Monday, November 30, 2015

How to use Cloudera Manager to setup a new cluster. (According to Sean from Cloudera on 2015/07/17)


Deploying nodes from quickstart VM?
https://community.cloudera.com/t5/Apache-Hadoop-Concepts-and/Deploying-nodes-from-quickstart-VM/td-p/29715


If you want to have a virtual cluster, I would strongly recommend just
starting with vanilla Linux VMs, downloading and running the Cloudera
Manager installer on one of them, and building a new cluster. You may be
surprised at how easy it is, once you have the VMs networked together
properly. You would have to reset so much on the QuickStart VM to get it to
incorporate a copy of itself as another node in the cluster - it would
actually be harder than starting from scratch. The QuickStart VM is
designed to "just work" as robustly as possible regardless of how the
virtual network is setup, and that requires that it make some assumptions
that it is just a single node. So be aware that you're going to run into
some issues if you try this, and we do not try to cater to this use case.

Specifically, you're going to run into a lot of networking issues. The VM
has the hostname quickstart.cloudera 'baked' into it. To add another node,
you would need another hostname, and that's going to require changing so
many config files and resetting so many services that you would basically
be starting from scratch anyway. You would also need to be careful with IP
addresses. If another network device is not available early enough in the
boot, the VM will use 127.0.0.1 - which works fine as a single-node, but
that's not how you want machines to refer to themselves in a distributed
system, because as soon as it's resolved elsewhere it's wrong. So you'd
need to make sure the VM had an externally routeable IP (e.g. use bridged
networking, or a similar option) and was rebooted (in my experience, you
have to reboot twice after making the change) in order to have the correct
networking device be available early enough in the boot process. Not to
mention, this is all in theory - I don't know that anyone has successfully
done this. Again - it's so much easier to just install using Cloudera
Manager on top of some new Linux VMs.

No comments:

Post a Comment