2

We will deploy a hadoop cluster on hundreds(say 300) of physical x86 nodes. Since we have no much production deployment experience, there is a simple question as the title we want to hear response from experienced guys. What are the best practics? should we deploy hadoop directly on the physical boxes,or we need a virtual machine layer (i.e., IaaS cloud) to manage the computing resources for the hadoop cluster. What concern should be take care of while making this decision?

John Wang
  • 97
  • 2
  • 12
  • Probably your question will receive some negative votes. Be aware the aim of serverfault is to look for help when you have a problem, not to look for advice (http://serverfault.com/help/on-topic) – alphamikevictor Apr 23 '15 at 09:01

1 Answers1

2

Hadoop was designed to run on bare metal hardware.
It intends to manage resource allocations for you.
Another layer is just overhead that can be avoided.

But best practice is hard to say, it depends on many factors.
You should read https://cwiki.apache.org/confluence/display/HADOOP2/Virtual+Hadoop and make your decision.
This addresses reasons against running Hadoop in a virtual environment and explains why some people still may want to do it.

faker
  • 17,326
  • 2
  • 60
  • 69