2

Is it at least possible to build Hadoop cluster from Raspberry Pi-based nodes? Can such a cluster meet hardware requirements of Hadoop? And if so, how much Raspberry Pi nodes are required to meet requirements?

I understand that a cluster from several Raspberry Pi nodes being cheap is not powerful. My purpose is to organize cluster without possibility of loosing personal data from my desktop or notebook, and to use this cluster studying Hadoop.

I'd appreciate if you suggest any better ideas of organizing a cheap Hadoop cluster for studying purposes.

UPD: I've seen that recommended amount of memory for Hadoop is 16-24GB, multi-core processors, and 1TB of HDD, but it doesn't look like minimal requirements.

UPD2: I understood that serverfault.com is a place for questions related with production systems. Questions related with configuring systems for fun and personal usage are out of scope. Sorry for asking this question.

Dmitriy Sukharev
  • 233
  • 1
  • 4
  • 9
  • 1
    For test/study purposes, a bunch of VMs will do the job. – womble Jul 08 '12 at 14:21
  • i actually think its a reasonable proposition... the hadoop heap doesnt have to be 1G especially if your not doing large jobs. – jayunit100 Nov 05 '12 at 19:34
  • what are alternatives? – Alex Gordon Dec 04 '12 at 00:17
  • 1
    Hadoop can work in 3 different configurations. They are single node (only few most important services are started), pseudocluster (all Hadoop services are executed at the same node), cluster (distributed configuration). It's quite ok to learn Apache Hadoop having only 1-2 nodes. It will work, but using more mature configuration you may catch some new unexpected bugs. Personally I have 2 nodes: desktop and notebook, 10 cores in sum that is quite enough for me. At any rate everything depends on your needs. – Dmitriy Sukharev Dec 04 '12 at 13:37

1 Answers1

12

It would be incredibly bad. Hadoop eats heap like an elephant. The default heap size is 1000 MB but pretty much everybody increases it. 256 MB of RAM will not get you far, after the GPU and operating system take their toll, you'd probably have to restrict Java to 128 MB and the smallest jobs will run out of heap.

The JVM is also very slow on ARM. Red Hat plans to work with Linaro to improve this by Red Hat Enterprise Linux 7, but don't expect to run Java server loads on ARM reasonably for a while.

Pierre Carrier
  • 2,607
  • 17
  • 28