Build parallel processing grid from 32-bit desktops

Question

I am running simulation on my laptop(Core 2 Duo 1.8Ghz with 4GB RAM running windows 7) which is taking very long time.

I have a couple of unused 32-bit (Core 2 Duo 1.8Ghz with 4GB RAM) desktops. I would like to connect them(as slaves) in a grid using a LAN to my laptop(as master) to complete the execution of algorithms faster.

How to do this? Which operating system should I use? Should my operating system be a server? Do I need any additional hardware?

Please kindly point me out to any tutorials or books. Please give me your valuable suggestions and advice.

score 2 · Accepted Answer · answered May 30 '11 at 19:02

What type of distributed/parallel computing infrastructure you build depends a lot on the problem being worked on. The easiest workloads to distribute are those that are easily subdivide-able: carve the problem-set into 4 chunks, farm the chunks to 4 machines, stitch the results back together once processing is done. Workloads that are poor choices for subdivision are those that have strong dependence upon previously or currently processing data.

For data that can't subdivide, your best bet is to look into some of the single-system-image frameworks out there (see link for a list). These cause multiple systems to emulate a single larger system. Even then care must be taken to design processing in such a way as to minimize inter-system communication. Systems like these are where networking products such as Infiniband are really useful.

For data that can subdivide, you have a lot more options. The largest is perhaps BOINC, which is designed around very high latency workunit reporting (hours, days, or even weeks). I've heard of private BOINC clusters out there.

One I used back in college is PVM. This is a C-library (a perl wrapper exists, which is new) that enables inter-system communication over a variety of transports.

Whatever you pick, you'll still have to redesign how your computation framework functions. It'll be a lot of work, but at least you can use more resources to solve your problems. It is vanishingly unlikely that you can just drop your existing code into a distributed computing framework and have it all work, just getting the distributed framework up and running will be a challenge.

score 0 · Answer 2 · answered May 30 '11 at 19:07

0

Depending on your needs. But for computational workload try Java RMI or MPI.
For Data processing try Hadoop

answered May 30 '11 at 19:07

wlk

1,643
3
14
19

Build parallel processing grid from 32-bit desktops

2 Answers2