redundancy scalability cluster software suporting nginx?

Question

I'm having some problems with a filesharing storage configuration that we recently set-up. We moved onto a NAS (1x24TB) recently for our video streaming service. We set-up 9 front end servers that would communicate with the NAS using NFS and we're running into problems. Because the's so many requests to files and only "1 Physical" drive the NAS server has issues keeping up with all the requests leaving us with IO issues.

In anycase, we've decided that we want to change to a different configuration because the NAS set-up isn't working well for us. I guess its not ideal for hundreds of random file downloads.

Where looking for an alternate solution. We though of setting up two 4 server clusters and using gluster on each cluster but I don't think that's the best solution. We also though of setting up mogilefs on a cluster of 8 servers how-ever we need to use nginx to server files and this can't be done with mogilefs.

Can anyone recommend some ideas for a larger scale file sharing service that would provide some redundancy and scalability? I'm not an expert and this is the first time I've tried configuring something this large.

score 1 · Answer 1 · answered Oct 27 '11 at 02:52

Another configuration worth some consideration would be to have the 9 nginx severs use the proxy_cache setting to keep a copy of the file as they serve it, so a second request for the same file is served from the servers LOCAL disk. You can set this cache to be of a limited size, and to store the files for a limited duration. This will reduce the load on your NFS server, and won't require you to replicate the data to each node.

Once the files are local, you can also use the OS's sendfile feature, which allows the kernel to take over sending the data to the socket, rather than having nginx do it, this greatly increases performance, but is not recommended over NFS.

score 0 · Answer 2 · answered Oct 27 '11 at 05:29

You had better take a look to Lustre + DRBD. The first one would probably provide the scalability you're looking for. I thought about DRDB just because of redundancy in a block device level. Although I do have not tested this setup and I can't say much about its performance. Let me know your improvements.

score 0 · Answer 3 · answered Oct 27 '11 at 10:58

The title of your question has little to do with what you are asking for.

Using a shared filesystem you're going to run into scalability problems - you can defer the point at which this becomes an issue by using a SAN (even if its just 10Gb iSCSI), but for your application where there are very few write operations and (presumably) updates only need to be near realtime rather than atomic, I'd recommend going with either a distributed peer-to-peer or replicated filesystem.

AFAIK, there's limited options for off-the-shelf solutions for ptp filesystems - but it would be trivial to implement this in your application.

Replicated filesystems are much more mature technology - AFS comes as standard in Linux these days.

score -1 · Answer 4 · answered Oct 26 '11 at 21:22

-1

Tell me if you ever get your GridFS on MongoDB at a "disk full" or "out of resources" state.

answered Oct 26 '11 at 21:22

mailq

16,882
2
36
66

Mailq, are you suggesting GridFS? Ive been put in charge of doing the digging as our system admin is away at the moment... so please keep that in mind when asking questions as I'm limited with what i know. – Graham Oct 26 '11 at 23:28

redundancy scalability cluster software suporting nginx?

4 Answers4