11

I'm building a application that needs to distribute a standard file server across a few sites over a WAN. Basically, each site needs to write a lot of misc files of varying size (some in the 100s MB range, but most small), and the application is written such that collisions aren't a problem. I'd like to have a system set up that meets the following qualifications:

  1. Each site can store files in a shared "namespace". That is, all the files would show up in the same filesystem.
  2. Each site would not send data over the WAN unless necessary. I.e., there would be local storage on each side of the WAN that would be "merged" into the same logical filesystem.
  3. Linux & Free ($$$) is a Plus

Basically, something like a central NFS share would meet most of the requirements, however it would not allow the locally written data to stay local. All data from remote sides of the WAN would be copied locally all the time.

I have looked into Lustre, and have run some successful tests with it, however, it appears to distribute files fairly uniformly across the distributed storage. I have dug through the documentation and have not found anything that automatically will "prefer" local storage over remote storage. Even something that went with the lowest latency storage would be fine. It would work most of the time, which would meet this application's requirements.


Some answers to some questions asked below:

  • Server nodes: 2 or 3 to start. Each server would have dozens of simultaneous reads/write clients connecting.
  • WAN Topology is full mesh and reliable. (large corporation, cost isn't as limiting as red tape)
  • Client failover: I actually hadn't thought about having the clients failover (mostly because our current appliction doesn't do this at just one site). I supposed the practicle answer is that the servers at each geographically distributed site are expected to be single points of failures for the clients they are serving. Though, if you are thinking about something specific here, I think it would be quite germane to the discussion.
  • Roll-my-own: I have thought about rsync/unison, however I would need quite a bit of fancy logic to make the "dynamic" part of this work seamlessly. I.e., file appears to be local, but is only retrieved on demand.
  • MS-DFS: It certainly appears to be something I should look into. My main issue would be potentially being unsure about NFS server configuration/reliability/performance on Windows, as many of the clients connecting are NFS clients.
dpb
  • 445
  • 5
  • 16

9 Answers9

5

Shame about the Linux requirement. This is exactly what Windows DFS does. Since 2003 R2, it does it on a block-level basis, too.

Chris Thorpe
  • 9,903
  • 22
  • 32
  • Chris, thanks for the answer. I think DFS is pretty much what I'm looking for, though on Windows. Certainly something for me to look into. – dpb Mar 25 '10 at 01:47
  • DFS does not work on a block level basis. The replication service is non-transactional on file basis. – eckes Nov 28 '14 at 20:49
4

Some questions:

  • How many "server" nodes are you thinking about having participate in this thing?

  • What's the WAN connectivity topology like-- hub and spoke, full mesh? How reliable is it?

  • Do you expect clients to failover to a geographically non-local server in the event the local server fails?

Windows DFS-R certainly would what you're looking for, albeit for some potentially hefty licensing costs.

You say that collisions aren't a problem and you don't need a distributed lock manager, so you could do this with userland tools like rsync or Unison and just export the resulting corpus of files with NFS to the local clients. It's ugly, and you'd have to handle knocking together some kind of system to handle generating a replication topology and actually running the userland tools, but it would certainly be cheap as licensing cost goes.

Evan Anderson
  • 141,071
  • 19
  • 191
  • 328
  • Thanks for the answer Evan, I have updated my question with the data you were asking for. I'm interested in your unison/rsync idea, but don't quite see how the dynamic aspect would be handled. (I don't have a lot of experience with Unison, only rsync). – dpb Mar 25 '10 at 01:45
  • @dpb: I wasn't getting the sense of that requirement in your original edit. Microsoft DFS-R won't do that, either. The on-demand retrieval behaviour is going to require something "active" in the filesystem to intercept read requests for file stubs that don't have their local data cached, go get the data, and fulfill the read. I'm not aware of any geographically distributed filesysstem with that behaviour-- that's more like an HSM. – Evan Anderson Mar 25 '10 at 11:48
  • For those as clueless as me: http://en.wikipedia.org/wiki/Hierarchical_storage_management. Thanks again @Evan. I'm not nearly as interested in rearranging the underlying storage location in a dynamic way as choosing it initially in a dynamic way. I think HSM sounds very cool, but the cool part of it is pretty overkill for what I'm doing. – dpb Mar 26 '10 at 15:17
3

Have you considered AFS?

The Andrew File System (AFS) is a distributed networked file system which uses a set of trusted servers to present a homogeneous, location-transparent file name space to all the client workstations.

As I understand it, most of the recent development has been behind the OpenAFS project.

I can't pretend to be familiar enough with the project to know if the "preferred locality" feature is available, but otherwise it sounds like a good fit.

Insyte
  • 9,314
  • 2
  • 27
  • 45
1

Have you looked at OST pools in Lustre?

It won't be automatic but with OST pools you can assign directories/files to specific OST/OSSes - basically policy based storage allocation, rather than the default round-robin/striping across OSTs.

So you could setup a directory per site and assign that directory to the local OSTs for that site, which will direct all I/O to the local OSTs. It will still be a global namespace.

There's a lot of work going into improving Lustre over WAN connections (local caching servers and things like that) but it's all still under heavy development AFAIK.

James
  • 7,553
  • 2
  • 24
  • 33
  • Thanks @James, That is almost exactly what I'm looking for. I'm not keen on the munged namespace at the top level (assign particular directories to an OST pool), but perhaps that would be OK. It's at least good to know what the use case and limitation is in Lustre. Thanks again! – dpb Mar 26 '10 at 15:20
1

Maybe NFS but with Cachefs on the application servers will accomplish your part of your goal. As I understand it everything written will still go the central server, but at least reads could end up being cached locally. This could potentially take a lot of delay off of reads depending on your usage patterns.

Also, mabye UnionFS is worth looking into. With this I think each location would be a NFS export, and then you could use UnionFS at each location to have that and all the other NFS mounts from the location appear as one filesystem. I don't have experience with this though.

Kyle Brandt
  • 82,107
  • 71
  • 302
  • 444
  • Thanks @Kyle, I didn't know about UnionFS, along with aggressive caching, NFS could be a good solution for this. I'm thinking that it could get to be more trouble to maintain as the number of locations grew, but I'm going to look into it before I decide. – dpb Mar 26 '10 at 15:37
0

You could look into DRBD to replicate the disks. http://www.drbd.org/. This is a linux High Availability solution which just now made it into the Kernel.

However, this has some limitations:

  1. Only two nodes can be set up
  2. WAN might be too unreliable to keep DRBD robust.
  • Interesting idea, however I don't think it would give my application anything over other distributed filesystems. (lustre, glusterfs, etc). Thanks for posting... – dpb Mar 25 '10 at 01:49
0

If you want to keep it simple then have a look a rsync, solves a lot of problems and can be scripted.

The Unix Janitor
  • 2,388
  • 14
  • 13
0

Check on chironfs.

Maybe it can do what you want, on file system basis.

Evan Anderson
  • 141,071
  • 19
  • 191
  • 328
Dom
  • 6,628
  • 1
  • 19
  • 24
0

Btsync is another solution that I've had good experience with. It uses BitTorrent protocol to transfer the files, so the more servers you have the faster it is as synchronizing new files.

Unlike rsync-based solution, it detects when you rename the files/folders, and renames them on all the nodes instead of delete/copy.

Yout btsync clients can then share the folders on a local network.

The only downside I found (compared to MS DFS) is that it will not detect a local file copy. Instead it will interpret it as a new file an uploaded to all the peers.

So far btsync seems to be the best synchronization solution and it can be installed on Windows, Linux, Android, and ARM devices (e.g. NAS)

Alex G
  • 1