5

Anyone have a good suggestion for doing a forward sync to HDFS? ("forward-sync" in contrast to "bi-directional sync")

Basically I have a large number of files I want to put into the HDFS. Its so large that I'll often, say, lose connectivity before it is finished. What I would like to do is just do a "resume" of my file upload. However hadoop fs -put will just upload the whole directory again (or complain if it exists).

Anyone have a good way to continue an incomplete hdfs upload?

Nate Murray
  • 973
  • 1
  • 7
  • 7

1 Answers1

1

If you're running a new enough Hadoop, you could mount hdfs using FUSE and just use rsync.

Might also be possible to build a local-only hdfs and then use distcp.

Robert Novak
  • 619
  • 4
  • 6