"Input/output errors" using encfs folder inside Dropbox folder

5

2

I have a 200 gig Encfs encrypted file system living in my Dropbox and being accessed by multiple machines, and I've never had any problems with it until now.

I moved about 10 gigs of data around on one (ubuntu) computer X, and 2 days later when the sync had finished on another (ubuntu) computer Y there were some problems: some of the files cannot be read on Y and give me Input/Output errors, e.g.

$ file myfile.txt
myfile.txt: ERROR: cannot read `myfile.txt' (Input/output error)

So somehow the file system has been corrupted. All the files can be read fine on computer X. I've run into about 20 files with this property; there could be more. In a directory typically only a few files will fail with this error, and many more will be fine.

I also have the system running on a Windows machine Z; I looked at the files in Z and I also got IO errors (although the Windows error messages were rather more cryptic). So in some sense the problem is almost certainly "at X's end".

I have managed to navigate to a directory in the actual encrypted Dropbox directory which corresponds to a directory where the input/output errors are occurring. All the (encrypted) files can be read fine, so the problem doesn't seem to be an actual IO error with the physical disc, the problem seems to be with encfs.

I have all the data backed up and I could simply delete it all and rewrite it, but the non-corrupted copy is on a system that has a very slow upload speed (it's in my home) and it took 2 days to sync; I'm reluctant to restart (not because I don't have 2 days, but because I don't want to basically make my home internet sluggish for 2 days).

Google has not led me to anything. I am at a loss to know what to do next, short of "restart and try again" which as I say I'm currently hoping to avoid. I don't really understand how a file system can be stored in a directory so I don't know how to start debugging the problem.

If I do have to restart can someone tell me a nice way to check which files in a directory have IO errors?? Edit: in the end I used a horrible way -- run file on each file, using find, and then hacking my way to a list of bad files using grep and emacs using a method which won't work if any files are called things like "output error" :-)

EDIT (one year later): I have lived with this issue for over a year now. I have been using malte's workaround. However last week, for the first time ever, I actually lost data. I made substantial changes in an encfs directory, I did nothing weird other than moving data around, and then my nightly script (which, I might add, takes over an hour to run with a lot of disc reading, every night, on both the ubuntu machines where I have Dropbox and Encfs running) told me that certain files were giving I/O errors at both ends. I had to restore the files using Dropbox's "restore deleted files" functionality, which was a pain because of course all the filenames are encrypted so I had to use encfsctl etc.

This prompted me into action. So I bit the bullet and set up a second Encfs directory, this time with different global settings (I do not know how to change these settings in a given encfs directory and I am pretty sure it's impossible, so the only way I could do this, as far as I could see, was to copy the now 300 gigs from one directory to another; I had to do this now because when I get up to 500 gigs I won't be able to store two copies in my dropbox which has a limit of 1000 gigs).

So what did I do? I set up another encrypted file storage system using no filename initialisation vector chaining, no per-file initialisation vectors and no external IV chaining. Yes I know this is less secure! Yes I know this doesn't work for everyone! Yes I even know that a security audit on Encfs came to the conclusion that I should not store 100,000 userids, passwords and credit card details using Encfs! But this is not what I am using encfs for. All I want to do is to use Dropbox but to ensure that if Dropbox is hacked, or there is a disgruntled Dropbox employee who leaks data, then my data is not the stuff being sold on. I do not have munitions-grade secrets here, I just have photos of my family and work-related stuff like references which I don't want to be randomly leaked.

While I am here, let me mention some other links that I have found in the last year which may, or may not, be relevant to this issue. I do not understand enough about how FUSE works to know. But given that this is my question and this has been a major problem for me for 1 year now, I thought I would use this question as a personal collection of what I had discovered about his and possibly related issues.

https://stackoverflow.com/questions/24966676/transport-endpoint-is-not-connected

https://github.com/vdudouyt/mhddfs-nosegfault

https://github.com/vgough/encfs/issues/109

And also the suggestion to use fsck on the encfs directory.

I am not enough of an expert to know whether any of these are relevant. What I do know is that as of yesterday I have "started again" with Encfs, and I will report back in a couple of months about whether this has fixed the problem for me.

UPDATE Two years later I can now confidently state that changing these Encfs file settings has fixed the problem, at the cost of possibly weakening my security. I've had no I/O errors since I made these changes in my set-up.

eric

Posted 2015-08-01T20:17:34.703

Reputation: 187

EncFS has some logging that could be helpful, and possibly look into hard drive bad sectors or possible failure on the problem computer ("Y" maybe?) – Xen2050 – 2015-10-01T12:59:59.817

1I still live in fear of this problem :-( and am still none the wiser of what causes it. – eric – 2015-12-23T14:03:02.617

@eric I find it frustrating too, hence the bounty. Still not sure how to fix. – Andrew Ferrier – 2015-12-26T17:43:19.240

Due to security vulnerabilities found in the current version, using Encfs in a Dropbox is not secure. See https://www.cryfs.org/comparison#encfs for details.

– Heinzi – 2016-02-13T18:09:47.097

@Heinzi: that might be true, but this is not the question. I don't want my data to be munitions grade secure, I just want it to be much less interesting to a random person who hacked Dropbox and wants to sell stuff on, than all the unencrypted stuff. – eric – 2016-10-25T21:40:04.063

Answers

3

If you are running encfs in the "maximum security" mode or you have enabled "filename to IV header chaining" in will break on any Dropbox-like service. Don't enable it. Actually, don't ever use it, it's just plain stupid to rely upon the file path for the file data encryption IV.

I would use "stream" filename encoding and only "per-file initialization vectors" and "File holes passed through to ciphertext" features to make encfs reliable.

And don't listen to the guyz which say that encfs is vulnerable to the watermarking attacks. Oh course, it does due to it's nature. Just don't put there the with the recognizable patterns like ripped CDs.

This would be the correct encfs setup. Only unique iv per file end sparse file support is enabled.

Version 6 configuration; created by EncFS 1.7.4 (revision 20100713)
Filesystem cipher: "ssl/aes", version 3:0:0 (using 3:0:2)
Filename encoding: "nameio/stream", version 2:1:0 (using 2:1:2)
Key Size: 256 bits
Using PBKDF2, with 206833 iterations
Salt Size: 160 bits
Block Size: 1024 bytes
Each file contains 8 byte header with unique IV data.
File holes passed through to ciphertext.

Dilyin

Posted 2015-08-01T20:17:34.703

Reputation: 154

You clearly have no understanding of 'Maximum security mode' @eric, consider a service like SpiderOak or Won/Next Cloud which natively have this support for the ultra secure types among us. – linuxdev2013 – 2016-07-09T12:15:06.233

@linuxdev2013: I don't know if your comment is supposed to be directed at Dilyin or me. It is true that when I wrote the question I did have no understanding of the options available to me when starting a new encfs file system; now I do, and I believe I've chosen options in my new set-up which are far more dropbox-friendly and strictly speaking less encryption-fascist friendly; however, this currently does not bother me. – eric – 2016-10-25T21:42:40.803

6

I have the exact same problem, it also just started a couple of weeks ago. Just to make this more complete:

  • Moving files out & in again does fix the symptoms
  • All my machines are Ubuntu, so it can't be Windows related
  • I have three machines in the sync group, and the problem occurs on at least two of them. See below for an extended script so that each machine can a) list its Errors and b) try fixing those of the others

Find corrupt files:

saveFile="$(hostname)-corruptFiles"
find $dir -exec file {} \;|grep "output error" > /tmp/corruptFilesRaw.txt
cat /tmp/corruptFilesRaw.txt | awk -F  ":" '{print $1}' > $saveFile

Fix corrupt files:

while read i <&3; do
    #check if file is corrupted on this machine as well
    file "$i" >/dev/null 2>&1
    retcode=$?
    if [ $retcode -eq 0 ]; then
        #if not, fix it
        mv "$i" /tmp/crap
        sleep 5
        mv /tmp/crap "$i"
        sleep 1
    else
        #if it is corrupt here as well, skip it
        echo $i >> /tmp/remainingCorruptedFiles
    fi;
done 3<$fileList

#replace file list with list of remaining corrupt files
rm $fileList
mv /tmp/remainingCorruptedFiles $fileList

I have these two scripts in the decrypted folder's root, so both the script and the lists of corrupted files are synchronized between all machines

malte

Posted 2015-08-01T20:17:34.703

Reputation: 61

This is a much better approach than mine (although I would be scared to run it on more than one machine at once perhaps) – eric – 2015-10-02T19:48:47.817

Update: if your experience is anything like mine, then this will work fine until the day it doesn't work, and then you just lost some data (you have I/O errors on every machine). – eric – 2016-10-25T21:43:51.087

2

OK so I wanted to get this sorted out today, so here's what I did. YMMV.

Note: I did not ever find out what had caused the problem. But testing indicated that if I found a file on computer Y with an I/O error then taking the file on computer X, moving it out of the file system and back in again, would fix the problem. I don't really like this solution because there's an underlying problem which might presumably bite me again, but I don't know how to diagnose the underlying problem.

OK so first I backed up everything on computer X.

Secondly I ran (in the directory where all the problems were on Y)

$ find . -exec file '{}' \; | grep "output error" > ~/io_problems.txt

[some of my filenames had spaces in but none had newlines or anything else like that]

I ran wc on io_problems.txt and found that I had just over 2000 lines in that file, and hence just over 2000 I/O errors in my system. Ouch.

Then I used a short emacs macro to edit io_problems.txt: in each line I located the string : ERROR: cannot read and just deleted all the rest of the line starting from the colon. I did this by typing (in emacs) (C-x ( C-s : ERROR: cannot read [now press left arrow key to get back to the first colon] C-k [right arrow key] C-x ) C-u 2500 C-x e in emacs. I'm sure I could have used sed or awk or whatever, I'm just more used to emacs. I renamed the resulting file list.txt .

So far I am left with a file list.txt that contains a list of filenames (which might have spaces in) which are problematic on Y.

Now for the big moment: I need to loop through this list of files and for each file move it out of the file system and back in again. The filenames might contain spaces. So I use a file descriptor for the loop.

while read i <&3; do
  mv "$i" ~/crap
  sleep 5
  mv ~/crap "$i"
  sleep 5
  done 3<~/list.txt

The sleep is so that I don't overwhelm dropbox, which is what somehow caused the original problems (although I do not believe that the problem is with dropbox; I did extensive testing with the encrypted files and couldn't find any differences in the files at X and Y; my ignorance with encfs/fuse prevented me from doing any more rigorous testing to actually find what the problem was).

2000 files and 10 seconds per file means that the entire operation will take over 5 hours. This works for me.

I'm currently waiting for this loop to terminate but preliminary tests seem to indicate that the problem is being solved slowly but surely.

eric

Posted 2015-08-01T20:17:34.703

Reputation: 187

I have the same problem and made the following observation: If a file has a problem on a machine, it is never on the machine I added the file from. I can read the file perfectly fine on the 'source' machine, but it will report 'Input/output error' on one or more of my other machines. – NZD – 2017-04-22T03:56:18.547