0

Some time ago I developed a simple script to detect duplicated files. It works in the following way:

  • Locates Excel files in a folder.
  • Obtains SHA-256 of the file.
  • Stores a mapping between Filename - SHA256.
  • Flags the file as read-only.

Now, I am well aware that Excel stores metadata in its own structure, so opening and saving an Excel file even if no modification has been made will modify the SHA-256 for this file.

My question is: if I flag the file as read-only after obtaining its SHA-256, is it possible to obtain a different SHA-256 while this flag is enabled?

I have noticed that I'm getting different SHA-256 hashes for files flagged as read-only, and I would like to understand if this is even possible.

S.L. Barth
  • 5,486
  • 8
  • 38
  • 47
Jausk
  • 209
  • 3
  • 9

2 Answers2

2

There are two scenarios that would explain why your file can still be changed, despite it being write protected:

  1. If you start your program (Excel) as a user who is allowed to change the permissions, most notably write permissions, of the file, then the program (Excel) could simply re-enable write permission without telling you. That's generally a bad idea, but maybe Excel is dumb. Libreoffice Calc asks whether you want to modify the write permission flag. I'd assume Excel would, too.

  2. If your program DELETES the file and recreates a file with the same name, it will look as if the file changed, even though it was actually completely removed and replaced by a different file which just happens to look mostly identical. This is possible because deleting a file isn't prevented by denying write permission to the file.

Out of Band
  • 9,150
  • 1
  • 21
  • 30
0

When you say "flag the file as read-only", do you mean changing the attributes/permissions of the file? If so, I'll address that.

Attributes (Windows)/Permissions (*nix) are "meta" information on the file system. In NTFS, this data is stored in the Master File Table (MFT). In ext3, this data would be stored in the inode. This means that renaming the file, moving the file, or changing permissions of the file should not change the contents of the file, which is what is hashed.

You could always run a quick test to confirm:

>rem Sum the file first.
>sha256sum ExcelFile.xlsx
4bb7303b56a728665f639c36ffdc6169ac4debd774a0e9bedd27ca15b451c8ad *ExcelFile.xlsx

>rem Check the attributes of the file.
>attrib ExcelFile.xlsx
A            C:\Users\User\Documents\ExcelFile.xlsx

>rem Add the read-only attribute.
>attrib +r ExcelFile.xlsx

>rem Sum the file again.
>sha256sum ExcelFile.xlsx
4bb7303b56a728665f639c36ffdc6169ac4debd774a0e9bedd27ca15b451c8ad *ExcelFile.xlsx

>rem Rename the file.
>ren ExcelFile.xlsx ExcelFile.xlsx2

>rem Sum the file again.
>sha256sum ExcelFile.xlsx2
4bb7303b56a728665f639c36ffdc6169ac4debd774a0e9bedd27ca15b451c8ad *ExcelFile.xlsx2

>rem Check the attributes of the file again.
>attrib ExcelFile.xlsx2
A    R       C:\Users\User\Documents\ExcelFile.xlsx2

>rem Remove the read-only attribute.
>attrib -r ExcelFile.xlsx2

>rem Rename the file again.
>ren ExcelFile.xlsx2 ExcelFile.xlsx

>rem Sum the file again.
>sha256sum ExcelFile.xlsx
4bb7303b56a728665f639c36ffdc6169ac4debd774a0e9bedd27ca15b451c8ad *ExcelFile.xlsx

>

I also tried to open the file when it was set to read-only in Excel (also had [Read Only] in the title bar). While the file was open, I ran sha256sum again and the hash was still unchanged. I tried to save changes to the file and it prompted me for a new file name.

Damian T.
  • 334
  • 1
  • 6