How are the file metadata stored in Windows?

12

3

(I'm using Windows XP but I guess it's similar in all the recent versions of Windows.)

When you create for example a new empty text document, you'll find in its properties that it has size of 0 bytes. Zero bytes means no information. No data.
But still, the file has some name, it can still carry the dates of last access, modification and creation. It carries the information whether it's hidden file or not, whether it's read-only or not...

So where are all the metadata stored?

Jeyekomon

Posted 2013-10-13T20:41:11.127

Reputation: 269

1

For example: http://ntfs.com/ntfs_basics.htm

– Koray Tugay – 2015-02-25T19:13:08.997

there is no magic here. Read these answers here: http://stackoverflow.com/questions/4954991/are-0-bytes-files-really-0-bytes

– HighTechGeek – 2013-10-13T20:52:14.420

A long time ago I remember I used to have a kind of virus that somehow corrupted a couple of files in my PC so that they appeared to be about 100GB in size. Each of them. On my 40GB harddisc. So there must've been some kind of magic... :-D – Jeyekomon – 2013-10-13T21:16:02.227

Answers

11

You've been taught that hard disks contain files, but that's not the entire truth. Actually, hard drives contain one very, very big number expressed by a lot of single bits. But this interpretation doesn't make any sense to you nor your computer, because processing single big numbers isn't very common (and I'm talking about REALLY HUGE numbers). Instead, computer splits it into smaller 'words' (8-bit, 16-bit, 32-bit or whatever) and uses like that. Still, that's just a bunch of words (let's assume 8-bit words, i.e. bytes).

Now, that drive is partitioned. I have explained why partitioning is a good idea in this answer:

Generally speaking, drives can be used without partitioning. Most pendrives work like that. But using partitions has many advantages, just to name some of them:

  • You can have two OSes sitting on the same hard drive and not interfering with each other. Each one will treat its partition as a logical drive and won't mess with other ones unless you tell it to.
  • You can logically separate your data. If one partition becomes corrupted for some reason, other partitions will very likely remain intact.
  • Using partitions is better than using multiple smaller hard drives, because your system is quieter, consumes less energy and you can resize, delete, move them around etc.
  • You can use some parts of the hard drive for some special purposes.

Now, every partition has its own filesystem. Modern versions of Windows use NTFS, but FAT, FAT32 and exFAT are supported for external media or legacy partitions. Everyday-use Linux installations usually use ext filesystems, ext4 being the latest one.

Filesystem defines the way files are physically located on the disk. You can think of it like this: if you had a 10000-page book without any chapters, page numbers or line breaks, it would be very hard to use. Of course page numbers and chapter titles take up some space on the page, but they make using the book a lot easier and faster. If you want to jump to chapter, let's say, 42, you just look it up in the table of contents. Then you leaf through the book until you find the chapter you want. Your files are chapters and your filesystem is the book. Filesystem metadata, like file boundaries, filenames etc. takes up space too, but it's a comparably small amount of space, and it makes things work a lot faster.

If your "chapter" is empty, it can still have a heading or a page number, right? Empty file contains zero bytes of data. Metadata takes up space, but it's not a part of the file, but of the filesystem. Otherwise you'd see filenames inside your text files?

By the way, that's why early versions of DOS were accepting only 8.3 names - the space reserved for filenames was very limited. NTFS allows filenames that are 255 characters long[1].


Just one more word on your comment:

I used to have a kind of virus that somehow corrupted a couple of files in my PC so that they appeared to be about 100GB in size. Each of them. On my 40GB harddisc. So there must've been some kind of magic... :-D

That's completely possible to have valid files bigger than your hard drive thanks to a feature called sparse files. Hennes has an excellent explaination of these in his comment on this question:

Imagine a binder capable of holding a 100 pages. If you use that binder as a regular file you could insert a 100 pages. You could read all 100. You could write to all 100. Now imagine a sparse binder. You insert the first page you write "page 1: Content A". You then insert a the second page you write "page 9999: content b:". Whenever you try to read a page you look if it exists. If it does not, your answer will be this is an empty page. If it does exist you return the contents of the page. Whenever you write to a page which does not yet exist in the binder you add a new sheet of paper.

gronostaj

Posted 2013-10-13T20:41:11.127

Reputation: 33 047

This does not answer the question, does it? So where are all the metadata stored? – Koray Tugay – 2015-02-25T18:52:48.290

@KorayTugay I believe the actual question was "How are the file metadata stored in Windows so that they don't take up space". In my opinion the best answer you can give in a Super User post is explain that they are stored in the filesystem, not directly in the file, and that's why the don't count into the file size. They are in the book, but not as a part the text. – gronostaj – 2015-02-25T21:51:12.560

Otherwise you'd see filenames inside your text files? Well, many rich filetypes like pictures or PDF files can contain a lot of metadata. Even simple UTF-8 encoded text files contain a sequence EFBBBF which is hidden by most text editors so I expected the file metadata to be just some another hidden and inaccesible part of the file. Anyway, you'd be an awesome teacher! Every answer given here was (and will be) really helpful in some way but I appreciate your effort the most. – Jeyekomon – 2013-10-13T22:21:05.347

5

Just learned today about Windows Alternate Data Streams (ADS). This is a hidden resource fork that is attached to a file. It has been part of NTFS since Windows NT 3.1.

For example, if you have a blank text file but fill in some of the summary information in the file's property tag, a hidden ADS file is created and attached to the text file. Most versions of Windows don't include the size of the ADS file when reporting the size of the original file.

You can create and view ADS files from a command prompt.

echo "ABCDE" > test.txt:hidden.txt

will create a test.txt file with an ADS file called hidden.txt

you can use this command to edit the file:

notepad test.txt:hidden.txt

Here's an easy to read article that goes into greater detail.

HighTechGeek

Posted 2013-10-13T20:41:11.127

Reputation: 1 467

@Jeyekomon it seems 'type' doesn't support it, http://pastebin.com/raw/4Ae3GGkN but I see echo and notepad do (tested in win7)

– barlop – 2016-12-28T16:55:36.377

That's really an interesting thing! I've actually never heard about it too... Thank you. – Jeyekomon – 2013-11-01T18:58:32.930

2

I did a search and came across a similar question asked on Stack Overflow.

It basically says that the file is created and takes up a few bytes or a hard drive cluster, etc. It depends on the granularity of the hard drive and the file system, etc.

They discuss it here: https://stackoverflow.com/questions/4954991/are-0-bytes-files-really-0-bytes

with additional links for further research.

HighTechGeek

Posted 2013-10-13T20:41:11.127

Reputation: 1 467

2

On a NTFS volume this information is stored in metafiles. In particular, the file name and timestamps are stored in a metafile called $MFT. The metafiles are not accessible by the normal Windows methods like Explorer or the command prompt.

For more reading:

http://ntfs.com/ntfs-system-files.htm

http://en.wikipedia.org/wiki/NTFS

David Marshall

Posted 2013-10-13T20:41:11.127

Reputation: 6 698

Thank you. So the answer lies in a deeper understanding of my HDD's filesystem... And just out of curiosity - do you know any "unnormal" Windows method? The one that is actually useful for accessing those metafiles? A couple of keywords for google would be enough... – Jeyekomon – 2013-10-13T21:44:27.693

1

@Jeyekomon You need to use a sector editor. There's an example here: http://blogs.technet.com/b/askcore/archive/2013/03/01/where-did-my-space-go.aspx

– David Marshall – 2013-10-13T23:25:32.067