Why don't file managers use the Master File Table for instant search results?

27

6

I've just discovered UltraSearch and was blown away by its file and folder search speed. It's instantaneous. And doesn't use any indexing service. It simply uses the NTFS Master File Table, which already stores all the filenames on the NTFS partition.

The question is, why isn't this capability way more popular among file managers, and Windows Explorer Search (Win+F) to begin with?

Dan Dascalescu

Posted 2013-01-07T04:11:56.457

Reputation: 3 406

Question was closed 2013-01-10T18:29:52.727

2

Also see Everything by VoidTools which does the same thing.

– David d C e Freitas – 2015-05-07T19:58:16.313

1Great job guys closing a question with 20+ upvotes as "not constructive"! – Dan Dascalescu – 2016-12-15T02:21:22.247

Answers

29

Because of Security!

That's the real reason. (And the only real reason, in my opinion -- it's not that hard to make a reader for major file systems, although it's by no means easy; making a writer is the real challenge.)

A program like this bypasses the entire (file) system's security infrastructure, so only an administrator (or someone else who has "Manage Volume" privileges) can actually run it.

So obviously, it wouldn't work in many scenarios -- and I don't think Microsoft (or any other big company) would ever consider making a product like this and then encouraging users to run as administrators, because of the security ramifications.

It would be theoretically possible to make a system which runs in the background and filters out secured data, but in practice it would be a lot of work to get correct and without security holes for production.

By the way I haven't used UltraSearch, but I'd written a very similar program myself a few years ago which I open-sourced just last month! Check it out if you're interested. :)

user541686

Posted 2013-01-07T04:11:56.457

Reputation: 21 330

1This does not feel like a right reason. OS can give a view for unsecured search just like a DMBS. An API or a restricted view should give public access to public files. And if the file table does not know anything about security of different directories than it is probably bad design on the OS design end – LifeH2O – 2015-05-24T13:01:49.707

@LifeH2O: The problem is that adding security checks is going to be a massive performance hit, which entirely defeats the point of the tool. – user541686 – 2015-05-24T17:27:11.930

1How the performance hit can be more than scanning directories? Only the security of inner directories will need to be checked. I don't know how much can be done with windows file table. – LifeH2O – 2015-05-24T18:31:20.473

1@LifeH2O: Have you considered how complicated it is to "check" something? Users belong to multiple groups, groups and users can each have allow/deny/neither permissions on either some directory on the chain or on the file itself, and you have to figure out the effective permissions for the current user on each file using its ACL. Now add to that the synchronization required with the kernel's security manager subsystem, and you're going to get massive performance hits just "checking" all the files. – user541686 – 2015-05-24T18:50:02.377

@LifeH2O: And this is all assuming the security information is already in memory or in MFT, which it typically isn't. – user541686 – 2015-05-24T18:55:59.250

So can we assume that if calculated effective permissions are stored in MFT or another table, only than it is possible to search MFT for any user? – LifeH2O – 2015-05-24T19:09:10.890

@LifeH2O: Not even then, because if you store per-user permissions for every file, the size of the MFT would blow up, and reading it would take much longer. It's a lose-lose except in the most trivial cases, and OSes aren't designed for trivial cases. – user541686 – 2015-05-24T19:13:56.277

1You need to provide something authoritive indicating what you are saying, otherwise people can't differentiate speculation from information. I agree with others, this is purely speculation. – user34660 – 2016-12-04T03:20:45.073

And the only real reason, in my opinion - what about if the files don't actually exist? http://stackoverflow.com/a/13937860/478656 e.g. a backup system which presents virtual files which you can enumerate the filesystem through the normal API calls and see them, but they really reside on a backup store and have no entries in the MFT. Or PSProviders in PowerShell, or files in the Offline Folders client-side-cache, etc. – TessellatingHeckler – 2016-12-14T20:35:42.420

6

File managers have to be able to support every single filesystem that could be encountered. As such, they have to call into the VFS via its API. There is no (sane) way to return a large array from an API call, which results in the file enumeration being serial regardless of the presence of a MFT/FAT/superblock.

Ignacio Vazquez-Abrams

Posted 2013-01-07T04:11:56.457

Reputation: 100 516

1If you were a programmer then you would know how APIs manage large amounts of data such as you say. And no, a search program is not required to support multiple file systems. – user34660 – 2016-12-04T03:23:40.660

@user34660: They have two choices: 1) Use enumeration. 2) Run very slowly when handling very large datasets. And a search tool that only supports a single filesystem is of very limited utility. – Ignacio Vazquez-Abrams – 2016-12-04T03:31:00.340

3

File indexing service is for user who would like to search content (most likely text) and metadata of files, not merely filename. That's why it takes a long time go walk through all the files and the index built from such services is big and relatively slow. You can disable indexing service in Windows but windows explorer is stupid enough to keep searching file content after filenames. As Ignacio Vazquez-Abrams said the file managers cannot take advantage from low-level file system.

neo

Posted 2013-01-07T04:11:56.457

Reputation: 564