Add a web based document search to my debian fileserver

Question

I have a Samba-based fileserver with lots of gigs of data on it, mostly Word, Excel, OpenOffice and PDF documents.

I've set up a simple web based search interface (Apache, PHP, mlocate) that just goes on filepaths + mtime. It works, for that, but it would be great to have all the documents indexed by Apache Solr, as by all accounts this is blazingly fast and can cope with all these different document types.

But it's a fileserver, not a website, so I'd need something to crawl all files, and keep crawling and re-indexing the updated ones; people aren't "POST"ing documents, they're just pressing Save.

Is there a project out there that does this?

if only Google Desktop Search still existed :( – Sridhar Sarnobat Jul 09 '20 at 17:37 — Sridhar Sarnobat, Jul 09 '20 at 17:37

score 1 · Accepted Answer · answered Aug 28 '12 at 17:35

1

Check out inotify. It will notify you about file system events instantaneously.

answered Aug 28 '12 at 17:35

belteshazzar

292
4
9

Useful, thanks. As the case is so general ("I want people to be able to search for files on the fileserver") do you know of any projects that do this? I want to avoid having to write my own stuff if there's a good project already written. – artfulrobot Aug 29 '12 at 10:29
Not that I know of. If I had to do it though, I would write a daemon using [pyinotify](http://pyinotify.sourceforge.net/) that would store the info in a mysql database, which can then be searched by the webapp. – belteshazzar Aug 29 '12 at 17:51

score 0 · Answer 2 · answered Jan 16 '15 at 23:18

0

I'm not sure if this is what the asker wants but others looking for a web interface to mlocate, have a look at this:

https://github.com/kaazoo/weblocate

answered Jan 16 '15 at 23:18

Sridhar Sarnobat

101
5

Add a web based document search to my debian fileserver

2 Answers2