7
2
I'm looking for any open source command line tool or tools which will allow me to index and search a large number of plain text files. Approximate search would be a plus. The tool only needs to print the files that match, although some match context would be useful. A GUI tool isn't useful for my application, nor is anything that searches files one by one (grep for example). I'm basically targeting unix platforms (osx, linux, bsd).
EDIT: I'm not interested in any sort of tool that is system-wide, or needs to run in the background. Basically, I want to build an index for a directory tree full of text files and then later be able to search against it. Preferably the index is one or a few files that I can specify the location of.
Any ideas?
Just about any way you do it you will have to scan each file for matches. Even if you dump everything into a DB, as one answer proposes, you still have to feed each file into the DB one by one. I don't know why grep wont work for you but it will give you the exact results your asking for, the matching file and the context of the match. Just redirect the output to a file and you have a searchable index.
grep -r searchterm /somedir/* > index.txt
– None – 2011-03-13T23:42:45.753@Deleted Account, A query using grep is O(n) where n is the number of files. An index usually implies a data structure that gives you better than O(n) for most searches. Your index.txt idea is worse than grep by itself as it is an extra step, and I'm really not sure what the point would be. I don't have a problem with a database, I'd just prefer a lightweight one like sqlite or similar. – ergosys – 2011-03-14T01:28:43.727