1
I'm crawling a large website (over 200k pages) using wget (is there a better tool btw?). Wget is saving all the files to one directory.
The partition is HFS (I think), will it cause problems if I have all the files in one dir? Assuming I will access all of them only from the console (I know Finder has problems with dirs>5k files).
Or is there perhaps a way to create a micro-partition that would be compressed and would allow for a fast, optimized access to this amount of files?
What flags are you using with wget? – Majenko – 2011-04-12T13:25:26.917
@Matt: -np, why do you ask? – kolinko – 2011-04-12T13:45:37.770
I usually specify -m - it keeps the file tree structure for me then - I don't know the layout of the site you're scraping, but that might reduce the number of files in each directory. – Majenko – 2011-04-12T17:06:11.133