wget -nd (--no-directories) option not working as expected

2

0

I am using wget 1.12 (in msys if it makes a difference) and am trying to mirror a website with the -nd option, since the file and folder names on this site are very long. The docs state that

‘-nd’
‘--no-directories’
    Do not create a hierarchy of directories when retrieving recursively.
    With this option turned on, all files will get saved to the current
    directory, without clobbering (if a name shows up more than once, the
    filenames will get extensions ‘.n’).

However, this is not the case. The identically named files keep getting overwritten (think index.html on a large site). How can I get the correct behavior?

P.S. he reason the names are so long is that they are in Hebrew and are being converted to ascii %HH. Is there another way to do this?

Baruch

Posted 2011-06-16T10:05:35.523

Reputation: 243

I have 1.11.4 I think I got it from gnuwin32 I hadn't heard of msys, it looks similar. gnuwin32 is better known, you could try gnuwin32 But, do you have an example of a site with the problem? – barlop – 2011-06-16T12:23:37.567

out of interest does it convert it into different ascii chars or just squares? I find i just get square chars.. there is a good different gui for the cygwin command prompt that shows any unicode char.. maya be mintty.. not sure for cmd.exe – barlop – 2011-06-16T12:50:26.847

if you can include the line you are using, and naturally then the site, or a site with the problem, then that'd help. – barlop – 2011-06-16T12:51:28.593

I upvoted this initially prematurely, thinking this was a worthwhile question, but since he has actually been back and still not provided a link or a link to another site with te problem, is just a nuisance. If i'd have known, I wouldn't have upvoted it, so it'd be zero if it's lucky, and if i'd have downvoted it, this question would be on -1 which it almost and perhaps does, deserve. Certainly doesn't deserve my upvote that's for sure. I just can't cancel it – barlop – 2011-08-10T18:37:25.620

Perhaps it is a limitation of your filesystem? Are you using NTFS or FAT? What if you do a test run on some files that have no dots in them, so that the added .n adds the only dot? – Flimzy – 2011-06-16T11:17:28.117

Answers

0

Very possibly you also used -N (--timestamping), which is implied by -m (--mirror) for example. It effectively disables the preserving of files with the same name. The manual for -nc (--no-clobber) option says:

When running Wget with ‘-N’ ... the decision as to whether or not to download a newer copy of a file depends on the local and remote timestamp and size of the file

Usually if there are 2 files with the same name and different path (e.g., index.html) they will have different sizes and due to how timestamping works the file will always be overwritten if used with -nd option.

You can read more in the documentation for timestamping.

danadam

Posted 2011-06-16T10:05:35.523

Reputation: 375