Renaming files fetched via “wget --mirror” in Bash before uploading to an Amazon S3 statically hosted area

1

I’m trying to archive and upload an old website to statically hosted Amazon S3 area.

I was able to get the contents with wget, using the following command:

wget --mirror --no-parent --html-extension --page-requisites http://original.com

Then, I could replace all the links to their new URL, by:

ag -l original\.com -0 | xargs -0 sed -i '' \
's|original.com|old.original.com|g'

After this, I uploaded the website to Amazon S3 using s3cmd sync.

My only problem now that all the “cache-busted” assets are now access denied on Amazon. The problem is that wget got the files with query parameters included in their filename and I’ll need to rename them.

So I’d like to rename files recursively, in all subfolder, like:

  • style.css?ver=4.2.5.css is renamed to style.css

How can I do that in Mac OS X using Bash 3.2?

hyperknot

Posted 2015-09-29T15:24:49.843

Reputation: 734

Answers

1

This should work:

find . -maxdepth 1 -type f -name '*\?*' |\
  while read FILENAME
  do
    IFS='?'
    SPLIT_FILENAME=(${FILENAME})
    unset IFS
    echo mv "${FILENAME}" "${SPLIT_FILENAME}"
    # mv "${FILENAME}" "${SPLIT_FILENAME}"
  done

The find . indicates all the action happens in the current directory as well as child directories; feel free to change that . to be the full/actual filepath of what you are acting on. The -name '*\?*' looks for files with a question mark (?) in its name.

This initial/demo version also has a -maxdepth set to “1” so the process doesn’t go out of control on your filesystem and it uses an echo version of the command to show you what it would do before you run it for real.

If you run that an the output looks good, feel free to adjust the -maxdepth 1 to something like -maxdepth 9 or even remove that entirely and then comment out the echo line and uncomment the mv line so it looks like this:

find . -type f -name '*\?*' |\
  while read FILENAME
  do
    IFS='?'
    SPLIT_FILENAME=(${FILENAME})
    unset IFS
    # echo mv "${FILENAME}" "${SPLIT_FILENAME}"
    mv "${FILENAME}" "${SPLIT_FILENAME}"
  done

Using your test file example of style.css?ver=4.2.5.css, I got this output when running this script on my Mac OS X 10.9.5 (Mavericks) system:

mv ./style.css?ver=4.2.5.css ./style.css

Looks like a good switch to me. Ran it with the real mv command and the file was successfully renamed to style.css. This would also work with files that have spaces in them such as test files like this is my style.css?ver=4.2.5.css and my style.css?ver=4.2.5.css.

JakeGould

Posted 2015-09-29T15:24:49.843

Reputation: 38 217

1

This would work on Mac OS X assuming there’s just a single ? on the URL/original file name:

find . -name "*\?*" -exec sh -c 'var="{}" ; mv "{}" "${var%\?*}"' \;

For reference, this would work as well on Linux systems—or any system—that has the rename tool installed:

find . -name "*\?*" -exec rename "s/\?.*//" "{}" \;

SΛLVΘ

Posted 2015-09-29T15:24:49.843

Reputation: 1 157

rename doesn’t exist in Mac OS X. – JakeGould – 2015-09-29T17:06:56.637

Running your command in Mac OS X results in: “find: rename: No such file or directory” – JakeGould – 2015-09-29T17:11:11.570

I believe if you'd mention "brew install rename" it'd be a valid answer for OS X. – hyperknot – 2015-09-29T18:22:12.993

@zsero I don’t hate Homebrew, but the problem with Homebrew for most people is, brew install rename is not something most users can do without heavy lifting. First, Homebrew requires Xcode (4GB+ download) and related command line tools be installed. Then Homebrew itself needs to be installed. And only then does brew install [something] seem elegant. More casual users than not who need a simple solution are not going down that path to solve a simple issue that could be handled via tools already a part of Mac OS X. – JakeGould – 2015-09-29T18:37:45.747

You can actually use $1, that way it'd be more elegant, no need for var. – hyperknot – 2015-09-30T14:11:45.100

-1

I'll use echo to demonstrate.

# echo 'style.css?ver=4.2.5.css' | cut -d? -f2-9999
ver=4.2.5.css

Recursion:

cd <yourdir>
for f in *; do
    newf=$( echo $f | cut -d? -f2-9999 )
    mv $f $newf
done

Assumptions: <yourdir> contains only files that you want changed. If not, alter the for f in * glob to suit. You should test the final command first with echo, i.e. replace mv $f $newf with echo $f $nf and ensure it does what you want.

Erik Bryer

Posted 2015-09-29T15:24:49.843

Reputation: 21

1Your method would rename the file (style.css?ver=4.2.5.css) to ver=4.2.5.css but that is not the goal of the original poster’s question. – JakeGould – 2015-09-29T17:17:53.770

Yep. I suppose I should have written 'cut -d? -f1'. – Erik Bryer – 2015-10-03T16:35:41.430