0

I use gsutil to backup photos from a dedicated box to google nearline storage.

Recently I moved all of my photos from that dedicated box to a new dedicated box. I made sure to use the relevant archival rsync flags to avoid changing the files mtimes.

I've just run gsutil for the first time on the new box, walked away, and expected it to only move over any NEW files that hand't been backed up previously.

gsutil -m rsync -r /originals/. gs://my-bucket

Instead I came back to find an output comprising hundreds of lines of

Copying mtime from src to dst for gs://my-bucket/photo123456.jpg

I can see (via ls -ltu) that the last-accessed time was affected when I moved the files, but the modified time is correctly untouched, and in this instance showing a date from 2010.

I've cancelled the job for now. What have I done wrong? Can I fix this so it won't try to do that for all 3million files?

On further inspection I can see that the backed-up files in nearline have a modified date of 2015, when they were initially backed up. The ones that produced the message today now have today's date on.

Why would that happen? The backup job has run 1000 times since the files were originally backed up, without overwriting the files modified time, so why isn't it happy now?

I wonder if it could be because I'm running a newer version of gsutil now?

Codemonkey
  • 1,034
  • 2
  • 17
  • 36

1 Answers1

0

In this Stack Overflow question using the -c option with the gsutil rsync helped. This option:

Causes the rsync command to compute and compare checksums (instead of comparing mtime) for files if the size of source and destination match. This option increases local disk I/O and run time if either src_url or dst_url are on the local file system.

(Source: GCP docs)

arudzinska
  • 166
  • 5