-1

I am looking to remove all text after the 4 digit year of movie files:

Input:
Some.Movie.Name.2011.1080p.BluRay.x265.mp4
Another.Movie.Name.1999.1080p.BluRay.x264.mp4
Another.Movie.Name.II.2001.1080p.BluRay.x264.mp4

Desired Output:
Some.Movie.Name.2011
Another.Movie.Name.1999
Another.Movie.Name.II.2001

I have used awk with regex:
echo "Some.Movie.Name.2011.1080p.BluRay.x265.mp4" |awk -F'.[0-9]{4}' '{print$1}'

Which gives me:
Some.Movie.Name

I can't find a way with awk or anything else to have it print the 4 digit year delimiter also.

1 Answers1

0

You could use

sed -E "s/(.*\.[[:digit:]]{4})\..*/\1/"

to achieve what you're after.

back-check

for f in Some.Movie.Name.2011.1080p.BluRay.x265.mp4 Another.Movie.Name.1999.1080p.BluRay.x264.mp4 Another.Movie.Name.II.2001.1080p.BluRay.x264.mp4 ; do 
    echo $f | sed -E "s/(.*\.[[:digit:]]{4})\..*/\1/"
done

produces your desired output.

explanation

The pattern lets sed look for four digits ([[:digit:]]{4}) enclosed in dots (\.) and anything (.*) in front and behind that pattern. Anything in front including the digits, excluding the dot right behind the four digits is modeled as a group, hence the braces (). This group is back-referenced in the substitution with \1.

Timor
  • 161
  • 10