33
53
I need to remove some stupid email watermark that expands across all pages of a public domain book. I looked at pdftk man page and some examples but still can not figure out how to remove the watermarks. I appreciate your hints.
33
53
I need to remove some stupid email watermark that expands across all pages of a public domain book. I looked at pdftk man page and some examples but still can not figure out how to remove the watermarks. I appreciate your hints.
35
very simply task to perform:
use sed:
sed -e "s/watermarktextstring/ /g" <input.pdf >unwatermarked.pdf
but, after, be sure to repair resulting output pdf
pdftk unwatermarked.pdf output fixed.pdf && mv fixed.pdf unwatermarked.pdf
all into one command:
sed -e "s/watermarktextstring/ /g" <input.pdf >unwatermarked.pdf && pdftk unwatermarked.pdf output fixed.pdf && mv fixed.pdf unwatermarked.pdf
text watermarks are nothing else than a text between two tags inside the pdf compressed code
50
Just a little add-on to Dingo's answer as it did not work for me:
I had to first uncompress the PDF document in order to be able to find the watermark and replace it with sed
.
The first step involves uncompressing the PDF document using pdftk
:
pdftk original.pdf output uncompressed.pdf uncompress
now, the uncompressed.pdf
can be used as in Dingo's answer:
sed -e "s/watermarktextstring/ /" uncompressed.pdf > unwatermarked.pdf
I then repaired and recompressed the document:
pdftk unwatermarked.pdf output fixed.pdf compress
@Alexander Garden It doesn't work, TypeError: str() takes at most 1 argument (2 given)
when used following the usage advice given – 8bitjunkie – 2016-02-28T19:40:45.603
@8bitjunkie Can you open a github issue with a full stack trace? – Alexander Garden – 2016-02-29T20:33:59.253
I was having issues with this approach due to pdftk not being able to open the unwatermarked.pdf file. What did the trick was to replace the watermarktextstring via sed using a replacement string which was just N number of space characters where N is the length of the original watermark. In other words, make sure your uncompressed.pdf and unwatermarked.pdf have the same length – gdecaso – 2017-04-11T17:39:35.033
+1 I used the sed command /watermarktextstring/d
instead because my water mark string was interlaced with formatting instructions or typographic hints or something like that. – David Foerster – 2017-10-12T16:12:39.947
@Philippe The second command gives an error: "sed: RE error: illegal byte sequence", what should I do? – Karlo – 2018-03-31T17:46:09.007
Since qpdf is the default tool on many distros, here is how to uncompress using qpdf.
– akhan – 2018-11-15T17:39:26.603@Philippe any idea on how to batch remove watermark? – Clain Dsilva – 2018-11-20T12:43:10.437
2Didn't work to remove watermark added by Master PDF Editor. – fccoelho – 2018-12-27T12:10:21.667
You are a life-saver! Thank you!!! :) – johndodo – 2013-11-07T11:11:19.957
1This is really awesome! – qed – 2014-01-29T14:59:44.040
4
I took this process, made it slightly fancier, and wrapped it up in a Python script. It is on github here.
– Alexander Garden – 2014-04-11T04:00:09.533-2
To remove www.it-ebooks.info,
open the PDF in notepad++ or textpad
replace www.it-ebooks.info with nothing (blank)
save the file
Open in standard adobe reader
Exit, you will be prompted to save the file
save it
1Is this a general solution? What is www.it-ebooks.info? – Karlo – 2018-03-31T17:23:03.627
pdftk crashed when I ran this. – Cerin – 2018-09-03T11:28:54.797
@Dingo how do batch process it? I mean multiple files – Clain Dsilva – 2018-11-20T13:11:48.067
Multiple files having same text string to replace or different strings for each file? – Dingo – 2018-11-20T15:15:34.560
1Fantastic! worked like a charm. please just rename the email address to a fictitious one. I don't want the guy how spoiled the book be targeted by spammers. Specially as he is probably the one who has made the pdf. Many thanks. – hnns – 2012-07-12T14:17:39.053
done! Changed specific string with a generic string – None – 2012-07-12T14:28:35.190
Does anyone know how to modify this solution to get rid of a link watermark? I got rid of the text, but there's still a small square left where the text used to be. – 425nesp – 2013-10-20T07:43:04.080