Stop Microsoft Word 2010 from smoothing screenshots?

30

2

When I insert JPEG screenshots into Microsoft Word, it smoothes them instead of preserving the original pixels from the bitmap. When I then print to PDF (using Acrobat Distiller), depending on my downsample settings, I either get blurry screenshots or hugely bloated file sizes.

What I want:

I would like Word and Acrobat to leave the bitmaps alone so that they make it through the process with their pixels intact. This is what the original image looks like when you zoom in:

What I want

What I get:

This is what the Word document looks like when you insert the same image and zoom in. When this is printed to PDF, all those extra pixels result in a much larger file.

What I get

Sample files:

  • Test.png (56K) A sample screenshot image file
  • Test.docx (69K) A Word file containing nothing but this image
  • Test.PDF (9.4MB) A PDF file printed from the Word file using Distiller, with all downsampling turned off
  • Test2.PDF (98K) A PDF file generated using Word 2010's "Save as PDF" tool (note the very low quality of the compressed image)

Edit: This is with Word 2010 - I've updated the tags to reflect that.


Edit: I've confirmed that OpenOffice doesn't have this problem. I've opened Test.docx (referenced above) and exported it as a PDF from OO (choosing "lossless compression" under Images in the options), and the image comes through unharmed.

Unfortunately, OpenOffice mangles the formatting on more complex Word documents that I've created; so I can't just create the documents in Word and use OO to render the PDFs; I'd have to switch to OO altogether, which is a bigger step than I'm prepared to take right now.

Herb Caudill

Posted 2011-04-18T02:19:06.513

Reputation: 879

What are you pasting them as? Bitmap, Enhanced Metafile, JPEG, GIF, PNG, Windows Metafile? – Rhys Gibson – 2011-04-18T04:01:13.110

They're JPEG files (saved from Photoshop with maximum quality) inserted into the document using "Insert picture from file". – Herb Caudill – 2011-04-18T12:31:41.653

1Have you tried an alternative (non-lossy) file format (eg PNG)? If you're lucky it will be something that Word and Distiller are much less likely to attempt to helpfully re-compress. – DMA57361 – 2011-04-18T14:04:38.553

PNG has the same problems (I just added an example above). – Herb Caudill – 2011-05-04T16:19:35.933

Interesting problem you're having here... the ultimate goal here is to reduce the size of the PDF? – James Mertz – 2011-05-04T18:20:24.180

@KronoS: Yes, I would like for the PDF not to be bloated by a factor of 100 relative to the Word document; I'd also like to preserve the exact pixels in the screenshot, rather than blur them the way Word is doing. – Herb Caudill – 2011-05-05T17:45:15.263

Hi Herb, did you try OpenOffice in the end as I suggested? As far as I can tell it solves both of these issues but let me know if you get any problems. – James P – 2011-05-11T08:24:46.887

@James - I haven't tried it yet - I'm reluctant to abandon Word over this one issue. – Herb Caudill – 2011-05-11T12:50:27.957

OpenOffice can open Word documents natively so you would only need to use it as a PDF exporter. You can still create the files in Word as before but use OpenOffice in place of Distiller. I don't think you could set it up as a virtual printer though. – James P – 2011-05-11T13:08:29.630

@Herb Just if you've overlooked my comment under my answer. I don't have Word, so I would like to ask you to upload 2 more files 1) PDF w/ PNG from PDFCreator, 2) PDF w/ my last JPG from Word (using built-in PDF output). Having these files I will be hopefully able to solve two still open mysteries. TIA – przemoc – 2011-05-15T22:02:46.797

FWIW I've posted this question at Microsoft Answers, without much luck so far. http://goo.gl/niN9H

– Herb Caudill – 2011-05-19T12:33:59.920

Answers

9

Word maybe just renders upscaled image and sends it that way as printer input (I presume that Distiller works as a printer). If so, then it's good for normal printers, but inefficient for fake printers producing PDF files.

For instance pdfLaTeX properly embeds image in output file. Check my PDF uploaded to min.us gallery: Embedding image in LaTeX document

Important thing is what PDF producing stack you are using. If trying other PDF printer, like great and free PDFCreator, does not fix the problem, then you should try using dedicated PDF export, i.e. not working as a printer. AFAIK recent Word versions have PDF export built-in, so if it is properly implemented, then you will get small file, thanks to embedding images used in the document.

HUGE EDIT

Gallery has been renamed to Embedding PNG image in LaTeX vs Word

I've looked more thoroughly at my mytest.pdf generated by pdfLaTeX and your test2.pdf generated by Word.

mytest.pdf test2.pdf

Let's start with uncompressing. If you look into uncompressed file, you'll easily spot beginning of the image stream (<<...>>stream line with Width and Height parameters, same as in test.png, i.e. 176x295), which ends with endstream tag. Peek time.

(WARNING at this point pdftk is assumed to be in version 1.41)

test2.pdf

$ pdftk test2.pdf output test2uc.pdf uncompress
$ sed '\,^<</Width 176[^>]*/Height 295[^>]*>>stream$,!d' test2uc.pdf
<</Width 176/BitsPerComponent 8/Interpolate true/Height 295/Filter[/DCTDecode]/Subtype/Image/Length 20003/ColorSpace/DeviceRGB/Type/XObject>>stream
$ sed '1,\,^<</Width 176[^>]*/Height 295[^>]*>>stream$,d;/^endstream$/,$d' test2uc.pdf > test2stream
$ xxd test2stream | head -10
0000000: ffd8 ffe0 0010 4a46 4946 0001 0101 0048  ......JFIF.....H
0000010: 0048 0000 ffe1 005c 4578 6966 0000 4d4d  .H.....\Exif..MM
0000020: 002a 0000 0008 0004 0302 0002 0000 0016  .*..............
0000030: 0000 003e 5110 0001 0000 0001 0100 0000  ...>Q...........
0000040: 5111 0004 0000 0001 0000 0b13 5112 0004  Q...........Q...
0000050: 0000 0001 0000 0b13 0000 0000 5068 6f74  ............Phot
0000060: 6f73 686f 7020 4943 4320 7072 6f66 696c  oshop ICC profil
0000070: 6500 ffe2 0c58 4943 435f 5052 4f46 494c  e....XICC_PROFIL
0000080: 4500 0101 0000 0c48 4c69 6e6f 0210 0000  E......HLino....
0000090: 6d6e 7472 5247 4220 5859 5a20 07ce 0002  mntrRGB XYZ ....
$ file test2stream 
test2stream: JPEG image data, JFIF standard 1.01

So Word is giving JPEG instead of PNG on its internal output for further PDF processing. Just WOW! Same thing may happen when sending output to printer.

test2stream.jpg

mytest.pdf

$ pdftk mytest.pdf output mytestuc.pdf uncompress
$ sed '\,^<</Width 176[^>]*/Height 295[^>]*>>stream$,!d' mytestuc.pdf
<</Width 176/BitsPerComponent 8/Height 295/Subtype/Image/Length 155760/ColorSpace/DeviceRGB/Type/XObject>>stream
$ sed '1,\,^<</Width 176[^>]*/Height 295[^>]*>>stream$,d;/^endstream$/,$d' mytestuc.pdf > myteststream
$ xxd myteststream | head -10
0000000: ebeb ebea eaea ecec eceb ebeb ebeb ebeb  ................
0000010: ebeb ebeb ebec ecec ebeb ebeb ebeb ebeb  ................
0000020: ebeb ebeb ebeb ebeb ebeb ebeb ebeb ebeb  ................
0000030: ebeb ebea eaea eaea eaec ecec eaea eaec  ................
0000040: ecec ebeb ebec ecec ebeb ebeb ebeb ebeb  ................
0000050: ebeb ebeb ebeb ebeb ebeb ebeb ebeb ebeb  ................
0000060: ebeb ebeb ebeb ebeb ebeb ebeb ebeb ebeb  ................
0000070: ebeb ebeb ebeb ebeb ebeb ebeb ebeb ebeb  ................
0000080: ebea eaea ecec eceb ebeb ebeb ebea eaea  ................
0000090: ebeb ebeb ebeb ebeb ebeb ebeb ebeb ebeb  ................
$ file myteststream 
myteststream: DOS executable (COM)

It's not COM file, but it's not PNG either.

$ du -b test.png test2stream myteststream 
57727   test.png
20004   test2stream
155761  myteststream

You see it now? Image stream (of PNG) from PDF produced by pdfLaTeX is possibly simple raw format (176*295*3=155760, 1 comes from superfluous newline). Let's check it:

$ convert -depth 8 -size 176x295 rgb:myteststream myteststream.png

And we have our original image back! No, wait. It looks that pdftk 1.41 uncompression is buggy and image was almost the same with a few flaws. I upgraded to pdftk 1.44, but this version does not decompress image stream at all. Moreover pdftk does not output stream dictionary in one line, so above extraction using sed no longer works, but there is no point in fixing it now.

So what we can do about Word? Not much methinks. At least you can transplant embedded image from one PDF to another. I repeated uncompression of both PDFs using recent pdftk, opened them in vim, replaced in test2uc.pdf <<...>>stream...endstream with counterpart from mytestuc.pdf, saved as test2fixuc.pdf and compressed to test2fix.pdf.

test2fix.pdf

test.pdf

It would be a sin not checking your big PDF after all. Ok, I've prepared another oneliner to play with pdftk 1.44 uncompressed PDFs to list image streams and their beginning lines in files. So I'll start with uncompressing test.pdf.

(WARNING at this point pdftk is assumed to be in version 1.44)

$ pdftk test.pdf output testuc.pdf uncompress
$ awk '{if(i)h=h$0} /^[0-9]+ [0-9]+ obj $/{i=1;h=""}/^stream$/{i=0;if(h!~/\/Image/)next;print h,":"NR+1}' testuc.pdf 
<</ColorSpace /DeviceRGB/Subtype /Image/Length 10443804/Width 707/Type /XObject/BitsPerComponent 8/Height 4924>>stream :619
<</ColorSpace /DeviceRGB/Subtype /Image/Length 11264460/Width 953/Type /XObject/BitsPerComponent 8/Height 3940>>stream :12106
<</ColorSpace /DeviceRGB/Subtype /Image/Length 2813256/Width 953/Type /XObject/BitsPerComponent 8/Height 984>>stream :12910
<</ColorSpace /DeviceRGB/Subtype /Image/Length 11264460/Width 953/Type /XObject/BitsPerComponent 8/Height 3940>>stream :18547
<</ColorSpace /DeviceRGB/Subtype /Image/Length 2813256/Width 953/Type /XObject/BitsPerComponent 8/Height 984>>stream :19312
<</ColorSpace /DeviceRGB/Subtype /Image/Length 4845216/Width 328/Type /XObject/BitsPerComponent 8/Height 4924>>stream :19326

Something is really insane here! 6 raw images (apparently this time pdftk did not have any problems in uncompressing them) taking together 43444452 bytes! Let's recheck test2uc.pdf and mytestuc.pdf.

$ awk '{if(i)h=h$0} /^[0-9]+ [0-9]+ obj $/{i=1;h=""}/^stream$/{i=0;if(h!~/\/Image/)next;print h,":"NR+1}' test2uc.pdf 
<</Width 176/BitsPerComponent 8/Interpolate true/Height 295/Filter /DCTDecode/Subtype /Image/Length 20003/ColorSpace /DeviceRGB/Type /XObject>>stream :113
przemoc@debian:~/latex/test/img/mod$ awk '{if(i)h=h$0} /^[0-9]+ [0-9]+ obj $/{i=1;h=""}/^stream$/{i=0;if(h!~/\/Image/)next;print h,":"NR+1}' mytestuc.pdf 
<</DecodeParms <</Colors 3/Columns 176/Predictor 10/BitsPerComponent 8>>/Width 176/BitsPerComponent 8/Height 295/Filter /FlateDecode/Subtype /Image/Length 54954/ColorSpace /DeviceRGB/Type /XObject>>stream :22

In both cases only one image stream. Why the heck there could be more of them?!

$ sed '1,618d;/^endstream $/q' testuc.pdf | convert -depth 8 -size 707x4924 rgb:- testuc-stream1.png
$ sed '1,12105d;/^endstream $/q' testuc.pdf | convert -depth 8 -size 953x3940 rgb:- testuc-stream2.png
$ sed '1,12909d;/^endstream $/q' testuc.pdf | convert -depth 8 -size 953x984 rgb:- testuc-stream3.png
$ sed '1,18546d;/^endstream $/q' testuc.pdf | convert -depth 8 -size 953x3940 rgb:- testuc-stream4.png
$ sed '1,19311d;/^endstream $/q' testuc.pdf | convert -depth 8 -size 953x984 rgb:- testuc-stream5.png
$ sed '1,19325d;/^endstream $/q' testuc.pdf | convert -depth 8 -size 328x4924 rgb:- testuc-stream6.png

Image was cut to many pieces... It looks like some kind of utterly stupid protection, maybe introduced by Distiller (and maybe it can be turned off)? I doubt same thing would be spitted by PDFCreator, unless it's Word who performs this unbelievable insanity...

testuc-stream1.png and others (use right arrow to navigate)

Conclusion

Important things are:

  • you can clearly see, that huge image that was cut into pieces is actually upscaled JPEG, so my hypothesis was correct,
  • because in PDFCreator you get also huge file in the output, it's the Word who provides awfully big image to the fake PDF printer, and my earlier supposition was also correct.

Phew. This investigation took some time. Word is piece of junk.

Workarounds?

In the meantime some suggestions were given. Let me comment them.

Using writer with decent PDF support like LibreOffice (forget about OpenOffice, it's obsoleted now) is good solution, unless some incompabilities make you unable to work with it.

Using bigger image in same box on the page is also not that bad idea, because even after JPEG-izing, artifacts will be less visible.

My another grosz though is using JPEG from the beginning. That way Word shouldn't recompress it (you never know...) and you can provide highest possible quality of JPEG. There is also lossless JPEG compression. Developers from Redmond presumably thought it's not needed, so I won't be surprised if Word doesn't handle such JPEGs. Well, TBH it's not widely supported (even in open source world), just like arithmetic coding (or it's rather even worse situation in case of arithmetic coding).

convert test.png -quality 100 -resize $((100*300/72))% test-300dpi-mitchell.jpg
convert test.png -quality 100 -filter box -resize $((100*300/72))% test-300dpi-box.jpg
convert test.png -quality 100 test.jpg

(In Windows use 416 instead of this $(()) arithmetic expansion available in POSIX shells)

I think that default Mitchell is good one for upscaling, but if you really want such pixelatic image, then go with Box as @ceving suggested. Of course first 2 files are useful only if you must (for some reason) use fake PDF printers.

I've uploaded all three files.

test-300dpi-mitchell.jpg (426 KB) test-300dpi-box.jpg (581 KB) test.jpg (74 KB)

If my hypothesis is right and Word won't recompress JPEG image, then just use the last one not upscaled and go with built-in PDF output, because it has less shortcommings (at least it avoids needless upscale).

przemoc

Posted 2011-04-18T02:19:06.513

Reputation: 2 176

Thanks, @przemoc. I tried PDFCreator and I get the same results as with Distiller (the images are blurred just the way they are in Word, and if I turn off compression, I get a huge file). I also tried "save as PDF" and I get a highly compressed version of the image, with lots of JPEG artifacts (although the original image was PNG and I have image compression turned off as @nihcap suggested. I'll upload the result. – Herb Caudill – 2011-05-09T14:23:06.110

@Herb Important update. I haven't solved your problem, but I shed some light on it and it should be interesting reading. At least I hope so... – przemoc – 2011-05-10T19:03:34.420

tl;dr I've provided one 100% working workaround and one possibly working. 1) 100% working is transplanting images from PDF with properly embedded losslessly compressed raw images (generated from LibreOffice or pdfLaTeX) to your PDF generated from Word (avoid using fake PDF printers!). Unfortunately it can tiresome if you have many images. 2) Assuming that Word won't recompress JPEG image for internal output, use JPEG with highest possible quality, e.g. produced by convert from ImageMagick or XnView. In this case you have loss of quality, but it is controllable at least. – przemoc – 2011-05-10T22:02:12.593

I'm awarding the bounty because of the incredible amount of research you've done. Of course my problem remains unsolved, but it seems nothing can be done - it looks like Microsoft took a huge step backwards in image handling between 2007 and 2010. – Herb Caudill – 2011-05-11T12:45:13.027

@Herb Thanks. Actually I've never looked into PDFs at internal level before, so this investigations was interesting and informative (and I get into some new problems, that I will have to resolve for my own pleasure of understanding things). I'm a bit sad that I couldn't provide you more satisfactory result, i.e. real solution instead of some grasp of what's going on, who's guilty, and workarounds. But let's not close the case yet, there are some open questions here. I don't have Word, so I have to ask you to upload 2 more files 1) PDF w/ PNG from PDFCreator 2) PDF w/ my last JPG from Word. TIA – przemoc – 2011-05-11T13:32:55.253

7

Open File > Settings > Advanced, then in Image size and quality section, check option Do not compress images in files (See screen capture to orientate where is this option located)
Word settings

The following image is the same JPG image (document capture 400% zoomed in to show anti-aliasing difference) inserted before and after activating that option:
enter image description here

Francisco Alvarado

Posted 2011-04-18T02:19:06.513

Reputation: 311

Any idea where this setting can be found in Word 2007? – dimo414 – 2011-05-03T08:33:55.070

I actually misstated the problem in my original post - it's not that Word is compressing or anti-aliasing the image, it's that it's smoothing it rather than showing the original pixels. I've tried the setting that you point out here, but it's still smoothing the image, which results in bloated PDF output. – Herb Caudill – 2011-05-04T16:10:46.240

@dimo414 Click the Office button then Settings, other steps should be the same. – nyuszika7h – 2011-05-04T16:11:52.133

@Nyuszika7H I go to Office > Word Options > Advanced but I do not see a "Image size and quality" section, nor can I find anywhere in the settings to "not compress images in files" – dimo414 – 2011-05-04T19:36:17.627

2This is a new Word 2010 option. – harrymc – 2011-05-05T08:00:47.153

1I feel like I should clarify since this is getting so many up votes - this is a good setting to know about, but it doesn't affect the issue I'm having at all. – Herb Caudill – 2011-05-11T12:42:48.363

1

It looks like Microsoft Word's zoom feature uses bilinear filtering. This should not change the image itself, but only how it is displayed at magnifications other than 100%. What you want is nearest neighbor scaling, but I doubt MS Word has an option for that.

trough

Posted 2011-04-18T02:19:06.513

Reputation: 11

0

I have repeated the manipulation of inserting Test.png into a document in Word 2007, and found to my astonishment that the result depends upon the mechanism that one uses.

If one uses Insert / Picture then the picture is smoothed.
But if one enters an image-editor and does copy, then paste into Word, then the image is not smoothed.

Other possible workarounds are :

  1. Try using Paste Special as Bitmap or Device independent bitmap.
  2. Do not paste images. Use Inset tab / Illustrations group / Picture command and change the "Insert" drop-down button to "Link to file". The image file can be optimized for the Web to take less space.

harrymc

Posted 2011-04-18T02:19:06.513

Reputation: 306 093

Hmm - that hasn't been my experience. When I paste from MS Paint or Photoshop, I get the same smoothing as if I used Insert Picture from File. I'm using Word 2010, I wonder if that's the difference - I don't remember having this problem when I used Word 2007. – Herb Caudill – 2011-05-04T17:40:04.557

Hmm, is this a new Word 2010 "feature" ? Maybe the Microsofties realized that paste wasn't doing the "right" thing and "fixed" it in Word 2010. – harrymc – 2011-05-04T18:43:47.023

3Regarding the screenshot pixelation, you can use Vista's snipping tool and save the file as a PNG which will prevent that noise. – dimo414 – 2011-05-04T19:37:57.593

@dimo414: Thanks, a very useful hint. – harrymc – 2011-05-05T07:35:40.660

@Herb Caudill: What happens if you try to insert the picture into a .doc, while ensuring that Word options / Advanced / Compatibility is "Word 2003" ? – harrymc – 2011-05-05T08:01:03.157

@harrymc - I've tried saving as different file formats (Word 2003, XML, etc) - doesn't make any difference. – Herb Caudill – 2011-05-05T18:03:36.550

I meant pasting into a .doc rather than saving as .doc – harrymc – 2011-05-05T20:08:00.373

@harrymc - Yeah, no difference, unfortunately. – Herb Caudill – 2011-05-06T01:30:48.757

Well, the only (lousy) advice I can think of is to downgrade to Word 2007 and use paste. Both 2007 and 2010 can be installed side-by-side, although passing from one to the other may require waiting. – harrymc – 2011-05-06T08:21:10.270

Added a couple more possibilities. – harrymc – 2011-05-09T08:10:52.190

@harrymc - I've tried all the "Paste Special" options with no difference in the results. I've also tried inserting the picture as a file, with the various options (Insert, Link, Insert and Link) - no difference. – Herb Caudill – 2011-05-09T14:29:51.163

Are you sure that it is the image that changes, rather than Word 2010 only beautifying the display ? – harrymc – 2011-05-09T14:38:36.280

It appears to store the original image unchanged, but that doesn't help me since I don't have any way to generate a clean PDF from it. – Herb Caudill – 2011-05-11T12:49:22.090

0

It is probably the most easiest solution to scale the original images to 300dpi or whatever resolution you use during your PDF export. ImageMagick's convert program can do it for example.

The original image has a width of 176 pixels. If you want to scale it to 4 inch at 300dpi the target width is 1200 pixel. This will do it:

convert test.png -filter Box -resize 1200 test_300dpi.png

I have experienced that it is always better to prevent Microsoft products from trying to think what might be good for you. It is always better to decide it on your own.

ceving

Posted 2011-04-18T02:19:06.513

Reputation: 1 737

I believe that PDF files have a native embedded image resolution, so if you resize your images to match then Word might be able to skip a step. Unfortunately I fear the resulting PDF size might be unreasonable. – Mark Ransom – 2011-05-10T21:00:55.627

-1

Correct me if this comment is too obvious or not relevant:

When I paste a crisp Image of, say, a page of text (I tested .bmp and .png) into a Word 2010 document (.docx) the result is a blurry version of the original. This is due to automatic resizing and Image processing done by Word, seemingly regardless of the relevant settings in "Options". However if I then

  1. select the Image
  2. go to the ribbon Header "Format"
  3. select the little icon in the leftmost area of the ribbon which Looks like an Little Image with an "undo" arrow
  4. pull down the associated menu
  5. select the lower item called "Revert Image and Size" (that was a loose translation from German),

then the crisp image I pasted reappears in place of the blurred one.


Note: if I use the built-in photograph tool in Acrobat Reader, then paste directly to Word, the above does not work. I Need to take a screenshot of the whole Screen or go via IrfanView.

KUK

Posted 2011-04-18T02:19:06.513

Reputation: 1

-1

This question is similar to this one

It has to do with the wrapping style ... set it to the top and bottom only. Read here.

pcunite

Posted 2011-04-18T02:19:06.513

Reputation: 1 126

No, changing the wrapping options doesn't make a difference. – Herb Caudill – 2011-05-09T14:22:37.733

@Herb, this issue may require a code change on MS part. I miss Outlook 2003 where you could do an insert image and expect it to be viewable as expected. – pcunite – 2011-05-10T02:15:49.283