11

I have noticed only now that my Word 2010 (docx) documents that are just a single page long and include a simple WMF vector graphic and a bit of text are almost 1 MB large. The Word document is only 50 kB and a PDF file created with Bullzip PDF printer is about the same size. So what is Microsoft writing into the other 950 kB?

Update: As I keep getting answers recently that all do not apply, I'd like to save you the work. The issue has gone away after using Windows 7 instead of XP (which I did over a year ago). Something doesn't seem to be supported on the old system, I suspect it's some font subsetting or so. Also I cannot try your suggestions because the issue does not exist anymore. So I'm not able to accept answers to this.

ygoe
  • 2,480
  • 8
  • 29
  • 46

6 Answers6

3

This is still a problem with Word 2016. Perhaps not the same as the OP had, but it's still there: start with a 1 page 20 KB document, save as PDF, get a 300 KB PDF.

I can't say why Word does this, but there is an easy way to minify these PDF files: install GhostScript, then run the following command:

gswin64c.exe -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH "-sOutputFile=%2" "%1"

where %1 is the input PDF and %2 is the output PDF. Turns that 300 KB PDF into a 40 KB PDF. Still not as small as CutePDF (that one managed about 30 KB for the same document) but a vast improvement.

Or just skip this step and print to CutePDF directly.

RomanSt
  • 9,959
1

Many reasons.

  1. XML Styling
  2. Images converted to base64, which is 33%larger than the original
  3. Other stuff like fonts etc...
  4. A lot of stuff that seemingly doesn't do anything!
Nobody
  • 21
1

Check your options settings in Word 2010. You may be instructing Word to embed one or several entire fonts into the your document. This causes terrible document bloat especially if you are using Unicode fonts. Uncheck that option if it is checked and Word will embed only the characters that are actually used within your document.

You should also be aware that *.docx is a compressed file format that has to be decompressed before it can be converted to a PDF file which adds to its size.

If this does not work for you, there are several PDF optimizing tools that are available through Adobe and Nuance.

Hope this helps.

0

Thought: Word is converting the vector graphic into a bitmap or PNG and embedding it in the document with limited or no compression. Check the PDF settings and see if you can adjust that.

Analysis: One way to check that is to change the file extension of the Word file to .ZIP, and see for yourself what Word is doing!

Joshua
  • 4,402
0

This is because the formatting of the PDF document will contain styles for (probably) each character. I did something like this but into HTML and it generated a 20KB html file as a 600KB file.

0

Use software that is designed for a specific purpose. Word is good in creating word documents and because a lot of other software suits add the feature, MS can't leave it out. I don't really see why they would choose to spend a lot of time and effort optimizing something that most people don't even use or care much about. The people that do care, don't use word for PDF printing.

You should look into installing a dedicated PDF printer on your computer and use the PRINT function to create a PDF file. There are many free and commercial packages available that do a perfect job and keep your PDF file compressed to a minimum.

Asking WHY exactly Word creates such huge PDF files is something you better ask the MS engineers on their forums... only they can tell. Here you'll just get a lot of guesses as to why MS does things the way they do.

Jakke
  • 1,020