9

I have uncompressed a docx file (say file.docx) using 7zip to file/. (or alternatively, changed extension to .zip then extracted it)

I modified some images inside file\word\media\ and now i want to recompile/rebundle/repackage the folder into a docx file. I am using MS Office 2007.

How can i do that?

  • I tried zipping it back, then changing extension to docx, didn't work - msword showed corrupted file error, tried restoring/fixing it, failed, then exited with error.

As for the XY problem; i am trying to edit all the selected images in the docx via some manipulation (like magick and thereafter in gimp)

This question is similar in spirit to:

3 Answers3

13

To correctly re-zip an unzipped .docx file, be sure to follow the following rules :

  • Replace the files in their original folder
  • Zip the contents of the folder and avoid the common error of zipping up the folder
  • Use Zip (DEFLATE64) compression.
harrymc
  • 498,455
3

I would clone the backup copy (you did make a backup, right?) and replace the files you modified directly in the DOCX "archive" instead of trying to fiddle with the extracted contents.

If you renamed your DOCX file as a ZIP, Windows Explorer can open it like a folder and you can do a limited level of editing directly with the archive.

Nelson
  • 1,393
2

From: https://deparkes.co.uk/2016/12/23/how-word-files-store-images/

You may also need to update various XML files in the 'ZIP'. Excerpts from the page referred to:

"The docx xml stores properties about the images.

Within the ‘word’ folder in the unzipped docx document you’ll find document.xml which contains the structure of the document. Open this file up and you’ll see a series of ‘p’ – paragraph – elements which make up your document."

enter image description here

In this there are three p elements – one for each line of text and one for the image. In this case the image is the second of the paragraphs. We can explore the branches of this tree until we find “r:id” under “v:imagedata”.

enter image description here

Possibly the hex values are checksums? And if your modifications change for example dimensions and checksums, this may be what's causing the issues when opening the recompressed DOCX files.

I'll also include the URL to the video the page refers to: https://youtu.be/p9MqsEIHFXE