How can I easily remove all comments and annotations (added with Foxit Reader) from all the PDFs in a folder?
7 Answers
I just came across this problem, and none of the answers given here worked for me. What did work was the rewritepdf tool from the Ubuntu package libcam-pdf-perl:
rewritepdf -C in.pdf out.pdf
Wrapping this into a little scripting to remove annotations from all pdf files in a directory is now easy:
for i in *.pdf; do rewritepdf -C '$i' '$i'.new; done
As usual, you can install libcam-pdf-perl via the Software Center or using sudo apt install libcam-pdf-perl
- 219
Haven't tested it a great deal, but the following seems to work. It deletes all annotations, except internal document links (which none of the answers here seem to do). This script depends on the pdfrw python library.
#!/usr/bin/python
import sys, pdfrw
try:
in_path = sys.argv[1]
out = sys.argv[2]
except:
print("Usage:\tannotclean IN.pdf OUT.pdf")
exit(0)
reader = pdfrw.PdfReader(in_path)
for p in reader.pages:
if p.Annots:
# See PDF reference, Sec. 12.5.6 for all annotation types
p.Annots = [a for a in p.Annots if a.Subtype == "/Link"]
pdfrw.PdfWriter(out, trailer=reader).write()
Usage:
- Save as a script somewhere (I assume in your
PATH), e.g./usr/local/bin/annotclean. annotclean in.pdf cleaned.pdf- (optional) batch processing:
# fish shell syntax
for p in **pdf # pdfs from current directory and subdirectories
annotclean $p $p.new
mv $p.new $p # overwrite the old
end
- 175
Providing you're on a Unix system:
cd <directory containing PDFs>
find . -type f -name '*.pdf' -exec perl -pi -e 's:/Annots \[[^]]+\]::g' {} +
This is a hack that removes all /Annots commands from the PDF (the commands that draws the annotations). It leaves the annotation objects there (you can open the PDF with a text editor and search for them), they're just not drawn.
- 151
A fast (in-memory) way is to use pdftk and sed as in the following shell pipeline:
pdftk in.pdf output - uncompress |\
sed '/^\/Annots/d' |\
pdftk - output out.pdf compress
- 188
I think you can do that most easily by "refrying" the PDF. Refrying means: first convert the file to PostScript, then convert the PostScript back to PDF. Usually refrying is frowned upon, because usually you loose quality and some content. In your case you want to loose the content. The re-frying can be done with Ghostscript (and the helper batch files shipping with it -- download the gs900w32.exe if you are on Windows), so here you go, with 2 easy commands:
pdf2ps.bat input.pdf output.ps
ps2pdf.bat output.ps input_refried.pdf
- 13,079
OK, you said you'd also consider a commercial solution....
I'd recommend you try callas pdfToolbox. It's available for Windows and Mac OS X. (They have a CLI for Linux as well, but you can only use pre-configured "profiles" with it. With the Windows GUI, you can create your custom profiles and re-use them with the Linux CLI, though.
The pdfToolbox has lots and lots and lots of way to manipulate and fix many, many individual PDF problems.
One of the "Fixups" is to remove all annotations.
You don't need to shell out any money to test it first; callas gives out 14days trial licenses for free.
- 13,079
BatchPurifier can remove all comments and annotations from all PDFs in a folder at once.
It's a paid app for Windows only, with a wizard-like GUI. It comes with a trivial installer for setting it up. To use it, you just need to select the files to process, select the hidden data types to remove and specify output options.
- 11
- 4