I tried some utilities for digital PDF signing based on iText v1 or v2 and found out that it seems whole PDF is loaded into memory (for 60M PDF process can take up to 300-400MB of memory).
Can recent iText versions sign PDF without load it into memory?
Updates
I tested Bruno's example with itextpdf 5.5.6
- PdfReader constructor doesn't matter - it can be (src) or (src, null, true), or (src, null, false) - result the same.
- what matters is new File(tmp) in createSignature.
But memory consumption is still to big. I tried to sign 100M file (it's PDF with embedded attachment), peak memory is about 325M. Sure, it's better than 540M without temporary file, but not good enough (((.
With 32K file max. memory was 65M (that's JVM and java code itself, I guess)
Memory was measured with /usr/bin/time -v java ....
I limited Java memory with -Xmx100m, but it crashed with out of memory:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at com.itextpdf.text.pdf.PdfReader.getStreamBytesRaw(PdfReader.java:2576) at com.itextpdf.text.pdf.PdfReader.getStreamBytesRaw(PdfReader.java:2615) at com.itextpdf.text.pdf.PRStream.toPdf(PRStream.java:230) at com.itextpdf.text.pdf.PdfIndirectObject.writeTo(PdfIndirectObject.java:158) at com.itextpdf.text.pdf.PdfWriter$PdfBody.write(PdfWriter.java:420) at com.itextpdf.text.pdf.PdfWriter$PdfBody.add(PdfWriter.java:398) at com.itextpdf.text.pdf.PdfWriter.addToBody(PdfWriter.java:887) at com.itextpdf.text.pdf.PdfStamperImp.close(PdfStamperImp.java:412) at com.itextpdf.text.pdf.PdfStamperImp.close(PdfStamperImp.java:386) at com.itextpdf.text.pdf.PdfSignatureAppearance.preClose(PdfSignatureAppearance.java:1316) at com.itextpdf.text.pdf.security.MakeSignature.signDetached(MakeSignature.java:140)
Code is:
public static byte[] getStreamBytesRaw(final PRStream stream, final RandomAccessFileOrArray file) throws IOException {
PdfReader reader = stream.getReader();
byte b[];
if (stream.getOffset() < 0)
b = stream.getBytes();
else {
----> b = new byte[stream.getLength()];
file.readFully(b);
I see in debugger that stream type is EmbeddedFile and length is 100M - so whole embedded file is being read into memory.
Update - create big PDF
It's difficult to share 100M file )), but here is create sequence:
- Run
dd if=/dev/urandom of=file.bin bs=1048000 count=100 - Go to http://blog.didierstevens.com/programs/pdf-tools/ and take http://didierstevens.com/files/software/make-pdf_V0_1_6.zip
- Unzip and run
python make-pdf-embedded.py file.bin file.pdf
Here you are )
I should note that it's important to use /dev/urandom. /dev/zero creates compressed PDF with only 100K size.
Anyway, if it's necessary to obtain my file I've created 50M file on server - http://50mpdf.tk/50m.pdf