EDIT: Updated flow diagram to better explain the (likely unnecessary) complexity of what I'm doing.
We at the company I work for are attempting to create complex PDF files using Java iText (the free version 2.1 line). The documents are built piece-by-piece from individual "template" files, which are added to the final document one after another using the PdfStamper
class, as well as filled using AcroForms.
Visual Example
The current design performs a loop which runs for each template that needs to be added in order (as well as the custom logic needed to fill each). In each iteration of the loop, it does the following:
- Creates a
PdfReader
to open the template file - Creates a
PdfStamper
that reads from thePdfReader
and writes to a "template buffer" - Fills AcroForm fields, as well as measures the height of the template by getting the location of an "end"
AcroField
. - Closes the
PdfReader
andPdfStamper
- Creates a
PdfReader
to read a "working buffer" that stores the current final document in progress - Creates a
PdfStamper
that reads from thePdfReader
and writes to a "storage buffer" - Closes the
PdfReader
, opens a newPdfReader
to the "template buffer" - Imports the page from the "template buffer," adds it to the
ContentByte
of thePdfStamper
- Closes the
PdfReader
andPdfStamper
- Swaps the "storage buffer" with the "working buffer" in order to be ready to repeat.
Here is a diagram visually explaining the above process, which is performed for each iteration of the "loop" that performs each template:
Flow diagram
However, as discovered through this Stack Overflow question (where example code can also be viewed) and the response by iText author Bruno Lowagie, this methodology of using the PdfStamper
can cause significant issues. "Abusing" the PdfStamper
by creating and closing the stamper too many times can cause corruptions in the resulting file that only effect some programs, generating a seemingly good document that may fail in certain contexts.
What is the alternative? Mr. Lowagie's response suggests that there is a simpler or more direct way to use PdfStamper
s, though I do not quite understand it myself yet. Could this be done using only a single stamper? Could it be done without using a rotating series of buffers?
2 Answers 2
I'm always disappointed when I read "we are using iText 2.1" because that's really not a wise choice as explained here, but this is a question about design, so here is a possible approach:
enter image description here
You create a new document Document document = new Document();
(step 1), you create a PdfWriter
instance (step 2), you open the document (step 4), and you add content in a loop (step 4):
- You have different templates, and by templates we mean: existing PDF documents with fillable fields (AcroForms). You fill them out using
PdfStamper
andAcroFields
(see your code on StackOverflow). This results in separate "flattened" form snippets kept in memory. - If you want to keep these snippets together, you can do so by creating a
Document
/PdfWriter
instance to create a new PDF in memory that combines all the snippets that belong together. You get a snippet like this:PdfImportedPage snippet = writer.getImportedPage(reader, 1);
and you add thesnippet
to thewriter
using theaddTemplate()
method. - You get the combined result using
PdfImportedPage combined = writer.getImportedPage(reader, 1);
, you wrap the result in an image like this:Image image = Image.getInstance(combined);
You add the image to the document:document.add(image);
Step 2 could be omitted. You could add the different snippets straight to the document
that is initially created. Repeats steps 1 to 3 as many times as needed, and close the document (step 5).
Omitting step 2 will result in a lower XObject nesting count, but keeping step 2 isn't problematic.
In pseudo code, we'd have:
[1.] The outer loop (the large part to the right of the schema, marked PdfWriter
)
// step 1
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.getInstance(document, os);
// step 3
document.open();
// step 4
for (int i = 0; i < parameters.length; i++)
document.add(getSnippetCombination(writer, parameters[i]));
// step 5
document.close();
[2.] The creation of one unit (the arrow marked PdfWriter
in the middle)
public Image getSnippetCombination(PdfWriter w, Parameters parameters) {
// step 1
Document document = new Document();
// step 2
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfWriter writer = PdfWriter.getInstance(document, baos);
// step 3
document.open();
// step 4
PdfContentByte canvas = writer.getDirectContent();
for (int i = 0; i < parameters.getNumberOfSnippets(); i++)
canvas.addTemplate(getSnippet(writer, parameters.getSnippet(i)),
parameters.getX(i), parameters.getY(i));
// step 5
document.close();
// Convert PDF in memory to From XObject wrapped in Image object
PdfReader reader = new PdfReader(baos.toByteArray());
PdfImportedPage page = w.getImportedPage(reader, 1);
return Image.getInstance(page);
}
[3.] Filling out data in separate snippets (the arrows marked PdfStamper
)
public PdfTemplate getSnippet(PdfWriter w, Snippet snippet) {
// Using PdfStamper to fill out the fields
PdfReader reader = new PdfReader(snippet.getBytes());
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfStamper stamper = new PdfStamper(reader, baos);
stamper.setFormFlattening(true);
AcroFields form = stamper.getAcroFields();
// fill out the fields; you've already implemented this
stamper.close();
// return the template
PdfReader reader = new PdfReader(baos.toByteArray());
return w.getImportedPage(reader, 1);
}
There may be better solutions, for instance involving XFA, but I don't know if that's feasible as I don't know if the templates (the light blue part in my schema) are always the same. It would also involve creating new templates in the XML Forms Architecture.
-
The part that is still confusing, which is perhaps key to understanding this new way of doing things compared to my own, is how my PdfStamper object should be created. In my previous methodology, I create it with
PdfStamper stamper = new PdfStamper(reader, templateBuffer);
first, and thenstamper = new PdfStamper(reader, storageBuffer);
second, both within the loop. Do I create it outside the loop in this case? Do I pass it the same output file as I do to the writer? I'm afraid my understanding of the proper usage of the stamper is still a bit fuzzy.Southpaw Hare– Southpaw Hare2014年05月15日 17:16:36 +00:00Commented May 15, 2014 at 17:16 -
We were able to solve this problem, although admittedly I still don't quite understand your examples relative to my problem; they're actually quite similar in style to the examples from your book which I failed to understand or find useful. Your discussions did point my co-worker and I in the correct direction, though, so thanks for that. I do hope that you come to give people more benefit of the doubt about that book thing in the future - I think it's just a matter that some people learn differently than you do.Southpaw Hare– Southpaw Hare2014年05月16日 22:13:56 +00:00Commented May 16, 2014 at 22:13
A co-worker of mine was able to understand how to remove the extraneous uses of stampers and buffers, and open them to the correct combination of places such that the final document remains open for the entirety of the process and is closed only once, thus creating the underlying tables and such in the file correctly as Mr. Lowagie suggested was the issue.
Here is a revised version of the previous test example with these modifications:
public class PDFFileMakerRevised {
private static final int INCH = 72;
final private static float MARGIN_TOP = INCH / 4;
final private static float MARGIN_BOTTOM = INCH / 2;
private static final String DIREC = "/pdftest/";
private static final String OUTPUT_FILEPATH = DIREC + "coolerdoc_%d.pdf";
private static final String TEMPLATE1_FILEPATH = DIREC + "template1.pdf";
private static final Rectangle PAGE_SIZE = PageSize.LETTER;
private static final Rectangle TEMPLATE_SIZE = PageSize.LETTER;
private static final int DEFAULT_NUMBER_OF_TIMES = 200;
private float currPosition;
private int currPage;
public static void main (String [] args) {
System.out.println("Starting...");
PDFFileMakerRevised maker = new PDFFileMakerRevised();
File file = null;
try {
file = maker.createPDF(DEFAULT_NUMBER_OF_TIMES);
}
catch (Exception e) {
e.printStackTrace();
}
if (file == null || !file.exists()) {
System.out.println("File failed to be created.");
}
else {
System.out.println("File creation successful.");
}
}
public File createPDF(int numTimes) throws FileNotFoundException, IOException, DocumentException, InterruptedException {
currPosition = 0;
currPage = 1;
String sFilepath = String.format(OUTPUT_FILEPATH, numTimes);
Document d = new Document(PAGE_SIZE);
PdfWriter w = PdfWriter.getInstance(d, new FileOutputStream(sFilepath));
d.open();
PdfContentByte cb = w.getDirectContent();
ByteArrayOutputStream stampedBuffer;
for (int i = 0; i < numTimes; i++) {
PdfReader templateReader = new PdfReader(new FileInputStream(TEMPLATE1_FILEPATH));
stampedBuffer = new ByteArrayOutputStream();
PdfStamper stamper = new PdfStamper(templateReader, stampedBuffer);
stamper.setFormFlattening(true);
AcroFields form = stamper.getAcroFields();
// Get Size
float[] area = form.getFieldPositions("end");
float size = TEMPLATE_SIZE.getHeight() - MARGIN_TOP - area[4];
// Requires Page Break
if (size >= PAGE_SIZE.getHeight() - MARGIN_TOP - MARGIN_BOTTOM + currPosition) {
currPosition = 0;
currPage += 1;
d.newPage();
}
form.setField("field1", String.format("Form Text %d", i+1));
form.setField("page", String.format("Page %d", currPage));
stamper.close();
templateReader.close();
form = null;
PdfReader stampedReader = new PdfReader(stampedBuffer.toByteArray());
PdfImportedPage page = w.getImportedPage(stampedReader, 1);
cb.addTemplate(page, 0, currPosition);
currPosition = currPosition - size;
}
d.close();
w.close();
return new File(sFilepath);
}
}