Generating PDF files using individual template components

Question 1

EDIT: Updated flow diagram to better explain the (likely unnecessary) complexity of what I'm doing.

We at the company I work for are attempting to create complex PDF files using Java iText (the free version 2.1 line). The documents are built piece-by-piece from individual "template" files, which are added to the final document one after another using the PdfStamper class, as well as filled using AcroForms.

Visual Example

The current design performs a loop which runs for each template that needs to be added in order (as well as the custom logic needed to fill each). In each iteration of the loop, it does the following:

Creates a PdfReader to open the template file
Creates a PdfStamper that reads from the PdfReader and writes to a "template buffer"
Fills AcroForm fields, as well as measures the height of the template by getting the location of an "end" AcroField.
Closes the PdfReader and PdfStamper
Creates a PdfReader to read a "working buffer" that stores the current final document in progress
Creates a PdfStamper that reads from the PdfReader and writes to a "storage buffer"
Closes the PdfReader, opens a new PdfReader to the "template buffer"
Imports the page from the "template buffer," adds it to the ContentByte of the PdfStamper
Closes the PdfReader and PdfStamper
Swaps the "storage buffer" with the "working buffer" in order to be ready to repeat.

Here is a diagram visually explaining the above process, which is performed for each iteration of the "loop" that performs each template:

Flow diagram

However, as discovered through this Stack Overflow question (where example code can also be viewed) and the response by iText author Bruno Lowagie, this methodology of using the PdfStamper can cause significant issues. "Abusing" the PdfStamper by creating and closing the stamper too many times can cause corruptions in the resulting file that only effect some programs, generating a seemingly good document that may fail in certain contexts.

What is the alternative? Mr. Lowagie's response suggests that there is a simpler or more direct way to use PdfStampers, though I do not quite understand it myself yet. Could this be done using only a single stamper? Could it be done without using a rotating series of buffers?

Question 2

I'm always disappointed when I read "we are using iText 2.1" because that's really not a wise choice as explained here, but this is a question about design, so here is a possible approach:

enter image description here

You create a new document Document document = new Document(); (step 1), you create a PdfWriter instance (step 2), you open the document (step 4), and you add content in a loop (step 4):

You have different templates, and by templates we mean: existing PDF documents with fillable fields (AcroForms). You fill them out using PdfStamper and AcroFields (see your code on StackOverflow). This results in separate "flattened" form snippets kept in memory.
If you want to keep these snippets together, you can do so by creating a Document/PdfWriter instance to create a new PDF in memory that combines all the snippets that belong together. You get a snippet like this: PdfImportedPage snippet = writer.getImportedPage(reader, 1); and you add the snippet to the writer using the addTemplate() method.
You get the combined result using PdfImportedPage combined = writer.getImportedPage(reader, 1);, you wrap the result in an image like this: Image image = Image.getInstance(combined); You add the image to the document: document.add(image);

Step 2 could be omitted. You could add the different snippets straight to the document that is initially created. Repeats steps 1 to 3 as many times as needed, and close the document (step 5).

Omitting step 2 will result in a lower XObject nesting count, but keeping step 2 isn't problematic.

In pseudo code, we'd have:

[1.] The outer loop (the large part to the right of the schema, marked PdfWriter)

// step 1
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.getInstance(document, os);
// step 3
document.open();
// step 4
for (int i = 0; i < parameters.length; i++)
 document.add(getSnippetCombination(writer, parameters[i]));
// step 5
document.close();

[2.] The creation of one unit (the arrow marked PdfWriter in the middle)

public Image getSnippetCombination(PdfWriter w, Parameters parameters) {
 // step 1
 Document document = new Document();
 // step 2
 ByteArrayOutputStream baos = new ByteArrayOutputStream();
 PdfWriter writer = PdfWriter.getInstance(document, baos);
 // step 3
 document.open();
 // step 4
 PdfContentByte canvas = writer.getDirectContent();
 for (int i = 0; i < parameters.getNumberOfSnippets(); i++)
 canvas.addTemplate(getSnippet(writer, parameters.getSnippet(i)),
 parameters.getX(i), parameters.getY(i));
 // step 5
 document.close();
 // Convert PDF in memory to From XObject wrapped in Image object
 PdfReader reader = new PdfReader(baos.toByteArray());
 PdfImportedPage page = w.getImportedPage(reader, 1);
 return Image.getInstance(page);
}

[3.] Filling out data in separate snippets (the arrows marked PdfStamper)

public PdfTemplate getSnippet(PdfWriter w, Snippet snippet) {
 // Using PdfStamper to fill out the fields
 PdfReader reader = new PdfReader(snippet.getBytes());
 ByteArrayOutputStream baos = new ByteArrayOutputStream();
 PdfStamper stamper = new PdfStamper(reader, baos);
 stamper.setFormFlattening(true);
 AcroFields form = stamper.getAcroFields();
 // fill out the fields; you've already implemented this
 stamper.close();
 // return the template
 PdfReader reader = new PdfReader(baos.toByteArray());
 return w.getImportedPage(reader, 1);
}

There may be better solutions, for instance involving XFA, but I don't know if that's feasible as I don't know if the templates (the light blue part in my schema) are always the same. It would also involve creating new templates in the XML Forms Architecture.

Question 3

The part that is still confusing, which is perhaps key to understanding this new way of doing things compared to my own, is how my PdfStamper object should be created. In my previous methodology, I create it with PdfStamper stamper = new PdfStamper(reader, templateBuffer); first, and then stamper = new PdfStamper(reader, storageBuffer); second, both within the loop. Do I create it outside the loop in this case? Do I pass it the same output file as I do to the writer? I'm afraid my understanding of the proper usage of the stamper is still a bit fuzzy.

Question 4

We were able to solve this problem, although admittedly I still don't quite understand your examples relative to my problem; they're actually quite similar in style to the examples from your book which I failed to understand or find useful. Your discussions did point my co-worker and I in the correct direction, though, so thanks for that. I do hope that you come to give people more benefit of the doubt about that book thing in the future - I think it's just a matter that some people learn differently than you do.

Question 5

A co-worker of mine was able to understand how to remove the extraneous uses of stampers and buffers, and open them to the correct combination of places such that the final document remains open for the entirety of the process and is closed only once, thus creating the underlying tables and such in the file correctly as Mr. Lowagie suggested was the issue.

Here is a revised version of the previous test example with these modifications:

public class PDFFileMakerRevised {
 private static final int INCH = 72;
 final private static float MARGIN_TOP = INCH / 4;
 final private static float MARGIN_BOTTOM = INCH / 2;
 private static final String DIREC = "/pdftest/";
 private static final String OUTPUT_FILEPATH = DIREC + "coolerdoc_%d.pdf";
 private static final String TEMPLATE1_FILEPATH = DIREC + "template1.pdf";
 private static final Rectangle PAGE_SIZE = PageSize.LETTER;
 private static final Rectangle TEMPLATE_SIZE = PageSize.LETTER;
 private static final int DEFAULT_NUMBER_OF_TIMES = 200;
 private float currPosition;
 private int currPage;
 public static void main (String [] args) {
 System.out.println("Starting...");
 PDFFileMakerRevised maker = new PDFFileMakerRevised();
 File file = null;
 try {
 file = maker.createPDF(DEFAULT_NUMBER_OF_TIMES);
 }
 catch (Exception e) {
 e.printStackTrace();
 }
 if (file == null || !file.exists()) {
 System.out.println("File failed to be created.");
 }
 else {
 System.out.println("File creation successful.");
 }
 }
 public File createPDF(int numTimes) throws FileNotFoundException, IOException, DocumentException, InterruptedException {
 currPosition = 0;
 currPage = 1;
 String sFilepath = String.format(OUTPUT_FILEPATH, numTimes);
 Document d = new Document(PAGE_SIZE);
 PdfWriter w = PdfWriter.getInstance(d, new FileOutputStream(sFilepath));
 d.open();
 PdfContentByte cb = w.getDirectContent();
 ByteArrayOutputStream stampedBuffer;
 for (int i = 0; i < numTimes; i++) {
 PdfReader templateReader = new PdfReader(new FileInputStream(TEMPLATE1_FILEPATH));
 stampedBuffer = new ByteArrayOutputStream();
 PdfStamper stamper = new PdfStamper(templateReader, stampedBuffer);
 stamper.setFormFlattening(true);
 AcroFields form = stamper.getAcroFields();
 // Get Size
 float[] area = form.getFieldPositions("end");
 float size = TEMPLATE_SIZE.getHeight() - MARGIN_TOP - area[4];
 // Requires Page Break
 if (size >= PAGE_SIZE.getHeight() - MARGIN_TOP - MARGIN_BOTTOM + currPosition) {
 currPosition = 0;
 currPage += 1;
 d.newPage();
 }
 form.setField("field1", String.format("Form Text %d", i+1));
 form.setField("page", String.format("Page %d", currPage));
 stamper.close();
 templateReader.close();
 form = null;
 PdfReader stampedReader = new PdfReader(stampedBuffer.toByteArray());
 PdfImportedPage page = w.getImportedPage(stampedReader, 1);
 cb.addTemplate(page, 0, currPosition);
 currPosition = currPosition - size;
 }
 d.close();
 w.close();
 return new File(sFilepath);
 }
}

Bruno Lowagie Bruno Lowagie 1214 bronze badges · Answer 1 · 2014-05-15 14:36:50Z

I'm always disappointed when I read "we are using iText 2.1" because that's really not a wise choice as explained here, but this is a question about design, so here is a possible approach:

enter image description here

You create a new document Document document = new Document(); (step 1), you create a PdfWriter instance (step 2), you open the document (step 4), and you add content in a loop (step 4):

You have different templates, and by templates we mean: existing PDF documents with fillable fields (AcroForms). You fill them out using PdfStamper and AcroFields (see your code on StackOverflow). This results in separate "flattened" form snippets kept in memory.
If you want to keep these snippets together, you can do so by creating a Document/PdfWriter instance to create a new PDF in memory that combines all the snippets that belong together. You get a snippet like this: PdfImportedPage snippet = writer.getImportedPage(reader, 1); and you add the snippet to the writer using the addTemplate() method.
You get the combined result using PdfImportedPage combined = writer.getImportedPage(reader, 1);, you wrap the result in an image like this: Image image = Image.getInstance(combined); You add the image to the document: document.add(image);

Step 2 could be omitted. You could add the different snippets straight to the document that is initially created. Repeats steps 1 to 3 as many times as needed, and close the document (step 5).

Omitting step 2 will result in a lower XObject nesting count, but keeping step 2 isn't problematic.

In pseudo code, we'd have:

[1.] The outer loop (the large part to the right of the schema, marked PdfWriter)

// step 1
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.getInstance(document, os);
// step 3
document.open();
// step 4
for (int i = 0; i < parameters.length; i++)
 document.add(getSnippetCombination(writer, parameters[i]));
// step 5
document.close();

[2.] The creation of one unit (the arrow marked PdfWriter in the middle)

public Image getSnippetCombination(PdfWriter w, Parameters parameters) {
 // step 1
 Document document = new Document();
 // step 2
 ByteArrayOutputStream baos = new ByteArrayOutputStream();
 PdfWriter writer = PdfWriter.getInstance(document, baos);
 // step 3
 document.open();
 // step 4
 PdfContentByte canvas = writer.getDirectContent();
 for (int i = 0; i < parameters.getNumberOfSnippets(); i++)
 canvas.addTemplate(getSnippet(writer, parameters.getSnippet(i)),
 parameters.getX(i), parameters.getY(i));
 // step 5
 document.close();
 // Convert PDF in memory to From XObject wrapped in Image object
 PdfReader reader = new PdfReader(baos.toByteArray());
 PdfImportedPage page = w.getImportedPage(reader, 1);
 return Image.getInstance(page);
}

[3.] Filling out data in separate snippets (the arrows marked PdfStamper)

public PdfTemplate getSnippet(PdfWriter w, Snippet snippet) {
 // Using PdfStamper to fill out the fields
 PdfReader reader = new PdfReader(snippet.getBytes());
 ByteArrayOutputStream baos = new ByteArrayOutputStream();
 PdfStamper stamper = new PdfStamper(reader, baos);
 stamper.setFormFlattening(true);
 AcroFields form = stamper.getAcroFields();
 // fill out the fields; you've already implemented this
 stamper.close();
 // return the template
 PdfReader reader = new PdfReader(baos.toByteArray());
 return w.getImportedPage(reader, 1);
}

There may be better solutions, for instance involving XFA, but I don't know if that's feasible as I don't know if the templates (the light blue part in my schema) are always the same. It would also involve creating new templates in the XML Forms Architecture.

The part that is still confusing, which is perhaps key to understanding this new way of doing things compared to my own, is how my PdfStamper object should be created. In my previous methodology, I create it with PdfStamper stamper = new PdfStamper(reader, templateBuffer); first, and then stamper = new PdfStamper(reader, storageBuffer); second, both within the loop. Do I create it outside the loop in this case? Do I pass it the same output file as I do to the writer? I'm afraid my understanding of the proper usage of the stamper is still a bit fuzzy.
We were able to solve this problem, although admittedly I still don't quite understand your examples relative to my problem; they're actually quite similar in style to the examples from your book which I failed to understand or find useful. Your discussions did point my co-worker and I in the correct direction, though, so thanks for that. I do hope that you come to give people more benefit of the doubt about that book thing in the future - I think it's just a matter that some people learn differently than you do.

Southpaw Hare Southpaw Hare 1,1062 gold badges10 silver badges23 bronze badges · Answer 2 · 2014-05-16 22:01:56Z

A co-worker of mine was able to understand how to remove the extraneous uses of stampers and buffers, and open them to the correct combination of places such that the final document remains open for the entirety of the process and is closed only once, thus creating the underlying tables and such in the file correctly as Mr. Lowagie suggested was the issue.

Here is a revised version of the previous test example with these modifications:

public class PDFFileMakerRevised {
 private static final int INCH = 72;
 final private static float MARGIN_TOP = INCH / 4;
 final private static float MARGIN_BOTTOM = INCH / 2;
 private static final String DIREC = "/pdftest/";
 private static final String OUTPUT_FILEPATH = DIREC + "coolerdoc_%d.pdf";
 private static final String TEMPLATE1_FILEPATH = DIREC + "template1.pdf";
 private static final Rectangle PAGE_SIZE = PageSize.LETTER;
 private static final Rectangle TEMPLATE_SIZE = PageSize.LETTER;
 private static final int DEFAULT_NUMBER_OF_TIMES = 200;
 private float currPosition;
 private int currPage;
 public static void main (String [] args) {
 System.out.println("Starting...");
 PDFFileMakerRevised maker = new PDFFileMakerRevised();
 File file = null;
 try {
 file = maker.createPDF(DEFAULT_NUMBER_OF_TIMES);
 }
 catch (Exception e) {
 e.printStackTrace();
 }
 if (file == null || !file.exists()) {
 System.out.println("File failed to be created.");
 }
 else {
 System.out.println("File creation successful.");
 }
 }
 public File createPDF(int numTimes) throws FileNotFoundException, IOException, DocumentException, InterruptedException {
 currPosition = 0;
 currPage = 1;
 String sFilepath = String.format(OUTPUT_FILEPATH, numTimes);
 Document d = new Document(PAGE_SIZE);
 PdfWriter w = PdfWriter.getInstance(d, new FileOutputStream(sFilepath));
 d.open();
 PdfContentByte cb = w.getDirectContent();
 ByteArrayOutputStream stampedBuffer;
 for (int i = 0; i < numTimes; i++) {
 PdfReader templateReader = new PdfReader(new FileInputStream(TEMPLATE1_FILEPATH));
 stampedBuffer = new ByteArrayOutputStream();
 PdfStamper stamper = new PdfStamper(templateReader, stampedBuffer);
 stamper.setFormFlattening(true);
 AcroFields form = stamper.getAcroFields();
 // Get Size
 float[] area = form.getFieldPositions("end");
 float size = TEMPLATE_SIZE.getHeight() - MARGIN_TOP - area[4];
 // Requires Page Break
 if (size >= PAGE_SIZE.getHeight() - MARGIN_TOP - MARGIN_BOTTOM + currPosition) {
 currPosition = 0;
 currPage += 1;
 d.newPage();
 }
 form.setField("field1", String.format("Form Text %d", i+1));
 form.setField("page", String.format("Page %d", currPage));
 stamper.close();
 templateReader.close();
 form = null;
 PdfReader stampedReader = new PdfReader(stampedBuffer.toByteArray());
 PdfImportedPage page = w.getImportedPage(stampedReader, 1);
 cb.addTemplate(page, 0, currPosition);
 currPosition = currPosition - size;
 }
 d.close();
 w.close();
 return new File(sFilepath);
 }
}

Stack Exchange Network

Generating PDF files using individual template components

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Generating PDF files using individual template components

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions