Working With PDFs in Node.js Using pdf-lib

by Valeri Karpov@code_barbarian April 08, 2020

The pdf-lib npm module is a great tool for creating and editting PDFs with Node.js. Puppeteer is a great tool for generating PDFs from HTML, but unfortunately browser support for print layouts in CSS is not very good in my experience. The pdf-lib module gives you very fine grained control over PDFs, and is great for tasks like merging PDFs, adding page numbers and watermarks, splitting PDFs, and basically anything else you might use the ILovePDF API for.

Getting Started

Let's use pdf-lib to create a simple PDF document. The PDF document will have 1 page with the Mastering JS logo in the middle.

const { PDFDocument } = require('pdf-lib');
const fs = require('fs');
run().catch(err => console.log(err));
async function run() {
 // Create a new document and add a new page
 const doc = await PDFDocument.create();
 const page = doc.addPage();
 // Load the image and store it as a Node.js buffer in memory
 let img = fs.readFileSync('./logo.png');
 img = await doc.embedPng(img);
 // Draw the image on the center of the page
 const { width, height } = img.scale(1);
 page.drawImage(img, {
 x: page.getWidth() / 2 - width / 2,
 y: page.getHeight() / 2 - height / 2
 });
 // Write the PDF to a file
 fs.writeFileSync('./test.pdf', await doc.save());
}

Running the above script generates the below PDF. Working with pdf-lib is pretty easy, there's just a few gotchas: note that PDFDocument#embedPng() and PDFDocument#save() return promises, so you need to use await.

Merging 2 PDFs

The killer feature for pdf-lib is that you can modify existing PDFs, not just create new ones. For example, suppose you have two PDFs: one containing the cover of an eBook, and one containing the eBook content. How can you merge the two? I used the ILovePDF API for my last eBook, but pdf-lib makes this task easy in Node.js.

Here's two PDF files: cover.pdf and page-30-31.pdf. The below script uses pdf-lib to combine the two into a single test.pdf file.

const { PDFDocument } = require('pdf-lib');
const fs = require('fs');
run().catch(err => console.log(err));
async function run() {
 // Load cover and content pdfs
 const cover = await PDFDocument.load(fs.readFileSync('./cover.pdf'));
 const content = await PDFDocument.load(fs.readFileSync('./page-30-31.pdf'));
 // Create a new document
 const doc = await PDFDocument.create();
 // Add the cover to the new doc
 const [coverPage] = await doc.copyPages(cover, [0]);
 doc.addPage(coverPage);
 // Add individual content pages
 const contentPages = await doc.copyPages(content, content.getPageIndices());
 for (const page of contentPages) {
 doc.addPage(page);
 }
 // Write the PDF to a file
 fs.writeFileSync('./test.pdf', await doc.save());
}

Below is what the merged PDF looks like.

Adding Page Numbers

One of the biggest pain points of generating PDFs from HTML with Puppeteer is how painful it is to add page numbers. Seems simple, but CSS print layouts still don't quite work for that case. Take a look at the time I wrote a for loop with hard-coded pixel offsets to get page numbers to show up correctly.

For example, here's a PDF containing first 4 pages of Mastering Async/Await without the page numbers: ./content.pdf. Below is a script that adds page numbers to each page in the PDF.

const { PDFDocument, StandardFonts, rgb } = require('pdf-lib');
const fs = require('fs');
run().catch(err => console.log(err));
async function run() {
 const content = await PDFDocument.load(fs.readFileSync('./content.pdf'));
 // Add a font to the doc
 const helveticaFont = await content.embedFont(StandardFonts.Helvetica);
 // Draw a number at the bottom of each page.
 // Note that the bottom of the page is `y = 0`, not the top
 const pages = await content.getPages();
 for (const [i, page] of Object.entries(pages)) {
 page.drawText(`${+i + 1}`, {
 x: page.getWidth() / 2,
 y: 10,
 size: 15,
 font: helveticaFont,
 color: rgb(0, 0, 0)
 });
 }
 // Write the PDF to a file
 fs.writeFileSync('./test.pdf', await content.save());
}

Below is what the page numbers the script added look like.

Moving On

The Node.js ecosystem is filled with excellent libraries for solving almost any problem you can think of. The pdf-lib module lets you modify PDFs, sharp lets you handle almost anything with images, pkg bundles Node projects into standalone executables, and so many more. Before you start looking for an online API to solve an issue you're seeing, try searching npm, you might find a better solution.

Found a typo or error? Open up a pull request! This post is available as markdown on Github
Please enable JavaScript to view the comments powered by Disqus. comments powered by Disqus

AltStyle によって変換されたページ (->オリジナル) /