tuffstuff9/nextjs-pdf-parser

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.vscode		.vscode
app		app
components		components
.gitignore		.gitignore
README.md		README.md
next.config.js		next.config.js
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock

Repository files navigation

Next.js PDF Parser Template 📄🔍

nextjs-pdf-parser.mp4

Introduction

I was having some trouble parsing PDFs in Next.js, so I thought I would make this template for anyone else who was facing the same issues as me. I hope this template saves you some time and trouble. It's a basic create-next-app with PDF parsing implemented using the pdf2json library and file uploading facilitated by FilePond.

Installation & Setup 🚀

Clone the repository:
git clone [repository-url]
Navigate to the project directory:
cd nextjs-pdf-parser
Install dependencies:
Windows only: In app\api\upload\route.ts on line 22, change tempFilePath to a valid path. Make sure it starts from the root drive, for example: C:/coding/nextjs-pdf-parser/public/${fileName}.pdf
```
npm install
# or
yarn install
```
Run the development server:
```
npm run dev
# or
yarn dev
```
Visit http://localhost:3000 to view the application.

Usage 🖱

Navigate to http://localhost:3000 and use the FilePond uploader to select and upload a PDF. Once uploaded, the content of the PDF is parsed and printed to the server console (Note: it will not be printed to the browser log).

Technical Details 🛠

nodeUtil is not defined Error:

To bypass the nodeUtil is not defined error, the following configuration was added to next.config.js:

const nextConfig = {
 experimental: {
 serverComponentsExternalPackages: ['pdf2json'],
 },
};
module.exports = nextConfig;

See more details here

Blank output from pdfParser.getRawTextContent():

This issue might be due to incorrect type definitions. There are two potential solutions:
1. Fix TypeScript definitions: Update the type definition for PDFParser.
2. Bypass type checking: Instantiate PDFParser as shown:
  
  const pdfParser = new (PDFParser as any)(null, 1);
For more details, refer to my comment on this GitHub issue.

Acknowledgements 🙏

A special thanks to the following libraries and their contributors:

FilePond : For providing a seamless and user-friendly file uploading experience.
pdf2json : For its efficient and robust PDF parsing capabilities.

License 📜

MIT License

About

Next.js template for seamless PDF parsing using pdf2json and FilePond. Ideal for developers seeking a ready-to-use solution for PDF content extraction in Next.js projects.

twitter.com/tuff_stuff9

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tuffstuff9/nextjs-pdf-parser

Folders and files

Latest commit

History

Repository files navigation

Next.js PDF Parser Template 📄🔍

Introduction

Installation & Setup 🚀

Usage 🖱

Technical Details 🛠

Acknowledgements 🙏

License 📜

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors 2

Languages

tuffstuff9/nextjs-pdf-parser

Folders and files

Latest commit

History

Repository files navigation

Next.js PDF Parser Template 📄🔍

Introduction

Installation & Setup 🚀

Usage 🖱

Technical Details 🛠

Acknowledgements 🙏

License 📜

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages