Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

parserdata/parserdata-python-invoice-batch

Repository files navigation

Python Parserdata API

Parserdata: Batch Invoice Extraction with Python

This example shows how to extract structured invoice data from a folder of PDF / image invoices using the Parserdata API.

You point the script at a folder, it uploads each invoice to the /v1/extract endpoint, and prints back clean JSON with the extracted fields.


Official ParserData pages


What this example does

  • Reads all files from a folder that match a pattern (e.g. ~/Downloads/invoices/*)
  • Filters to supported file types: PDF, PNG, JPG, JPEG
  • Sends each file to the Parserdata API with a natural language prompt
  • Prints a structured JSON result for each invoice (invoice number, date, supplier, total, line items, etc.)
  • Shows basic error handling and debugging output

Requirements

  • Python 3.8+
  • A Parserdata API key (starts with pd_live_...)

1. Clone this repository

git clone https://github.com/parserdata/parserdata-python-invoice-batch.git
cd parserdata-python-invoice-batch

2. Install dependencies

We recommend using a virtual environment, but it's optional.

python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt

3. Set your API key

You can set your API key as an environment variable:

macOS / Linux:

export PARSERDATA_API_KEY="pd_live_..."

Windows (PowerShell):

$env:PARSERDATA_API_KEY="pd_live_..."

Or you can hard-code it in the script (for quick tests only), but avoid committing your key to GitHub.

4. Put some invoices in a folder

Create a folder with your test invoices, for example:

  • C:\Users\YourName\Downloads\invoices\ (Windows)

  • /Users/yourname/Downloads/invoices/ (macOS)

The script will pick up files with these extensions: .pdf, .png, .jpg, .jpeg

5. Configure the input folder in the script

Open invoice_folder_extractor.py and adjust this line:

INPUT_GLOB = r"C:\Users\Admin\Downloads\invoices\*"

Examples:

  • Windows: INPUT_GLOB = r"C:\Users\YourName\Downloads\invoices*"

  • macOS / Linux: INPUT_GLOB = "/Users/yourname/Downloads/invoices/*"

6. Run the script

From the project root:

python invoice_folder_extractor.py

You should see output like:

=== invoice-001.pdf ===
Status: 200
{
 "invoice_number": "INV-001",
 "invoice_date": "2024-01-10",
 "supplier_name": "ACME Supplies",
 "total_amount": 1234.56,
 "line_items": [
 {
 "description": "Widget A",
 "quantity": 10,
 "unit_price": 12.34,
 "net_amount": 123.4
 }
 ]
}

If there's an error, you'll see:

=== invoice-001.pdf ===
Status: 400
Error body: { ... }

7. How the prompt works

In this example, we use a natural language prompt:

PROMPT = (
 "Extract invoice number, invoice date, supplier name, total amount, and line items "
 "(description, quantity, unit price, net amount)."
)

You can change this to match your use case:

  • Add/remove fields

  • Request alternative names

  • Ask for currency codes, tax amounts, etc.

For stricter outputs, you can switch to schema mode in the API. This example keeps it simple and just uses a prompt.

8. Common issues

  • 401 / 403 errors

Double-check your API key and that it's set in PARSERDATA_API_KEY.

  • Timeouts

Default timeout is 300 seconds per request in this script. You can reduce it if needed:

r = requests.post(..., timeout=60)
  • No files found

Check INPUT_GLOB and make sure the path and pattern match where your invoices actually are.

9. Next steps

  • Pipe results into a database or a CSV instead of printing

  • Wrap the logic into a function and reuse it from another app

  • Combine with automation tools (Zapier, Make, n8n, etc.) for fully automated flows

License

MIT – feel free to copy and adapt this example.


Need help or a custom setup?

This repository is a reference example.

If you need help tailoring it to your workflow, or want advice on a more advanced Parserdata API integration (custom schemas, scale, or production use), reach out to us: support@parserdata.com

About

Python example for batch invoice extraction with the Parserdata API. Upload PDFs or images from a local folder and receive structured JSON output.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

Languages

AltStyle によって変換されたページ (->オリジナル) /