Process PDF for OCR

Developers ♥ Pipedream

Chase Roberts
Chase Roberts
@chsrbrts
@benedictevans If you haven’t used @pipedream yet, then you haven’t lived.
✨Ellie Day✨
✨Ellie Day✨
@heyellieday
Evaluation update: @pipedream has quite literally been a dream to work with! I’m excited to leverage this tool for all the various workflows I need to write. I’m currently at 11k invocations a day from the initial workflows I’ve written in the past couple weeks.
Michael Braedley
Michael Braedley
@MBraedley
Update: I got it working properly, and it's working so well that I'm dropping IFTTT. @pipedream can do everything that IFTTT basic can, and most (if not all things) IFTTT pro can for free or at a reasonable price if you need it. I am recommending it for basically any power user.
Thomas Cutting
Thomas Cutting
@mrthomascutting
Want quick+dirty integrations for a serverless workflow - @pipedream is my new go-to 😃
Matthew Roberts
Matthew Roberts
@mattdotroberts
day 013 - finally hit node js. This is the secret sauce of taking #nocode projects that one step further. Pumped about getting deeper into @pipedream now
Kenneth Auchenberg 💭
Kenneth Auchenberg 💭
@auchenberg
Yahoo Pipes is back! Kinda 😍 @pipedream
Raymond Camden 🥑
Raymond Camden 🥑
@raymondcamden
Awesome video by the @pipedream folks showing real time twitter sentiment analysis integrated with Google Sheets. This is where Pipedream *really* shines, connecting systems together in easy workflows.
Nacho Caballero
Nacho Caballero
@nachocaballero
I couldn't recommend @pipedream more. It's an amazing service to integrate different APIs. Much more powerful than Zapier and more user-friendly than AWS Lambda. I'm very proud to wear this t-shirt #NoCode
Jason Snow
Jason Snow
@jyksnw
Developed a working prototype environmental sensor IoT solution with @particle Photon, @pipedream, and @MongoDB with full graphing and alerting in less than a day! All amazing technology, will def. be exploring these more.
Steven Terrana
Steven Terrana
@steven_terrana
@burgwyn you've inspired me to finally set up my own blog. I'll make sure my first blog post explains the tech behind the setup. think @obsdmd + @GatsbyJS + @pipedream.
🚄 James Augeri, PhD
🚄 James Augeri, PhD
@DotDotJames
Want to low-code your back end, need more horsepower than @Bubble / @KnackHQ, or just miss Yahoo! Pipes? Check out @PipeDream
Sébastien Chopin
Sébastien Chopin
@Atinux
GitHub issues should be like @linear_app for maintainers. Looking forward more integrations with GH actions or tools like @pipedream 👀
Raul
Raul
@raul_predescu
If you're a dev and not using @pipedream, you're missing out. Been using it for months, daily. FREE for devs. Plenty of integrations and good limits. Absolutely love it.
Bruno Skvorc
Bruno Skvorc
@bitfalls
So @pipedream is pretty amazing. In 3 minutes I just made a flow which adds @rickastley's Never Gonna Give You Up to my @spotify playlist whenever a new pull request arrives in an old repo of mine.
Zach Lanich
Zach Lanich
@ZachLanich
Um, wow 🤯 @pipedream
Steven Bell
Steven Bell
@bellontech
I just used @pipedream to build a Shopify App. Wow, they make small backed tasks easy.
Jay Hack 🎩🇺🇸
Jay Hack 🎩🇺🇸
@_jayhack_
Very impressed with this bad boi - it reminds me of a @PalantirTech internal tool, but geared towards integrations instead of data analysis and far more customizable. Great expectations here 🚀🤩
Tree Sturgeon 🔥🚴‍♂️🌳
Tree Sturgeon 🔥🚴‍♂️🌳
@philsturgeon
For context this is day 2 of a really challenging and stupid migration from Notion to @airtable with disparate/missing data. It's going better than expected and thanks to @pipedream I don't have to bother the iOS dev to add W3W.

import common from "../common/process-base.mjs"; export default { ...common, key: "ocrspace-process-pdf", name: "Process PDF for OCR", description: "Submit a PDF for OCR processing. [See the documentation](https://ocr.space/ocrapi)", version: "0.1.2", annotations: { destructiveHint: false, openWorldHint: true, readOnlyHint: false, }, type: "action", props: { ...common.props, file: { propDefinition: [ common.props.ocrspace, "file", ], label: "PDF File", description: "The URL of the PDF file or the path to the file saved to the `/tmp` directory (e.g. `/tmp/example.pdf`) to process. [See the documentation](https://pipedream.com/docs/workflows/steps/code/nodejs/working-with-files/#the-tmp-directory).", }, syncDir: { type: "dir", accessMode: "read", sync: true, optional: true, }, }, methods: { getSummary() { return "Submitted PDF for OCR processing."; }, }, };

Label	Prop	Type	Description
OCRSpace	`ocrspace`	`app`	This component uses the OCRSpace app.
Language	`language`	`string`	Select a value from the drop down menu:`{ "label": "Arabic", "value": "ara" }{ "label": "Bulgarian", "value": "bul" }{ "label": "Chinese (Simplified)", "value": "chs" }{ "label": "Chinese (Traditional)", "value": "cht" }{ "label": "Croatian", "value": "hrv" }{ "label": "Czech", "value": "cze" }{ "label": "Danish", "value": "dan" }{ "label": "Dutch", "value": "dut" }{ "label": "English", "value": "eng" }{ "label": "Finnish", "value": "fin" }{ "label": "French", "value": "fre" }{ "label": "German", "value": "ger" }{ "label": "Greek", "value": "gre" }{ "label": "Hungarian", "value": "hun" }{ "label": "Korean", "value": "kor" }{ "label": "Italian", "value": "ita" }{ "label": "Japanese", "value": "jpn" }{ "label": "Polish", "value": "pol" }{ "label": "Portuguese", "value": "por" }{ "label": "Russian", "value": "rus" }{ "label": "Slovenian", "value": "slv" }{ "label": "Spanish", "value": "spa" }{ "label": "Swedish", "value": "swe" }{ "label": "Turkish", "value": "tur" }`
Is Overlay Required	`isOverlayRequired`	`boolean`	If true, returns the coordinates of the bounding boxes for each word. If false, the OCR'ed text is returned only as a text block (this makes the JSON reponse smaller). Overlay data can be used, for example, to show text over the image
Detect Orientation	`detectOrientation`	`boolean`	If set to true, the api autorotates the image correctly and sets the TextOrientation parameter in the JSON response. If the image is not rotated, then TextOrientation=0, otherwise it is the degree of the rotation, e. g. "270".
Scale	`scale`	`boolean`	If set to true, the api does some internal upscaling. This can improve the OCR result significantly, especially for low-resolution PDF scans. Note that the front page demo uses scale=true, but the API uses scale=false by default. See also this OCR forum post.
Is Table	`isTable`	`boolean`	If set to true, the OCR logic makes sure that the parsed text result is always returned line by line. This switch is recommended for table OCR, receipt OCR, invoice processing and all other type of input documents that have a table like structure.
OCR Engine	`ocrEngine`	`string`	Select a value from the drop down menu:`{ "label": "OCR Engine 1", "value": "1" }{ "label": "OCR Engine 2", "value": "2" }`
PDF File	`file`	`string`	The URL of the PDF file or the path to the file saved to the `/tmp` directory (e.g. `/tmp/example.pdf`) to process. See the documentation
N/A	`syncDir`	`dir`	This component uses `dir` to share files between component executions.

Label

Prop

Type

Description

OCRSpace

ocrspace

app

This component uses the OCRSpace app.

Language

language

string

Select a value from the drop down menu:

{
 "label": "Arabic",
 "value": "ara"
}

{
 "label": "Bulgarian",
 "value": "bul"
}

{
 "label": "Chinese (Simplified)",
 "value": "chs"
}

{
 "label": "Chinese (Traditional)",
 "value": "cht"
}

{
 "label": "Croatian",
 "value": "hrv"
}

{
 "label": "Czech",
 "value": "cze"
}

{
 "label": "Danish",
 "value": "dan"
}

{
 "label": "Dutch",
 "value": "dut"
}

{
 "label": "English",
 "value": "eng"
}

{
 "label": "Finnish",
 "value": "fin"
}

{
 "label": "French",
 "value": "fre"
}

{
 "label": "German",
 "value": "ger"
}

{
 "label": "Greek",
 "value": "gre"
}

{
 "label": "Hungarian",
 "value": "hun"
}

{
 "label": "Korean",
 "value": "kor"
}

{
 "label": "Italian",
 "value": "ita"
}

{
 "label": "Japanese",
 "value": "jpn"
}

{
 "label": "Polish",
 "value": "pol"
}

{
 "label": "Portuguese",
 "value": "por"
}

{
 "label": "Russian",
 "value": "rus"
}

{
 "label": "Slovenian",
 "value": "slv"
}

{
 "label": "Spanish",
 "value": "spa"
}

{
 "label": "Swedish",
 "value": "swe"
}

{
 "label": "Turkish",
 "value": "tur"
}

Is Overlay Required

isOverlayRequired

boolean

If true, returns the coordinates of the bounding boxes for each word. If false, the OCR'ed text is returned only as a text block (this makes the JSON reponse smaller). Overlay data can be used, for example, to show text over the image

Detect Orientation

detectOrientation

boolean

If set to true, the api autorotates the image correctly and sets the TextOrientation parameter in the JSON response. If the image is not rotated, then TextOrientation=0, otherwise it is the degree of the rotation, e. g. "270".

Scale

scale

boolean

If set to true, the api does some internal upscaling. This can improve the OCR result significantly, especially for low-resolution PDF scans. Note that the front page demo uses scale=true, but the API uses scale=false by default. See also this OCR forum post.

Is Table

isTable

boolean

If set to true, the OCR logic makes sure that the parsed text result is always returned line by line. This switch is recommended for table OCR, receipt OCR, invoice processing and all other type of input documents that have a table like structure.

OCR Engine

ocrEngine

string

Select a value from the drop down menu:

{
 "label": "OCR Engine 1",
 "value": "1"
}

{
 "label": "OCR Engine 2",
 "value": "2"
}

PDF File

file

string

The URL of the PDF file or the path to the file saved to the /tmp directory (e.g. /tmp/example.pdf) to process. See the documentation

N/A

syncDir

dir

This component uses dir to share files between component executions.

Process PDF for OCR with OCRSpace API

Pipedream makes it easy to connect APIs for OCRSpace and 3,000+ other apps remarkably fast.

Trusted by 1,000,000+ developers from startups to Fortune 500 companies

Developers ♥ Pipedream

Getting Started#

Details#

Code#

Configuration#

Authentication#

About OCRSpace#

More Ways to Use OCRSpace#

Actions#

Explore Other Apps#

1
-
24
of
3,000+
apps by most popular

Process PDF for OCR with OCRSpace API

Pipedream makes it easy to connect APIs for OCRSpace and 3,000+ other apps remarkably fast.

Trusted by 1,000,000+ developers from startups to Fortune 500 companies

Developers ♥ Pipedream

1-24of3,000+apps by most popular

1
-
24
of
3,000+
apps by most popular