A powerful npm package for parsing text from PowerPoint, PDF, and Word documents. This tool seamlessly extracts text, making it easier to analyze, process, and integrate with your applications.
- Parse text from PPT, PDF, and DOCX files
- Easy-to-use API
- High performance and accuracy
- Supports multiple file formats
- Lightweight and fast
Install the package via npm:
npm install @xoxoharsh/multiparser
Here's how to use the package in your project:
- For parsing whole file:
import Parser from '@xoxoharsh/multiparser'; const parser = new Parser(filePath); parser.extractAll().then((text) =>{ console.log(text); }).catch((error) => { console.error("Error extracting text:", error); });
- For parsing a particular page:
import Parser from '@xoxoharsh/multiparser'; const parser = new Parser(filePath); parser .extractPage(pageNo) .then((text) => { console.log("Page 3 text:", text); }) .catch((error) => { console.error("Error extracting text:", error); }); // Currently this feature is not available for word documents
We welcome contributions!