docx-parser

Star

Here are 15 public repositories matching this topic...

Language: All

Filter by language

All 15 Python 12 HTML 1 Rust 1

Sort: Most stars

Sort options

Most stars Fewest stars Most forks Fewest forks Recently updated Least recently updated

ispras / dedoc

Star 610

Dedoc is a library (service) for automate documents parsing and bringing to a uniform format. It automatically extracts content, logical structure, tables, and meta information from textual electronic documents. (Parse document; Document content extraction; Logical structure extraction; PDF parser; Scanned document parser; DOCX parser; HTML parser

html pdf ocr table-of-contents excel html-parser docx documents doc scanned-documents txt document-analysis odt pdf-parser table-recognition docx-parser document-content-extraction logical-structure-extraction

Updated Sep 22, 2025
Python

has-abi / docparser

Star 11

Extract text from your DOCX documents.

text-parser document-parser doc-parser docx-parser

Updated Feb 10, 2024
Python

omar2535 / BioLife-AU-01-attendance-parser

Star 2

Biolife-AU-01 打卡鐘解析程序

parser html-parser docx docx-parser

Updated Aug 2, 2025
Python

sarabjit1003 / resume-tracker

Star 2

A smart resume screening tool that matches resumes to job descriptions using Streamlit and Python.

python data-analysis pdf-parser job-matching ai-project streamlit docx-parser career-tools resume-tracker portfolio-proj

Updated Jun 5, 2025
Python

lukethacoder / docx-to-html

Star 2

📃 A GUI based docx to html parser. Useful for ripping out inline styles of docx files.

docx rich-text docx-parser

Updated Oct 14, 2025
HTML

FayazK / Document-Metadata-Extractor

Star 1

A Python tool that uses Google's Gemini AI to automatically extract structured metadata from PDF and DOCX documents, saving results to Excel for easy analysis and organizing raw responses as JSON files.

nlp text-analysis data-extraction document-management metadata-extraction pdf-parser document-processing excel-export json-output python-automation docx-parser generative-ai gemini-ai-project content-indexing

Updated Mar 21, 2025
Python

Talabov / Resume-Parser-API

Star 1

Extract key details from resumes (PDF or DOCX) via a fast Flask API. Returns name, contact info, skills, experience, and education in clean JSON.

resume-parser flask-api pdf-processing docx-parser hr-tools

Updated Jun 21, 2025

kchernokozinsky / paper-sage

Star 0

AI-powered student assignment evaluator written in Rust. Supports code, PDF, and DOCX files. Uses local or remote LLMs to grade submissions based on configurable criteria, and exports results to Excel.

rust education ai grading openai code-review gpt student-assignments cli-tool excel-export pdf-processing docx-parser llm ollama automated-evaluation

Updated Jun 26, 2025
Rust

xsukax-Word-Document-Comparison-Tool

xsukax / xsukax-Word-Document-Comparison-Tool

Star 0

A powerful, privacy-focused web application for side-by-side comparison of Word documents with intelligent diff highlighting, comprehensive analytics, and multilingual support including Arabic and RTL languages.

word-diff visual-diff document-analysis document-processing single-page-application diff-viewer change-tracking comparison-tool word-processor document-comparison docx-parser office-documents rtl-support content-comparison word-document-diff docx-comparison side-by-side-diff text-difference-viewer file-comparison-tool similarity-checker

Updated Oct 15, 2025
Python

coffeemesh / compareFootnotes

Star 0

Small script for comparing footnotes on .docx files. Resulting in a .csv

python script compare-text docx-parser docx2python

Updated Jan 11, 2025
Python

Imtiazsalaf-01 / Automated-Resume-Parser

Star 0

Automated Resume Parser – Built at Codec Technologies during internship. Designed an intelligent parser that extracts candidate details (name, contact, skills, experience, education) from PDF/DOCX resumes and converts them into structured data, enhancing recruitment efficiency.