Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

eliask/pdfssa4met

Repository files navigation

===============================================================================
PDF Structure and Syntactic Analysis for Metadata Extraction and Tagging
===============================================================================
PDFSSA4MET attempts to provide metadata extraction and tagging based on
structural and syntactic analysis of content in XML.
PDFSSA4MET depends on pdf2xml which is available in binary form at
https://github.com/eliask/pdf2xml/tags. The master branch of the
repository also contains the source.
PDFSSA4MET is written in Python.
The following scripts can be used directly to output results to STDOUT:
pdf2xml.py
headings.py
references.py
socialtags.py
Help, options and arguments for each are available using -h or --help option
e.g.
python references.py --help

About

PDF Structure and Syntactic Analysis for Metadata Extraction and Tagging - https://code.google.com/p/pdfssa4met/

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

AltStyle によって変換されたページ (->オリジナル) /