eliask/pdfssa4met

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
LICENSE.txt		LICENSE.txt
README.txt		README.txt
config.py		config.py
headings.py		headings.py
openCalais.py		openCalais.py
pdf2xml.py		pdf2xml.py
references.py		references.py
socialtags.py		socialtags.py
utils.py		utils.py

Repository files navigation

===============================================================================
PDF Structure and Syntactic Analysis for Metadata Extraction and Tagging
===============================================================================
PDFSSA4MET attempts to provide metadata extraction and tagging based on
structural and syntactic analysis of content in XML.
PDFSSA4MET depends on pdf2xml which is available in binary form at
https://github.com/eliask/pdf2xml/tags. The master branch of the
repository also contains the source.
PDFSSA4MET is written in Python.
The following scripts can be used directly to output results to STDOUT:
pdf2xml.py
headings.py
references.py
socialtags.py
Help, options and arguments for each are available using -h or --help option
e.g.
python references.py --help

About

PDF Structure and Syntactic Analysis for Metadata Extraction and Tagging - https://code.google.com/p/pdfssa4met/

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

eliask/pdfssa4met

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Languages

License

eliask/pdfssa4met

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages