Implements the docx2hub library to convert from docx to XML.
Considering this hello word example, docx2hub will generate flat Hub XML with CSSa XML attributes.
<hub xmlns="http://docbook.org/ns/docbook" xmlns:css="http://www.w3.org/1996/css" xml:base="file:///C:/cygwin64/home/kraetke/docx2hub-frontend/sample/hello-word.hub.xml" xml:lang="de" css:rule-selection-attribute="role" css:version="3.0-variant le-tex_Hub-1.2" version="5.1-variant le-tex_Hub-1.2"> <info> <keywordset role="hub"> <!-- (...) hub format properties --> </keywordset> <keywordset role="docProps"> <!-- (...) document properties --> </keywordset> <css:rules> <css:rule layout-type="para" native-name="heading 1" css:font-size="14pt" css:font-family="Calibri" css:page-break-after="avoid" css:margin-top="24pt" css:margin-bottom="0pt" remap="h1" css:font-weight="bold" css:color="#365F91" name="berschrift1"/> </css:rules> </info> <para role="berschrift1"><phrase css:font-style="italic">Hello Word!</phrase></para> </hub>
At least Java 1.7 is required.
This project depends on Git submodules. Therefore you have to clone it with the --recursive
option to get the submodules, too:
git clone https://github.com/transpect/docx2hub-frontend --recursive
For convenient use on command line, we provide a simple Bash script. You can run it in this way:
`./docx2hub.sh sample/hello-word.docx
We provide also Bash and Windows Batch scripts to invoke the XProc pipeline directly:
./calabash.sh -o result=sample/hello-word.xml xpl/docx2hub-frontend.xpl docx=sample/hello-word.docx
Please refer to this tutorial for a more extensive documentation.
Currently docx files from the following applications are supported:
- Microsoft Word
- LibreOffice
- Google Docs