Hermes - a semantic XML+MathML+Unicode e-publishing/self-archiving tool for LaTeX authored scientific articles

Download version 0.9.4:

Users, developers and/or philosophers are invited to subscribe to the Hermes mailing list or to send their comments to the author. Hermes has a blog too: your feedback is appreciated.

Examples

Some results of Hermes assisted conversions are available here; the source distribution also contains an article in LaTeX source, as well as a content-oriented source sample.

What is Hermes?

Hermes is a grammar based translator from (AMS)LaTeX to Unicode(utf-8) XML+MathML+metadata. It is software libre.
Translating pure (AMS)TeX documents is not yet supported by Hermes, but this facility will be available sooner or later, depending on user interest.

What for?

Hermes is here to help individuals at self-archiving, libraries at long term-archiving, and publishers at having a reference document for their various specific services.

How does it work?

Hermes follows the steps below, in the specified order:
  1. semantically seeds a copy of your TeX source
  2. lets the TeX program do its job (texing) on this semantically enriched source
  3. parses the resulting semantic dvi
  4. generates the XML reference document, an XML reflection of your TeX source.
It works on Linux, Windows and OS X.

What is the Hermes reference document?

It is a Unicode XML document with a generic structure, containg free text  and various XML vocabularies.
Its validating XML-Schema will get published after this generic structure gets less fluid.
Currently, the generic structure consists of:
  1. sections, paragraphs,
  2. presentation hints (currently font names and sizes),
  3. free text ((accented)TeX glyphs mapped to their Unicode equivalent),
  4. metadata (title, author, date etc.)
  5. bibliography,
  6. internal and external references (no need for special LaTeX packages to get these activated in the XML),
  7. tables, images
These items are in a one-to-one relationship with the corresponding structures in the source/semantic dvi. This list is easily extensible.
The XML vocabularies reflect the vocabularies used in the LaTeX source, e.g. mathematical regions in the LaTeX source correspond to MathML regions in the reference document.
MathML is the only validable XML vocabulary implemented and supported currently by Hermes (SVG, and other vocabularies, like MARC, or other open standards, may follow,  if users are interested).
Of MathML, only MathML-presentation is generated if Hermes is used to translate legacy LaTeX files (here, by legacy LaTeX files I mean sources which were not edited with semantic vocabularies in mind) without manual intervention on the source.
MathML-content can only be generated if a newly authored LaTeX source uses the semantic LaTeX macros available in the Hermes distribution.

Installation requirements

A standard latex system, gcc, bison, flex, make and libxml/xslt should be on your system, in order to compile the program and have the proper example output (Windows developers can check out the Cygwin distribution, windows users will have a binary distribution (hermes.exe and seed.exe) issued (almost) synchronously with the source distribution.).
Developers and Unix users can unpack the source distro and run make.
After a successful 'make' you get:

General use

Follow the steps below:
'Validate' your source:
  1. - write an (AMS)LaTeX text containing mathematical expressions; LaTeX it and fix all your editing errors ;).
  2. - latex document.tex, if you didn't get a dvi return to step 1
Use Hermes to get the reference document (library) and renderable (publish) XML files:
  1. - run ./seed document.tex, if you didn't get document.s.tex go to found-a-bug
  2. - latex document.s.tex, if you didn't get a document.s.dvi go to found-a-bug
  3. - run ./hermes document.s.dvi >document.lib.xml, if you didn't get a document.lib.xml go to found-a-bug
  4. - run xsltproc pub.xslt document.lib.xml > document.pub.xml, if you didn't get a document.pub.xml go to found-a-bug
  5. - now you can archive or send document.lib.xml to your library, and post your document.pub.xml on your website, along with the MathML-stylesheets for others to read/reuse.
found-a-bug:
either let the author know, fix it or ask around.

Architecture of Hermes

Developer's tips

To do

Hermes has its own website hosted by Albert Einstein Institute, or hosted on the author's personal website, check them often, minor releases come monthly.

Credits

Hermes is covered by GNU GPL. It is being developed by Romeo Anghelache. It was created in the framework of the EU funded MoWGLI IST project (ended in Feb. 2005), as a task for LivingReviews, from Max Planck Institute for Gravitational Physics, Golm, Germany.
Its further development is currently supported by Max Planck Institute for Gravitational Physics.
Valid XHTML 1.0!