Christian Gagné
August 2016
The following is a small demo of the use of Emacs Org-mode coupled with a TeX engine to produce Web content in different formats. A preliminary version of the present page was shown at the TeX Users Group 2016 meeting in Toronto.
Below is a comparison of the same text content first in HTML, then SVG inside an img
element with text converted to paths, and finally directly-included SVG with proper SVG Text. Display may vary depending on the browser, even in recent versions. The text enunciates a principle that the author of the present page holds dearly.
With highly structured notations, the verbosity might actually be embraced if the material is narrative, with both long stretches of linear sequences and arbitrary depth in some places. The depth is often the result of aesthetic goals that require many levels of tag nesting, which might indeed create a Welter of Delimiters.
When the intended fonts are present, the last two instances have much better layout since the content was typeset by a TeX engine in DVI mode and subsequently converted by the wonderful tool dvisvgm. The intended fonts, in this case, are Charis SIL or Bitstream Charter: the metrics in the DVI were calculated with Charter chosen as the main body font. Many font families with similar metrics will give pleasing results. Such sophisticated typesetting affords both economy and ergonomics, and that is why I consider it worthy of the effort for Web content production – and also because this allows one to typeset for the Web the kind of complex content that TeX excels at, be it book-quality tables, expansive bibliographies or multi-language text in several directions. Perhaps someday the Web browsers will include magical unicorn TeX-equivalent layout engines, but that day has not yet arrived.
With the third instance, we get the best of both worlds: the SVG Text can be made as semantic and as accessible as one desires by adding the appropriate markup (e.g. ARIA). It can also be selected, copied and pasted into applications that recognize the content type as text.
Now, the following is an example of how Org-mode syntax can be used to produce such reusable content, with flexible structure notations that export to multiple formats (in this case primarily HTML5 and LaTeX).
#+begin_section
#+attr_html: :id about-blocks :class mt_sectitle
#+begin_header
How block-delimited lines become paragraphs,
which are at the syntactic level
of mt_concept(elements) in Org parlance
#+end_header
The parsing rules dictate that, because Org
is very much a mt_emphasis(line-based) format
(in its surface guise), paragraphs are created
inside the special blocks as expected.
As for the lexical macros, they are expanded
at the very beginning of the export process.
#+end_section
The above contains Org special blocks, which create so-called ‘greater elements’ and ‘elements’ (roughly equivalent to CommonMark’s container blocks and leaf blocks), plus some lexical macros, which are Org ‘objects’ (inlines in Markdown/CommonMark). The mt_
prefix means ‘multi-target’, since the constructs will be substituted for the appropriate HTML or TeX markup depending on the selected export format. Note that the macro names are here rendered in a different font so as to denote that they are user-defined names in the Org source.
The previous example exports to the following HTML, whose rendition depends on the present page’s associated CSS:
How block-delimited lines become paragraphs, which are at the syntactic level of elements in Org parlance
The parsing rules dictate that, because Org is very much a line-based format (in its surface guise), paragraphs are created inside the special blocks as expected.
As for the lexical macros, they are expanded at the very beginning of the export process.
Here is the same Org fragment exported to LaTeX, with the LaTeX being handled differently from the first example above: here it is typeset to PDF and the PDF is converted to SVG with the Poppler conversion tools. The Poppler tools are widely available today: they are part of Inkscape and can also be used with the Unix-like command-line utility pdf2svg, which means that nowadays they can even be used on Windows if one installs the Windows Subsystem for Linux. The rendition here is a minimalistic example of what can be done with the pdfTeX engine; the body text is typeset in the rather exclusive Concrete Roman family, which is very suitable for the Web:
Finally, the following is one small step in my long quest to find the most satisfying notation for structuring text. Here, the delimiters are inspired by the substitution operator of equational logics, based on the famous Leibniz rule. Interestingly, this parenthesis-then-square-bracket arrangement is the reverse of Markdown’s link syntax, which might be a good thing or not depending on who you ask. This syntax is just an idea that I am floating around, I certainly don’t want to use an actual equational logic system to write documents. Someday, the most beautiful and useful syntax will emerge from such essays.
(
(
How block-delimited lines become paragraphs,
which are at the syntactic level
of (elements)[ps_concept] in Org parlance
)
[ps_header]
The parsing rules dictate that, because Org
is very much a (line-based)[ps_emphasis] format
(in its surface guise), paragraphs are created
inside the special blocks as expected.
As for the lexical macros, they are expanded
at the very beginning of the export process.
)
[ps_section]