# The making of a LaTeX pre-processor with Haskell - Part II

16 February 2015

In the previous part, the progam could parse text for emphasized and bold characters. So there’s some functionality, except that it doesn’t do anything useful like produce output. In this part, I’ll implement parsing links Markdown stlye (this is really more of an implementation of Markdown than anything else at the moment), and then turn the parsed text into something useful!

To start off with, I removed the ability for bold emphasized text. This is largely because there’s no plan on how to deal with embedded styles, so I’d rather not start now. Implementation of link parsing exists in the link function, which is very similiar to the other parsers, except it produces two strings. Suprisingly easy to put together in this case, and illustrates the parsing process a bit more clearly. Here is the resulting code:

So not a whole lot changed. without too much effort its resonably easy follow the parser from bodyText.

The next thing to do is make the parser return LaTeXified text. This turned out to be more simple than I thought, though it helps that all I am doing it returning strings. All that was required is to wrap the content in the return statements with LaTeX commands, and altering the main function simply concatanate the resulting parsed text:

Running it with input.htex as This is not in italics. *But this is.* **This is bold.** This is not bold. (Description for a link)[link]. This is not in bold italics. results in

This is not in italics. \emph{But this is.} \textbf{This is bold.} This is not bold. \href{link}{Description for a link}. This is not in bold italics.


Easy! It’s not quite right, since you can’t just run LaTeX on it on the output, but that requires a bit more thinking to manage package handling and other preamble bits.

### Where to?

At this point, I could continue and create a fully fledged pre-preprocessor, but would this satisfy a true need for it. As I said in the previous part, LaTeX syntax is somewhat verbose for a lot of uses. What I can see my pre-processor doing is allowing users to only need content to create pdfs, with some customization via something like a YAML header (like Jekyll), allowing for a bit of a “hands off” experience. Pandoc already allows for the creation of LaTeX files from Markdown, so what further use would a slightly different Markdown derivative be?

Templates are another possible extension, but this functionality is already available in Pandoc, and can be implemented fairly easily with Python using Jinja. One problem with templates is that with the seperation of content and design, the content must be written without regard for where it is in the page. This reduces to having to edit a .tex file anyway.

This all applies to normal text, not math notation since there’s no real alternative markup. Which is real shame since latex is over a bit too verbose what you need it for (especially if you’re writing any calculus). To suit the needs of ordinary text, in my experience with helping create my partners masters thesis in LyX, all that’s needed it a good framework to handle bibliographies and references.

So with all of the above, I’m going leave this project as is. On the plus side, what if you could write math in Haskell? Say, if you wanted to typeset

as LaTeX? This doesn’t look too diffucult, and could produce the output:

Looks fairly trivial to do. How about some calculus? How would you represent

in Haskell? This example is a bit complicated complicated math wise, since we don’t know what \theta is as a function, and \lambda may not be invertable, so rearraging is not so easy. In addition, both side of the equations have functions applied to \theta. But luckely, we don’t care about the math, but the typesettings instead! Inside of a LaTeX (or even html file), one could have:

Where the type definitions are:

Then a parser could use type introspection (for variable names/operations) alongside evaluating equationOne, which maps names to LaTeX commands contained within a text file.

In this case, the Haskell version is a bit longer, and it’s readability compared to the LaTeX version is debatable. But one thing it does allow you to do is reuse functions quickly (without using those bloody backslashes everywhere!). For instance, if say you wanted to rewrite the above to a system of differential equations, all you would have to do is write

One other way that may work is to take working mathematical functions and rely heavily on type introspection to get the meta-data. But in the case of the above differential equation, theta is unknown and may not have a closed form. So you would need some method of saying theta is a function that explicity depends on x, y, and z. Using this method opens up support for incorperating other languages, but I’m not sure there’d be a reliable way of implementing it considering considering the diverse ranges of languages used in the computing world. I’ll stick to the former method and see where it takes me.