Earmark—A Pure Elixir Markdown Processor
N.B.
This README contains the docstrings and doctests from the code by means of extractly
and the following code examples are therefore verified with ExUnit doctests.
Dependency
{ :earmark, "> x.y.z" }Earmark
Abstract Syntax Tree and Rendering
The AST generation has now been moved out to EarmarkParser
which is installed as a dependency.
This brings some changes to this documentation and also deprecates the usage of Earmark.as_ast
Earmark takes care of rendering the AST to HTML, exposing some AST Transformation Tools and providing a CLI as escript.
Therefore you will not find a detailed description of the supported Markdown here anymore as this is done in here
Earmark.as_ast
WARNING: This is just a proxy towards EarmarkParser.as_ast and is deprecated, it will be removed in version 1.5!
Replace your calls to Earmark.as_ast with EarmarkParse.as_ast as soon as possible.
N.B. If all you use is Earmark.as_ast consider only using EarmarkParser.
Also please refer yourself to the documentation of EarmarkParser
The function is described below and the other two API functions as_html and as_html! are now based upon
the structure of the result of as_ast.
{:ok, ast, []} = EarmarkParser.as_ast(markdown)
{:ok, ast, deprecation_messages} = EarmarkParser.as_ast(markdown)
{:error, ast, error_messages} = EarmarkParser.as_ast(markdown)Earmark.as_html
{:ok, html_doc, []} = Earmark.as_html(markdown)
{:ok, html_doc, deprecation_messages} = Earmark.as_html(markdown)
{:error, html_doc, error_messages} = Earmark.as_html(markdown)Earmark.as_html!
html_doc = Earmark.as_html!(markdown, options)
Formats the error_messages returned by as_html and adds the filename to each.
Then prints them to stderr and just returns the html_doc
Options
Options can be passed into as as_html/2 or as_html!/2 according to the documentation.
A keyword list with legal options (c.f. Earmark.Options) or an Earmark.Options struct are accepted.
{status, html_doc, errors} = Earmark.as_html(markdown, options)
html_doc = Earmark.as_html!(markdown, options)
{status, ast, errors} = EarmarkParser.as_ast(markdown, options)Rendering
All options passed through to EarmarkParser.as_ast are defined therein, however some options concern only
the rendering of the returned AST
These are:
compact_output:defaults tofalse
Normally Earmark aims to produce Human Readable output.
This will give results like these:
iex(0)> markdown = "# Hello\nWorld"
...(0)> Earmark.as_html!(markdown, compact_output: false)
"<h1>\nHello</h1>\n<p>\nWorld</p>\n"But sometimes whitespace is not desired:
iex(1)> markdown = "# Hello\nWorld"
...(1)> Earmark.as_html!(markdown, compact_output: true)
"<h1>Hello</h1><p>World</p>"Be cautions though when using this options, lines will become loooooong.
escape: defaulting to true
If set HTML will be properly escaped
iex(2)> markdown = "Hello<br />World"
...(2)> Earmark.as_html!(markdown)
"<p>\nHello<br />World</p>\n"
However disabling escape: gives you maximum control of the created document, which in some
cases (e.g. inside tables) might even be necessary
iex(3)> markdown = "Hello<br />World"
...(3)> Earmark.as_html!(markdown, escape: false)
"<p>\nHello<br />World</p>\n"postprocessor:defaults to nil
Before rendering the AST is transformed by a postprocessor.
For details see the description of Earmark.Transform.map_ast· below which will accept the same postprocessor as
a matter of fact specifying postprocessor: fun is conecptionnaly the same as
markdown
|> EarmarkParser.as_ast
|> Earmark.Transform.map_ast(fun)
|> Earmark.Transform.transformwith all the necessary bookkeeping for options and messages
renderer:defaults toEarmark.HtmlRendererThe module used to render the final document.
smartypants: defaulting to true
If set the following replacements will be made during rendering of inline text
"---" → "—"
"--" → "–"
"' → "’"
?" → "”"
"..." → "…"Command line
$ mix escript.build
$ ./earmark file.md
Some options defined in the Earmark.Options struct can be specified as command line switches.
Use
$ ./earmark --helpto find out more, but here is a short example
$ ./earmark --smartypants false --code-class-prefix "a- b-" file.mdwill call
Earmark.as_html!( ..., %Earmark.Options{smartypants: false, code_class_prefix: "a- b-"})Timeouts
By default, that is if the timeout option is not set Earmark uses parallel mapping as implemented in Earmark.pmap/2,
which uses Task.await with its default timeout of 5000ms.
In rare cases that might not be enough.
By indicating a longer timeout option in milliseconds Earmark will use parallel mapping as implemented in Earmark.pmap/3,
which will pass timeout to Task.await.
In both cases one can override the mapper function with either the mapper option (used if and only if timeout is nil) or the
mapper_with_timeout function (used otherwise).
For the escript only the timeout command line argument can be used.
Security
Please be aware that Markdown is not a secure format. It produces
HTML from Markdown and HTML. It is your job to sanitize and or
filter the output of Earmark.as_html if you cannot trust the input
and are to serve the produced HTML on the Web.
Transformations
Structure Conserving Transformers
For the convenience of processing the output of EarmarkParser.as_ast we expose two structure conserving
mappers.
map_ast
takes a function that will be called for each node of the AST, where a leaf node is either a quadruple
like {"code", [{"class", "inline"}], ["some code"], %{}} or a text leaf like "some code"
The result of the function call must be
for nodes → a quadruple of which the third element will be ignored -- that might change in future, and will therefore classically be
nil. The other elements replace the nodefor strings → strings
A third parameter ignore_strings which defaults to false can be used to avoid invocation of the mapper
function for text nodes
As an example let us transform an ast to have symbol keys
iex(0)> input = [
...(0)> {"h1", [], ["Hello"], %{title: true}},
...(0)> {"ul", [], [{"li", [], ["alpha"], %{}}, {"li", [], ["beta"], %{}}], %{}}]
...(0)> map_ast(input, fn {t, a, _, m} -> {String.to_atom(t), a, nil, m} end, true)
[ {:h1, [], ["Hello"], %{title: true}},
{:ul, [], [{:li, [], ["alpha"], %{}}, {:li, [], ["beta"], %{}}], %{}} ]N.B. If this returning convention is not respected map_ast might not complain, but the resulting
transformation might not be suitable for Earmark.Transform.transform anymore. From this follows that
any function passed in as value of the postprocessor: option must obey to these conventions.
map_ast_with
this is like map_ast but like a reducer an accumulator can also be passed through.
For that reason the function is called with two arguments, the first element being the same value
as in map_ast and the second the accumulator. The return values need to be equally augmented
tuples.
A simple example, annotating traversal order in the meta map's :count key, as we are not
interested in text nodes we use the fourth parameter ignore_strings which defaults to false
iex(0)> input = [
...(0)> {"ul", [], [{"li", [], ["one"], %{}}, {"li", [], ["two"], %{}}], %{}},
...(0)> {"p", [], ["hello"], %{}}]
...(0)> counter = fn {t, a, _, m}, c -> {{t, a, nil, Map.put(m, :count, c)}, c+1} end
...(0)> map_ast_with(input, 0, counter, true)
{[ {"ul", [], [{"li", [], ["one"], %{count: 1}}, {"li", [], ["two"], %{count: 2}}], %{count: 0}},
{"p", [], ["hello"], %{count: 3}}], 4}Structure Modifying Transformers
For structure modifications a tree traversal is needed and no clear pattern of how to assist this task with tools has emerged yet.
Contributing
Pull Requests are happily accepted.
Please be aware of one caveat when correcting/improving README.md.
The README.md is generated by Extractly as mentioned above and therefore contributers shall not modify it directly, but
README.md.eex and the imported docs instead.
Thank you all who have already helped with Earmark, your names are duely noted in RELEASE.md.
Author
Copyright © 2014,5,6,7,8,9, 2020,1 Dave Thomas, The Pragmatic Programmers & Robert Dober @/+pragdave, dave@pragprog.com
LICENSE
Same as Elixir, which is Apache License v2.0. Please refer to LICENSE for details.
SPDX-License-Identifier: Apache-2.0