The CodeDown Manual

CodeDown is implemented in Haskell and codedown is the executable that can be called from the command line to perform any of these conversions. See the according appendices below for the according instructions how to install and use it.

1.2 Overview

CodeDown is deliberately not smart. Its own idea is extremely simple and only requires to understand two or three syntax rules for the core conversion. However, this simplicity is built on the inherent properties of the Markdown lightweight markup language, so you need to become familiar with Markdown and we therefore start with a short recap of its features.

CodeDown is designed, implemented, and presented here in two steps. First, Core CodeDown defines the back and forth conversions between the different types of codes and Markdown. The simple rules for these conversions are introduced here, for the concrete example of PHP code, and for all other types of code in general. In practice however, one usually wants to generate other document formats than Markdown, say HTML, and to integrate this in a comfortable codedown command, the universal document converter Pandoc² is merged into CodeDown. Pandoc CodeDown thus explains the options.

2 Markdown

Markdown was originally designed as a way to ease the generation and comprehension of HTML source code. But meanwhile, there are a couple of Markdown extensions and implementations (including Pandoc) that suggest Markdown as a default authoring format for documents in general.

2.1 Markdown syntax, part 1

2.2 A typical example

For example, suppose we want to publish a HMTL file example.html with the following content:

Instead of writing out this "tag soup", we could just create a file, say example.markdown, containing this:

and then generate example.html from example.markdown with the original Perl executable

2.3 Markdown syntax, part 2

There are three more Markdown syntax rules, that will be particularly important for the CodeDown conversions later on:

2.4 Markdown for program documentations

Markdown is an excellent format for the documentation of programming source code! If you ever have to write a manual for some program or application, this is a very convenient format. It is very easy to read and write, especially the just mentioned syntax for inline code and code blocks is very efficient and intuitive. The huge amount of Markdown converter implementations, including some online tools, makes it ubiquitously available. And they not only convert to HTML, but to any documentation format you could possibly whish for: groff man pages, PDF, RTF, LaTeX, DocBook XML, you name it. Besides, it is even very readable in its own text style.

By the way, this very document CodeDownManual.html was originally written in Markdown and then converted to HTML. ⁵ The source text CodeDownManual.markdown should thus be a good example for the ease and beauty of the Markdown syntax (in the extended Pandoc version).

3 Core CodeDown

The main idea behind CodeDown is the use of Markdown as the language for the documentation of program code and also as the primary target format of the document generation.

The core of the CodeDown program is made of the functions that convert source code (in C, JavaScript, Scheme, or whatever) to Markdown text, and vice versa.

3.1 Document generation

First, let us explain the document generator, i.e. the function that converts code to Markdown.

Here, code can be virtually any type of source code, say PHP, JavaScript or C.

By the way, a list of all the currently implemented types of code is given by a call of

3.1.1 Calling the document generator

Later on, we will explain the syntax and semantics of the codedown executable. But in order to give an idea how to realize the conversion in practice, let us take the following example.

Suppose we want to generate a Markdown document named example.markdown from a given PHP file example.php. We can do this by calling

where the order of the options is arbitrary and format specifications (here PHP and Markdown) are case insensitive (i.e. we could have written php and markdown, or Php and MARKDOWN, instead).

3.1.2 The rules for the conversion to Markdown

Obviously, converting a file from PHP to Markdown changes the content and character of the file. Let us first introduce the conversion for the concrete example of PHP code.

As usual, a comment is a part of the source code, that is ignored by code applications like interpreters or compilers.

By a variation of these comments, CodeDown destinguishes the following areas in the PHP source code:

All other code outside Markdown document lines or blocks and literal code blocks are simply ignored.

3.1.3 An example

3.1.3.1 Generating the example

Suppose, the original PHP code of the previous example is the content of a file named HelloWorld.php.

We can generate the Markdown text as content of a file called HelloWorld.markdown by calling

A subsequent conversion of this Markdown file into the HTML file called HelloWord.html is achieved by

3.1.4 Calling for help

3.2 Document generation for arbitrary code languages

We just described how CodeDown defines the Markdown document generator for the PHP programming language. But CodeDown is a Markdown document generator for just any (main stream) code language. In fact, the implementation is designed so that adding a new document generator for yet another type of code XYZ just requires a few lines of code. ⁷ In this sense, CodeDown is a true "document generator generator".

In the sequel, we explain the document generation with CodeDown for arbitrary types of code.

3.2.1 Any-time help

Once the general principles for the document generation is understood, it should be possible to work with codedown without the need to consult this manual anymore. All specific information is then available from the following calls:

3.2.2 Line and block comments

All mainstream programming languages allow the insertion of comments into the source code. These are text parts, that are ignored by machine (i.e. the interpreter or compiler). The syntax for comments always works according to at least one of the following two principles:

Again, every modern programming language provides at least one of the following kinds of comments. Some only have line comments, such as Scheme, bash scripts or Perl.⁸ Others only know block comments, such as SML and SQL. And languages like C and Haskell have both. ⁹

3.2.3 Markdown document lines and blocks, and literal code blocks

The universal CodeDown document generator modifies the comments of a given code language so that each source contains certain designated parts:

All other source code outside Markdown document parts and literal code blocks is ignored during the document generation.

Note, that the special CodeDown symbols (e.g. "// //", "/***" and "***/", "///BEGIN///" and "///END///" in PHP) always have to be at the beginning of a line.

3.2.3.1 For example: document generation in C

An exhaustive overview of the CodeDown document generation rules for the C programming language is shown after a call of

For example, a C source file HelloWorld.c with might contain the following code:

3.2.3.2 For example: document generation in Scheme

3.2.3.3 For example: document generation in C

3.2.4 Good style recommendations

3.3 Code generation or literal programming

4 Pandoc CodeDown

5 Appendix: Installation

6 Appendix: The codedown user manual

We explain the syntax and options of the codedown executable in some detail. A short summary can be obtained at any time by calling the help function without a value, i.e.

6.1 Formats

CodeDown is universal converter between different text formats, and there are three types of formats:

The names of all these formats need to be specified in the source (--from or --read) and target (--to or --write) options of the codedown command. These name values are case-insensitive. For example, LaTeX, latex and LATEX are equally possible.

6.2 Syntax of the codedown call

6.2.1 Syntax for options

Each OPTION is a combination of a key and possible values. Each OPTION has has a long form

and often also an equivalent short version, where the key K stands for just one letter

where each of these options has a short version and can thus be replaced by ¹²

6.2.2 The codedown options

7 Appendix: Haskell implementation

Haskell module	Haddock documentation	CodeDown documentation in HTML	CodeDown documentation in Markdown
		CodeDownManual.html	CodeDownManual.markdown
CodeDown.hs	CodeDown.html	CodeDown.hs.html	CodeDown.hs.markdown
CoreCodeDown.hs	CoreCodeDown.html	CoreCodeDown.hs.html	CoreCodeDown.hs.markdown
PandocCodeDown.hs	PandocCodeDown.html	PandocCodeDown.hs.html	PandocCodeDown.hs.markdown

8 Appendix: Table of code symbols

The following table lists all types of code languages currently implemented in CodeDown, together with the comment symbols.

9 Appendix: Links and resources

This is entirely thanks to John MacFarlanes Pandoc, that does all the hard work hidden behind the scenes. ↩
Compare to the simplicity of CodeDown, Pandoc is a huge and very sophisticated program written by John MacFarlane, which does all the heavy conversion work between the different document formats. ↩
The syntax of the codedown command is very similar to the pandoc command syntax. There is one big difference, however, namely the --input option, which does not exist for pandoc. There, the input files are added at the end of the call, as the example shows. ↩
There is yet another version for code blocks in Markdown, but only in the extended Markdown version of Pandoc, namely delimited code blocks between tilde-lines, with an option to use syntax highlighting for many types of code. You can use that, too, but the official version of CodeDown does not mention this explicitly. ↩
The conversion was done with the command codedown --from=markdown --to=html --input CodeDownManual.markdown --output=CodeDownManual.html --table-of-contents --standalone --css=CodeDown.css, and that has the same effect as pandoc --from=markdown --to=html --output=CodeDownManual --table-of-contents --standalone --css=CodeDown.css CodeDownManual.markdown. ↩
In fact, there are two versions for a line comment in PHP, namely the // and a # symbol. But CodeDown takes only one of the two. By taking // and neglecting #, PHP behaves the same way as the other languages from the C-like syntax family, like JavaScript, C and Java. ↩
See the CoreCodeDown.hs.html documentation of the Haskell CoreCodeDown.hs module, which explains how a new programming language is added to the supported types of code. This simple customization of CodeDown is complete, it even implies the automatic generation of the help messages, i.e. a call of codedown --help=XYZ. ↩
In this context, Perl, Python and Ruby are considered languages that only have line comments, because their block comments use a special markup for their own document converters. ↩
In the implementation of the general document generators in the CoreCodeDown.hs module we say that a code language is of type 1, if it has a line, but no block comment. If it is the other way round, we call it a type 2 code language. If it has both, line and block comments, it is of type 3. For example, scheme and bash are type 1, SML and SQL are type 2, and C and (Common) Lisp are type 3. ↩
PDF output is generated via LaTeX and is supported with the markdown2pdf wrapper, included in the Pandoc installation. By using codedown, all this is done automatically. For example, calling codedown -f markdown -t pdf -i example.markdown -o example.pdf should work just fine. ↩
To be precise, the order of the options in a codedown call is not entirely arbitrary, namely in case you specify the same option several times. But this is never intended and average users will avoid doing that, anyway. ↩
As it is common for one-letter UNIX command options without values, these one-letter flags can be condensed into a single one. For example, in UNIX, a call of ls -A -l -r -R -S is equivalent to ls -AlrRS. This works in CodeDown and Pandoc, too, but the time and space to mention this is probably not worth the time that can be saved when using these abbreviations. ↩