The tersen dictionary format

If you have not yet read Terms, you may wish to do so before reading this section, as it will freely use the terms presented there.

You can find a full-featured example tersen dictionary based on Dutton Speedwords in the examples/ directory of the source distribution; note that it relies on the example annotations file to function correctly.

Basics

The abbreviation table is defined in a tersen dictionary, a text file with the following format.

Lines beginning with the comment character # and blank lines are ignored. All other lines create one or more mappings. The ordering of entries in the dictionary does not matter, unless you map the same source to multiple destinations, in which case the mapping listed first in the file wins and the rest are ignored (but a warning is printed to let you know there might be a mistake, unless you explicitly suppress it).

The simplest kind of dictionary line looks like this:

source => destination

This creates a mapping that will replace source with destination in output.

You may want to map several sources to the same destination. You can do this by separating them with commas:

source1, source2 => destination

Annotated mappings

You can apply an annotation to any dictionary line by putting an at-sign, the name of the annotation, and optionally some parameters at the end of the line. Annotations programmatically post-process your mapping in some way, reducing the number of repetitive entries that you need to include in your dictionary file.

There are three different forms of annotations. The first is argumentless:

tree => bo @n

The second uses a single pair of square brackets to delimit the arguments, and individual arguments are separated by spaces:

easy => fas @adj[easier easiest]

Each argument can contain any character except whitespace and closing square brackets.

The third and last uses curly braces, with each argument in a separate pair of braces:

log in => lgn @v{logs in}{logged in}{logged in}{logging in}

Each argument can contain any character except newlines and closing curly braces.

For instance, you might write:

you're => v_e @apos

Or:

easy => fas @adj[easier easiest]

If you annotate a line that has comma-separated sources, the annotation function will be applied to each of the sources in turn.

See the Annotations section for details on tersen’s built-in annotations and how you can write your own.

Flags

The following characters of punctuation, called flags, when placed at the start of a line in the dictionary, have special effects. Any number of flags can be used together, and if the same flag is used twice, tersen will behave as if it were used only once.

!

The cut flag causes tersen to stop parsing the dictionary immediately after this entry. This may be useful if you’re trying to debug a small portion of the dictionary. To reduce the risk of accidentally leaving a cut in the dictionary file, a warning will be printed anytime a cut is present, indicating which line the cut is on.

?

The trace flag causes tersen to print its internal lookup-table structure for all entries generated by this dictionary line. This can be useful when debugging annotations. For multi-word tokens, the structure will be printed back from the first token (so if “Internet Protocol” has a ? by it, the entry for “Internet” will appear, containing “Internet Protocol” as a continuation member).

A warning is printed anytime a trace is present.

Note

The trace is printed for all flagged items only after the entire lookup table is built. Therefore, if the entry being traced is not actually inserted successfully (for instance, because it had the same source as an earlier entry), it won’t show up in a trace.

-

The suppress redefinition flag silences any warnings that would otherwise be displayed if any mappings created by this line conflict with existing mappings. This only affects the display of the warning; the original mappings will win, as they would without the flag. This flag is useful when using an annotation that happens to generate an item with the same source as a previous mapping; for instance, the present tense of the verb lead and the singular form of the noun lead are identical, but you might want to include their base forms and attach noun and verb annotations to them in the dictionary.