Important concepts

Time for our first proper contact with arara! I must stress that is very important to understand a few concepts in which arara relies before we proceed to the usage itself. Do not worry, these concepts are easy to follow, yet they are vital to the comprehension of the application and the logic behind it.

Rules

A rule is a formal description of how arara handles a certain task. For instance, if we want to use pdflatex with our tool, we should have a rule for that. Directives are mapped to rules, so a call to a non-existent rule foo, for instance, will indeed raise an error:

  __ _ _ __ __ _ _ __ __ _
 / _` | '__/ _` | '__/ _` |
| (_| | | | (_| | | | (_| |
 \__,_|_|  \__,_|_|  \__,_|

Processing "doc1.tex" (size: 31 B, last modified: 12/28/2020
07:37:37), please wait.

  ERROR

I could not find a rule named "foo" in the provided rule paths.
Perhaps a misspelled word? I was looking for a file named
"foo.yaml" in the following paths in order of priority:
(/opt/islandoftex/arara/rules)

Total: 0.03 seconds

Once a rule is defined, arara automatically provides an access layer to that rule through directives in the source code, a concept to be formally introduced later on. Observe that a directive reflects a particular instance of a rule of the same name (i.e, a foo directive in a certain source code is an instance of the foo rule).

A note about rules
From version 6.0 on, rules included in the core distribution have been renamed to have a unique prefix in the texmf tree. File names should not be relied upon.

In short, a rule is a plain text file written in the YAML format, described in YAML. I opted for this format because back then it was cleaner and more intuitive to use than other markup languages such as XML, besides of course being a data serialization standard for programming languages.

Animal jokes
As a bonus, the acronym YAML rhymes with the word camel, so arara is heavily environmentally friendly. Speaking of camels, there is the programming reference as well, since this amusing animal is usually associated with Perl and friends.

The default rules, i.e, the rules shipped with arara, are placed inside a special subdirectory named rules/ inside another special directory named ARARA_HOME (the place where our tool is installed). We will learn later on that we can add an arbitrary number of paths for storing our own rules, in order of priority, so do not worry too much about the location of the default rules, although it is important to understand and acknowledge their existence. Observe, however, that rules in the core distribution have a different naming scheme than the ones located in the user space.

The following list describes the basic structure of an arara rule by presenting the proper elements (or keys, if we consider the proper YAML nomenclature). Observe that elements marked as [M] are mandatory (i.e, the rule has to have them in order to work). Similarly, elements marked as [O] are optional, so you can safely ignore them when writing a rule for our tool. A key preceded by context → indicates a context and should be properly defined inside it.

This is the rule structure in the YAML format used by arara. Keep in mind that all subtasks in a rule are checked against their corresponding exit status. If an abnormal execution is detected, the tool will instantly halt and the rule will fail. Even arara itself will return an exit code different than zero when this situation happens (detailed in Command line).

Directives

A directive is a special comment inserted in the source file in which you indicate how arara should behave. You can insert as many directives as you want. The tool will read and extract directives from beginning of the file by default. See Enabling header mode by default in next section for more info.

There are two types of directives in arara which determine the way the corresponding rules will be instantiated. They are listed as follows. Note that directives are always preceded by the arara: pattern.

When handling parametrized directives, arara always checks if directive parameters and rule arguments match. If we try to inject a non-existent parameter in a parametrized directive, the tool will raise an error about it:

  __ _ _ __ __ _ _ __ __ _
 / _` | '__/ _` | '__/ _` |
| (_| | | | (_| | | | (_| |
 \__,_|_|  \__,_|_|  \__,_|

Processing "hello.tex" (size: 102 B, last modified: 12/28/2020
10:28:00), please wait.

  ERROR

I found these unknown keys in the directive: (foo). This should
be an easy fix, just remove them from your map.

Total: 0.21 seconds

As the message suggests, we need to remove the unknown parameter key from our directive or rewrite the rule in order to include it as an argument. The first option is, of course, easier.

Sometimes, directives can span several columns of a line, particularly the ones with several parameters. We can split a directive into multiple lines by using the arara: --> mark (also known as arrow notation during development) to each line which should compose the directive. We call it a multiline directive. Let us see an example:

% arara: pdflatex: {
% arara: --> shell: yes,
% arara: --> synctex: yes
% arara: --> }

It is important to observe that there is no need of them to be in contiguous lines, i.e, provided that the syntax for parametrized directives hold for the line composition, lines can be distributed all over the code. In fact, the log file (when enabled) will contain a list of all line numbers that compose a directive. This feature is discussed later on.

Keep lines together
Although it is possible to spread lines of a multiline directive all over the code, it is considered good practice to keep them together for easier reading and editing. In any case, you can always see which lines compose a directive by inspecting the log file.

arara provides logical expressions, written in the MVEL language, and special operators processed at runtime in order to determine whether and how a directive should be processed. This feature is named directive conditional, or simply conditional as an abbreviation. The following list describes all conditional operators available in the directive context.

Several methods are available in the directive context in order to ease the writing of conditionals, such as ❖ missing, ❖ changed, ❖ found, ❖ unchanged, and ❖ exists featured in the previous examples. They will be properly detailed later on.

No infinite loops
Although there are no conceptual guarantees for proper halting of unbounded loops, we have provided a technical solution for potentially infinite iterations: arara has a predefined maximum number of loops. The default value is set to 10, but it can be overridden either in the configuration file or with a command line flag. We discuss this feature later on.

All directives, regardless of their type, are internally mapped alongside with the reference parameter, discussed earlier on, as a special variable in the rule context. When inspecting the log file, you will find all map keys and values for each extracted directive (actually, there is an entire log section devoted to detailing directives found in the code). See, for instance, the report of the directive extraction and normalization process performed by arara when inspecting doc2.tex, available in the log file. Note that timestamps were deliberately removed in order to declutter the output, and line breaks were included in order to easily spot the log entries.

% arara: pdflatex
% arara: pdflatex: { shell: yes }
\documentclass{article}

\begin{document}
Hello world.
\end{document}
\end{ncodebox}
Directive: { identifier: pdflatex, parameters:
{reference=/home/islandoftex/doc2.tex},
conditional: { NONE }, lines: [1] }

Directive: { identifier: pdflatex, parameters:
{shell=yes, reference=/home/islandoftex/doc2.tex},
conditional: { NONE }, lines: [2] }

The directive context also features another special parameter named files which expects a non-empty list of file names as plain string values. For each element of this list, arara will replicate the current directive and point the element being iterated as current reference value (resolved to a proper absolute, canonical path of the file name). See, for instance, the report of the directive extraction and normalization process performed by arara when inspecting doc3.tex, available in the log file.

% arara: pdflatex: { files: [ doc1.tex, doc2.tex ] }
Hello world.
\bye
Directive: { identifier: pdflatex, parameters:
{reference=/home/islandoftex/doc1.tex},
conditional: { NONE }, lines: [1] }

Directive: { identifier: pdflatex, parameters:
{reference=/home/islandoftex/doc2.tex},
conditional: { NONE }, lines: [1] }

It is important to observe that, in this case, doc3.tex is a plain TeX file, but pdflatex is actually being called on two LaTeX documents, first doc1.tex and then, at last, doc2.tex.

Even when a directive is interpreted with a file other than the one being processed by arara (through the magic of the files parameter), it is possible to use helper methods in the rule context to get access to the original file and reference. Such methods are detailed later on.

Orb tag expansion in parameter values
From version 6.0 on, arara is able to expand orb tags within a special options parameter in the directive context. For instance:

% arara: lualatex: {
% arara: --> options: [ '--output-directory=@{getSession().
% arara: -->                          get("arg:builddir")}'
% arara: -->          ]
% arara: --> }

This feature supports the following methods with their documented meanings, as seen in Methods: ❖ getBasename, ❖ getSession and ❖ getOriginalReference.

Keep in mind that this feature is disabled when arara is running in safe mode, as seen in Command line.

Important changes in version 7

A note to users
If this is your first time using arara or you do not have custom rules in the old format, you can safely ignore this section. All rules shipped with our tool are already written in the new format.
Enabling header mode by default
The header mode (parse only the first commented lines of a file) is now enabled by default. You may return to the old behavior disabling header mode in the configuration file or using the -w/--whole-file command line flag.
Using an own I/O API instead of Java's File objects
In previous versions, arara's rules relied on Java's File API. That was bad for several reasons. Most importantly, we have switched to Java's Path API quite a while ago. Hence, what was used internally and what users accessed diverged.

With our general refactoring, there has been a change of strategies: we now avoid exposing any Java-specific API. The new API which you have access to when using the toFile("some file.txt") method exposes the following properties and methods:

  • The properties isAbsolute, fileName, fileSize, lastModified, parent, exists, isDirectory, and isRegularFile do what their names indicate.
  • The method startsWith(File) checks if the string representation of the one file is prefix of the other one's.
  • normalize() turns a path into an absolute path and normalizes it.
  • resolve(String | File) resolves a child.
  • resolveSibling(String | File) resolves a sibling.
  • readLines() reads the file's content into a List<String>.
  • readText() reads the file's content into a continuous String.
  • writeText(String, append? = false) writes the argument to the file; the optional argument allows appending instead of overwriting.

If you use the toFile method in your rules, you do not need to change anything. All the arara-internal methods like exists(File) have been adjusted to accept objects of the new format. In the end, the only need to change anything is in rules where you have accessed Java's File API yourself.

Add projects
arara now supports projects. See Projects for further information on this new feature.

This section pretty much covered the basics of the changes to this version. Of course, it is highly advisable to make use of the new features available in arara 7.0 for achieving better results. If you need any help, please do not hesitate to contact us. See Introduction for more details on how to get help.

If you are upgrading you may also be interested in reading our changelog or the announcement blog post of this release in the news section on our website.