|
From Domain-Specific Languages Made Easy by Meinte Boersma This article shows you how to use modern JavaScript in a smart way to comfortably implement templates for text/code generation, instead of using a template engine. |
The following article is a standalone excerpt from chapter 8 of my book “Domain-Specific Languages Made Easy” for Manning Publications. This book is going to be available spring 2021, but is already partially available as part of the Manning Early Access Program (MEAP). You can find its product page at: https://www.manning.com/books/domain-specific-languages-made-easy and take 40% off by entering fccboersma into the discount code box at checkout.
This article contains ideas, excerpts, and material from chapter 8: “Generating code from the AST.”
The book focuses on implementing Domain-Specific Languages (DSLs) for business domains, using projectional editing and a JavaScript-based technology stack. With such DSLs, domain expert can capture their extensive domain knowledge as DSL content using a Domain IDE. That knowledge is often “made executable” by generating code from it. This generated code implements the domain-specific part of a software system. This approach is usually called model-driven software development, with the DSL content more commonly known as the model.
A template is a function that takes input data, and returns code as text – meaning: a string. In the context of DSLs and model-driven software development, the input data typically consists of an Abstract Syntax Tree (AST), and possibly some additional parametrizing arguments. That AST represents the model: the DSL content written using the Domain IDE.
For this article, consider the following AST as input data. It represents the specification of an entity in a data model.
Listing 1. An AST that represents the specification of an entity in a data model, constructed in-memory as a JavaScript value. It’s going to serve as input data to generate SQL code from using a template function.
const entityAst = { "name": "pet store", "attributes": [ { "name": "number of employees", "type": "integer" }, // ... ] }
Let’s try to generate something resembling SQL from this. At times, you might have resorted to the following awkward coding style, or a variant of it, to generate code text:
Listing 2. A JavaScript template function to generate some SQL for the entity given as input data, using ordinary string concatenation.
const generateSqlFrom = (inputData) => { ❶ let sql = ""; sql += "CREATE TABLE " + withFirstUpper(camelCase(inputData.name)) + "(\n"; ❷ sql += " ID int not null,\n"; ❸ for (attribute of inputData.attributes) { sql += " " + camelCase(attribute.name) + " -- TODO -> SQL type,\n"; ❹ } sql += " PRIMARY KEY (ID)\n"; sql += ");\n"; return sql; };
❶ Use a const
declaration in combination with ES2015’s arrow function expression syntax to define a function.
❷ Assume that the functions camelCase
and withFirstUpper
are pre-defined.
❸ Every line to generate requires a code line of the form sql += " <actual content>\n" +
.
❹ Mapping an attribute’s type to an SQL type is not hard, just some work which I leave as an exercise to the reader.
The expression generateSqlFrom(entityAst)
calls this function with the AST above, and produces the following text value:
CREATE TABLE PetStore( ID int not null, numberOfEmployees -- TODO -> SQL type, ❶ PRIMARY KEY (ID) );
❶ Mapping the type
data in the AST to SQL types is still TODO.
In recent times, various programming languages have seen improvements in their string generation facilities. In particular, these improvements come with support for multi-line strings, and for interpolation of embedded expressions. An embedded expression occurring in a string literal is evaluated, or interpolated, and the result replaces the text in the string literal that makes up the embedded expression’s code. JavaScript gained template literals per the ES2015 specification.
Let’s rewrite Listing 2 using template literals:
Listing 3. A JavaScript template function that uses a template literal to achieve the same as Listing 2 – with differences highlighted in bold.
const generateSqlFrom = (inputData) => ❶ `CREATE TABLE ${withFirstUpper(camelCase(inputData.name))} ( ❷ ID int not null, ${inputData.attributes.map((attribute) => ` ${camelCase(attribute.name)} -- TODO -> SQL type,`).join("\n")} ❸ PRIMARY KEY (ID) ); ` ❹ };
❶ Instead of a statement block, this function directly returns the interpolation of the a template literal.
❷ Template literals start and end with backticks, and an expression is embedded as `${<expr>}`
.
❸ Loop over all values in the array property attributes
on the input data. Map each attribute to the result of interpolating ` "${camelCase(attribute.name)}"
—
TODO -> SQL type`
– note that template literals can be embedded in embedded expressions. Join the resulting strings with newlines.
❹ Enforce that the string returned ends with a newline.
The template literal in Listing 3 seems to be slightly more readable overall than Listing 2, although looping over the attributes seems less readable. We can improve on this by making us of a couple of constructs JavaScript has to offer beyond template literals. This makes our template functions not only more readable, it also allows us to use other JavaScript features which we otherwise couldn’t.
This approach is based on what I like to call nested strings: any value that’s composed entirely of strings and nested arrays. The arrays are allowed to be nested arbitrarily deeply, and the strings can be single- or multi-line. As an example, this is allowed:
[ "foo (as regular string literal)", [ [ "bar" ], [] ], ` multi- line string (as template literal)` ]
This also includes any regular string, without any nesting.
The idea behind a loosely-defined data structure like this is that it’s flexible, but also easy to produce and process. If we redefine a template function to have the nested string type as return type, we don’t have to make sure anymore that every template function returns one string. Basically, array creation takes over the roles of both string concatenation, and joining-with-newlines. We just have to turn the outermost nested string produced by the code generator into an actual string so we can write it to a destination.
Let’s rewrite Listing 3 to produce a nested string, instead of one string:
Listing 4. A JavaScript template function that uses a nested string – with differences highlighted in bold.
const generateSqlFrom = (inputData) => [ ❶ `CREATE TABLE ${withFirstUpper(camelCase(inputData.name))} (`, ❷ ` ID int not null,`, inputData.attributes.map((attribute) => ` ${camelCase(attribute.name)} -- TODO -> SQL type,`), ❸ ` PRIMARY KEY (ID)`, `);` ];
❶ Return an array instead of a string.
❷ Use one array element per line of code.
❸ Map the entity’s attributes to an array, which is itself an element of the outer array.
This already looks a bit cleaner, especially because we got rid of some syntactic noise when looping over the attributes. When calling this function with the AST, we get back the following structured data:
[ 'CREATE TABLE PetStore(', ' ID int not null,', [ ' numberOfEmployees -- ...,' ], ' PRIMARY KEY (ID)', ');' ]
We need to turn this data into a string again, taking care of newlines. Let’s define the following function for that:
Listing 5. The function asString
that turns a nested string into a regular string, taking properly care of newlines.
const withNewlineEnsured = (str) => str + (str.endsWith("\n") ? "" : "\n"); ❶ const asString = (nestedString) => Array.isArray(nestedString) ❷ ? nestedString.flat(Infinity) ❸ .map(withNewlineEnsured) ❹ .join("") ❺ : withNewlineEnsured(nestedString); ❻
❶ Define a helper function withNewlineEnsured
that adds a newline to a given string when it doesn’t already end in one.
❷ Check whether the given nestedString
value is a single string or an array of nested strings.
❸ If nestedString
is an array, first flatten it using <array>.flat(Infinity)
– more on that directly below.
❹ Ensure that each of those strings ends in a newline, adding one where necessary, by mapping withNewlineEnsured
over the flat(tened) array.
❺ Concatenate the strings in the array (without newlines or commas) using .join("")
.
❻ If nestedString
is a single string, just ensure it ends in a newline.
The JavaScript built-in function Array.flat
“flattens” an array by moving members of sub arrays up. In other words: it removes nesting. As an example: [ 1, [ 2 ], [] ].flat()
evaluates to [ 1, 2 ]
. (Note that empty sub arrays have no members, so these basically disappear.)
The function Array.flat
takes an optional argument: the depth to which an array should be flattened. This argument has a default value of 1
. As examples: [ [ 1, [ 2 ] ] ].flat()
evaluates to [ 1, [ 2 ] ]
, while [ [ 1, [ 2 ] ] ].flat(2)
evaluates to [ 1, 2 ]
. Passing Infinity
for the argument ensures that an array is flattened completely, with no nested arrays remaining. This is exactly what we need to work with nested strings, which can be nested to arbitrary depth.
Now we can change the generateSqlFrom
function as follows:
Listing 6. A JavaScript template function that uses a nested string to achieve the same as Listing 2 – with differences highlighted in bold.
const generateSqlFrom = (inputData) => asString([ // ...rest is the same... ]);
By adding a little machinery to work with nested strings, writing template functions became much more comfortable than in Listing 2. As long as we make sure to call asString
on a nested string before we hand the result off, it doesn’t matter what shape of nested string we actually produce. As a result, it’s now also much easier to compose template functions.
To demonstrate that, let’s make a separate function for generating SQL columns from attributes:
const generateAttributeSqlFrom = (attribute) => ` ${camelCase(attribute.name)} -- TODO -> SQL type,` ❶
❶ The template literal is lifted directly from the argument to inputData.attributes.map
on the third annotated line in Listing 4.
We can now change the third annotated line in Listing 4 to:
inputData.attributes.map(generateAttributeSqlFrom),
That’s already better, in the sense that the code becomes more readable, and is more modularized. Unfortunately, the generateAttributeSqlFrom
function still has to manage indentation itself. Ideally, this function just has to know how to map a specification of an attribute to a fragment of SQL. That we’re going to use the fragments generated from attributes only in places where an indentation level of 1 makes sense, should not be this function’s concern. In other words: we should manage indentation at the call site of generateAttributeSqlFrom
, instead of in that function itself.
One option is to go back to
inputData.attributes.map((attribute) => ` ${generateAttributeSqlFrom(attribute)}`),
but that seems a big step backwards. A better way is to introduce a function indent
which indents a nested string one level:
Listing 7. The function indent
that indents a nested string passed as its argument one level.
const indentLine = (str) => ` ${str}`; const indent = (nestedString) => Array.isArray(nestedString) ? nestedString.flat(Infinity).map(indentLine) : indentLine(nestedString);
This function follows the same pattern as the asString
function in Listing 5. Using this function we can modify the code as follows:
const generateAttributeSqlFrom = (attribute) => `${camelCase(attribute.name)} -- TODO -> SQL type,` ❶ ... indent(inputData.attributes.map(generateAttributeSqlFrom)), ❷
❶ Remove the explicit indentation.
❷ Indent the lines of SQL generated from the entity’s attributes.
We could improve even further by also introducing a function separated(<separator>)
that helps with joining an array of nested strings together with the specified <separator>
. I’ll leave that as an exercise to the reader.
The final version of the template functions look as follows:
Listing 8. The final version of the JavaScript template function.
const generateAttributeSqlFromNonIndented = (attribute) => ${camelCase(attribute.name)} -- TODO -> SQL type, const generateSqlFrom = (inputData) => asString([ `CREATE TABLE ${withFirstUpper(camelCase(inputData.name))}(`, ` ID int not null,`, indent(inputData.attributes.map(generateAttributeSqlFromNonIndented)), ` PRIMARY KEY (ID)`, `);` ]);
This is syntactically almost as succinct as a separate template file for a template engine like Mustache, Handlebars, StringTemplate, etc. would look like. The advantages of this approach over using such a template engine are:
- You don’t need to learn the template syntax of template files for the template engine.
- You don’t need to introduce an extra dependency.
- You can make more use of your existing skill in exploiting a programming language you’re already using. In particular: you can lean on its built-in modularization and abstraction mechanisms, such as defining functions.
That circumvents a common “feature” of template engines. Many of these explicitly force you to separate the template file from logic beyond simple if(-not)s and loops. This forces you to pre-compute all values derived from the input data, and add them to the input data somehow. That’s not only tedious, it can also hit performance, because every value that could be necessary in the template must be pre-computed.
Template engines often have specific features to help with getting indentation right without needlessly cluttering the template, or to inject separator strings when looping over arrays. It turns out that helper functions such as indent
achieve the same, without adding too much syntactic noise.
For completeness’, here is the code for defining the camelCase
and withFirstUpper
helper functions:
Listing 9. The final version of the JavaScript template function.
const camelCase = (str) => str .toLowerCase() .replace(/\s+([a-z])/g, (_, ch) => ch.toUpperCase()) .replace(" ", ""); const withFirstUpper = (str) => str.charAt(0).toUpperCase() + str.substring(1);
In summary:
- We introduced the notion of a nested string: any value that’s composed entirely of strings and nested arrays. You can think of this type as an extension of the regular string that’s easy to process.
- We wrote two helper functions taking a nested string as argument:
asString
(Listing 5) turns any nested string back into a string, taking care of added newlines.indent
(Listing 7) indents any nested string one level.
- Using nested strings and their associated helper functions has the following advantages:
- Writing template functions becomes less tedious, error-prone, and “noisy”.
- Composing template function becomes much easier, especially when you need to get indentation right.
As a result, this approach offers an alternative to using a template engine, without the disadvantages of those.
That’s all for this article. If you want to learn more about the book, you can check it out on our liveBook platform here.