Does this sound familiar to you?
Oh, I need to rename this misspelled property within our domainmodel. Ok, so let's startup this big UML monster... and by the way let's get a new cup of coffee. Cool, it has been started up already. Grabbing the mouse, clicking through the different diagrams and graph visualizations... Ahhh, there is the name of the property right down there in the properties view. Let's change it, export it to XMI (...drinking coffee), starting the oAW generator (in a jiffy ;-)). Oh, it's not allowed for the property to be named that way, a constraint says that properties names should start with a lower case letter. Ok, let's change that and again reexport...
Some moments later, everything seems to works (tests are green). Ok let's check it in!
Oh someoneelse has also modified the model! Aaarrrgggh....
Think of this:
Want to change a properties name? Ok, open the respective text file, rename the properties name and save it. The editor complains about a violated constraint. Ok fix the issue, save again and generate. Check the changes into SVN (CVS). Oh there is a conflict, ok, let's simply merge it using Diff.
And now? Let's have a cup of coffee :-)
Xtext is a textual DSL development framework. Providing the ability to describe your DSL using a simple EBNF notation. Xtext will create a parser, a metamodel and a specific Eclipse texteditor for you!
Xtext is contained in the openArchitectureWare SDK. The easiest way to install is to download and unzip the "Eclipse 3.3 for RCP/Plug-in Developers" from eclipse's download site (http://www.eclipse.org/downloads/).
Afterwards download the org.openarchitectureware.all_in_one_feature-4.2.0.*.zip release and extract it to the directory where you have unziped the Eclipse release (i.e. the Eclipse installation dir).
Make sure that you start Eclipse with a Java VM Version greater than 5.0.
You should have a look at the Xtext tutorial screencast (TODO link to screencast). This will give you a good overview of how Xtext basically works. Come back to this document to find out about additional details.
At the heart of Xtext lays its grammar language. It's a lot like an extended Backus-Naur-Form but it not only describes the concrete syntax, but also the abstract syntax (metamodel).
A grammar file consists of a list of so called Rules.
This is an example for a Rule describing something called an entity :
Entity :
"entity" name=ID "{"
(features+=Feature)+
"}"
Entity is both the name of the rule and the name of the metatype corresponding to this rule. After the colon the description of the rule is following. A description is made up of tokens. The first token is a KeywordToken which says that a description of an entity starts with the keyword entity. A so called Assignment follows (name=ID).
The left hand refers to a property of the Metatype (in this case it's the property name of type Entity). The left hand side is a call to the built-in token ID. Which means Identifier and allows character sequences of the form ('a-zA-Z_' ('a-zA-Z_0-9)*). The parser will assign ('=') the Identifier to the specified property (name).
Then (enclosed in curly brackets ("{" and "}" both are essentially keyword tokens)) one or more features can be declared ((features+=Feature)+). This one again is an assignment. This time the token points to another rule (called Feature) and each feature is added (note += operator) to the Entity's reference called features.
The Feature rule itself could be described like this:
Feature : type=ID name=ID ";"
so that the following description of an entity would be valid according to the grammar:
entity Customer {
String name;
String street;
Integer age;
Boolean isPremiumCustomer;
}
Note, that the types (String, Integer, Boolean) used in this description of a customer, are simple identifiers, they don't have been mapped to e.g. Java types or something else. So according to the grammar this would also be valid, so far:
entity X {
X X;
X X;
X X;
cjbdlfjerifuerfijerf dkjdhferifheirhf;
}
As stated before the grammar is not only used as input for the parser generator but it is also used to compute a meta model for your DSL. We will first talk about how an Xtext parser works in general, before we look at how a meta model is beeing constructed.
The parsing of text is devided in two separate tasks the lexing and the parsing.
The lexer is responsible of creating a sequence of tokens from a character stream. Such tokens are identifiers, keywords, whitespace, comments, operators, etc. Xtext comes with a set of built-in lexer rules which can be extended or overwritten if necessary. You have already seen some of them (e.g. ID).
The parser get's the stream of tokens and creates a parse tree out of them. The type rules from the example are essentially parser rules.
Now let's have a look at how the meta model is constructed.
We've already seen how the Type Rules works in general. The rule's name is used as the name of the meta type generated by Xtext.
Each assignment token within an Xtext grammar is not only used to create a corresponding assignment action in the parser but also to compute the properties of the current meta type.
Properties can refer to the simple types such as String, Boolean or Integer as well as to other complex meta types (i.e. other rules). It depends on the assignment operator and the type of the token on the right what the type actually is.
There are three different assignment operators:
Standard assignment '=' : The type will be computed from the token on the right.
Boolean assignment '?=' : The type will be Boolean
Add assignment '+=' : The type will be List. The list's inner type depends on the type returned by the token on the right.
Example:
Entity :
(isAbstract?="abstract")? "entity" name=ID "{"
(features+=Feature)*
"}";
The meta type entity will have three properties:
1) Boolean isAbstract
2) String name
3) List[Feature] features
Parsers construct parse trees not graphs. This means that the outcome of a parser has no crossreferences only so called containment references (Composition).
In order to get cross links in your model, one usually has to add third task : the linking. However, Xtext supports specifying the linking information in the grammar, so that the meta model contains cross references and the generated linker links the model elements automatically (for most cases). Linking semantic can be arbritrary complex. Xtext generates a default semantic (find by id) which can be selektively overwritten. We will see how this can be done later in this document.
Let's concentrate on what the grammar language supports:
Entity :
"entity" name=ID ("extends" superType=[Entity])?
"{"
(features+=Feature)*
"}";Have a look at the optional extends clause. Therein the rule name on the right is surrounded by squared paranthesis. That's it.
By default the parser expects an ID to point to the refered element. If you want to refer with another klind of token you can optionally specifiy it sepearted by a vertical bar:
... ("extends" superType=[Entity|MyComplexTokenRule])? ...Where MyComplexTokenRule must be either a NativeLexerRule or a StringRule (explaination follows).
We've seen how to define simple concrete meta types it's features. One can also define type hierarchies using Xtext's grammar language. Sometimes you want to abstract rules, in order to let a feature contain elements of different types.
We have seen the Feature rule in the example. If you would like to have two different kinds of Feature (e.g. Attribute and Reference) you could create an abstract type rule like this:
Feature :
Attribute | Reference;
Attribute :
type=ID name=ID ";";
Reference :
"ref" (containment?"+")? type=ID name=ID ("<->" oppositeName=ID)? ";";
The transformation creating the meta model automatically normalizes the type hierarchy. This means that properties defined in all subtypes will automatically be moved to the common super type. In this case the abstract type 'Feature' would be created containing the two features (name and type). Attribute and Reference would be subtypes of Feature inheriting those properties.
It is also possible to define concrete supertypes like this:
Feature :
type=ID name=ID ";" | Reference;
Reference :
"ref" (containment?"+")? type=ID name=ID ("<->" oppositeName=ID)? ";";
In this case Feature wouldn't be abstract but would be the supertype of Reference.
If you need multiple inheritance you can simply add an abstract rule. Such a rule must not be called from anothe rule.
Example:
Model : TypeA TypeA TypeC; TypeA : "A" | TypeB; TypeB : "B"; TypeC : "C"; CommonSuper : TypeB | TypeC; // just for the typehierarchy
The resulting type hierarchy will look like this:
- Model
- TypeA
- TypeB extends TypeA, CommonSuper
- TypeC extends CommonSuper
- CommonSuper
The enum rule is used to define enumerations. For example if you would like to hardwire the possible datatypes for attributes into the language you could just write:
Attribute : type=DataType name=ID ";"; Enum DataType : String="string"|Integer="int"|Boolean="bool";
So that this would be valid:
entity Customer {
string name;
string street;
int age;
bool isPremiumCustomer;
}
but this would not
entity Customer {
X name; // type X is not known
String street; // type String is not known (case sensitivity!)
}
Xtend provides built-in Tokens (we have already seen the
IdentifierToken
and the
KeywordToken
). Sometimes this is not sufficient, so we might want to create our own Tokens. Therefore we have the so called
String Rule
, which is implemented as a parser rule (it's not a lexer rule!).
Example
String JavaIdentifier :
ID ("." ID)*;
The contents of the String Rule is simply concatenated and returned as a string. One can refer to a String Rule in the same manner we refere to any other rule.
So just for the case you want to declare datatypes using your DSL and therein specify how it is mapped to Java (not Platform independent, i know, but expressive and pragmatic), you could do so using the following rules.
Attribute :
type=DataType name=ID ";";
DataType :
"datatype" name=ID "mappedto" javaType=JavaIdentifier;
String JavaIdentifier :
ID ("." ID)*;
A respective model could look like this:
entity Customer {
string name;
string street;
int age;
bool isPremiumCustomer;
}
datatype string mappedto java.util.String
datatype int mappedto int
datatype bool mappedto boolean
You could of course point to a custom typemapping implementation, if you need to support multiple platforms (like e.g. SQL, WSDL, Java,...). Additionally you should consider to define the datatypes in a separate file, so the users of your DSL can import and use them.
As mentioned before we Xtext provides some common built-in lexer rules. Let's start with the two simplest.
All static characters or words (keywords) can be specified directly in the grammar using the usual string literal syntax. We never need the value of keyword because we know it (it's static). But sometimes there are optional keywords like e.g. the modifiers in Java. The existence of a keyword can be assigned using the boolean assignment operator "?". However, if you want to assign the value of the keyword to a property just use the assignment operator '='.
Example:
Entity :
(abstract?"abstract")? "entity" name=ID ("<" extends=ID)?
"{"
(features+=Feature)*
"}";
With this the type
Entity
will have the boolean property
abstract
, which is set to true if the respective keyword has been specified for an entity. (I've added the extends part, because an abstract entity wouldn't make sense without inheritance).
Note that operators such as '<' in the example are keywords, too.
We also have seen the identifier token (ID). This is the token rule expressed in Antlr grammar syntax:
('^')?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
So an identifier is a word starting with a character or underscore followed by optionally additional characters, underscores or digits. The return value of the Identifier token is a String. So if you use the usual assignment operator „=“, the feature the value is assigned to will be of type String. You could also use the boolean operator and the type will be Boolean.
If an identifier conflicts with a keyword or another lexer rule, it can be escaped with the '^' character.
There is also a built-in String Token. Here is an example:
Attribute : type=DataType name=ID (description=STRING)? ";";
With this one can optionally specify a description for an entity like this:
entity Customer {
string name ;
string street "should include the street number, too.";
int age;
bool isPremiumCustomer;
}By default the two string literal syntaxes "my text" and 'my text' are supported. Note that unlike in Java also multiline strings are supported:
entity Customer {
string name ;
string street "should include the street number, too.
And if you don't want to specify it, you
should consider specifying it somewhere else.";
int age;
bool isPremiumCustomer;
}
Sometimes you want to assign Integers. Xtext supports it with the built-in lexer rule INT.
Index: "#" index=INT;
The default pattern is :
('-')?('0'..'9')+It can be overwritten (see next section), but you have to take car that the coercion (Integer.valueOf(String) is used) works.
There are two different kinds of comments automatically available in any Xtext language.
// single-line comments and
/*
mutli-line comments
*/
Note that those comments are ignored by the language's parser by default (i.e. they are not contained in the AST returned from the parser).
If you don't want ignored comments, or you ant to have a different syntax you need to overwrite the default implementation (name is SL_COMMENT resp. ML_COMMENT).
Every textual model contains whitespace. As most languages simply ignore whitespace, Xtext does so by default, too. If you want to have semantic whitespace in your language (like e.g. python), you have to overwrite the builtin whitespace rule (name is WS).
If you want to overwrite one or more of the built-in lexer rules or add an additional one, the so called native rule is your friend.
Example:
// overwriting SL_COMMENTS we don't want Java syntax (//) but bash syntax (#)
Native SL_COMMENT :
"'#' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}";
// fully qualified names as a lexer rule
Native FQN :
"ID ('.' ID)*";
The syntax is :
"Native" ID ":" STRING // The string contains a valid ANTLR 3 lexer rule expression (see http://www.antlr.org/wiki/display/ANTLR3/ANTLR+v3+documentation) ";"
It's assumed that you've used the Xtext Projects Wizard and that you
have successfully written an Xtext grammar file describing your little
language. Next up you need to start Xtext's generator in order to get a
parser, a metamodel and an editor. To do so just right-click the workflow
file (*.oaw) located next to the grammar file and
choose "Run As" -> "Run Workflow" (in Eclipse, of course). The
generator will read the grammar file in and create a bunch of files. Some
of them located in the src-gen directories others
located in the src directory.
IMPORTANT : You should now (after first generation) open the *.properties file and set the "overwritePluginRes=true" option to false!
The generator can be configured with the following properties
definied in
:generate.properties
| property name (default) | description |
|---|---|
| grammar | The grammar file to generate the code from. |
| debug.grammar (false) | Specifies whether a debug grammar should be generated. A debug grammar is an antlr grammar without any action code, so it can be interpreted within Antlrworks. |
| language.name | The name of the DSL. Is used throughout the generated code |
| language.nsURI ("http://www.oaw.org/xtext/dsl/${language.name}") | A unique URI used within the derived ecore package. |
| language.fileextension ("${language.name}") | The file extension the generated editor is configured for. |
| overwrite.pluginresources ("false") | If this is set to true the plugin resources (META-INF/MANIFEST.MF, plugin.xml) will be overwritten by the generator!!! |
| wipeout.src-gen ("true") | Specifies whether the src-gen folders should be deleted before generation. |
| generator.project.name ("") | If this property is set to something, a project wizard will be generated referencing the generator project. |
| workspace.dir | The root of the workspace. |
| core.project.name | name of the main project |
| core.project.src ("src/") | src folder of the main project |
| core.project.src.gen ("src-gen/") | src-gen folder of the main project |
| editor.generate ("true") | should an editor be generated at all |
| editor.project.name ("${core.project.name}.editor") | name of the editor project |
| editor.project.src ("${core.project.src}") | src folder of the editor project |
| editor.project.src.gen ("${core.project.src.gen}") | src-gen folder of the editor project |
Any textual artifacts located in the src dir
(of any project) will always stay untouched. The generator just creates
them the first time when they don't exist.
Files generated to the src-gen directory
should never be touched! The whole directory will be wiped out the next
time one starts the generator.
Xtext generates artifact into two different projects.
The name of the main project can be specified in the wizard. This project contains the main language artifacts and is 100% eclipse independent. The default locations of the most important resources are:
| Location | Description |
|---|---|
|
src/[dslname].xtxt |
The grammar file, containing the grammar rules describing your DSL |
|
src/generate.oaw |
The workflow file for the Xtext generator. |
|
src/generator.properties |
Properties passed to the Xtext generator |
|
src/[base.package.name]/Checks.chk |
The Check-file used by the parser and within the editor. Used to add semantic constraints to your language. |
|
src-gen/[base.package.name]/GenChecks.chk |
The generated Check-file contains checks automatically derived from the grammar. |
|
src/[base.package.name]/Extensions.ext |
The Extension-file is used (imported) by all aother
extensions and check files. It reexports the extensions from
|
|
src-gen/[base.package.name]/GenExtensions.ext |
generated extensions (reexported by
|
|
src/[base.package.name]/Linking.ext |
Used by |
|
src-gen/[base.package.name]/GenLinking.ext |
Contains the default linking semantic for each cross reference. Example:
Void link_featureName(my::MetaType this) : (let ents = this.allElements().typeSelect(my::ReferredType) : this.setFeatureName(ents.select(e|e.id() == this.parsed_featureName).first()) );
This is: Get all instances of the referred type using the allElements() extension. Select the first one where the id() equals the parsed value (by default an identifier). Both extensions, id() and
allElements()) can be overwritten or
specialized in the |
|
src-gen/[base.package.name]/[dslname].ecore |
Metamodel derived from the grammar |
|
src-gen/[base.package.name]/parser/* |
Generated Antlr parser artifacts |
The name of the editor project is derived from the main
project's name by appending the suffix .editor to
it. The editor project contains the Eclipse Texteditor specific
informations. Note that it uses a generic xtext.editor plugin, which
does most of the job. These are the most important resources:
| Location | Description |
|---|---|
|
src/[base.package.name]/[dslname]EditorExtensions.ext |
The Xtend-file is used by the outline view. If you want to customize the labels of the outline view, you can do that here. |
|
src-gen/[base.package.name]/[dslname]Utilities.java |
Contains all the important DSL-specific information. You should subclass it in order to overwrite the default behaviours. |
|
src/[base.package.name]/[dslname]EditorPlugin.java |
If you have sublcassed the *Utilities class, make sure to change the respective instantiation here. |
The name of the generator project is derived from the main
project's name by appending the suffix .generator
to it. The generator project is intended to contain all needed
generator resources such as Xpand templates, platform-specific Xtend
files etc..
These are the most important resources:
| Location | Description |
|---|---|
|
src/[base.package.name]/generator.oaw |
The generators workflow preconfigured with the generated DSL parser and the Xpand component. As this is just a proposal, feel free to change/add the workflow as you see fit. |
|
src-gen/[base.package.name]/Main.xpt |
The proposed td Xpand template file. |
The generated editor supports a number of features known from other eclipse editors. Although most of them have a decent default implementation, we will see how to tweak and enhance each of them.
Code Completion is controlled using oAW extensions. The default implementation provides keyword proposals as well as proposals for cross references.
Have a look at the extension file
ContentAssist.ext. Therein a comment describes how
to customize the default behaviour:
/* * There are two types of extensions one can define * * 1) completeMetaType_feature(ModelElement ele, String prefix) * This one is called for assignments only. It gets the underlying Modelelement and the current * prefix passed in. * * 2) completeMetaType(xtext::Element grammarEle, ModelElement ele, String prefix) * This one gets the grammarElement which should be completed passed in as the first parameter. * an xtext::Element can be of the following types : * - xtext::RuleName (a call to a lexer rule (e.g. ID)), * - xtext::Keyword, * - xtext::Assignment * * Overwrite rules are as follows: * 1) if the first one returns null for a given xtext::Assignment or does not exist the second one * is called. * 2) if the second one returns null for a given xtext::Keyword or does not exist a default keyword * proposal will be added. * * Note that only propals with wich match (case-in-sensitive) the current prefix will be proposed * in the editor */
The implementation for the navigation actions is implemented via
extensions, too. As for Code completion the same pattern applies: There
is a GenNavigation.ext extension file in the
src-gen folder which can be overwritten or
specialized using the Navigation.ext file in the
src folder (reexporting the generated
extensions).
There are two different Actions supported by Xtext:
This action ca be invoked via [CTRL]+[SHIFT]+G or use the corresponding action int context menu. The default implementation returns the crossreferences for a modelelement.
The signature to overwrite / specialize is:
List[UIContentNode] findReferences(String s, Object grammarelement, Object element) : ...;
A UIContentNode is a meta class used by xtext. An UIContentNode represents an element visualized in eclipse.
Here is the declaration of UIContentNode (pseudo code):
package tree;
eclass UIContentNode {
UIContentNode parent;
UIContentNode[] children;
String label;
String image;
emf::EObject context;
}A content node can have children and / or a parent (the tree structure is not used for find references). The label is used for the label in eclipse and the image points to an image relative to the icons folder in the editor project. The icon instances are automatically cached and managed.
The context points to the actual model element. This is used to get the file, line and offset of the declaration. If you don't fill it you cannot click on the item in order to get to it.
This action can be invoked via F3 as well as by holding CTRL, hovering over an identifier and left click the mouse.
The default implementation goes to the declaration of a crossreference. You can implement or overwrite this action for all grammar elements.
emf::EObject findDeclaration(String identifier, emf::EObject grammarElement, emf::EObject modelElement) : ...;
Have a look at the generated extensions to see how it works.
The outline view is constructed using a tree of UIContentNode (see above).
Each time the outline view is created the following extension is called:
UIContentNode outlineTree(emf::EObject model)
It
is expected to be declared in Outline.ext which by
default exports a generic implementation from
GenOutline.ext (the same pattern again).
You can either reuse the generic extension and just provide a
label() and image() extension
for your model elements (should be added in
EditorExtensions.ext).
However, if you want to control the structure of the outline tree
you can overwrite the extension outlineTree(emf::EObject
model) in Outline.ext.
The default syntax highlighting distincts between comments, string literals, keywords and the rest.
If you just want to specify which words to be coloured as keywords you can extend the [basepackage.][Languagename]Utilities.java class from the editor plugin. You need to overwrite the following method (DON'T CHANGE IT DIRECTLY, BECAUSE IT WILL BE OVERWRITTEN THE NEXT TIME YOU START THE GENERATOR!).
public String[] allKeywords()
Each String returned by the method represents a keyword.
The utiltities method is created within the [LanguageName]EditorPlugin.java. So make sure that you change the following lines, too :
// OLD -> private MyLangUtilities utilities = new MyLangUtilities();
private MyCustomUtilities utilities = new MyCustomUtilities();
public LanguageUtilities getUtilities() {
return utilities;
}If you want to change the syntax of comments and string literals you have to provide an alternative implementation of GeneratedPartitionScanner.
Don't touch the class directly but, use the Utilities method to return a different instance.
This part of the documentation deals with the discussion and solution of different requirements and problems.
If you have uge models or want to separate parts of your models from others for other reasons (e.g. provide a kind of library). You need to tweak Xtext a bit, because there is no first class support for this as for now. However, Xtext has a built-in registry where you can get parsed models by name.
Xtext automatically manages a built-in registry. That is it caches parsed models by there filename and provides an API (both Java and Xtend) to access cached models.
The registry is contained in the org.openarchitectureware.xtext.core.base bundle. The Java API sonsists of a handful of static methods from org.openarchitectureware.xtext.registry.Registry.java. The extensions are contained in org.openarchitectureware.xtext.registry.Reg.ext. This part of the framework will definitely be enhanced in te future so expect API changes here.
All you have to do is to declare some kind of import statemnent in our DSL and overwrite the allElements() extension from GenExtensions.ext.
Here is an example:
MyModel : (imports+=Import)* (types+=Type)*; Import : "import" name=ID; Type : ...
Overwrite the allElements extension in Extensions.ext to something like this:
extension mydsl::GenExtensions reexport;
extension org::openarchitectureware::xtext::registry::Reg;
allElements(emf::EObject this) :
wholeModel().union(
(((MyModel)eRootContainer)
.imports // for each import
.getRootNode(i) // get the root from the registry
.wholeModel() // get the whole model as a list
);
wholeModel(emf::EObject this) :
{eRootContainer}.union(eRootContainer.eAllContents);
As the allElements() extension is used everywhere the imported model elements will automatically be available for code completion, navigation, checking, etc.