0% found this document useful (0 votes)
921 views

Xtext User Guide

Uploaded by

tontondimsum
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
921 views

Xtext User Guide

Uploaded by

tontondimsum
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 113

Xtext User Guide

Xtext User Guide


Heiko Behrens, Michael Clay, Sven Efftinge, Moritz Eysholdt, Peter Friese,
Jan Köhnlein, Knut Wannheden, Sebastian Zarnekow and contributors

Copyright 2008 - 2010

Xtext 1.0 1
1. Overview .................................................................................................................... 1
1.1. What is Xtext? ................................................................................................... 1
1.2. How Does It Work? ............................................................................................ 1
1.3. Xtext is Highly Configurable ................................................................................ 1
1.4. Who Uses Xtext? ................................................................................................ 1
1.5. Who is Behind Xtext? ......................................................................................... 1
1.6. What is a Domain-Specific Language ..................................................................... 2
2. Getting Started ............................................................................................................ 3
2.1. Creating a DSL ................................................................................................. 3
2.1.1. Create an Xtext Project ............................................................................. 3
2.1.2. Project Layout ......................................................................................... 4
2.1.3. Build Your Own Grammar ......................................................................... 5
2.1.4. Generate Language Artifacts ...................................................................... 8
2.1.5. Run the Generated IDE Plug-in ................................................................... 8
2.1.6. Adding a Namespace Concept .................................................................... 9
2.2. Processing Xtext Models .................................................................................... 11
2.2.1. Loading a Resource ............................................................................... 12
2.2.2. Working with EMF Models ...................................................................... 12
2.2.3. Working with the Parse Tree (aka Node Model) ........................................... 13
2.2.4. Working with the Grammar ...................................................................... 13
2.2.5. Working with Live Models ....................................................................... 13
2.2.6. Summary ............................................................................................... 14
2.3. Writing a Code Generator ................................................................................... 15
2.3.1. Xpand and MWE2 .................................................................................. 15
2.3.2. The Empty Generator Project .................................................................... 15
2.3.3. Replacing the Example Model File (1) ........................................................ 15
2.3.4. The Modeling Workflow Engine File (2) .................................................... 16
2.3.5. Using Xpand (3) and Xtend (4) for Code Generation ..................................... 18
3. The Grammar Language ............................................................................................ 21
3.1. A First Example ............................................................................................... 21
3.2. The Syntax ...................................................................................................... 22
3.2.1. Language Declaration .............................................................................. 22
3.2.2. EPackage Declarations ............................................................................. 22
3.2.3. Rules .................................................................................................... 24
3.2.4. Parser Rules ........................................................................................... 26
3.2.5. Hidden Terminal Symbols ........................................................................ 30
3.2.6. Data Type Rules ..................................................................................... 31
3.2.7. Enum Rules ........................................................................................... 31
3.3. Ecore Model Inference ....................................................................................... 32
3.3.1. Type and Package Generation ................................................................... 32
3.3.2. Feature and Type Hierarchy Generation ...................................................... 32
3.3.3. Enum Literal Generation .......................................................................... 33
3.3.4. Feature Normalization ............................................................................. 33
3.3.5. Customized Post Processing ...................................................................... 33
3.3.6. Error Conditions ..................................................................................... 34
3.4. Grammar Mixins ............................................................................................... 34
3.5. Common Terminals ........................................................................................... 34
4. Configuration ............................................................................................................ 36
4.1. The Generator .................................................................................................. 36
4.1.1. A Short Introduction to MWE2 ................................................................. 36
4.1.2. General Architecture .............................................................................. 37
4.1.3. Standard Generator Fragments .................................................................. 40
4.2. Dependency Injection in Xtext with Google Guice .................................................. 40
4.2.1. The Module API .................................................................................... 41
4.2.2. Obtaining an Injector ............................................................................. 42
5. Runtime Concepts ...................................................................................................... 43
5.1. Runtime Setup (ISetup) ...................................................................................... 43
5.2. Setup within Eclipse-Equinox (OSGi) ................................................................... 43

Xtext 1.0 ii
Xtext User Guide

5.3. Logging ........................................................................................................... 43


5.4. Validation ........................................................................................................ 43
5.4.1. Automatic Validation .............................................................................. 43
5.4.2. Custom Validation .................................................................................. 45
5.4.3. Validation with the Check Language .......................................................... 45
5.4.4. Validating Manually ................................................................................ 46
5.4.5. Test Validators ....................................................................................... 46
5.5. Linking ........................................................................................................... 48
5.5.1. Declaration of Crosslinks ......................................................................... 48
5.5.2. Default Runtime Behavior (Lazy Linking) ................................................... 48
5.6. Scoping ........................................................................................................... 48
5.6.1. Global Scopes and IResourceDescriptions .................................................. 49
5.6.2. Local Scoping ........................................................................................ 54
5.6.3. Imported Namespace-Aware Scoping ......................................................... 56
5.7. Value Converter ................................................................................................ 57
5.7.1. Annotation-Based Value Converters ........................................................... 57
5.8. Serialization ..................................................................................................... 57
5.8.1. The Contract .......................................................................................... 58
5.8.2. Roles of the Semantic Model and the Node Model During Serialization ............. 58
5.8.3. Parse Tree Constructor ............................................................................ 58
5.8.4. Options ................................................................................................. 59
5.8.5. Preserving Comments from the Node Model ................................................ 59
5.8.6. Transient Values ..................................................................................... 59
5.8.7. Unassigned Text ..................................................................................... 60
5.8.8. Cross-Reference Serializer ........................................................................ 60
5.8.9. Merge Whitespaces ................................................................................. 60
5.8.10. Token Stream ....................................................................................... 60
5.9. Formatting (Pretty Printing) ................................................................................ 60
5.9.1. Declarative Formatter .............................................................................. 61
5.9.2. Fragment Provider (Referencing Xtext Models From Other EMF Artifacts) ........ 63
5.10. Encoding in Xtext ........................................................................................... 64
5.10.1. Encoding at Language Design Time ......................................................... 64
5.10.2. Encoding at Language Runtime ............................................................... 64
5.10.3. Encoding of an XtextResource ................................................................. 65
5.10.4. Encoding in New Model Projects ............................................................. 65
5.10.5. Encoding of Xtext Source Code ............................................................... 65
6. MWE2 ...................................................................................................................... 66
6.1. Examples ......................................................................................................... 66
6.1.1. The Simplest Workflow ........................................................................... 66
6.1.2. A Simple Transformation ......................................................................... 67
6.1.3. A Stopwatch .......................................................................................... 69
6.2. Language Reference .......................................................................................... 69
6.2.1. Mapping to Java Classes .......................................................................... 70
6.2.2. Module ................................................................................................ 70
6.2.3. Properties .............................................................................................. 70
6.2.4. Mandatory Properties .............................................................................. 71
6.2.5. Named Components ................................................................................ 72
6.2.6. Auto Injection ........................................................................................ 72
6.3. Syntax Reference .............................................................................................. 72
6.3.1. Module ................................................................................................. 72
6.3.2. Property ................................................................................................ 73
6.3.3. Component ............................................................................................ 73
6.3.4. String Literals ........................................................................................ 74
6.3.5. Boolean Literals ..................................................................................... 74
6.3.6. References ............................................................................................. 74
7. IDE Concepts ............................................................................................................ 75
7.1. Label Provider .................................................................................................. 75
7.1.1. Label Providers For EObjects ................................................................... 75

Xtext 1.0 iii


Xtext User Guide

7.1.2. Label Providers For Index Entries .............................................................. 76


7.2. Content Assist .................................................................................................. 77
7.2.1. ProposalProvider ..................................................................................... 77
7.2.2. Sample Implementation ............................................................................ 77
7.3. Quick Fixes ...................................................................................................... 78
7.3.1. Quickfixes for Linking Errors and Syntax Errors .......................................... 79
7.4. Template Proposals ........................................................................................... 80
7.4.1. CrossReference TemplateVariableResolver .................................................. 80
7.4.2. Enumeration TemplateVariableResolver ...................................................... 81
7.5. Outline View .................................................................................................... 82
7.5.1. Influencing the Outline Structure ............................................................... 82
7.5.2. Filtering ................................................................................................ 83
7.5.3. Context menus ....................................................................................... 85
7.5.4. Quick Outline ........................................................................................ 86
7.6. Hyperlinking .................................................................................................... 86
7.6.1. Location Provider ................................................................................... 86
7.6.2. Customizing Available Hyperlinks ............................................................. 87
7.7. Syntax Coloring ................................................................................................ 87
7.7.1. Lexical Highlighting ................................................................................ 87
7.7.2. Semantic Highlighting ............................................................................. 88
7.8. Project Wizard .................................................................................................. 89
7.8.1. Customizing the Project Wizard ................................................................ 90
8. Referring to Java Types ............................................................................................. 91
8.1. How to Use Java Types .................................................................................... 91
8.2. Customization Points ......................................................................................... 91
9. Typical Language Configurations ................................................................................ 93
9.1. Case Insensitive Languages ................................................................................. 93
9.2. Languages Independent of JDT ............................................................................ 93
9.3. Parsing Expressions with Xtext ............................................................................ 94
9.3.1. Construction of an AST ........................................................................... 95
9.3.2. Associativity .......................................................................................... 98
10. Integration with EMF and Other EMF Editors ............................................................ 99
10.1. Model, Ecore Model, and Ecore ........................................................................ 99
10.2. EMF Code Generation .................................................................................... 100
10.3. XtextResource Implementation ......................................................................... 101
10.4. Integration with GMF Editors .......................................................................... 102
10.4.1. Stage 1: Make GMF Read and Write the Semantic Model As Text ................ 103
11. Migrating from Xtext 0.7.x to 1.0 ............................................................................. 105
11.1. Take the Shortcut ........................................................................................... 105
11.2. Migrating Step By Step ................................................................................... 105
11.2.1. Update the Plug-in Dependencies and Import Statements ............................. 105
11.2.2. Rename the Packages in the dsl.ui-Plug-in ............................................... 105
11.2.3. Update the Workflow ........................................................................... 105
11.2.4. MANIFEST.MF and plugin.xml ............................................................. 106
11.2.5. Noteworthy API Changes ..................................................................... 106
11.3. Now Go For The New Features ........................................................................ 107
12. The ANTLR IP Issue (Or Which Parser To Use?) ...................................................... 109

Xtext 1.0 iv
Chapter 1. Overview
1.1. What is Xtext?
No matter if you want to create a small textual domain-specific language (DSL) or you want to implement
a full-blown general purpose programming language. With Xtext you can create your very own languages
in a snap. Also if you already have an existing language but it lacks decent tool support, you can use Xtext
to create a sophisticated Eclipse-based development environment providing editing experience known
from modern Java IDEs in a surprisingly short amount of time. We call Xtext a language development
framework.

1.2. How Does It Work?


Xtext provides you with a set of domain-specific languages and modern APIs to describe the different
aspects of your programming language. Based on that information it gives you a full implementation
of that language running on the JVM. The compiler components of your language are independent of
Eclipse or OSGi and can be used in any Java environment. They include such things as the parser, the
type-safe abstract syntax tree (AST), the serializer and code formatter, the scoping framework and the
linking, compiler checks and static analysis aka validation and last but not least a code generator or
interpreter. These runtime components integrate with and are based on the Eclipse Modeling Framework
(EMF), which effectively allows you to use Xtext together with other EMF frameworks like for instance
the Graphical Modeling Project GMF.
In addition to this nice runtime architecture, you will get a full blown Eclipse-IDE specifically tailored
for your language. It already provides great default functionality for all aspects and again comes with
DSLs and APIs that allow to configure or change the most common things very easily. And if that’s not
flexible enough there is Guice to replace the default behaviour with your own implementations.

1.3. Xtext is Highly Configurable


Xtext uses the lightweight dependency injection (DI) framework Google Guice to wire up the whole
language as well as the IDE infrastructure. A central, external module is used to configure the DI
container. As already mentioned, Xtext comes with decent default implementations and DSLs and APIs
for the aspect that are common sweet spots for customization. But if you need something completely
different, Google Guice gives you the power to exchange every little class in a non-invasive way.

1.4. Who Uses Xtext?


Xtext is used in many different industries. It is used in the field of mobile devices, automotive
development, embedded systems or Java enterprise software projects and game development. People use
Xtext-based languages to drive code generators that target Java, C, C++, C#, Objective C, Python, or
Ruby code. Although the language infrastructure itself runs on the JVM, you can compile Xtext languages
to any existing platform. Xtext-based languages are developed for well known Open-Source projects such
as Maven, Eclipse B3, the Eclipse Webtools platform or Google’s Protocol Buffers and the framework
is also widely used in research projects.

1.5. Who is Behind Xtext?


Xtext is a professional Open-Source project. We, the main developers and the project lead, work for
itemis, which is a well known consulting company specialized on model-based development. Therefore
we are able to work almost full-time on the project. Xtext is an Eclipse.org project. Besides many other
advantages this means that you don’t have to fear any IP issues, because the Eclipse Foundation has their
own lawyers who take care that no intellectual property is violated.
You may ask: Where does the money for Open-Source development come from? Well, we provide
professional services around Xtext. Be it training or on-site consulting, be it development of prototypes
or implementation of full-blown IDEs for programming languages. We do not only know the framework
very well but we are also experts in programming and domain-specific language design. Don’t hesitate
to get in contact with us ( www.itemis.com).

Xtext 1.0 1
Overview

1.6. What is a Domain-Specific Language


A Domain-Specific Language (DSL) is a small programming language, which focuses on a particular
domain. Such a domain can be more or less anything. The idea is that its concepts and notation is as close
as possible to what you have in mind when you think about a solution in that domain. Of course we are
talking about problems which can be solved or processed by computers somehow.
The opposite of a DSL is a so called GPL, a General Purpose Language such as Java or any other common
programming language. With a GPL you can solve every computer problem, but it might not always be
the best way to solve it.
Imagine you want to remove the core from an apple. You could of course use a Swiss army knife to cut
it out, and this is reasonable if you have to do it just once or twice. But if you need to do that on a regular
basis it might be more efficient to use an apple corer.
There are a couple of well-known examples of DSLs. For instance SQL is actually a DSL which focuses
on querying relational databases. Other DSLs are regular expressions or even languages provided by tools
like MathLab. Also most XML languages are actually domain-specific languages. The whole purpose of
XML is to allow for easy creation of new languages. Unfortunately, XML uses a fixed concrete syntax,
which is very verbose and yet not adapted to be read by humans. Into the bargain, a generic syntax for
everything is a compromise.
Xtext is a sophisticated framework that helps to implement your very own DSL with appropriate IDE
support. There is no such limitation as with XML, you are free to define your concrete syntax as you
like. It may be as concise and suggestive as possible being a best match for your particular domain. The
hard task of reading your model, working with it and writing it back to your syntax is greatly simplified
by Xtext.

Xtext 1.0 2
Chapter 2. Getting Started
In this mini-tutorial we will implement a small domain-specific language and a corresponding Eclipse
IDE integration with Xtext. Later, we will create a code generator that is capable of reading the models
you create with the DSL editor and process them, i.e. generate some Java code from it.

2.1. Creating a DSL


Download and install the latest version of Xtext and set up a fresh workspace. We want to develop a small
language used to define domain models. It is a stripped-down version of our Domain-Model Example
which is shipped with Xtext and can be materialized into your workspace using the example wizard. The
following sample of a domain model should give you an idea about the language.
datatype String
datatype Bool

entity Session {
title: String
isTutorial : Bool
}

entity Conference {
name : String
attendees : Person*
speakers : Speaker*
}

entity Person {
name : String
}

entity Speaker extends Person {


sessions : Session*
}

2.1.1. Create an Xtext Project


In order to get started we first need to create some Eclipse projects. Use the Xtext wizard to do so:
File -> New -> Project... -> Xtext -> Xtext project
Choose a meaningful project name, language name and file extension, e.g.

Main project name: org.eclipse.xtext.example.domainmodel


Language name: org.eclipse.xtext.example.Domainmodel
DSL-File extension: dmodel

Keep “Create generator project” checked, as we will also create a code generator in a second step. If you
want to materialize the full blown Domainmodel example that ships with Xtext, it is strongly recommend
to use another language name and file extension for this tutorial.
Click on Finish to create the projects.

Xtext 1.0 3
Getting Started

2.1.2. Project Layout


In the Package Explorer you can see three new projects. In org.eclipse.xtext.example.domainmodel
the grammar is defined and all runtime aspects, such as linking, scoping and validation are
developed. The IDE-aspects such as the editor, any views, and incremental project builder go into
org.eclipse.xtext.example.domainmodel.ui. Both projects consist of generated classes derived from the
grammar and manual code such as the grammar itself and further classes to differentiate and adapt from
the default behavior. Although Xtext makes use of code generation, most of the code is actually written
as libraries, which are referenced by the means of OSGi using the Manifest.MF.

It is good to be clear and unambiguous whether the code is generated or is to be manipulated by the
developer. Thus, the generated code should be held separately from the manually written code. We follow
this pattern by having a folder src/ and a folder src-gen/ in each project. Keep in mind not to make changes
in the src-gen/ folder. They will be overwritten by the generator without any further warning.
A third project, org.eclipse.xtext.example.domainmodel.generator will later contain an Xpand code
generator that leverages the model created with the DSL editor. This is fully optional and it is also optional
to use Xpand. In fact you can use any programming language running on the JVM to implement a code

Xtext 1.0 4
Getting Started

generator for your Xtext models. Also it is of course perfectly ok, to have no code generator at all but use
the models dynamically at runtime. We call these kind of model processors interpreters.

2.1.3. Build Your Own Grammar


Please note : If you encounter strange errors while copying and pasting a snippet from this document to
your Eclipse editor, your documentation viewer most likely has inserted characters different from {space}
into your clipboard. Reenter the “whitespace” or type the text by hand to be sure everything works fine.
The wizard will automatically open the example grammar file Domainmodel.xtext from the first project in
the editor. A grammar has two purposes: First, it is used to describe the concrete syntax of your language.
Second, it contains information about how the parser shall create a model during parsing.
The first line of the grammar is the grammar declaration:
grammar org.eclipse.xtext.example.Domainmodel
with org.eclipse.xtext.common.Terminals

In Xtext each grammar has a unique name, which like public Java classes needs to reflect the location of
the file within the Java classpath. In our case the grammar file is located in /org/eclipse/xtext/example/
Domainmodel.xtext therefore the name of the grammar is org.eclipse.xtext.example.Domainmodel. The
second part of that statement ( with org.eclipse.xtext.common.Terminals) states, that this grammar reuses
and overrides rules from the specified grammar. The org.eclipse.xtext.common.Terminals is a library
grammar shipped with Xtext and predefines the most common terminal rules, such as ID, STRING and
INT. It also introduces single-line and multi-line comments as well as rules for whitespace, which may
occur everywhere by default. You can open that grammar in the editor to have a look at these rules. It
turns out that these set of rules are often the same and often used, so that most Xtext languages extend
this grammar. However, it is just a library, so you won’t use it if it doesn’t fit your needs. Also you can
use the grammar inheritance mechanism for your own grammar libraries, of course.
The next statement declares an EMF Ecore package, aka EPackage, to be derived from the grammar:
generate domainmodel "http://www.eclipse.org/xtext/example/Domainmodel"

Ecore EPackages are effectively a set of classes (in EMF they are called EClass) which are used to
represent the in-memory-model of a text file. Instead of model we also sometimes refer to it as semantic
model or _Abstract Syntax Tree (AST)_. Note, that in the case of Xtext it’s actually a graph rather than
a tree, since it also contains crosslinks. But as the term AST is so commonly used and well known we
are ignoring this minor detail.
In order to tell the parser which EClasses to use when creating the AST, we need to make them available
by either importing existing Ecore packages or like in our example let Xtext derive a package from the
grammar. An EPackage has a mandatory name and namespace URI (short: nsURI), so we have to declare
these two values in that statement.
That’s all for the prelude, now we have to define the parser rules which effectively define a bi-directional
mapping from text to a constructed AST. It is bi-directional because Xtext not only derives a parser from
the grammar but also a serializer. Anyway, we will mostly talk about parsing in this chapter.
Ignore and delete the parser rules generated by the wizard and start by declaring an entry rule for our
language. The entry rule for the parser will be called DomainModel. Xtext is fine with any name, it just
picks the first rule in a grammar and interprets it as the entry rule.
As a DomainModel consists of one or more Entity entries, this rule delegates to another rule named
Entity, which will be defined later on. As we can have an arbitrary number of entities within a model, the
cardinality is ' *'. There are four different cardinalities available.

(no operator) exactly one


? zero or one
* zero or more
+ one or more

Each rule ends with a semicolon. So our first rule reads for now as

Xtext 1.0 5
Getting Started

DomainModel :
Entity*;

An Xtext grammar does not only describe rules for the parser but also the structure of the resulting AST.
Usually, each parser rule will create a new object in that tree. The type of that element can be specified
after the rule name using the keyword returns. If the type’s name is the same as the rule’s name, it can
be omitted as in our case. Which effectively means that
DomainModel returns DomainModel: ...

is the same as
DomainModel : ...

In order to connect the different objects together, we have to assign the elements returned by called rules
to some feature of the current rule’s object. This is done by using so called assignments.
DomainModel :
(elements+=Entity)*;

The assignment with the ' +=' operator adds every object returned by the Entity rule to the elements
referenced of the AST type Model. If you use ' +=' as opposed to ' =' the feature needs to be a list – the
upperBound has to be -1. In addition to the two mentioned assignment operators there is a third one ' ?='
which is called boolean assignment and which sets a certain feature to true, if the part on the right hand
side was parsed successfully. Here’s the list of available assignment operators:

feature=... corresponds to setFeature(...)


list+=... corresponds to getList().add(...)
condition?=... corresponds to setCondition(true)

The next rule on our list is the rule Entity. Looking at our target syntax, each entity begins with the
keyword 'entity' followed by the entity’s name and an opening curly brace (we will handle the extends
clause in a second step). Then, an entity defines a number of features and ends with a closing curly brace.
The rule looks like this:
Entity :
'entity' name=ID '{'
(features+=Feature)*
'}';

Keywords are simply declared as string literals within the rules. The name uses the ID rule from the
mixed-in grammar org.eclipse.xtext.common.Terminals. The ID rule is a so called terminal rule which
always returns simple data types, such as String, Date, or Integer. Actually any Java class can be a simple
EMF EDataType. The ID rule recognizes alphanumeric words similar to identifiers in Java and returns
the value as a String. You can navigate to its declaration using F3. As mentioned before the assignment
operator ' =' denotes a single valued feature.
We want to allow inheritance for entities and therefore now add an optional ' extends' part.
Entity :
'entity' name=ID ('extends' superType=[Entity])? '{'
(features+=Feature)*
'}';

The question mark marks an element or as in this case an entire group of elements as optional. The
other new thing is the right hand side of of the assignment superType=[Entity]. This is a cross-reference
literal and states, that a cross-reference to an elsewhere declared Entity should be established. So in this
case Entity does not point to the corresponding rule but to the EClass. As both are named equally this
might cause a little confusion at first. The full syntax for a cross-reference is [TypeName|RuleCall] where
RuleCall defaults to ID. This means that ? would work equally well. The parser only parses the name of

Xtext 1.0 6
Getting Started

the cross-referenced element using the ID rule and stores it internally. Later on the linker establishes the
cross-reference using the name, the defined cross-reference’s type ( Entity in this case) and the defined
scoping rules. Scoping is not covered in this tutorial since the domain model DSL works with the default
scoping semantics. By default a namespace-based scoping is applied, which means that all entities living
in the same namespace are possible candidates for a cross-reference. We talk a bit about this later on in
this chapter when we introduce namespaces.
In our introductory example model we had not only defined entities but also two data types. So the
DomainModel rule needs to be able to parse data types as well. As both entities and data types will be
referenced from the yet to be defined Feature rule, we should introduce a common super type for them.
DomainModel :
(elements+=Type)*;

Type:
DataType | Entity;

DataType:
'datatype' name=ID;

A DomainModel now consists of types where a Type can either be a DataType or an Entity. Our rule Type
will just delegate to either of them, using the ' |' alternatives operator. Now that we have a common super
type for Entity and DataType we are able to refer to both types of elements with a single cross-reference.
Next up we need to define the syntax of Feature, which shouldn’t be surprising to you, as we do not
use any new concepts here.
Feature:
name=ID ':' type=TypeRef;

In many target platforms there is a difference between the notion of a simple attribute and a reference.
In relational databases, for instance, a table contains the values of attributes directly, but references are
“modeled” by the means of foreign keys. In object-relational persistence technologies such as JPA or
Hibernate references can have additional life-cycle semantics, which define what to do with referenced
objects in certain circumstances. Therefore you typically make a distinction between the two concepts,
as we do in the Domain Model Example you can create using the wizard. However, in this small scenario
we do not need a separation therefore we stick with one single concept for features.
As the type reference has an additional property for the multiplicity (many or one), we make it a separate
model element and parser rule TypeRef. The presence of the postfix ‘*’ in an input file should set the
boolean flag to ‘true’ that indicates a multi-value type reference in the AST model. This is the purpose
of the assignment operator ' ?='. The parser rule looks like this:
TypeRef :
referenced=[Type] (multi?='*')?;

Again we have specified a cross-reference, which this time not only allows to reference entities but all
instances of type. So far this only includes data type and entities, but you may want to introduce additional
concepts in a future version, like for instance a Value Object as proposed in Eric Evan’s great book
Domain Driven Design.
In the end your grammar editor should look like this:

Xtext 1.0 7
Getting Started

grammar org.eclipse.xtext.example.Domainmodel
with org.eclipse.xtext.common.Terminals

generate domainmodel "http://www.eclipse.org/xtext/example/Domainmodel"

DomainModel :
(elements+=Entity)*;

Type:
DataType | Entity;

DataType:
'datatype' name=ID;

Entity:
'entity' name=ID ('extends' superType=[Entity])? '{'
(features+=Feature)*
'}';

Feature:
name=ID ':' type=TypeRef;

TypeRef:
referenced=[Type] (multi?='*')?;

2.1.4. Generate Language Artifacts


Now that we have the grammar in place and defined we need to run the generator that will derive the
various language components. To do so locate the file GenerateDomainmodel.mwe2 file next to the
grammar file in the package explorer view. From its context menu, choose
Run As -> MWE2 Workflow.
That will trigger the Xtext language generator. It generates the parser and serializer and some additional
infrastructure code. You will see its logging messages in the Console View.

2.1.5. Run the Generated IDE Plug-in


If the code generation succeeded, we are now able to test the IDE integration. Right-click on the Xtext
project and choose
Run As -> Eclipse Application.
This will spawn a new Eclipse workbench with your projects as plug-ins installed. In the new workbench,
create a new project ( File -> New -> Project... -> General -> Project) and therein a new file with the
file extension you chose in the beginning ( *.dmodel). This will open the generated entity editor. Try it a
bit and discover the default functionality for code completion ( CTRL SPACE even for cross-references),
syntax highlighting, syntactic validation, linking errors, outline, find references etc.

Xtext 1.0 8
Getting Started

2.1.6. Adding a Namespace Concept


At the current stage, our language puts all types into the same scope. As cross-references are specified
by means of names, this can easily yield collisions as the model size increases. Avoiding such name
clashes is why most programming languages invent concepts like namespaces or modules. We will now
introduce the concept of a Package, which effectively is a namespace.
Our languages’s definition of a Package is as follows: A package has a name and contains other named
elements and namespace-imports. A named element is either another Package or a Type. Within a
package, elements must have unique names. The fully qualified name (FQN) of a named element consists
of the FQN of its package, a separator ' .' and its own simple name. Elements refer to each other using their
qualified names which may be relative to their common namespaces or absolute. By means of import
statements, these names can be further abbreviated.
That might sound a bit complicated at first but the good thing is that Xtext comes with exactly such
a default semantic for namespaces, which feels very natural in practice. Although all aspects of this
semantic are adaptable, we will stick to the defaults within this tutorial.
Before we extend the grammar, let’s have a look at how a the example model would look like when using
the newly introduced concepts:
package my.types {
datatype String
datatype Boolean
}

package my.entities {

import my.types.*

entity Session {
title: String
isTutorial : Boolean
}

entity Conference {
name : String
attendees : Person*
speakers : Speaker*
}

entity Person {
name : String
}

entity Speaker extends Person {


sessions : Session*
}
}

Let’s start to modify our grammar now. First we introduce the rule AbstractElement:
AbstractElement:
PackageDeclaration | Type;

DomainModel should now call AbstractElement instead of Type:


DomainModel:
(elements+=AbstractElement)*;

To allow qualified names, we need a new rule QualifiedName. This rule returns a string. But we do not
want this to be a terminal rule, as terminal rules have special semantics (they are used context-less in the
lexer, which often causes problems). Instead we want it to be a parser rule, but one which doesn’t return
an instance of an EClass from the referenced EPackage but just a plain String. Such rules are called data
type rules, because they return instances of EDataType as opposed to instances of EClass. String is the

Xtext 1.0 9
Getting Started

most often used EDataType in this context and is therefore the default return type for data type rules.
The rule looks like this:
QualifiedName:
ID ('.' ID)*;

Note, that Xtext automatically figures out that the rule is a data type rule rather then a normal parser rule,
because it doesn’t contain any assignments and all rule calls go to either other data type rules or terminal
rules. For a data type rule the parser simply concatenates the consumed text and returns it as a string. The
transformation to a user defined data type is done by so called ValueConverters. But as we use strings
we don’t need to care about ValueConverters here.
Next up we need to define the syntax of PackageDeclarations, which makes use of the QualifiedName
rule but doesn’t make use of anything we haven’t yet talked about.
PackageDeclaration:
'package' name=QualifiedName '{'
(elements+=AbstractElement)*
'}';

Having qualified names at hand, we want to specify cross-references that way, too. As mentioned by
default, Xtext assumes the rule ID as the syntax for cross-references, which has been fully sufficient so
far. But now we want to allow fully qualified names as well, so we explicitly specify the syntax rule
after a delimiter ' |':
fn.. Note, that the ' |' has nothing to do with the previously introduced alternative operator. In the context
of a cross-reference it is simply a delimiter.
Entity:
'entity' name=ID ('extends' superType=[Entity|QualifiedName])? '{'
(features+=Feature)*
'}';

and
TypeRef:
referenced=[Type|QualifiedName] (multi?='*')?;

As the last step, we introduce imports. An Import is an instance of AbstractElement, too, so that it can
occur as a child of DomainModel as well as of PackageDeclaration.
AbstractElement:
PackageDeclaration | Type | Import;

An imported namespace is not just a qualified name but it also allows an optional wildcard character '
*' at the end, so that multiple names can be imported with one statement. That requires a new data type
rule which we name QualifiedNameWithWildCard:
QualifiedNameWithWildCard:
QualifiedName '.*'?;

The declaration of the Import rule should look familiar by now:


Import:
'import' importedNamespace=QualifiedNameWithWildCard;

The default scoping implementation is based on naming conventions. First everything which has a name
is referenceable. By default something has a name if it has a property 'name'. If such an EAttribute
is available, the default implementation computes a qualified name by asking the container for its
name and concatenates the two names separated by a dot. The computation of qualified names can be
arbitrarily changed by implementing an IQualifedNameProvider. The other naming convention is that
if some element has an EAttribute ‘importedNamespace’ that value is used as a namespace import and
is automatically prefixed to any simple names used within that namespace. Also the asterisk ' *' is used
as a wildcard by default.

Xtext 1.0 10
Getting Started

That’s all for the grammar. It should now read as


grammar org.eclipse.xtext.example.Domainmodel
with org.eclipse.xtext.common.Terminals

generate domainmodel "http://www.eclipse.org/xtext/example/Domainmodel"

DomainModel:
(elements+=AbstractElement)*;

AbstractElement:
PackageDeclaration | Type | Import;

Import:
'import' importedNamespace=QualifiedNameWithWildCard;

PackageDeclaration:
'package' name=QualifiedName '{'
(elements+=AbstractElement)*
'}';

Type:
Entity | DataType;

DataType:
'datatype' name=ID;

Entity:
'entity' name=ID ('extends' superType=[Entity|QualifiedName])? '{'
(features+=Feature)*
'}';

Feature:
name=ID ':' type=TypeRef;

TypeRef:
referenced=[Type|QualifiedName] (multi?='*')?;

QualifiedName:
ID ('.' ID)*;

QualifiedNameWithWildCard:
QualifiedName '.*'?;

Now you should regenerate the language infrastructure as described in the previous section, and give the
editor a try. You can even split up your model into smaller parts and have cross-references across file
boundaries, as long as the referenced models are on the classpath.

2.2. Processing Xtext Models


Now that we know how to create a language we should talk about processing the parsed models somehow.
There are typically two useful things one can do with Xtext models: One is translating them to another
programming language, i.e. writing a code generator, the other is loading them at runtime and use them
dynamically. We’ll talk about code generation later. In this chapter we want to see how to interact with
Xtext models programmatically.
Text files parsed by Xtext are represented as object graphs in memory. We call these object graphs
Abstract Syntax Tree (AST), semantic model or simply model interchangeably. In Xtext models are
implemented using the Eclipse Modeling Framework (EMF), which can be seen as a very powerful
version of JavaBeans. It not only provides the typical getter and setter methods for the different features
of a model element but also comes with an long list of advanced concepts and semantics, which are
extremely useful in the context of Xtext.

Xtext 1.0 11
Getting Started

2.2.1. Loading a Resource


EMF models can be persisted by the means of a so called Resource. Xtext languages implement the
Resource interface which is why you can use the EMF API to load a model into memory (and also save
them):
new DomainmodelStandaloneSetup().createInjectorAndDoEMFRegistration();

ResourceSet rs = new ResourceSetImpl();


Resource resource = rs.getResource(URI.createURI("./mymodel.dmodel"), true);
EObject eobject = resource.getContents().get(0);

The first line initializes the language infrastructure to run in standalone mode. That is EMF is designed
to work in Eclipse and therefore makes use of Equinox extension points in order to register factories
and the like. In a vanilla Java project there is no Equinox, hence we do the registration programmaticly.
The generated MyLanguageStandaloneSetup class does just that. You don’t have to do this kind of
initialization when you run your plug-ins within Eclipse, since in that case the extension point declarations
are used.
The other thing the StandaloneSetup takes care of is creating a Guice injector. The use of Guice and
Dependency Injection is explained here.
Now that the language infrastructure is initialized and the different contributions to EMF are registered,
we want to load a Resource. To do so we first create a ResourceSet, which as the name suggests represents
a set of Resources. If one Resource references another Resource, EMF will automatically load that other
Resource into the same ResourceSet as soon as the cross-reference is resolved. Resolution of cross-
references is done lazy, i.e. on first access.
The 4th line loads the Resource using the resource set. We pass in a URI which points to the file in the
file system. EMF’s URI is a powerful concept. It supports a lot of different schemes to load resources
from file system, web sites, jars, OSGi bundles or even from Java’s classpath. And if that is not enough
you can come up with your own schemes. Also a URI can not only point to a resource but also to any
EObject in a resource. This is done by appending a so called URI fragment to the URI.
The second parameter denotes whether the resource should automatically be loaded if it wasn’t already
before. Alternatively we could have written
Resource resource = rs.getResource(URI.createURI("./mymodel.dmodel"), false);
resource.load(null);

The load method optionally takes a map of properties, which allows to define a contract between a client
and the specific implementation. In Xtext, for instance, we use the map to state whether cross-references
should be eagerly resolved. In order to find out what properties are supported, it’s usually best to look
into the concrete implementations. That said, in most cases you don’t need to pass any properties at all.
The last line
EObject eobject = resource.getContents().get(0);

assigns the root element to a local variable. Actually it is the first element from the contents list of a
Resource, but in Xtext a Resource always has just one root element.

2.2.2. Working with EMF Models


We have previously talked about Ecore models, which effectively is an EPackage containing a any
number of EClasses with EAttributes and EReferences. Ecore defines additional concepts but they are
not so important here. An EObject is an instance of an EClass. For instance, the root element of a domain
model would be an instance of the EClass DomainModel, as defined in the grammar in the last chapter.
EMF provides a reflection layer to work with EObjects in a generic way.
assertEquals("DomainModel", eobject.eClass().getName());

It is also possible to create new instances or get and set values using the reflection layer. That reflection
layer is very helpful when creating generic libraries for EMF models, however if you know what kind of
model you are working with it’s much nicer to program against the generated Java classes. As we know

Xtext 1.0 12
Getting Started

that the root element is an instance of DomainModel we can just cast it to the Java interface generated
by EMF:
DomainModel dm = (DomainModel) eobject;

The generated Java types provide getter and setter methods for each EAttribute and EReference, so that
you can easily navigate the model using Java:
EList<AbstractElement> elements = domainModel.getElements();
for (AbstractElement abstractElement : elements) {
if (abstractElement instanceof Entity) {
Entity entity = ((Entity)abstractElement);
System.out.println("entity "
+ entity.getName()
+ " extends "
+ entity.getSuperType().getName());
}
}

Note that you’ll find a lot of convenience API in EMF’s EcoreUtil and Xtext’s EcoreUtil2.

2.2.3. Working with the Parse Tree (aka Node Model)


In many situations the information from the AST is sufficient, but in some situations you need additional
syntactical information. In Xtext not only an AST is constructed while parsing but also a so called parse
tree, which contains all the textual information chunked in so called tokens. The parse tree, also called
node model, consists of two different kinds of nodes.
LeafNodes as the name suggests represent the leafs of the parse tree. Each leaf node represents one token.
If you go through all leaf nodes of a parse tree and concatenate all the tokens to one string, you’ll end up
with the whole textual representation of the model, including all hidden tokens such as whitespace and
comments. The following code does exactly that:
CompositeNode node = NodeUtil.getNode(domainModel);
Iterable<AbstractNode> contents = NodeUtil.getAllContents(node);
StringBuffer text = new StringBuffer();
for (AbstractNode abstractNode : contents) {
if (abstractNode instanceof LeafNode)
text.append(((LeafNode)abstractNode).getText());
}
System.out.println(text);

In addition to the text LeafNodes also hold information about the line, the offset and the length of that
token.
The other node type is called CompositeNode and is created for almost each grammar element. A
composite node can contain other composite nodes and leaf nodes. The super type of both node types is
AbstractNode. One can find a couple of convenience methods in NodeUtil and ParseTreeUtil.

2.2.4. Working with the Grammar


Also the grammar is represented as an EMF model and can be used in Java. In fact each node of the node
model references the element from the grammar which was responsible for parsing or lexing that node:
DomainModel domainModel = (DomainModel) eObject;
CompositeNode node = NodeUtil.getNode(domainModel);
ParserRule parserRule = (ParserRule) node.getGrammarElement();
assertEquals("DomainModel", parserRule.getName());

2.2.5. Working with Live Models


In a running Xtext Workbench, there are a number of components which access the semantic model of
an open editor, i.e. the parser, the linker, the validator, the outline, the index builder etc. While some of
these components are executed by the display thread, others like the parser or the indexer use different
concurrent threads to not deteriorate the editing experience. If you for example want to have a consistent

Xtext 1.0 13
Getting Started

outline of your model, it is essential to keep other threads from modifying the model while the outline
component reads it.
Many of the prominent locations where users can hook in their own code in Xtext are already called from
within a thread safe context, e.g. the API for quick fixes. Consequently, the following usually applies
only if you add additional functionality on top of Xtext, e.g. custom UI actions.
Each XtextEditor uses an IXtextDocument to store its model. To avoid synchronization trouble,
neither of them provides direct access to the XtextResource storing the semantic model. Instead, the
IXtextDocument has two methods readOnly() and modify(). Both take an argument of type IUnitOfWork
(<T>, IXtextResource) which defines a method <T> exec(IXtextResource) that contains what you want
to do with the model and allows to deliver a result of arbitrary type.
So here is an example of safely reading a model:
IXtextDocument myDocument = ...;
String rootElementName = myDocument.readOnly(
new IUnitOfWork(){
public String exec(IXtextResource resource) {
MyType type = (MyType)resource.getContents().get(0);
return myType.getName();
}
});

Direct write-access on the document is usually only performed inside the framework. If you want to
change a document by means of its semantic model, you should rather use an IDocumentEditor which
uses the modify() method internally but takes care of synchronizing the node model, too:
@Inject
private IDocumentEditor documentEditor;

public void setRootName(IXtextDocument myDocument,


final String newName) {
documentEditor.process(
new IUnitOfWork.Void() {
public void process(IXtextResource resource) {
MyType type = (MyType)resource.getContents().get(0);
myType.setName(newName);
}
}, myDocument);
}

2.2.6. Summary
Let’s summarize what we have learned: An Xtext model is loaded by an EMF Resource. The main model
is represented as an instance of so called EClasses which are themselves declared within Ecore models.
A parse tree is created as well, which effectively acts as a tracing model between the text, the AST and
the grammar. The following diagram illustrates the four different kinds of models.

Xtext 1.0 14
Getting Started

2.3. Writing a Code Generator


In the previous section we’ve seen how to process Xtext files in general (e.g. with Java). If you need
late binding, loading models dynamically and interpreting them is a good idea. If you do not need late
binding, code generation is a viable option which is widely used to make models executable.
You could of course use Java to do the code generation, but unfortunately Java doesn’t support this task
very well. Therefore we use a language specialized on code generation: M2T Xpand.

2.3.1. Xpand and MWE2


Xpand is also part of Eclipse and ships with its own documentation, so we don’t need to explain all the
details here, but rather want to point you to the existing documentation. Instead this chapter is about
providing some information about how to use Xtext and Xpand together.
The other technology we are going to use here is the Modeling Workflow Engine 2. It is used to describe
the generation workflow, that is how and which models should be loaded, how they should be validated
post-processed and where and how code generation should take place.
This section is based on the Getting Started section and uses the code generator project which is created
by Xtext’s project wizard.

2.3.2. The Empty Generator Project


Previously when we created the projects to develop the domain model language, we had
checked the option Create a generator project in the wizard, which as a result created a third
project for us in the workspace. If you’ve followed along you should have a project called
‘org.eclipse.xtext.example.domainmodel.generator’ in your workspace. It should look like shown in the
following screenshot:

2.3.3. Replacing the Example Model File (1)


The four files shown in the image above are very simple stubs generated for the ‘Hello World’-language
the wizard creates as a starting point. We’ll have to adapt them to match the domain model language we
have developed in the getting started section. The first thing we should do is open up the example model
file (1) and replace its contents with an instance of the domain model language. To do so copy and paste
the following into the editor:

Xtext 1.0 15
Getting Started

package java.lang {
datatype String
datatype Boolean
}

package my.entities {

import java.lang.*

entity Session {
title: String
isTutorial : Boolean
}

entity Conference {
name : String
attendees : Person*
speakers : Speaker*
}

entity Person {
name : String
}

entity Speaker extends Person {


sessions : Session*
}
}

2.3.4. The Modeling Workflow Engine File (2)


As mentioned MWE2 is used to define and configure the generation process. Actually, it is much more
general and can be useful for a lot of other things as well. Let’s have a look at the MWE2 file as generated
by the wizard:

Xtext 1.0 16
Getting Started

module workflow.DomainmodelGenerator

import org.eclipse.emf.mwe.utils.*

var targetDir = "src-gen"


var fileEncoding = "ISO-8859-1"
var modelPath = "src/model"

Workflow {

component = org.eclipse.xtext.mwe.Reader {
// lookup all resources on the classpath
// useJavaClassPath = true

// or define search scope explicitly


path = modelPath

// this class will be generated by the xtext generator


register = org.eclipse.xtext.example.DomainmodelStandaloneSetup {}
load = {
slot = "greetings"
type = "Greeting"
}
}

component = org.eclipse.xpand2.Generator {
expand = "templates::Template::main FOREACH greetings"
outlet = {
path = targetDir
}
fileEncoding = fileEncoding
}
}

The first line just declares a name for this workflow file. Any qualified Java identifier is allowed, but it
should match the file name of the MWE2 file (and in future versions we might make this mandatory). It
is followed by an import statement. MWE2 references Java classes and to make that convenient, you can
specify imports after the module declaration. Next up we see a couple of vars are declared. Such vars
can be overridden when invoking a workflow file.
The important part is the Workflow part. There we declare an instance of
org.eclipse.emf.mwe.utils.Workflow and add two instances to the component property. The first one is a
Reader which is used to initialize Xtext languages, read in Xtext files, and fetch specific elements from
those models in order to make them available to following workflow components (such as the declared
Generator).
Xtext language initialization is done by specifying any number of ISetup implementations. In our case the
generated DomainModelStandaloneSetup is registered, which makes sure that the infrastructure of your
domain model language is set up properly. If you have multiple languages just add additional assignments
for each language.
You have to tell the Reader which paths to scan for model files. In this particular case we just specified
one path. Another convenient option is to reuse the Java classpath, as suggested in the comments.
The load section specifies which elements to fetch from the loaded resources. In this case we state that
we want all elements of type Greeting. Note that this is completely file agnostic, it will provide you with
all elements from all files the Reader could find on the specified paths. The slot name is the name by
which other workflow components can access the stored elements.
The second workflow component is an instance of org.eclipse.xpand2.Generator, which is the MWE2
facade to the Xpand template language. The Xpand generator needs to know which template to invoke
for which models. Qualified names in Xpand are separated by a double colon '::'. That is the name
'templates::Template::main' points to the definition main in the file templates/Template.xpt on the Java
classpath. The second part FOREACH greetings references the greetings slot which has previously been
populated by the reader component.

Xtext 1.0 17
Getting Started

An Outlet describes where to put the generated sources. In Xpand there is a file-statement which refers
to outlets. If you only have one outlet you don’t have to give it a name, but you need to declare where
the root folder for that outlet can be found in the file system. I.e. you specifiy where the generated code
should go. Oulets allow you to specify a couple of other things as well. As MWE2 just instantiates Java
objects, you can go to the Java code of org.eclipse.xpand2.output.Outlet in order to find out (see the adder
and setter methods). about the different options.
Now that we understood the most important things in the workflow we have to adapt it to match our
domain model language. We only have to change the type which is used to fetch and filter the elements
in the load section. We want to generate code for entities, therefore we change it to Entity. We should
change the slot’s name, too:
module workflow.DomainmodelGenerator

import org.eclipse.emf.mwe.utils.*

var targetDir = "src-gen"


var fileEncoding = "ISO-8859-1"
var modelPath = "src/model"

Workflow {

component = org.eclipse.xtext.mwe.Reader {
// lookup all resources on the classpath
// useJavaClassPath = true

// or define search scope explicitly


path = modelPath

// this class will be generated by the xtext generator


register = org.eclipse.xtext.example.DomainmodelStandaloneSetup {}
load = {
slot = "entities" //changed to entities
type = "Entity" //changed to Entity
}
}

component = org.eclipse.xpand2.Generator {
expand = "templates::Template::main FOREACH entities" //changed to entities
outlet = {
path = targetDir
}
fileEncoding = fileEncoding
}
}

2.3.5. Using Xpand (3) and Xtend (4) for Code Generation
The MWE2 file invokes an Xpand definition called main located in templates/Template.xpt. Please open
that file and replace its contents with the following Xpand code:

Xtext 1.0 18
Getting Started

«IMPORT org::eclipse::xtext::example::domainmodel»
«EXTENSION templates::Extensions»

«DEFINE main FOR Entity-»


«FILE qualifiedName().replaceAll("\\.","/")+".java"-»
package «packageName()»;

public class «name» {


«EXPAND property FOREACH features»
}
«ENDFILE»
«ENDDEFINE»

«DEFINE property FOR Feature»


private «type.referenced.qualifiedName()» «name»;

public void set«name.toFirstUpper()»(


«type.referenced.qualifiedName()» «name») {
this.«name» = «name»;
}
public «type.referenced.qualifiedName()» get«name.toFirstUpper()»() {
return «name»;
}
«ENDDEFINE»

You might get a couple of error markers, because the Xpand file references another file ( Extensions.ext
(4)) which has not yet been updated. Let’s ignore this for a moment and have a look at the general structure
of an Xpand template file. In the first line we import the namespace (i.e. Java package) of the generated
AST classes, that is Entity and Feature, we want to refer to in the generator. Next up the previously
mentioned Xtend file is imported. An Xtend file defines functions which can be used in Xpand.
The rest of the file contains two so called definitions. A definition in Xpand is similar to a function in that
it can be called, it has a name and it is defined for one or more arguments. The general syntax is as follows:
«DEFINE name(ArgType1 arg1, ArgType2 arg2) FOR MainArgType»

Where MainArgType is bound to the variable name this which can like in Java be omitted when referring
to.
The first definition main is defined for Entity and is the one invoked from the previously discussed MWE2
file (2). The first thing the definition does is opening a file using the qualifiedName() of the given Entity.
The function qualifiedName() should be declared in the referenced Xtend file (4). To do so open that file
and replace its current contents by the following snippet:
import org::eclipse::xtext::example::domainmodel;

packageName(Type this) :
qualifiedName(eContainer());

packageName(Void this) : null;

String qualifiedName(Object this) : null;

String qualifiedName(Type this) :


packageName()==null? name : packageName()+"."+name;

String qualifiedName(PackageDeclaration this) :


if qualifiedName(eContainer())==null
then name
else qualifiedName(eContainer())+'.'+name;

It defines the qualifiedName() function not only for Entity but also generally for types. Note how it
computes the qualified name by calling qualifiedName() recursively on its eContainer(). Both Xpand and
Xtend are based on the same expression language, which is statically typed and allows for very convenient
navigation over object graphs and collections.

Xtext 1.0 19
Getting Started

Back to the Xpand file (3) we see some static text which goes directly into the opened file. The EXPAND
statement calls the other definition defined in this file. It will effectively generate a Java field and two
accessor-methods for each Feature.
You’ll now able able to run the generator by opening the context menu on the *.mwe2 file (2) and choose
Run As->MWE2 Workflow. For more information on Xpand and Xtend see the corresponding language
documentation (available through Eclipse Help). MWE2 is explained in detail in a later section.

Xtext 1.0 20
Chapter 3. The Grammar Language
The grammar language is the corner stone of Xtext. It is a domain-specific language, carefully designed
for the description of textual languages. The main idea is to describe the concrete syntax and how it is
mapped to an in-memory model created during parsing.

3.1. A First Example


To get an idea of how it works we’ll start by implementing a simple example introduced by Martin Fowler.
It’s mainly about describing state machines used as the (un)lock mechanism of secret compartments.
People who have secret compartments control their access in a very old-school way, e.g. by opening the
door first and turning on the light afterwards. Then the secret compartment, for instance a panel, opens
up. One of those state machines could look like this:
events
doorClosed D1CL
drawOpened D2OP
lightOn L1ON
doorOpened D1OP
panelClosed PNCL
end

resetEvents
doorOpened
end

commands
unlockPanel PNUL
lockPanel PNLK
lockDoor D1LK
unlockDoor D1UL
end

state idle
actions {unlockDoor lockPanel}
doorClosed => active
end

state active
drawOpened => waitingForLight
lightOn => waitingForDraw
end

state waitingForLight
lightOn => unlockedPanel
end

state waitingForDraw
drawOpened => unlockedPanel
end

state unlockedPanel
actions {unlockPanel lockDoor}
panelClosed => idle
end

So, we have a bunch of declared events, commands, and states. Within states there are references
to declared actions, which should be executed when entering such a state. Also there are transitions
consisting of a reference to an event and a state. Please read Martin’s description if you want to read
the whole story.
In order to implment this little language with Xtext, you need to write the following grammar:

Xtext 1.0 21
The Grammar Language

grammar my.pack.SecretCompartments
with org.eclipse.xtext.common.Terminals

generate secretcompartment "http://www.eclipse.org/secretcompartment"

Statemachine :
'events'
(events+=Event)+
'end'
('resetEvents'
(resetEvents+=[Event])+
'end')?
'commands'
(commands+=Command)+
'end'
(states+=State)+;

Event :
name=ID code=ID;

Command :
name=ID code=ID;

State :
'state' name=ID
('actions' '{' (actions+=[Command])+ '}')?
(transitions+=Transition)*
'end';

Transition :
event=[Event] '=>' state=[State];

Martin uses this example throughout his book “Domain-Specific Languages” to implement external and
internal DSLs using different technologies. Note, that none of his implmentations is nearly as readable
and concise as the description in Xtext’s grammar language above. That is of course because the grammar
language is designed to do just that, i.e. it is specific to the domain of language descriptions.

3.2. The Syntax


In the following the different concepts and syntactical constructs of the grammar language are explained.

3.2.1. Language Declaration


The first line in every grammar ...
grammar my.pack.SecretCompartments with org.eclipse.xtext.common.Terminals
declares the name of the grammar. Xtext leverages Java’s classpath mechanism. This means that the name
can be any valid Java qualifier. The file name needs to correspond to the grammar name and have the
file extension ' .xtext'. This means that the name has to be SecretCompartments.xtext and must be placed
in a package my.pack somewhere on your project’s classpath.
The first line is also used to declare any used language (for mechanism details see Grammar Mixins).

3.2.2. EPackage Declarations


Xtext parsers create in-memory object graphs while parsing text. Such object-graphs are instances of
EMF Ecore models. An Ecore model basically consists of an EPackage containing EClasses, EDataTypes
and EEnums (See the section on EMF for more details). Xtext can infer Ecore models from a grammar
(see Ecore model inference) but it is also possible to reuse existing Ecore models. You can even mix this
and use multiple existing Ecore models and infer some others from one grammar.

3.2.2.1. EPackage Generation


The easiest way to get started is to let Xtext infer the Ecore model from your grammar. This is what is
done in the secret compartment example. To do so just state:

Xtext 1.0 22
The Grammar Language

generate secretcompartment 'http://www.eclipse.org/secretcompartment'


That statement means: generate an EPackage with the name “secretcompartment” and the nsURI “http://
www.eclipse.org/secretcompartment”. Actually these are the properties that are required to create an
EPackage. Xtext will then add EClasses with properties ( EAttributes and EReferences) for the different
rules, as described in Ecore model inference.

3.2.2.2. EPackage Import


If you already have an existing EPackage, you can import it using either a namespace URI or a resource
URI. An URI (Uniform Resource Identifier) provides a simple and extensible means for identifying an
abstract or physical resource. For all the niftx details about EMF URIs please see its documentation.

Using Resource URIs to Import Existing EPackages


In order to import an existing Ecore model, you’ll have to have the *.ecore file describing the EPackage
you want to use somewhere in your workspace. In order to refer to that file you make use of the platform:/
resource scheme. Platform URIs are a special EMF concept, which allow to reference elements in the
workspace independent of the location of the workspace.
An import statement referring to an Ecore file by a platform:/resource/-URI looks like this:
import 'platform:/resource/my.project/src/my/pack/SecretCompartments.ecore'
If you want to mix generated and imported Ecore models you’ll have to configure the generator fragment
in your MWE file responsible for generating the Ecore classes ( EcoreGeneratorFragment) with resource
URIs that point to the generator models of the referenced Ecore models.
The *.genmodel provides all kind of generator configuration used by EMF’s code generator. Xtext will
automatically create a *.genmodel for generated ecore models, but if such a model references an existing
imported Ecore model, the code generator needs to know how that code was generated in order to generate
valid references in Java for the new Ecore model.
Example:
fragment = org.eclipse.xtext.generator.ecore.EcoreGeneratorFragment {
genModels =
"platform:/resource/my.project/src/my/pack/SecretCompartments.genmodel"
}

Using Classpath URIs to Import Existing EPackages


We like to leverage Java’s classpath mechanism, because besides that it is well understood and designed
it allows us to package libraries as jars. If you want to reference an *.ecore file which is contained in a
jar, you can make use of the classpath scheme we’ve introduced. For instance if you want to reference
Java elements, you can use the JvmType Ecore model which is shipped as part of Xtext.
Example:
import 'classpath:/model/JvmTypes.ecore' as types
As with platform resource URIs you’ll also have to tell the generator where the corresponding *.genmodel
can be found:
fragment = org.eclipse.xtext.generator.ecore.EcoreGeneratorFragment {
genModels =
"classpath:/model/JvmTypes.genmodel"
}

See the section on Referring Java Types for a full explanation of this useful feature.

Using Namespace URIs to Import Existing EPackages


You can also use nsURI in order to import existing EPackage. Note that this is generally not preferrable,
because you’ll have to have the corresponding EPackage installed in the workbench. There’s mainly just
one exception, that is Ecore itself. So if you refer to Ecore it is best to use its nsURI :
import "http://www.eclipse.org/emf/2002/Ecore" as ecore

3.2.2.3. Ecore Model Aliases for EPackages


If you want to use multiple EPackages you need to specify aliases in the following way:

Xtext 1.0 23
The Grammar Language

generate secretcompartment 'http://www.eclipse.org/secretcompartment'


import 'http://www.eclipse.org/anotherPackage' as another

When referring to a type somewhere in the grammar you need to qualify the reference using that alias
(example another::CoolType). We’ll see later where such type references occur.
It is also supported to put multiple EPackage imports into one alias. This is no problem as long as there
are not any two EClassifiers with the same name. In such cases none of them can be referenced. It is
even possible to import multiple and generate one Ecore model and all of them are declared for the same
alias. If you do so, for a reference to an EClassifier first the imported EPackages are scanned before it is
assumed that a type needs to by generated into the to-be-generated package.
Note, that using this feature is not recommended, because it might cause problems, which are hard to track
down. For instance, a reference to classA would as well be linked to a newly created EClass, because the
corresponding type in http://www.eclipse.org/packContainingClassA is spelled with a capital letter.

3.2.3. Rules
Basically parsing can be separated in the following phases.
1. lexing
2. parsing
3. linking
4. validation

3.2.3.1. Terminal Rules


In the first stage called lexing, a sequence of characters (the text input) is transformed into a sequence of
so called tokens. In this context, a token is sort of a strongly typed part of the input sequence. It consists of
one or more characters and was matched by a particular terminal rule or keyword and therefore represents
an atomic symbol. Terminal rules are also referred to as token rules or lexer rules. There is an informal
naming convention that names of terminal rules are all upper-case.
In the secret compartments example there are no explicitly defined terminal rules, since it only uses the
ID rule which is inherited from the grammar org.eclipse.xtext.common.Terminals (cf. Grammar Mixins).
Therein the ID rule is defined as follows:
terminal ID :
('^')?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;

It says that a token ID starts with an optional '^' character (caret), followed by a letter (‘a’..‘z’|‘A’..‘Z’)
or underscore (‘_’) followed by any number of letters, underscores and numbers (‘0’..‘9’).
The caret is used to escape an identifier for cases where there are conflicts with keywords. It is removed
by the ID rule’s ValueConverter.
This is the formal definition of terminal rules:
TerminalRule :
'terminal' name=ID ('returns' type=TypeRef)? ':'
alternatives=TerminalAlternatives ';'
;

Note, that the order of terminal rules is crucial for your grammar, as they may hide each other. This is
especially important for newly introduced rules in connection with mixed rules from used grammars.
If you for instance want to add a rule to allow fully qualified names in addition to simple IDs, you should
implement it as a data type rule, instead of adding another terminal rule.

Return Types
A terminal rule returns a value, which is a string (type ecore::EString) by default. However, if you want
to have a different type you can specify it. For instance, the rule INT is defined as:
terminal INT returns ecore::EInt :
('0'..'9')+;

Xtext 1.0 24
The Grammar Language

This means that the terminal rule INT returns instances of ecore::EInt. It is possible to define any kind
of data type here, which just needs to be an instance of ecore::EDataType. In order to tell the parser
how to convert the parsed string to a value of the declared data type, you need to provide your own
implementation of IValueConverterService (cf. value converters). The value converter is also the point
where you can remove things like quotes from string literals or the caret (‘^’) from identifiers. Its
implementation needs to be registered as a service (cf. Service Framework).

3.2.3.2. Extended Backus-Naur Form Expressions


Token rules are described using “Extended Backus-Naur Form”-like (EBNF) expressions. The different
expressions are described in the following. The one thing all of these expressions have in common is the
cardinality operator. There are four different possible cardinalities
1. exactly one (the default, no operator)
2. one or none (operator ?)
3. any (zero or more, operator *)
4. one or more (operator +)

Keywords / Characters
Keywords are a kind of token rule literals. The ID rule in org.eclipse.xtext.common.Terminals for instance
starts with a keyword:
terminal ID : '^'? .... ;
The question mark sets the cardinality to “none or one” (i.e. optional) like explained above.
Note that a keyword can have any length and contain arbitrary characters.
The following standard Java notations for special characters are allowed: \n, \r, \t, \b, and \f. We currently
don’t support quoted unicode character notation, as \u123.

Character Ranges
A character range can be declared using the ‘..’ operator.
Example:
terminal INT returns ecore::EInt: ('0'..'9')+;
In this case an INT is comprised of one or more (note the + operator) characters between (and including)
'0' and '9'.

Wildcard
If you want to allow any character you can simple write the wildcard operator ' .' (dot): Example:
FOO : 'f' . 'o';
The rule above would allow expressions like ‘foo’, ‘f0o’ or even ‘f\no’.

Until Token
With the until token it is possible to state that everything should be consumed until a certain token occurs.
The multi-line comment is implemented this way:
terminal ML_COMMENT : '/*' -> '*/';
This is the rule for Java-style comments that begin with ' /*' and end with ' */'.

Negated Token
All the tokens explained above can be inverted using a preceding exclamation mark:
terminal BETWEEN_HASHES : '#' (!'#')* '#';

Rule Calls
Rules can refer to other rules. This is done by writing the name of the rule to be called. We refer to this
as rule calls. Rule calls in terminal rules can only point to terminal rules.
Example:
terminal DOUBLE : INT '.' INT;

Xtext 1.0 25
The Grammar Language

Alternatives
Using alternatives one can state multiple different alternatives. For instance, the whitespace rule uses
alternatives like this:
terminal WS : (' '|'\t'|'\r'|'\n')+;
That is a WS can be made of one or more whitespace characters (including ‘ ’, ' \t', ' \r', ' \n').

Groups
Finally, if you put tokens one after another, the whole sequence is referred to as a group. Example:
terminal ASCII : '0x' ('0'..'7') ('0'..'9'|'A'..'F');
That is the 2-digit hexadecimal code of ASCII characters.

3.2.4. Parser Rules


The parser reads a sequence of terminals and walks through the parser rules. Hence a parser rule - contrary
to a terminal rule – does not produce a single terminal token but a tree of non-terminal and terminal
tokens. They lead to a so called parse tree (in Xtext it is also referred as node model). Furthermore, parser
rules are handled as kind of a building plan for the creation of the EObjects that form the semantic model
(the linked abstract syntax graph or AST). Due to this fact, parser rules are even called production rules.
The different constructs like actions and assignments are used to derive types and initialize the semantic
objects accordingly.

3.2.4.1. Extended Backus-Naur Form Expressions


Not all the expressions that are available in terminal rules can be used in parser rules. Character ranges,
wildcards, the until token and the negation are only available for terminal rules.
The elements that are available in parser rules as well as in terminal rules are
1. groups,
2. alternatives,
3. keywords and
4. rule calls.
In addition to these elements, there are some expressions used to direct how the AST is constructed,
which are listed and explained in the following.

Assignments
Assignments are used to assign the parsed information to a feature of the current object. The type of the
current object, its EClass, is specified by the return type of the parser rule. If it is not explicitly stated it
is implied that the type’s name equals the rule’s name. The type of the feature is inferred from the right
hand side of the assignment.
Example:
State :
'state' name=ID
('actions' '{' (actions+=[Command])+ '}')?
(transitions+=Transition)*
'end'
;

The syntactic declaration for states in the state machine example starts with a keyword state followed
by an assignment:
name=ID
The left hand side refers to a feature ' name' of the current object (which has the EClass ' State' in this
case). The right hand side can be a rule call, a keyword, a cross-reference (explained later) or even an
alternative comprised by the former. The type of the feature needs to be compatible with the type of the
expression on the right. As ID returns an EString in this case, the feature ' name' needs to be of type
EString as well.
Assignment Operators
There are three different assignment operators, each with different semantics.

Xtext 1.0 26
The Grammar Language

1. The simple equal sign ' =' is the straight forward assignment, and used for features which take only
one element.
2. The ' +=' sign (the add operator) expects a multi-valued feature and adds the value on the right hand
to that feature, which is a list feature.
3. The ' ?=' sign (boolean assignment operator) expects a feature of type EBoolean and sets it to true if
the right hand side was consumed independently from the concrete value of the right hand side.
The used assignment operator does not effect the cardinality of the expected symbols on the right hand
side.

Cross-References
A unique feature of Xtext is the ability to declare crosslinks in the grammar. In traditional compiler
construction the crosslinks are not established during parsing but in a later linking phase. This is the same
in Xtext, but we allow to specify crosslink information in the grammar. This information is used by the
linker. The syntax for crosslinks is:
CrossReference :
'[' type=TypeRef ('|' ^terminal=CrossReferenceableTerminal )? ']'
;

For example, the transition is made up of two cross-references, pointing to an event and a state:
Transition :
event=[Event] '=>' state=[State]
;

It is important to understand that the text between the square brackets does not refer to another rule, but
to a type! This is sometimes confusing, because one usually uses the same name for the rules and the
returned types. That is if we had named the type for events differently like in the following the cross-
reference needs to be adapted as well:
Transition :
event=[MyEvent] '=>' state=[State]
;

Event returns MyEvent : ....;

Looking at the syntax definition of cross-references, there is an optional part starting with a vertical bar
(pipe) followed by CrossReferenceableTerminal. This is the part describing the concrete text, from which
the crosslink later should be established. If the terminal is omitted, it is expected to be ID.
You may even use alternatives as the referencable terminal. This way, either an ID or a STRING may be
used as the referencable terminal, as it is possible in many SQL dialects.
TableRef: table=[Table|(ID|STRING)];
Have a look at the linking section in order to understand how linking is done.

Unordered Groups
The elements of an unordered group can occur in any order but each element must appear once. Unordered
groups are separated with ' &', e.g.
Modifier:
static?='static'? & final?='final'? & visibility=Visibility;

enum Visibility:
PUBLIC='public' | PRIVATE='private' | PROTECTED='protected';

allows
public static final
static protected
final private static
public

Xtext 1.0 27
The Grammar Language

but not
static final static // ERROR: static appears twice
public static final private // ERROR: visibility appears twice
final // ERROR: visibility is missing

Note that if you want an element of an unordered group to appear once or not at all, you have to choose
a cardinality of ' ?'. In the example, the visibility is mandatory, while ' static' or ' final' are optional.
Elements with a cardinality of ' *' or ' +' have to appear continuously without interruption, i.e.
Rule:
values+=INT* & name=ID;

will parse
0 8 15 x
x 0 8 15

but not
0 x 8 15 // wrong, as values cannot be interrupted.

Simple Actions
By default the object to be returned by a parser rule is created lazily on the first assignment. Then the type
of the EObject to be created is determined from the specified return type or the rule name if no explicit
return type is specified. With Actions however, the creation of returned EObject can be made explicit.
Xtext supports two kinds of Actions:
1. simple actions, and
2. assigned actions.
If at some point you want to enforce the creation of a specific type you can use alternatives or simple
actions. In the following example TypeB must be a subtype of TypeA. An expression A ident should create
an instance of TypeA, whereas B ident should instantiate TypeB.
Example with alternatives:
MyRule returns TypeA :
"A" name=ID |
MyOtherRule
;

MyOtherRule returns TypeB :


"B" name = ID
;

Example with simple actions:


MyRule returns TypeA :
"A" name=ID |
"B" {TypeB} name=ID
;

Generally speaking, the instance is created as soon as the parser hits the first assignment. However,
actions allow to explicitly instantiate any EObject. The notation {TypeB} will create an instance of TypeB
and assign it to the result of the parser rule. This allows parser rules without any assignment and object
creation without the need to introduce unnecessary rules.

Unassigned Rule Calls


We previously explained, that the EObject to be returned is created lazily when the first assignment occurs
or when a simple action is evaluated. There is another way one can set the EObject to be returned, which
we call an “unassigned rule call”.

Xtext 1.0 28
The Grammar Language

Unassigned rule calls (the name suggests it) are rule calls to other parser rules, which are not used within
an assignment. If there is no feature the returned value shall be assigned to, the value is assigned to the
“to-be-returned” result of the calling rule.
With unassigned rule calls one can, for instance, create rules which just dispatch between several other
rules:
AbstractToken :
TokenA |
TokenB |
TokenC
;

As AbstractToken could possibly return an instance of TokenA, TokenB or TokenC its type must by a
super type of these types. It is now for instance as well possible to further change the state of the AST
element by assigning additional things.
Example:
AbstractToken :
( TokenA |
TokenB |
TokenC ) (cardinality=('?'|'+'|'*'))?
;

This way the cardinality is optional (last question mark) and can be represented by a question mark, a
plus, or an asterisk. It will be assigned to either an EObject of type TokenA, TokenB, or TokenC which are
all subtypes of AbstractToken. The rule in this example will never create an instance of AbstractToken
directly as long as the preceeding TokenX rule call returns an element.

Assigned Actions
LL-parsing has some significant advantages over LR algorithms. The most important ones for Xtext are,
that the generated code is much simpler to understand and debug and that it is easier to recover from
errors. Especially ANTLR has a very nice generic error recovery mechanism. This allows to construct
an AST even if there are syntactic errors in the text. You wouldn’t get any of the nice IDE features as
soon as there is one little error, if we hadn’t error recovery.
However, LL also has some drawbacks. The most important one is that it does not allow left recursive
grammars. For instance, the following is not allowed in LL-based grammars, because Expression ‘+’
Expression is left recursive:
Expression :
Expression '+' Expression |
'(' Expression ')' |
INT
;

Instead one has to rewrite such things by “left-factoring” it:


Expression :
TerminalExpression ('+' TerminalExpression)?
;

TerminalExpression :
'(' Expression ')' |
INT
;

In practice this is always the same pattern and therefore not that problematic. However, by simply
applying the Xtext AST construction features we’ve covered so far, a grammar ...

Xtext 1.0 29
The Grammar Language

Expression :
{Operation} left=TerminalExpression (op='+' right=TerminalExpression)?
;

TerminalExpression returns Expression:


'(' Expression ')' |
{IntLiteral} value=INT
;

... would result in unwanted elements in the AST. For instance the expression (42) would result in a tree
like this:
Operation {
left=Operation {
left=IntLiteral {
value=42
}
}
}

Typically one would only want to have one instance of IntLiteral instead.
One can solve this problem using a combination of unassigned rule calls and assigned actions:
Expression :
TerminalExpression ({Operation.left=current}
op='+' right=Expression)?
;

TerminalExpression returns Expression:


'(' Expression ')' |
{IntLiteral} value=INT
;

In the example above {Operation.left=current} is a so called tree rewrite action, which creates a new
instance of the stated EClass ( Operation in this case) and assigns the element currently to-be-returned
( current variable) to a feature of the newly created object (in this case feature left of the Operation
instance). In Java these semantics could be expressed as:
Operation temp = new Operation();
temp.setLeft(current);
current = temp;

3.2.5. Hidden Terminal Symbols


Because parser rules describe not a single token, but a sequence of patterns in the input, it is necessary
to define the interesting parts of the input. Xtext introduces the concept of hidden tokens to handle
semantically unimportant things like whitespaces, comments, etc. in the input sequence gracefully. It
is possible to define a set of terminal symbols, that are hidden from the parser rules and automatically
skipped when they are recognized. Nevertheless, they are transparently woven into the node model, but
not relevant for the semantic model.
Hidden terminals may (or may not) appear between any other terminals in any cardinality. They can be
described per rule or for the whole grammar. When reusing a single grammar its definition of hidden
tokens is reused as well. The grammar org.eclipse.xtext.common.Terminals comes with a reasonable
default and hides all comments and whitespace from the parser rules.
If a rule defines hidden symbols, you can think of a kind of scope that is automatically introduced. Any
rule that is called from the declaring rule uses the same hidden terminals as the calling rule, unless it
defines other hidden tokens itself.

Xtext 1.0 30
The Grammar Language

Person hidden(WS, ML_COMMENT, SL_COMMENT):


name=Fullname age=INT ';'
;

Fullname:
(firstname=ID)? lastname=ID
;

The sample rule “Person” defines multiline comments (_ML_COMMENT_), single-line comments
(_SL_COMMENT_), and whitespace ( WS) to be allowed between the Fullname and the age. Because
the rule Fullname does not introduce another set of hidden terminals, it allows the same symbols to appear
between firstname and lastname as the calling rule Person. Thus, the following input is perfectly valid
for the given grammar snippet:
John /* comment */ Smith // line comment
/* comment */
42 ; // line comment

A list of all default terminals like WS can be found in section Grammar Mixins.

3.2.6. Data Type Rules


Data type rules are parsing-phase rules, which create instances of EDataType instead of EClass. Thinking
about it, one may discover that they are quite similar to terminal rules. However, the nice thing about
data type rules is that they are actually parser rules and are therefore
1. context sensitive and
2. allow for use of hidden tokens.
If you, for instance, want to define a rule to consume Java-like qualified names (e.g. “foo.bar.Baz”) you
could write:
QualifiedName :
ID ('.' ID)*
;

In contrast to a terminal rule this is only valid in certain contexts, i.e. it won’t conflict with the rule ID.
If you had defined it as a terminal rule, it would possibly hide the ID rule.
In addition when this has been defined as a data type rule, it is allowed to use hidden tokens (e.g. “/*
comment **/”) between the IDs and dots (e.g. foo/* comment */. bar . Baz).
Return types can be specified in the same way as in terminal rules:
QualifiedName returns ecore::EString :
ID ('.' ID)*
;

Note that if a rule does not call another parser rule and does neither contain any actions nor assignments,
it is considered to be a data type rule and the data type EString is implied if none has been explicitly
declared. You have to import Ecore with the alias ecore in this case.
For conversion again value converters are responsible (cf. value converters).

3.2.7. Enum Rules


Enum rules return enumeration literals from strings. They can be seen as a shortcut for data type rules
with specific value converters. The main advantage of enum rules is their simplicity, type safety and
therefore nice validation. Furthermore it is possible to infer enums and their respective literals during the
Ecore model transformation.
If you want to define a ChangeKind org.eclipse.emf.ecore.change/model/Change.ecore with ADD,
MOVE and REMOVE you could write:
enum ChangeKind :
ADD | MOVE | REMOVE
;

Xtext 1.0 31
The Grammar Language

It is even possible to use alternative literals for your enums or reference an enum value twice:
enum ChangeKind :
ADD = 'add' | ADD = '+' |
MOVE = 'move' | MOVE = '->' |
REMOVE = 'remove' | REMOVE = '-'
;

Please note, that Ecore does not support unset values for enums. If you formulate a grammar like
Element: "element" name=ID (value=SomeEnum)?;
with the input of
element Foo
the resulting value of the element Foo will hold the enum value with the internal representation of
‘0’ (zero). When generating the EPackage from your grammar this will be the first literal you define. As
a workaround you could introduce a dedicated none-value or order the enums accordingly. Note that it
is not possible to define an enum literal with an empty textual representation.
enum Visibility:
package | private | protected | public
;

You can overcome this by modifying the infered Ecore model through a model to model transformation.

3.3. Ecore Model Inference


The Ecore model (or meta model) of a textual language describes the structure of its abstract syntax trees
(AST).
Xtext uses Ecore’s EPackages to define Ecore models. Ecore models are declared to be either inferred
(generated) from the grammar or imported. By using the generate directive, one tells Xtext to derive an
EPackage from the grammar.

3.3.1. Type and Package Generation


Xtext creates
• an EPackage
• for each generate-package declaration. After the directive generate a list of parameters follows.
The name of the EPackage will be set to the first parameter, its nsURI to the second parameter.
An optional alias as the third parameter allows to distinguish generated EPackages later. Only one
generated package declaration per alias is allowed.
• an EClass
• for each return type of a parser rule. If a parser rule does not define a return type, an implicit one
with the same name as the rule itself is assumed. You can specify more than one rule that return the
same type but only one EClass will be generated.
• for each type defined in an action or a cross-reference.
• an EEnum
• for each return type of an enum rule.
• an EDataType
• for each return type of a terminal rule or a data type rule.
All EClasses, EEnums, and EDataTypes are added to the EPackage referred to by the alias provided in
the type reference they were created from.

3.3.2. Feature and Type Hierarchy Generation


While walking through the grammar, the algorithm keeps track of a set of the currently possible return
types to add features to.
• Entering a parser rule the set contains only the return type of the rule.

Xtext 1.0 32
The Grammar Language

• Entering a group in an alternative the set is reset to the same state it was in when entering the first
group of this alternative.
• Leaving an alternative the set contains the union of all types at the end of each of its groups.
• After an optional element, the set is reset to the same state it was before entering it.
• After a mandatory (non-optional) rule call or mandatory action the set contains only the return type
of the called rule or action.
• An optional rule call does not modify the set.
• A rule call is optional, if its cardinality is ' ?' or ' *'.
While iterating the parser rules Xtext creates
• an EAttribute in each current return type
• of type EBoolean for each feature assignment using the ' ?=' operator. No further EReferences or
EAttributes will be generated from this assignment.
• for each assignment with the ' =' or ' +=' operator calling a terminal rule. Its type will be the return
type of the called rule.
• an EReference in each current return type
• for each assignment with the ' =' or ' +=' operator in a parser rule calling a parser rule. The
EReference's type will be the return type of the called parser rule.
• for each assigned action. The reference’s type will be set to the return type of the current calling rule.
Each EAttribute or EReference takes its name from the assignment or action that caused it. Multiplicities
will be 0...1 for assignments with the ' =' operator and 0...* for assignments with the ' +=' operator.
Furthermore, each type that is added to the currently possible return types automatically extends the
current return type of the parser rule. You can specify additional common super types by means of
“artificial” parser rules, that are never called, e.g.
CommonSuperType:
SubTypeA | SubTypeB | SubTypeC;

3.3.3. Enum Literal Generation


For each alternative defined in an enum rule, the transformer creates an enum literal, when another literal
with the same name cannot be found. The literal property of the generated enum literal is set to the right
hand side of the declaration. If it is omitted, you will get an enum literal with equal name and literal
attributes.
enum MyGeneratedEnum:
NAME = 'literal' | EQUAL_NAME_AND_LITERAL;

3.3.4. Feature Normalization


In the next step the generator examines all generated EClasses and lifts up similar features to super types
if there is more than one subtype and the feature is defined in every subtypes. This does even work for
multiple super types.

3.3.5. Customized Post Processing


As a last step, the generator invokes the post processor for every generated Ecore model. The post
processor expects an Xtend file with name MyDslPostProcessor.ext (if the name of the grammar file is
MyDsl.xtext) in the same folder as the grammar file. Further, for a successful invocation, the Xtend file
must declare an extension with signature process(xtext::GeneratedMetamodel). E.g.
process(xtext::GeneratedMetamodel this) :
process(ePackage)
;

process(ecore::EPackage this) :
... do something
;

Xtext 1.0 33
The Grammar Language

The invoked extension can then augment the generated Ecore model in place. Some typical use cases
are to:
• set default values for attributes,
• add container references as opposites of existing containment references, or
• add operations with implementation using a body annotation.
Great care must be taken to not modify the Ecore model in a way preventing the Xtext parser from
working correctly (e.g. removing or renaming model elements).

3.3.6. Error Conditions


The following conditions cause an error
• An EAttribute or EReference has two different types or different cardinality.
• There is an EAttribute and an EReference with the same name in the same EClass.
• There is a cycle in the type hierarchy.
• An new EAttribute, EReference or super type is added to an imported type.
• An EClass is added to an imported EPackage.
• An undeclared alias is used.
• An imported Ecore model cannot be loaded.

3.4. Grammar Mixins


Xtext supports the reuse of existing grammars. Grammars that are created via the Xtext wizard use
org.eclipse.xtext.common.Terminals by default which introduces a common set of terminal rules and
defines reasonable defaults for hidden terminals.
grammar org.xtext.example.MyDsl with org.eclipse.xtext.common.Terminals

generate myDsl 'http://www.xtext.org/example/MyDsl'

... some rules

Mixing in another grammar makes the rules defined in that grammar referable. It is also possible to
overwrite rules from the used grammar.
Example :
grammar my.SuperGrammar
...
RuleA : "a" stuff=RuleB;
RuleB : "{" name=ID "}";

grammar my.SubGrammar with my.SuperGrammar

Model : (ruleAs+=RuleA)*;

// overrides my.SuperGrammar.RuleB
RuleB : '[' name=ID ']';

Note that declared terminal rules always get a higher priority then imported terminal rules.

3.5. Common Terminals


Xtext ships with a default set of predefined, reasonable and often required terminal rules. This grammar
is defined as follows:

Xtext 1.0 34
The Grammar Language

grammar org.eclipse.xtext.common.Terminals
hidden(WS, ML_COMMENT, SL_COMMENT)

import "http://www.eclipse.org/emf/2002/Ecore" as ecore

terminal ID :
'^'?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')* ;
terminal INT returns ecore::EInt: ('0'..'9')+ ;
terminal STRING :
'"' ( '\\' ('b'|'t'|'n'|'f'|'r'|'"'|"'"|'\\') | !('\\'|'"') )* '"' |
"'" ( '\\' ('b'|'t'|'n'|'f'|'r'|'"'|"'"|'\\') | !('\\'|"'") )* "'";
terminal ML_COMMENT : '/*' -> '*/' ;
terminal SL_COMMENT : '//' !('\n'|'\r')* ('\r'? '\n')? ;
terminal WS : (' '|'\t'|'\r'|'\n')+ ;
terminal ANY_OTHER: . ;

Xtext 1.0 35
Chapter 4. Configuration
4.1. The Generator
Xtext provides lots of generic implementations for your language’s infrastructure but also uses code
generation to generate some of the components. Those generated components are for instance the parser,
the serializer, the inferred Ecore model (if any) and a couple of convenient base classes for content assist,
etc.
The generator also contributes to shared project resources such as the plugin.xml, MANIFEST.MF and
the Guice modules.
Xtext’s generator leverages MWE2 – the modeling workflow engine from EMFT to configure the
generator.

4.1.1. A Short Introduction to MWE2


MWE2 allows to compose object graphs declaratively in a very compact manner. The nice thing about
it is that it just instantiates Java classes and the configuration is done through public setter and adder
methods as one is used to from Java Beans encapsulation principles. An in-depth documentation can be
found in the chapter MWE2.
Given the following simple Java class (POJO):
package com.mycompany;

public class Person {

private String name;

public void setName(String name) {


this.name = name;
}

private final List<Person> children = new ArrayList<Person>();

public void addChild(Person child) {


this.children.add(child);
}
}

One can create a family tree with MWE2 easily by describing it in a declarative manner without writing
a single line of Java code and without the need to compile classes:
module com.mycompany.CreatePersons

Person {
name = "Grandpa"
child = {
name = "Father"
child = {
name = "Son"
}
}
}

These couple of lines will, when interpreted by MWE2, result in an object tree consisting of three
instances of com.mycompany.Person. The interpreter will basically do the same as the following main
method:

Xtext 1.0 36
Configuration

package com.mycompany;

public class CreatePersons {


public static void main(String[] args) {
Person grandpa = new Person();
grandpa.setName("Grandpa");
Person father = new Person();
father.setName("Father");
grandpa.addChild(father);
Person son = new Person();
son.setName("Son");
father.addChild(son);
}
}

The root element is a class-name following the Java classpath visibility rules. As the module is a
sibling to the class com.mycompany.Person it is not necessary to use use fully qualified name. There
are other packages implicitly imported into this workflow as well to make it convenient to instantiate
actual workflows and components, but these ones are covered in depth in the appropriate chapter. The
constructed objects are furthermore configured according to the declaration in the module, e.g. a second
instance of Person will be created and added to the list of children of “Grandpa” while the third person –
the class is inferred from the assigned feature – becomes a child of “Father”. All three instances will have
their respective name assigned via a reflective invocation setName. If one wants to add another child to
“Father”, she can simply repeat the child assignment:
child = com.mycompany.Person {
name = "Father"
child = {
name = "Son"
}
child = {
name = "Daughter"
}
}

As you can see in the example above MWE2 can be used to instantiate arbitrary Java object models
without any dependency or limitation to MWE2 specific implementations. This is conceptually very close
to the dependency injection mechanism and the XML language in the Spring Framework.
Tip Whenever you are in an *.mwe2 file and wonder what kind of configuration the underlying component
may accept: Just use the Content Assist in the MWE2 Editor or navigate directly to the declaration of the
underlying Java implementation by means of F3 (Go To Declaration).
This is the basic idea of the MWE2 language. There are of course a couple of additional concepts and
features in the language and we also have not yet talked about the runtime workflow model. Please refer
to the dedicated MWE2 reference documentation for additional information.

4.1.2. General Architecture


Of course a generator in Xtext is not composed of fathers and sons, but of so called language
configurations. For each language configuration a URI pointing to its grammar file and the file extensions
for the DSL must be provided. In addition, a language is configured with a list of IGeneratorFragments.
The whole generator is composed of theses fragments. We have fragments for generating parsers, the
serializer, the EMF code, the outline view, etc.

Xtext 1.0 37
Configuration

4.1.2.1. Generator Fragments


Each fragment gets the grammar of the language as an EMF model passed in. A fragment
is able to generate code in one of the configured locations and contribute to several shared
artifacts. The main interface IGeneratorFragment is supported by a convenient abstract base class
AbstractGeneratorFragment, which by default delegates to an Xpand template with the same qualified
name as the class and delegates some of the calls to Xpand template definitions.
We suggest to have a look at the fragment we have written for label providers ( LabelProviderFragment).
It is pretty trivial and at the same time uses the most important call backs. In addition, the structure is
not cluttered with too much extra noise so that the whole package can serve as a template to write your
own fragment.

4.1.2.2. Configuration
As already explained we use MWE2 from EMFT in order to instantiate, configure and execute this
structure of components. In the following we see an exemplary Xtext generator configuration written in
MWE2 configuration code:

Xtext 1.0 38
Configuration

module org.xtext.example.MyDsl

import org.eclipse.emf.mwe.utils.*
import org.eclipse.xtext.generator.*
import org.eclipse.xtext.ui.generator.*

var grammarURI = "classpath:/org/xtext/example/MyDsl.xtext"


var file.extensions = "mydsl"
var projectName = "org.xtext.example.mydsl"
var runtimeProject = "../${projectName}"

Workflow {
bean = StandaloneSetup {
platformUri = "${runtimeProject}/.."
}

component = DirectoryCleaner {
directory = "${runtimeProject}/src-gen"
}

component = DirectoryCleaner {
directory = "${runtimeProject}.ui/src-gen"
}

component = Generator {
pathRtProject = runtimeProject
pathUiProject = "${runtimeProject}.ui"
projectNameRt = projectName
projectNameUi = "${projectName}.ui"

language = {
uri = grammarURI
fileExtensions = file.extensions

// Java API to access grammar elements


fragment = grammarAccess.GrammarAccessFragment {}

/* more fragments to configure the language */


...
}
}
}

Here the root element is Workflow and is part of the very slim runtime model shipped with MWE2. It
accepts bean s and component s. The var declaration is a first class concept of MWE2’s configuration
language and defines the interface of the module. They allow to externalize some common configuration
parameters. This comes especially handy in String variables where one can easily use ${variable} to
concatenate values.
The method Workflow.addBean(Object) does nothing but provides a means to apply global side-effects,
which unfortunately is required sometimes. In this example we do a so called EMF stand-alone setup.
This class initializes a bunch of things for a non-OSGi environment that are otherwise configured by
means of extension points, e.g. it allows to populate EMF’s singletons like the EPackage.Registry.
Following the bean assignment there are three component elements. The Workflow.addComponent()
method accepts instances of IWorkflowComponent, which is the primary concept of MWE2’s workflow
model. Xtext’s generator itself is an instance of IWorkflowComponent and can therefore be used within
MWE2 workflows.
There are two fragments ImplicitRuntimeFragment and ImplicitUiFragment will be called implicitly if
the required paths and project names are set. They take care of common defaults.

Xtext 1.0 39
Configuration

4.1.3. Standard Generator Fragments


In the following table the most important standard generator fragments are listed. Please refer to the
Javadocs for more detailed documentation.

Class Generated Artifacts Related Documentation


EcoreGeneratorFragment EMF code for generated models Model inference
XtextAntlrGeneratorFragment ANTLR grammar, parser, lexer
and related services
GrammarAccessFragment Access to the grammar
ResourceFactoryFragment EMF resource factory Xtext Resource
ParseTreeConstructorFragment Model-to-text serialization Serialization
ImportNamespacesScopingFragment
Index-based scoping Index-based namespace scoping
JavaValidatorFragment Model validation Model validation
FormatterFragment Code formatter Declarative formatter
LabelProviderFragment Label provider Label provider
OutlineNodeAdapterFactoryFragment
Outline view configuration Outline
TransformerFragment Outline view configuration Outline
JavaBasedContentAssistFragment Java-based content assist Content assist
XtextAntlrUiGeneratorFragment Content assist helper based on Content assist
ANTLR
SimpleProjectWizardFragment New project wizard Project wizard

4.2. Dependency Injection in Xtext with Google Guice


All Xtext components are assembled by means of Dependency Injection (DI). This means basically that
whenever some code is in need for functionality (or state) from another component, one just declares the
dependency rather then stating how to resolve it, i.e. obtaining that component.
For instance when some code wants to use a scope provider, it just declares a field (or method or
constructor) and adds the @Inject annotation:
public class MyLanguageLinker extends Linker {

@Inject
private IScopeProvider scopeProvider;

It is not the duty of the code to care about where the IScopeProvider comes from or how it is created.
When above’s class is instantiated, Guice sees that it requires an instance of IScopeProvider and assigns
it to the specified field or method parameter. This of course only works, if the object itself is created
by Guice. In Xtext almost every instance is created that way and therefore the whole dependency net is
controlled and configured by the means of Google Guice.
Guice of course needs to know how to instantiate real objects for declared dependencies. This is done in
so called Modules. A Module defines a set of mappings from types to either existing instances, instance
providers or concrete classes. Modules are implemented in Java. Here’s an example:
public class MyDslRuntimeModule extends AbstractMyDslRuntimeModule {
@Override
public void configure(Binder binder) {
super.configure(binder);
binder.bind(IScopeProvider.class).to(MyConcreteScopeProvider.class);
}
}

Xtext 1.0 40
Configuration

With plain Guice modules one implements a method called configure and gets a so called Binder passed
in. That binder provides a fluent API to define the mentioned mappings. This was just a very brief and
simplified description. We highly recommend to have a look at the website Google Guice to learn more.

4.2.1. The Module API


Xtext comes with a slightly enhanced module API. For your language you get two different modules:
One for the runtime bundle which is used when executing your language infrastructure outside of Eclipse
such as on the build server. The other is located in the UI bundle and adds or overrides bindings when
Xtext is used within an Eclipse environment.
The enhancement we added to Guice’s Module API is that we provide an abstract base class, which
reflectively looks for certain methods in order to find declared bindings. The most common kind of
method is :
public Class<? extends IScopeProvider> bindIScopeProvider() {
return MyConcreteScopeProvider.class;
}

which would do the same as the code snippet above. It simply declares a binding from IScopeProvider
to MyConcreteScopeProvider. That binding will make Guice instantiate and inject a new instance of
MyConcreteScopeProvider whenever a dependency to IScopeProvider is declared.
Having a method per binding allows to deactivate individual bindings by overriding the corresponding
methods and either change the binding by returning a different target type or removing that binding
completely by returning null.
There are two additional kinds of binding-methods supported. The first one allows to configure a provider.
A Provider is an interface with just one method :
public interface Provider<T> {

/**
* Provides an instance of {@code T}. Must never return {@code null}.
*/
T get();
}

This one can be used if you need a hook whenever an instance of a certain type is created. For instance if
you want to provide lazy access to a singleton or you need to do some computation each time an instance
is created (i.e. factory). If you want to point to a provider rather than to a concrete class you can use the
following binding method.
public Class<? extends Provider<IScopeProvider>> provideIScopeProvider() {
return MyConcreteScopeProviderFactory.class;
}

(Please forgive us the overuse of the term provider. The IScopeProvider is not a Guice provider .)
That binding tells Guice to instantiate MyConcreteScopeProviderFactory and invoke get() in order
to obtain an instance of IScopeProvider for clients having declared a dependency to that type. Both
mentioned methods are allowed to return an instance instead of a type. This may be useful if some global
state should be shared in the application:
public Provider<IScopeProvider> provideIScopeProvider() {
return new MyConcreteScopeProviderFactory();
}

or
public IScopeProvider bindIScopeProvider() {
return new MyConcreteScopeProvider();
}

respectively.

Xtext 1.0 41
Configuration

The last binding method provided by Xtext allows to do anything you can do with Guice’s binding API,
since it allows you to use it directly. If your method’s name starts with the name ‘configure’, has a return
type void and accepts one argument of type Binder
public void configureIScopeProvider(Binder binder) {
binder.bind(IScopeProvider.class).to(MyConcreteScopeProvider.class);
}

4.2.2. Obtaining an Injector


In every application wired up with Guice there is usually one point where you initialize a so called
Injector using the modules declared and after that using that injector to create the root instance of the
whole application. In plain Java environments this is something that’s done in the main method. It could
look like this:
public static void main(String[] args) {
Injector injector = Guice.createInjector(new MyDslRuntimeModule());
MyApplication application = injector.getInstance(MyApplication.class);
application.run();
}

Xtext uses EMF which makes use of a couple of global registries, which have to be configured
on startup. Because we of course want to leverage Guice also for all factories, etc. that we put
into those registries, we have introduced a so called ISetup which provides a method called Injector
createInjectorAndDoEMFRegistration(). So instead of using the plain Guice code shown above you
rather use the ISetup class generated for your language, which, as the method name suggests, creates an
Injector and uses it to initialize a couple of EMF objects and register them in the corresponding registries.
Injector injector =
new MyStandaloneSetup().createInjectorAndDoEMFRegistration();

These are the basic ideas around Guice and the small extension Xtext provides on top. For more
information we strongly encourage you to read through the documentation on the website of Google
Guice.

Xtext 1.0 42
Chapter 5. Runtime Concepts
Xtext itself and every language infrastructure developed with Xtext is configured and wired-up using
dependency injection. Xtext may be used in different environments which introduce different constraints.
Especially important is the difference between OSGi managed containers and plain vanilla Java programs.
To honor these differences Xtext uses the concept of ISetup-implementations in normal mode and uses
Eclipse’s extension mechanism when it should be configured in an OSGi environment.

5.1. Runtime Setup (ISetup)


For each language there is an implementation of ISetup generated. It implements a method called
createInjectorAndDoEMFRegistration(), which can be called to do the initialization of the language
infrastructure. This class is intended to be used for runtime and for unit testing, only.
The setup method returns an Injector, which can further be used to obtain a parser, etc. It also registers
the ResourceFactory and generated EPackages at the respective global registries provided by EMF. So
basically you can just run the setup and start using EMF API to load and store models of your language.

5.2. Setup within Eclipse-Equinox (OSGi)


Within Eclipse we have a generated Activator, which creates a Guice injector using the modules.
In addition an IExecutableExtensionFactory is generated for each language, which is used to create
ExecutableExtensions. This means that everything which is created via extension points is managed by
Guice as well, i.e. you can declare dependencies and get them injected upon creation.
The only thing you have to do in order to use this factory is to prefix the class with the factory
ExecutableExtensionFactory name followed by a colon.
<extension
point="org.eclipse.ui.editors">
<editor
class="<MyLanguageName>ExecutableExtensionFactory:
org.eclipse.xtext.ui.editor.XtextEditor"
contributorClass=
"org.eclipse.ui.editors.text.TextEditorActionContributor"
default="true"
extensions="ecoredsl"
id="org.eclipse.xtext.example.EcoreDsl"
name="EcoreDsl Editor">
</editor>
</extension>

5.3. Logging
Xtext uses Apache’s log4j for logging. It is configured using the so called log4j.properties, which is
looked up in the root of the Java classpath. If you want to change or provide configuration at runtime
(i.e. non-OSGI), all you have to do is putting such a log4j.properties in place and make sure that it is not
overridden by other log4j.properties in previous classpath entries.
In OSGi you provide configuration by creating a fragment for org.apache.log4j. In this case you need to
make sure that there’s no second fragment contributing a log4j.properties file.

5.4. Validation
Static analysis or validation is one of the most interesting aspects when developing a programming
language. The users of your languages will be grateful if they get informative feedback as they type. In
Xtext there are basically three different kinds of validation.

5.4.1. Automatic Validation


Some implementation aspects (e.g. the grammar, scoping) of a language have an impact on what is
required for a document or semantic model to be valid. Xtext automatically takes care of this.

Xtext 1.0 43
Runtime Concepts

5.4.1.1. Lexer/Parser: Syntactical Validation


The syntactical correctness of any textual input is validated automatically by the parser.
The error messages are generated by the underlying parser technology. One can use the
ISyntaxErrorMessageProvider-API to customize this messages. Any syntax errors can be retrieved from
the Resource using the common EMF API:
• org.eclipse.emf.ecore.resource.Resource.getErrors()
• org.eclipse.emf.ecore.resource.Resource.getWarnings()

5.4.1.2. Linker: Crosslink Validation


Any broken crosslinks can be checked generically. As crosslink resolution is done lazily (see linking), any
broken links are resolved lazily as well. If you want to validate whether all links are valid, you will have
to navigate through the model so that all installed EMF proxies get resolved. This is done automatically
in the editor.
Similar to syntax errors, any unresolvable crosslinks will be reported and can be obtained through:
• org.eclipse.emf.ecore.resource.Resource.getErrors()
• org.eclipse.emf.ecore.resource.Resource.getWarnings()

5.4.1.3. Serializer: Concrete Syntax Validation


The IConcreteSyntaxValidator validates all constraints that are implied by a grammar. Meeting these
constraints for a model is mandatory to be serialized.
Example:
MyRule:
({MySubRule} "sub")? (strVal+=ID intVal+=INT)*;

This implies several constraints:


1. Types: only instances of MyRule and MySubRule are allowed for this rule. Sub-types are prohibited,
since the parser never instantiates unknown sub-types.
2. Features: In case the MyRule and MySubRule have EStructuralFeatures besides strVal and intVal,
only strVal and intVal may have non-transient values.
3. Quantities: The following condition must be true: strVal.size() == intVal.size().
4. Values: It must be possible to convert all values to valid tokens for terminal rule STRING. The same
is true for intVal and INT.
The typical use cases for the concrete syntax validator are validation in non-Xtext-editors that, however,
use an XtextResource. This is, for example, the case when combining GMF and Xtext. Another use case
is when the semantic model is modified “manually” (not by the parser) and then serialized again. Since
it is very difficult for the serializer to provide meaningful error messages, the concrete syntax validator
is executed by default before serialization. A textual Xtext editor itself, however, is not a valid use case.
Here, the parser ensures that all syntactical constraints are met. Therefore, there is no value in additionally
running the concrete syntax validator.
There are some limitations to the concrete syntax validator which result from the fact that it treats the
grammar as declarative, which is something the parser doesn’t always do.
• Grammar rules containing assigned actions (e.g. {MyType.myFeature=current} are ignored.
Unassigned actions (e.g. {MyType}), however, are supported.
• Grammar rules that delegate to one or more rules containing assigned actions via unassigned rule calls
are ignored.
• Orders within list-features can not be validated. e.g. Rule: (foo+=R1 foo+=R2)* implies that foo is
expected to contain instances of R1 and R2 in an alternating order.
To use concrete syntax validation you can let Guice inject an instance of IConcreteSyntaxValidator and
use it directly. Furthermore, there is an adapter ( ConcreteSyntaxEValidator) allows integrating of the
concrete syntax validator as an EValidator. You can, for example, enable it in your runtime module, by
adding:

Xtext 1.0 44
Runtime Concepts

@SingletonBinding(eager = true)
public Class<? extends ConcreteSyntaxEValidator>
bindConcreteSyntaxEValidator() {
return ConcreteSyntaxEValidator.class;
}
To customize error messages please see IConcreteSyntaxDiagnosticProvider and subclass
ConcreteSyntaxDiagnosticProvider.

5.4.2. Custom Validation


In addition to the afore mentioned kinds of validation, which are more or less done automatically,
you can specify additional constraints specific for your Ecore model. We leverage existing EMF
API (mainly EValidator) and have put some convenience stuff on top. Basically all you need to do
is to make sure that an EValidator is registered for your EPackage. The registry for EValidators
( EValidator.Registry.INSTANCE) can only be filled programmatically. That means contrary to the
EPackage and Resource.Factory registries there is no Equinox extension point to populate the validator
registry.
For Xtext we provide a generator fragment for the convenient Java-based EValidator API. Just add the
following fragment to your generator configuration and you are good to go:
fragment = org.eclipse.xtext.generator.validation.JavaValidatorFragment {}

The generator will provide you with two Java classes. An abstract class generated to src-gen/ which
extends the library class AbstractDeclarativeValidator. This one just registers the EPackages for which
this validator introduces constraints. The other class is a subclass of that abstract class and is generated
to the src/ folder in order to be edited by you. That’s where you put the constraints in.
The purpose of the AbstractDeclarativeValidator is to allow you to write constraints in a declarative way
– as the class name already suggests. That is instead of writing exhaustive if-else constructs or extending
the generated EMF switch you just have to add the @Check annotation to any method and it will be
invoked automatically when validation takes place. Moreover you can state for what type the respective
constraint method is, just by declaring a typed parameter. This also lets you avoid any type casts. In
addition to the reflective invocation of validation methods the AbstractDeclarativeValidator provides a
couple of convenient assertions.
All in all this is very similar to how JUnit works. Here is an example:
public class DomainmodelJavaValidator
extends AbstractDomainmodelJavaValidator {

@Check
public void checkTypeNameStartsWithCapital(Type type) {
if (!Character.isUpperCase(type.getName().charAt(0)))
warning("Name should start with a capital",
DomainmodelPackage.TYPE__NAME);
}
}

You can also implement quick fixes for individual validation errors and warnings. See the chapter on
quick fixes for details.

5.4.3. Validation with the Check Language


In addition to the Java-based validation code you can use the language Check (from M2T/Xpand) to
implement constraint checks against your model. To do so, you have to configure the generator with the
CheckFragment. Please note, that you can combine both types of validation in your project.
fragment = org.eclipse.xtext.generator.validation.CheckFragment {}

After regenerating your language artifacts you will find three new files “YourLanguageChecks.chk”,
“YourLanguageFastChecks.chk” and “YourLanguageExpensiveChecks.chk” in the src/ folder in the
sub-package validation. The checks in these files will be executed when saving a file, while typing

Xtext 1.0 45
Runtime Concepts

(FastChecks) or when triggering the validation explicitly (ExpensiveChecks). When using Check the
example of the previous chapter could be written like this.
context Type#name WARNING "Name should start with a capital":
name.toFirstUpper() == name;

Each check works in a specific context (here: Type) and can further denote a feature to which a warning
or error should be attached to (here: name). Each check could either be a WARNING or an ERROR with
a given string to explain the situation. The essential part of each check is an invariant that must hold true
for the given context. If it fails the check will produce an issue with the provided explanation.
Please read more about the Check language as well as the underlying expression language in Xpand’s
reference documentation which is shipped as Eclipse help.

5.4.4. Validating Manually


As noted above, Xtext uses EMF’s EValidator API to register Java or Check validators. You can run the
validators on your model programmatically using EMF’s Diagnostician, e.g.
EObject myModel = myResource.getContents().get(0);
Diagnostic diagnostic = Diagnostician.INSTANCE.validate(myModel);
switch (diagnostic.getSeverity()) {
case Diagnostic.ERROR:
System.err.println("Model has errors: ",diagnostic);
break;
case Diagnostic.WARNING:
System.err.println("Model has warnings: ",diagnostic);
}

5.4.5. Test Validators


If you have implemented your validators by extending AbstractDeclarativeValidator, there are helper
classes which may assist you when testing your validators.
Testing validators typically works as follows:
1. The test creates some models which intentionally violate some constraints.
2. The test runs some choosen @Check-methods from the validator.
3. The test asserts whether the @Check-methods have raised the expected warnings and errors.
To create models, you can either use EMF’s ResourceSet to load models from your hard disk or you can
utilize the Factory (which EMF generates for each EPackage) to construct the needed model elements
manually. While the fist option has the advantages that you can edit your models in your textual concrete
syntax, the second option has the advantage that you can create partial models.
To run the @Check-methods and ensure they raise the intended errors and warnings, you can utilize
ValidatorTester as shown by the following example:
Validator:
public class MyLanguageValidator extends AbstractDeclarativeValidator {
@Check
public void checkFooElement(FooElement element) {
if(element.getBarAttribute().contains("foo"))
error("Only Foos allowed", element,
MyLanguagePackage.FOO_ELEMENT__BAR_ATTRIBUTE, 101);
}
}

JUnit-Test:

Xtext 1.0 46
Runtime Concepts

public class MyLanguageValidatorTest extends AbstractXtextTests {

private ValidatorTester<MyLanguageValidator> tester;

@Override
public void setUp() {
with(MyLanguageStandaloneSetup.class);
MyLanguageValidator validator = get(MyLanguageValidator.class);
tester = new ValidatorTester<TestingValidator>(validator);
}

public void testError() {


FooElement model = MyLanguageFactory.eINSTANCE.createFooElement()
model.setBarAttribute("barbarbarbarfoo");

tester.validator().checkFooElement(model);
tester.diagnose().assertError(101);
}

public void testError2() {


FooElement model = MyLanguageFactory.eINSTANCE.createFooElement()
model.setBarAttribute("barbarbarbarfoo");

tester.validate(model).assertError(101);
}
}

This example uses JUnit 3, but since the involved classes from Xtext have no dependency on
JUnit whatsoever, JUnit 4 and other testing frameworks will work as well. JUnit runs the setUp()-
method before each testcase and thereby helps to create some common state. In this example, the
validator ( MyLanguageValidator) is instantiated by means of Google Guice. As we inherit from
the AbstractXtextTests there are a plenty of useful methods available and the state of the global
EMF singletons will be restored in the tearDown(). Afterwards, the ValidatorTester is created and
parameterized with the actual validator. It acts as a wrapper for the validator, ensures that the validator
has a valid state and provides convenient access to the validator itself ( tester.validator()) as well as to the
utility classes which assert diagnostics created by the validator ( tester.diagnose()). Please be aware that
you have to call validator() before you can call diagnose(). However, you can call validator() multiple
times in a row.
While validator() allows to call the validator’s @Check-methods directly, validate(model) leaves it to
the framework to call the applicable @Check-methods. However, to avoid side-effects between tests, it
is recommended to call the @Check-methods directly.
diagnose() and validate(model) return an object of type AssertableDiagnostics which provides several
assert-methods to verify whether the expected diagnostics are present:
• assertError(int code): There must be one diagnostic with severity ERROR and the supplied error code.
• assertErrorContains(String messageFragment): There must be one diagnostic with severity ERROR
and its message must contain messageFragment.
• assertError(int code, String messageFragment): Verifies severity, error code and messageFragment.
• assertWarning(...): This method is available for the same combination of parameters as assertError().
• assertOK(): Expects that no diagnostics (errors, warnings etc.) have been raised.
• assertDiagnostics(int severity, int code, String messageFragment): Verifies severity, error code and
messageFragment.
• assertAll(DiagnosticPredicate... predicates): Allows to describe multiple diagnostics at the same
time and verifies that all of them are present. Class AssertableDiagnostics contains static
error() and warning()-methods which help to create the needed DiagnosticPredicate. Example:
assertAll(error(123), warning(“some part of the message”)).
• assertAny(DiagnosticPredicate predicate): Asserts that a diagnostic exists which matches the
predicate.

Xtext 1.0 47
Runtime Concepts

5.5. Linking
The linking feature allows for specification of cross-references within an Xtext grammar. The following
things are needed for the linking:
1. declaration of a crosslink in the grammar (at least in the Ecore model)
2. specification of linking semantics (usually provided via the scoping API)

5.5.1. Declaration of Crosslinks


In the grammar a cross-reference is specified using square brackets.
CrossReference :
'[' type=ReferencedEClass ('|' terminal=CrossReferenceableTerminal)? ']'
;

Example:
ReferringType :
'ref' referencedObject=[Entity|STRING]
;

The Ecore model inference would create an EClass ReferringType with an EReference referencedObject
of type Entity ( containment=false). The referenced object would be identified either by a STRING and
the surrounding information in the current context (see scoping). If you do not use generate but import
an existing Ecore model, the class ReferringType (or one of its super types) would need to have an
EReference of type Entity (or one of its super types) declared. Also the EReference's containment and
container properties needs to be set to false.

5.5.2. Default Runtime Behavior (Lazy Linking)


Xtext uses lazy linking by default and we encourage users to stick to this because it provides many
advantages. One of which is improved performance in all scenarios where you don’t have to load the
whole closure of all transitively referenced resources. Furthermore it automatically solves situations
where one link relies on other links. Though cyclic linking dependencies are not supported by Xtext at all.
When parsing a given input string, say
ref Entity01
the LazyLinker first creates an EMF proxy and assigns it to the corresponding EReference. In EMF
a proxy is described by a URI, which points to the real EObject. In the case of lazy linking the
stored URI comprises of the context information given at parse time, which is the EObject containing
the cross-reference, the actual EReference, the index (in case it’s a multi-valued cross-reference) and
the string which represented the crosslink in the concrete syntax. The latter usually corresponds to
the name of the referenced EObject. In EMF a URI consists of information about the resource the
EObject is contained in as well as a so called fragment part, which is used to find the EObject within
that resource. When an EMF proxy is resolved, the current ResourceSet is asked. The resource set
uses the first part to obtain (i.e. load if it is not already loaded) the resource. Then the resource is
asked to return the EObject based on the fragment in the URI. The actual cross-reference resolution
is done by LazyLinkingResource.getEObject(String) which receives the fragment and delegates to the
implementation of the ILinkingService. The default implementation in turn delegates to the scoping API.
A simple implementation of the linking service (the DefaultLinkingService) is shipped with Xtext and
used for any grammar per default. Usually any necessary customization of the linking behavior can best
be described using the scoping API.

5.6. Scoping
Using the scoping API one defines which elements are referable by a certain reference. For instance, using
the introductory example (fowler’s state machine language) a transition contains two cross-references:
One to a declared event and one to a declared state.
Example:

Xtext 1.0 48
Runtime Concepts

events
nothingImportant MYEV
end

state idle
nothingImportant => idle
end

The grammar rule for transitions looks like this:


Transition :
event=[Event] '=>' state=[State];

The grammar states that for the reference event only instances of the type Event are allowed and that for
the EReference state only instances of type State can be referenced. However, this simple declaration
doesn’t say anything about where to find the states or events. That is the duty of scopes.
An IScopeProvider is responsible for providing an IScope for a given context EObject and EReference.
The returned IScope should contain all target candidates for the given object and cross-reference.
public interface IScopeProvider {

/**
* Returns a scope for the given context. The scope
* provides access to the compatible visible EObjects
* for a given reference.
*
* @param context the element from which an element shall be referenced
* @param reference the reference to be used to filter the elements.
* @return {@link IScope} representing the inner most {@link IScope} for
* the passed context and reference. Note for implementors: The
* result may not be <code>null</code>. Return
* <code>IScope.NULLSCOPE</code> instead.
*/
IScope getScope(EObject context, EReference reference);

A single IScope represents an element of a linked list of scopes. That means that a scope can be nested
within an outer scope. Each scope works like a symbol table or a map where the keys are strings and the
values are so called IEObjectDescription, which is effectively an abstract description of a real EObject.

5.6.1. Global Scopes and IResourceDescriptions


In the state machine example we don’t have references across model files. Also there is no concept like
a namespace which would make scoping a bit more complicated. Basically, every State and every Event
declared in the same resource is visible by their name. However in the real world things are most likely not
that simple: What if you want to reuse certain declared states and events across different state machines
and you want to share those as library between different users? You would want to introduce some kind
of cross resource reference.
Defining what is visible from outside the current resource is the responsibility of global scopes. As the
name suggests, global scopes are provided by instances of the IGlobalScopeProvider. The data structures
used to store its elements are described in the next section.

5.6.1.1. Resource and EObject Descriptions ( IResourceDescription,


IEObjectDescription)
In order to make states and events of one file referable from another file you need to export them as part
of a so called IResourceDescription.
A IResourceDescription contains information about the resource itself (primarily its URI), a list of
exported EObjects (in the form of IEObjectDescription) as well as information about outgoing cross-
references and qualified names it references. The cross references contain only resolved references,
while the list of imported qualified names also contain those names, which couldn’t be resolved. This

Xtext 1.0 49
Runtime Concepts

information is important in order to compute the transitive hull of dependent resources, which the shipped
index infrastructure automatically does for you.
For users and especially in the context of scoping the most important information is the list of exported
EObjects. An IEObjectDescription contains information about the URI to the actual EObject and the
qualified name of that element as well as the corresponding EClass. In addition one can export arbitrary
information using the user data map. The following diagram gives an overview on the description classes
and their relationships.

A language is configured with a default implementation of IResourceDescription.Manager which


computes the list of exported IEObjectDescriptions by iterating the whole EMF model and applying
the getQualifiedName(EObject obj) from IQualifiedNameProvider on each EObject. If the object has
a qualified name an IEObjectDescription is created and exported (i.e. added to the list). If an EObject
doesn’t have a qualified name, the element is considered to be not referable from outside the resource
and consequently not indexed. If you don’t like this behavior, you can implement and bind your own
implementation of IResourceDescription.Manager.
There are also two different default implementations of IQualifiedNameProvider. Both work by
looking up an EAttribute ‘name’. The SimpleNameProvider simply returns the plain value, while the
DefaultDeclarativeQualifiedNameProvider concatenates the simple name with the qualified name of its
parent exported EObject. This effectively simulates the qualified name computation of most namespace-
based languages (like e.g. Java).
As mentioned above, in order to calculate an IResourceDescription for a resource the framework asks
the IResourceDescription.Manager. Here’s some Java code showing how to do that:
Manager manager = // obtain an instance of IResourceDescription.Manager
IResourceDescription description = manager.getResourceDescription(resource);
for (IEObjectDescription objDescription : description.getExportedObjects()) {
System.out.println(objDescription.getQualifiedName());
}

In order to obtain an IResourceDescription.Manager it is best to ask the corresponding


IResourceServiceProvider. That is because each language might have a totally different implementation
and as you might refer from your language to a different language you can’t reuse your
language’s IResourceDescription.Manager. One basically asks the IResourceServiceProvider.Registry
(there is usually one global instance) for an IResourceServiceProvider, which in turn provides an
IResourceDescription.Manager along other useful services.

Xtext 1.0 50
Runtime Concepts

If you’re running in a Guice enabled scenario, the code looks like this:
@Inject
private IResourceServiceProvider.Registry resourceServiceProviderRegistry;

private IResourceDescription.Manager getManager(Resource res) {


IResourceServiceProvider resourceServiceProvider =
resourceServiceProviderRegistry.getResourceServiceProvider(res.getURI());
return resourceServiceProvider.getResourceDescriptionManager();
}

If you don’t run in a Guice enabled context you will likely have to directly access the singleton:
private IResourceServiceProvider.Registry resourceServiceProviderRegistry =
IResourceServiceProvider.Registry.INSTANCE;

However, we strongly encourage you to use dependency injection. Now, that we know how to
export elements to be referenceable from other resources, we need to learn how those exported
IEObjectDescriptions can be made available to the referencing resources. That is the responsibility of
global scoping (i.e. IGlobalScopeProvider) which is described in the following chapter.

5.6.1.2. Global Scopes Based On Explicit Imports (ImportURI Mechanism)


A simple and straight forward solution is to have explicit references to other resources in your file by
explicitly listing pathes (or URIs) to all referenced resources in your model file. That is for instance what
most include mechanisms use. In Xtext we provide a handy implementation of an IGlobalScopeProvider
which is based on a naming convention and makes this semantics very easy to use. Talking of the
introductory example and given you would want to add support for referencing external States and Events
from within your state machine, all you had to do is add something like the following to the grammar
definition:
Statemachine :
(imports+=Import)* // allow imports
'events'
(events+=Event)+
'end'
('resetEvents'
(resetEvents+=[Event])+
'end')?
'commands'
(commands+=Command)+
'end'
(states+=State)+;

Import :
'import' importURI=STRING; // feature must be named importURI

This effectively allows import statements to be declared before the events section. In addition you’ll have
to make sure that you have bound the ImportUriGlobalScopeProvider for the type IGlobalScopeProvider
by the means of Guice. That implementation looks up any EAttributes named ‘importURI’ in your
model and interprets their values as URIs that point to imported resources. That is it adds the
corresponding resources to the current resource’s resource set. In addition the scope provider uses
the IResourceDescription.Manager of that imported resource to compute all the IEObjectDescriptions
returned by the IScope.
Global scopes based on import URIs are available if you use the ImportURIScopingFragment
in the workflow of your language. It will bind an ImportUriGlobalScopeProvider (
IgnoreCaseImportUriGlobalScopeProvider if the caseInsensitve flag is set) that handles importURI
features.

5.6.1.3. Global Scopes Based On External Configuration (e.g. Classpath-


Based)
Instead of explicitly referring to imported resources, the other possibility is to have some kind of external
configuration in order to define what is visible from outside a resource. Java for instances uses the notion

Xtext 1.0 51
Runtime Concepts

of classpaths to define containers (jars and class folders) which contain any referenceable elements. In
the case of Java also the order of such entries is important.
Since version 1.0.0 Xtext provides support for this kind of global scoping. To enable it, a
DefaultGlobalScopeProvider has to be bound to the IGlobalScopeProvider interface. For case insensitive
names use the IgnoreCaseDefaultGlobalScopeProvider.
By default Xtext leverages the classpath mechanism since it is well designed and already understood
by most of our users. The available tooling provided by JDT and PDE to configure the classpath adds
even more value. However, it is just a default: You can reuse the infrastructure without using Java and
independent from the JDT.
In order to know what is available in the “world” a global scope provider which relies on external
configuration needs to read that configuration in and be able to find all candidates for a certain EReference.
If you don’t want to force users to have a folder and file name structure reflecting the actual qualified
names of the referenceable EObjects, you’ll have to load all resources up front and either keep holding
them in memory or remembering all information which is needed for the resolution of cross-references.
In Xtext that information is provided by a so called IEObjectDescription .

About the Index, Containers and Their Manager


Xtext ships with an index which remembers all IResourceDescription and their IEObjectDescription
objects. In the IDE-context (i.e. when running the editor, etc.) the index is updated by an incremental
project builder. As opposed to that, in a non-UI context you typically do not have to deal with changes
such that the infrastructure can be much simpler. In both situations the global index state is held by an
implementation of IResourceDescriptions (Note the plural form!). The bound singleton in the UI scenario
is even aware of unsaved editor changes, such that all linking happens to the latest maybe unsaved
version of the resources. You will find the Guice configuration of the global index in the UI scenario
in SharedModule.
The index is basically a flat list of instances of IResourceDescription. The index itself doesn’t
know about visibility constraints due to classpath restriction. Rather than that, they are defined by
the referencing language by means of so called IContainers: While Java might load a resource via
ClassLoader.loadResource() (i.e. using the classpath mechanism), another language could load the same
resource using the file system paths.
Consequently, the information which container a resource belongs to depends on the referencing
context. Therefore an IResourceServiceProvider provides another interesting service, which is called
IContainer.Manager. For a given IResourceDescription, the IContainer.Manager provides you with the
IContainer as well as with a list of all IContainers which are visible from there. Note that the index (
IResourceDesciptions) is globally shared between all languages while the IContainer.Manager that adds
the semantics of containers can be very different depending on the language. The following method lists
all resources visible from a given Resource:
@Inject
IContainer.Manager manager;

public void listVisibleResources(


Resource myResource, IResourceDescriptions index) {
IResourceDescription descr =
index.getResourceDescription(myResource.getURI());
for(IContainer visibleContainer:
manager.getVisibleContainers(descr, index)) {
for(IResourceDescription visibleResourceDesc:
visibleContainer.getResourceDescription()) {
System.out.println(visibleResourceDesc.getURI());
}
}
}

Xtext ships two implementations of IContainer.Manager which are as usual bound with Guice:
The default binding is to SimpleResourceDescriptionsBasedContainerManager, which assumes all
IResourceDescription to be in a single common container. If you don’t care about container support,
you’ll be fine with this one. Alternatively, you can bind StateBasedContainerManager and an additional
IAllContainersState which keeps track of the set of available containers and their visibility relationships.

Xtext 1.0 52
Runtime Concepts

Xtext offers a couple of strategies for managing containers: If you’re running an Eclipse workbench,
you can define containers based on Java projects and their classpaths or based on plain Eclipse projects.
Outside Eclipse, you can provide a set of file system paths to be scanned for models. All of these only
differ in the bound instance of IAllContainerState of the referring language. These will be described in
detail in the following sections.

JDT-Based Container Manager


As JDT is an Eclipse feature, this JDT-based container management is only available in the UI scenario.
It assumes so called IPackageFragmentRoots as containers. An IPackageFragmentRoot in JDT is the
root of a tree of Java model elements. It usually refers to
• a source folder of a Java project,
• a referenced jar,
• a classpath entry of a referenced Java project, or
• the exported packages of a required PDE plug-in.
So for an element to be referable, its resource must be on the classpath of the caller’s Java project and
it must be exported (as described above).
As this strategy allows to reuse a lot of nice Java things like jars, OSGi, maven, etc. it is part of the default:
You should not have to reconfigure anything to make it work. Nevertheless, if you messed something
up, make sure you bind
public Class<? extends IContainer.Manager> bindIContainer$Manager() {
return StateBasedContainerManager.class;
}

Xtext 1.0 53
Runtime Concepts

in the runtime module and


public Provider<IAllContainersState> provideIAllContainersState() {
return org.eclipse.xtext.ui.shared.Access.getJavaProjectsState();
// return org.eclipse.xtext.ui.shared.Access.getStrictJavaProjectsState();
}

in the UI module of the referencing language. The latter looks a bit more difficult than a common
binding, as we have to bind a global singleton to a Guice provider. The StrictJavaProjectsState requires
all elements to be on the classpath, while the default JavaProjectsState also allows models in non-source
folders.

Eclipse Project-Based Containers


If the classpath-based mechanism doesn’t work for your case, Xtext offers an alternative container
manager based on plain Eclipse projects: Each project acts as a container and the project references
Properties->Project References are the visible containers.
In this case, your runtime module should define
public Class<? extends IContainer.Manager> bindIContainer$Manager() {
return StateBasedContainerManager.class;
}

and the UI module should bind


public Provider<IAllContainersState> provideIAllContainersState() {
return org.eclipse.xtext.ui.shared.Access.getWorkspaceProjectsState();
}

ResourceSet-Based Containers
If you need an IContainer.Manager that is independent of Eclipse projects, you can use the
ResourceSetBasedAllContainersState. This one can be configured with a mapping of container handles
to resource URIs.
It is unlikely you want to use this strategy directly in your own code, but it is used in the back-end of the
MWE2 workflow component Reader. This is responsible for reading in models in a workflow, e.g. for
later code generation. The Reader allows to either scan the whole classpath or a set of paths for all models
therein. When paths are given, each path entry becomes an IContainer of its own. In the following snippet,
component = org.eclipse.xtext.mwe.Reader {
// lookup all resources on the classpath
// useJavaClassPath = true

// or define search scope explicitly


path = "src/models"
path = "src/further-models"

...
}

5.6.2. Local Scoping


We now know how the outer world of referenceable elements can be defined in Xtext. Nevertheless, not
everything is available in any context and with a global name. Rather than that, each context can usually
have a different scope. As already stated, scopes can be nested, i.e. a scope can in addition to its own
elements contain elements of a parent scope. When parent and child scope contain different elements with
the same name, the parent scope’s element will usually be shadowed by the element from the child scope.
To illustrate that, let’s have a look at Java: Java defines multiple kinds of scopes (object scope, type
scope, etc.). For Java one would create the scope hierarchy as commented in the following example:

Xtext 1.0 54
Runtime Concepts

// file contents scope


import static my.Constants.STATIC;

public class ScopeExample { // class body scope


private Object field = STATIC;

private void method(String param) { // method body scope


String localVar = "bar";
innerBlock: { // block scope
String innerScopeVar = "foo";
Object field = innerScopeVar;
// the scope hierarchy at this point would look like this:
// blockScope{field,innerScopeVar}->
// methodScope{localVar, param}->
// classScope{field}-> ('field' is shadowed)
// fileScope{STATIC}->
// classpathScope{
// 'all qualified names of accessible static fields'} ->
// NULLSCOPE{}
//
}
field.add(localVar);
}
}

In fact the classpath scope should also reflect the order of classpath entries. For instance:
classpathScope{stuff from bin/}
-> classpathScope{stuff from foo.jar/}
-> ...
-> classpathScope{stuff from JRE System Library}
-> NULLSCOPE{}

Please find the motivation behind this and some additional details in this blog post .

5.6.2.1. Declarative Scoping


If you have to define scopes for certain contexts, the base class AbstractDeclarativeScopeProvider allows
to do that in a declarative way. It looks up methods which have either of the following two signatures:
IScope scope_<RefDeclaringEClass>_<Reference>(
<ContextType> ctx, EReference ref)

IScope scope_<TypeToReturn>(<ContextType> ctx, EReference ref)

The former is used when evaluating the scope for a specific cross-reference and here ContextReference
corresponds to the name of this reference (prefixed with the name of the reference’s declaring type and
separated by an underscore). The ref parameter represents this cross-reference.
The latter method signature is used when computing the scope for a given element type and is applicable
to all cross-references of that type. Here TypeToReturn is the name of that type.
So if you for example have a state machine with a Transition object owned by its source State and you
want to compute all reachable states (i.e. potential target states), the corresponding method could be
declared as follows (assuming the cross-reference is declared by the Transition type and is called target):
IScope scope_Transition_target(Transition this, EReference ref)
If such a method does not exist, the implementation will try to find one for the context object’s container.
Thus in the example this would match a method with the same name but State as the type of the first
parameter. It will keep on walking the containment hierarchy until a matching method is found. This
container delegation allows to reuse the same scope definition for elements in different places of the
containment hierarchy. Also it may make the method easier to implement as the elements comprising the
scope are quite often owned or referenced by a container of the context object. In the example the State
objects could for instance be owned by a containing StateMachine object.

Xtext 1.0 55
Runtime Concepts

If no method specific to the cross-reference in question was found for any of the objects in the containment
hierarchy, the implementation will start looking for methods matching the other signature. Again it will
first attempt to match the context object. Thus in the example the signature first matched would be:
IScope scope_State(Transition this, EReference ref)
If no such method exists, the implementation will again try to find a method matching the context object’s
container objects. In the case of the state machine example you might want to declare the scope with
available states at the state machine level:
IScope scope_State(StateMachine this, EReference ref)
This scope can now be used for any cross-references of type State for context objects owned by the state
machine.

5.6.3. Imported Namespace-Aware Scoping


The imported namespace aware scoping is based on qualified names and namespaces. It adds namespace
support to your language, which is comparable and similar to the one in Scala and C#. Scala and C#
both allow to have multiple nested packages within one file and you can put imports per namespace, so
that imported names are only visible within that namespace. See the domain model example: its scope
provider extends ImportedNamespaceAwareLocalScopeProvider.

5.6.3.1. IQualifiedNameProvider
The ImportedNamespaceAwareLocalScopeProvider makes use of the so called IQualifiedNameProvider
service. It computes qualified names for EObjects. The default implementation (
DefaultDeclarativeQualifiedNameProvider) uses a simple name look up and concatenates the result to
the qualified name of its parent object using a dot as separator.
It also allows to override the name computation declaratively. The following snippet shows how you
could make Transitions in the state machine example referable by giving them a name. Don’t forget to
bind your implementation in your runtime module.
FowlerDslQualifiedNameProvider
extends DefaultDeclarativeQualifiedNameProvider {
public String qualifiedName(Transition t) {
if(t.getEvent() == null || !(t.eContainer() instanceof State))
return null;
else
return ((State)t.eContainer()).getName() + "." t.getEvent().getName();
}
}

5.6.3.2. Importing Namespaces


The ImportedNamespaceAwareLocalScopeProvider looks up EAttributes with name
‘importedNamespace’ and interprets them as import statements. By default qualified names with or
without a wildcard at the end are supported. For an import of a qualified name the simple name is made
available as we know from e.g. Java, where
import java.util.Set;
makes it possible to refer to ‘java.util.Set’ by its simple name ‘Set’. Contrary to Java the import is not
active for the whole file but only for the namespace it is declared in and its child namespaces. That is
why you can write the following in the example DSL:
package foo {
import bar.Foo
entity Bar extends Foo {
}
}

package bar {
entity Foo {}
}

Xtext 1.0 56
Runtime Concepts

Of course the declared elements within a package are as well referable by their simple name:
package bar {
entity Bar extends Foo {}
entity Foo {}
}

The following would as well be ok:


package bar {
entity Bar extends bar.Foo {}
entity Foo {}
}

See the JavaDocs and this blog post for details.

5.7. Value Converter


Value converters are registered to convert the parsed text into a certain data type instance and vice versa.
The primary hook is called IValueConverterService and the concrete implementation can be registered
via the runtime Guice module. To do so override the corresponding binding in your runtime module like
shown in this example:
@Override
public Class<? extends IValueConverterService> bindIValueConverterService() {
return MySpecialValueConverterService.class;
}

5.7.1. Annotation-Based Value Converters


The most simple way to register additional value converters is to make use of
AbstractDeclarativeValueConverterService, which allows to declaratively register an IValueConverter
by means of an annotated method.
If you use the common terminals grammar org.eclipse.xtext.common.Terminals you should subclass
DefaultTerminalConverters and override or add additional value converters by adding the respective
methods. In addition to the explicitly defined converters in the default implementation, a delegating
converter is registered for each available EDataType that reuses the functionality of the corresponding
EMF EFactory.
As qualified names – i.e. names composed of namespaces separated by a delimiter – are expected to
occur often, we’ve added a QualifiedNameValueConverter. This one removes comments and whitespaces
and delegates to another value converter for each segment, allowing individually quoted segments. The
domainmodel example shows how to use it.
The protocol of an IValueConverter allows to throw a ValueConverterException if something
went wrong. The exception is propagated as a syntax error by the parser or as a validation
problem by the ConcreteSyntaxValidator if the value cannot be converted to a valid string.
The AbstractLexerBasedConverter is useful when implementing a custom value converter. If the
converter needs to know about the rule that it currently works with, it may implement the interface
IValueConverter.RuleSpecific. The framework will set the rule such as the implementation may use it
afterwards.

5.8. Serialization
Serialization is the process of transforming an EMF model into its textual representation. Thereby,
serialization complements parsing and lexing.
In Xtext, the process of serialization is split into the following steps:
1. Validating the semantic model. This is optional, enabled by default, done by the concrete syntax
validator and can be turned off in the save options.
2. Matching the model elements with the grammar rules and creating a stream of tokens. This is done
by the parse tree constructor.

Xtext 1.0 57
Runtime Concepts

3. Associating comments with semantic objects. This is done by the comment associator.
4. Associating existing nodes from the node model with tokens from the token stream.
5. Merging existing whitespace and line-wraps into to token stream.
6. Adding further needed whitespace or replacing all whitespace using a formatter.
Serialization is invoked when calling XtextResource.save(...). Furthermore, Serializer provides resource-
independent support for serialization. Serialization is not called when a textual editors contents is saved
to disk. Another situation that triggers serialization is applying Quick Fixes with semantic modifications.

5.8.1. The Contract


The contract of serialization says that a model that is serialized to its textual representation and then
loaded (parsed) again should yield a loaded model that equals the original model. Please be aware that
this does not imply, that loading a textual representation and serializing it back produces identical textual
representations. For example, this is the case when a default value is used in a textual representation and
the assignment is optional. Another example is:
MyRule:
(xval+=ID | yval+=INT)*;

MyRule in this example reads ID- and INT-elements which may occur in an arbitrary order in the textual
representation. However, when serializing the model all ID-elements will be written first and then all
INT-elements. If the order is important it can be preserved by storing all elements in the same list – which
may require wrapping the ID- and INT-elements into objects.

5.8.2. Roles of the Semantic Model and the Node Model During
Serialization
A serialized document represents the state of the semantic model. However, if there is a node model
available (i.e. the semantic model has been created by the parser), the serializer
• preserves existing whitespaces from the node model.
• preserves existing comments from the node model.
• preserves the representation of cross-references: If a cross-referenced object can be identified by
multiple names (i.e. scoping returns multiple EObjectDescriptions for the same object), the serializer
tries to keep the previously used name.
• preserves the representation of values: For values handled by the value converter, the serializer checks
whether the textual representation converted to a value equals the value from the semantic model. If
that is true, the textual representation is kept.

5.8.3. Parse Tree Constructor


The parse tree constructor usually does not need to be customized since it is automatically derived from
the Xtext Grammar. However, it can be helpful to look into it to understand its error messages and its
runtime performance.
For serialization to succeed, the parse tree constructor must be able to consume every non-transient
element of the to-be-serialized EMF model. To consume means, in this context, to write the element to
the textual representation of the model. This can turn out to be a not-so-easy-to-fulfill requirement, since
a grammar usually introduces implicit constraints to the EMF model as explained for the concrete syntax
validator.
If a model can not be serialized, an XtextSerializationException is thrown. Possible reasons are listed
below:
• A model element can not be consumed. This can have the following reasons/solutions:
• The model element should not be stored in the model.
• The grammar needs an assignment which would consume the model element.
• The transient value service can be used to indicate that this model element should not be consumed.

Xtext 1.0 58
Runtime Concepts

• An assignment in the grammar has no corresponding model element. The default transient value service
considers a model element to be transient if it is unset or equals its default value. However, the parse
tree constructor may serialize default values if this is required by a grammar constraint to be able to
serialize another model element. The following solution may help to solve such a scenario:
• A model element should be added to the model.
• The assignment in the grammar should be made optional.
• The type of the model element differs from the type in the grammar. The type of the model element
must be identical to the return type of the grammar rule or the action’s type. Sub-types are not allowed.
• Value conversion fails. The value converter can indicate that a value is not serializeable by throwing
a ValueConverterException.
• An enum literal is not allowed at this position. This can happen if the referenced enum rule only lists
a subset of the literals of the actual enumeration.
To understand error messages and performance issues of the parse tree constructor it is important to know
that it implements a backtracking algorithm. This basically means that the grammar is used to specify the
structure of a tree in which one path (from the root node to a leaf node) is a valid serialization of a specific
model. The parse tree constructor’s task is to find this path – with the condition, that all model elements
are consumed while walking this path. The parse tree constructor’s strategy is to take the most promising
branch first (the one that would consume the most model elements). If the branch leads to a dead end
(for example, if a model element needs to be consumed that is not present in the model), the parse tree
constructor goes back the path until a different branch can be taken. This behavior has two consequences:
• In case of an error, the parse tree constructor has found only dead ends but no leaf. It cannot tell which
dead end is actually erroneous. Therefore, the error message lists dead ends of the longest paths, a
fragment of their serialization and the reason why the path could not be continued at this point. The
developer has to judge on his own which reason is the actual error.
• For reasons of performance, it is critical that the parse tree constructor takes the most promising branch
first and detects wrong branches early. One way to achieve this is to avoid having many rules which
return the same type and which are called from within the same alternative in the grammar.

5.8.4. Options
SaveOptions can be passed to XtextResource.save(options) and to Serializer.serialize(..). Available
options are:
• Formatting. Default: false. If enabled, it is the formatters job to determine all whitespace information
during serialization. If disabled, the formatter only defines whitespace information for the places in
which no whitespace information can be preserved from the node model. E.g. When new model
elements are inserted or there is no node model.
• Validating. Default: true: Run the concrete syntax validator before serializing the model.

5.8.5. Preserving Comments from the Node Model


The ICommentAssociater associates comments with semantic objects. This is important in case an
element in the semantic model is moved to a different position and the model is serialized, one expects
the comments to be moved to the new position in the document as well.
Which comment belongs to which semantic object is surely a very subjective issue. The default
implementation ( DefaultCommentAssociater) behaves as follows, but can be customized:
• If there is a semantic token before a comment and in the same line, the comment is associated with
this token’s semantic object.
• In all other cases, the comment is associated with the semantic object of the next following object.

5.8.6. Transient Values


Transient values are values or model elements which are not persisted (written to the textual
representation in the serialization phase). If a model contains model elements which can not be serialized
with the current grammar, it is critical to mark them transient using the ITransientValueService,
or serialization will fail. The default implementation marks all model elements transient, which are

Xtext 1.0 59
Runtime Concepts

eStructuralFeature.isTransient() or not eObject.eIsSet(eStructuralFeature). By default, EMF returns


false for eIsSet(...) if the value equals the default value.

5.8.7. Unassigned Text


If there are calls of data type rules or terminal rules that do not reside in an assignment, the serializer by
default doesn’t know which value to use for serialization.
. Example:
PluralRule:
'contents:' count=INT Plural;

terminal Plural:
'item' | 'items';

Valid models for this example are contents 1 item or contents 5 items. However, it is not stored
in the semantic model whether the keyword item or items has been parsed. This is due to the
fact that the rule call Plural is unassigned. However, the parse tree constructor needs to decide
which value to write during serialization. This decision can be be made by customizing the
IValueSerializer.serializeUnassignedValue(EObject, RuleCall, AbstractNode).

5.8.8. Cross-Reference Serializer


The cross-reference serializer specifies which values are to be written to the textual representation
for cross-references. This behavior can be customized by implementing ICrossReferenceSerializer.
The default implementation delegates to various other services such as the IScopeProvider or the
LinkingHelper each of which may be the better place for customization.

5.8.9. Merge Whitespaces


After the parse tree constructor has done its job to create a stream of tokens which are to be written to
the textual representation, and the comment associator has done its work, existing whitespace form the
node model is merged into the stream.
The strategy is as follows: If two tokens follow each other in the stream and the corresponding nodes in
the node model follow each other as well, then the whitespace information in between is kept. In all other
cases it is up to the formatter to calculate new whitespace information.

5.8.10. Token Stream


The parse tree constructor and the formatter use an ITokenStream for their output, and the latter for
its input as well. This makes them chainable. Token streams can be converted to a String using the
TokenStringBuffer and to a Writer using the WriterTokenStream.
public interface ITokenStream {
void flush() throws IOException;
void writeHidden(EObject grammarElement, String value) throws IOException;
void writeSemantic(EObject grammarElement, String value) throws IOException;
}

5.9. Formatting (Pretty Printing)


A formatter can be implemented via the IFormatter service. Technically speaking, a formatter is a Token
Stream which inserts/removes/modifies hidden tokens (whitespace, line-breaks, comments).
The formatter is invoked during the serialization phase and when the user triggers formatting in the editor
(for example, using the CTRL+SHIFT+F shortcut).
Xtext ships with two formatters:
• The OneWhitespaceFormatter simply writes one whitespace between all tokens.

Xtext 1.0 60
Runtime Concepts

• The AbstractDeclarativeFormatter allows advanced configuration using a FormattingConfig. Both are


explained in the next chapter.

5.9.1. Declarative Formatter


A declarative formatter can be implemented by sub-classing AbstractDeclarativeFormatter, as shown in
the following example:
public class ExampleFormatter extends AbstractDeclarativeFormatter {

@Override
protected void configureFormatting(FormattingConfig c) {
ExampleLanguageGrammarAccess f = getGrammarAccess();

c.setAutoLinewrap(120);

// find common keywords an specify formatting for them


for (Pair<Keyword, Keyword> pair : f.findKeywordPairs("(", ")")) {
c.setNoSpace().after(pair.getFirst());
c.setNoSpace().before(pair.getSecond());
}
for (Keyword comma : f.findKeywords(",")) {
c.setNoSpace().before(comma);
}

// formatting for grammar rule Line


c.setLinewrap(2).after(f.getLineAccess().getSemicolonKeyword_1());
c.setNoSpace().before(f.getLineAccess().getSemicolonKeyword_1());

// formatting for grammar rule TestIndentation


c.setIndentationIncrement().after(
f.getTestIndentationAccess().getLeftCurlyBracketKeyword_1());
c.setIndentationDecrement().before(
f.getTestIndentationAccess().getRightCurlyBracketKeyword_3());
c.setLinewrap().after(
f.getTestIndentationAccess().getLeftCurlyBracketKeyword_1());
c.setLinewrap().after(
f.getTestIndentationAccess().getRightCurlyBracketKeyword_3());

// formatting for grammar rule Param


c.setNoLinewrap().around(f.getParamAccess().getColonKeyword_1());
c.setNoSpace().around(f.getParamAccess().getColonKeyword_1());

// formatting for Comments


cfg.setLinewrap(0, 1, 2).before(g.getSL_COMMENTRule());
cfg.setLinewrap(0, 1, 2).before(g.getML_COMMENTRule());
cfg.setLinewrap(0, 1, 1).after(g.getML_COMMENTRule());
}
}

The formatter has to implement the method configureFormatting(...) which declaratively sets up a
FormattingConfig.
The FormattingConfig consist of general settings and a set of formatting instructions:

5.9.1.1. General FormattingConfig Settings


• setAutoLinewrap(int) defines the amount of characters after which a line-break should be dynamically
inserted between two tokens. The instructions setNoLinewrap().???(), setNoSpace().???() and
setSpace(space).???() suppress this behavior locally. The default is 80.

5.9.1.2. FormattingConfig Instructions


Per default, the declarative formatter inserts one whitespace between two tokens. Instructions can be used
to specify a different behavior. They consist of two parts: When to apply the instruction and what to do.

Xtext 1.0 61
Runtime Concepts

To understand when an instruction is applied think of a stream of tokens whereas each token is associated
with the corresponding grammar element. The instructions are matched against these grammar elements.
The following matching criteria exist:
• after(ele): The instruction is applied after the grammar element ele has been matched. For example, if
your grammar uses the keyword “;” to end lines, this can instruct the formatter to insert a line break
after the semicolon.
• before(ele): The instruction is executed before the matched element. For example, if your grammar
contains lists which separate their values with the keyword “,”, you can instruct the formatter to
suppress the whitespace before the comma.
• around(ele): This is the same as before(ele) combined with after(ele).
• between(ele1, ele2): This matches if ele2 directly follows ele1 in the document. There may be no other
tokens in between ele1 and ele2.
• bounds(ele1, ele2): This is the same as after(ele1) combined with before(ele2).
• range(ele1, ele2): The rule is enabled when ele1 is matched, and disabled when ele2 is matched.
Thereby, the rule is active for the complete region which is surrounded by ele1 and ele2.
The term tokens is used slightly different here compared to the parser/lexer. Here, a token is a keyword
or the string that is matched by a terminal rule, data type rule or cross-reference. In the terminology of
the lexer a data type rule can match a composition of multiple tokens.
The parameter ele can be a grammar’s AbstractElement or a grammar’s AbstractRule. All grammar
rules and almost all abstract elements can be matched. This includes rule calls, parser rules, groups and
alternatives. The semantic of before(ele), after(ele), etc. for rule calls and parser rules is identical to when
the parser would “pass” this part of the grammar. The stack of called rules is taken into account. The
following abstract elements can not have assigned formatting instructions:
• Actions. E.g. {MyAction} or {MyAction.myFeature=current}.
• Grammar elements nested in data type rules. This is due to to the fact that tokens matched by a data type
rule are treated as atomic by the serializer. To format these tokens, please implement a ValueConverter.
• Grammar elements nested in CrossReferences.
After having explained how rules can be activated, this is what they can do:
• setIndentationIncrement() increments indentation by one unit at this position. Whether one
unit consists of one tab-character or spaces is defined by IIndentationInformation. The default
implementation consults Eclipse’s PreferenceStore.
• setIndentationDecrement() decrements indentation by one unit.
• setLinewrap(): Inserts a line-wrap at this position.
• setLinewrap(int count): Inserts count numbers of line-wrap at this position.
• setLinewrap(int min, int default, int max): If the amount of line-wraps that have been at this position
before formatting can be determined (i.g. when a node model is present), then the amount of of line-
wraps is adjusted to be within the interval [ min, max] and is then reused. In all other cases default line-
wraps are inserted. Example: setLinewrap(0, 0, 1) will preserve existing line-wraps, but won’t allow
more than one line-wrap between two tokens.
• setNoLinewrap(): Suppresses automatic line wrap, which may occur when the line’s length exceeds
the defined limit.
• setSpace(String space): Inserts the string space at this position. If you use this to insert something else
than whitespace, tabs or newlines, a small puppy will die somewhere in this world.
• setNoSpace(): Suppresses the whitespace between tokens at this position. Be aware that between some
tokens a whitespace is required to maintain a valid concrete syntax.

5.9.1.3. Grammar Element Finders


Sometimes, if a grammar contains many similar elements for which the same formatting instructions
ought to apply, it can be tedious to specify them for each grammar element individually. The
IGrammarAccess provides convenience methods for this. The find methods are available for the grammar
and for each parser rule.

Xtext 1.0 62
Runtime Concepts

• findKeywords(String... keywords) returns all keywords that equal one of the parameters.
• findKeywordPairs(String leftKw, String rightKw): returns tuples of keywords from the same grammar
rule. Pairs are matched nested and sequentially. Example: for Rule: ‘(’ name=ID (‘(’ foo=ID ‘)’) ‘)’ |
‘(’ bar=ID ')' findKeywordPairs(“(”, “)”) returns three pairs.

5.9.2. Fragment Provider (Referencing Xtext Models From Other


EMF Artifacts)
Although inter-Xtext linking is not done by URIs, you may want to be able to reference your EObject
from non-Xtext models. In those cases URIs are used, which are made up of a part identifying the resource
and a second part that points to an object. Each EObject contained in a resource can be identified by a
so called fragment.
A fragment is a part of an EMF URI and needs to be unique per resource.
The generic resource shipped with EMF provides a generic path-like computation of fragments. These
fragment paths are unique by default and do not have to be serialized. On the other hand, they can be
easily broken by reordering the elements in a resource.
With an XMI or other binary-like serialization it is also common and possible to use UUIDs. UUIDs are
usually binary and technical, so you don’t want them in human readable representations.
However with a textual concrete syntax we want to be able to compute fragments out of the human
readable information. We don’t want to force people to use UUIDs (i.e. synthetic identifiers) or fragile,
relative, generic paths in order to refer to EObjects.
Therefore one can contribute a so called IFragmentProvider per language. It has two methods:
getFragment(EObject, Fallback) to calculate the fragment of an EObject and getEObject(Resource,
String, Fallback) to go the opposite direction. The Fallback interface allows to delegate to the default
strategy – usually the fragment paths described above.
The following snippet from the GMF Example shows how to use qualified names as fragments:
public QualifiedNameFragmentProvider implements IFragmentProvider {

@Inject
private IQualifiedNameProvider qualifiedNameProvider;

public String getFragment(EObject obj, Fallback fallback) {


String qualifiedName = qualifiedNameProvider.getQualifiedName(obj);
return qualifiedName != null ? qualifiedName : fallback.getFragment(obj);
}

public EObject getEObject(Resource resource,


String fragment,
Fallback fallback) {
if (fragment != null) {
Iterator<EObject> i = EcoreUtil.getAllContents(resource, false);
while(i.hasNext()) {
EObject eObject = i.next();
String candidateFragment = (eObject.eIsProxy())
? ((InternalEObject) eObject).eProxyURI().fragment()
: getFragment(eObject, fallback);
if (fragment.equals(candidateFragment))
return eObject;
}
}
return fallback.getEObject(fragment);
}
}

For performance reasons it is usually a good idea to navigate the resource based on the fragment
information instead of traversing it completely. If you know that your fragment is computed from
qualified names and your model contains something like NamedElements, you should split your fragment
into those parts and query the root elements, the children of the best match and so on.

Xtext 1.0 63
Runtime Concepts

Furthermore it’s a good idea to have some kind of conflict resolution strategy to be able to distinguish
between equally named elements that actually are different, e.g. properties may have the very same
qualified name as entities.

5.10. Encoding in Xtext


Encoding, AKA character set, describes the way characters are encoded into bytes and vice versa. Famous
standard encodings are “UTF-8” or “ISO-8859-1”. The list of available encodings can be determined by
calling java.nio.Charset.availableCharsets(). There is also a list of encodings and their canonical Java
names in the API docs.
Unfortunately, each platform and/or spoken language tends to define its own native encoding, e.g. Cp1258
on Windows in Vietnamese or MacIceland on Mac OS X in Icelandic.
In an Eclipse workspace, files, folders, projects can have individual encodings, which are stored in
the hidden file .settings/org.eclipse.core.resources.prefs in each project. If a resource does not have an
explicit encoding, it inherits the one from its parent recursively. Eclipse chooses the native platform
encoding as the default for the workspace root. You can change the default workspace encoding in
the Eclipse preferences Preferences->Workspace->Default text encoding. If you develop on different
platforms, you should consider choosing an explicit common encoding for your text or code files,
especially if you use special characters.
While Eclipse allows to define and inspect the encoding of a file, your file system usually doesn’t. Given
an arbitrary text file there is no general strategy to tell how it was encoded. If you deploy an Eclipse
project as a jar (even a plug-in), any encoding information not stored in the file itself is lost, too. Some
languages define the encoding of a file explicitly, as in the first processing instruction of an XML file.
Most languages don’t. Others imply a fixed encoding or offer enhanced syntax for character literals, e.g.
\uXXXX in Java.
As Xtext is about textual modeling, it allows to tweak the encoding in various places.

5.10.1. Encoding at Language Design Time


The plug-ins created by the New Xtext Project wizard are by default encoded in the workspace’s standard
encoding. The same holds for all files that Xtext generates in there. If you want to change that, e.g. because
your grammar uses/allows special characters, you should manually set the encoding in the properties of
these projects after their creation. Do this before adding special characters to your grammar or at least
make sure the grammar reads correctly after the encoding change. To tell the Xtext generator to generate
files in the same encoding, set the encoding property in the workflow next to your grammar, e.g.
Generator {
encoding ="UTF-8"
...

5.10.2. Encoding at Language Runtime


As each language could handle the encoding problem differently, Xtext offers a service here. The
IEncodingProvider has a single method getEncoding(URI) to define the encoding of the resource with
the given URI. Users can implement their own strategy but keep in mind that this is not intended to be
a long running method. If the encoding is stored within the model file itself, it should be extractable in
an easy way, like from the first line in an XML file. The default implementation returns the default Java
character set in the runtime scenario.
In the UI scenario, when there is a workspace, users will expect the encoding of the model files
to be settable the same way as for other files in the workspace. The default implementation of the
IEncodingProvider in the UI scenario therefore returns the file’s workspace encoding for files in the
workspace and delegates to the runtime implementation for all other resources, e.g. models in a jar or
from a deployed plug-in. Keep in mind that you are going to loose the workspace encoding information
as soon as you leave this workspace, e.g. deploy your project.
Unless you want to enforce a uniform encoding for all models of your language, we advise to override
the runtime service only. It is bound in the runtime module using the binding annotation @Runtime:

Xtext 1.0 64
Runtime Concepts

@Override
public void configureRuntimeEncodingProvider(Binder binder) {
binder.bind(IEncodingProvider.class)
.annotatedWith(DispatchingProvider.Runtime.class)
.to(MyEncodingProvider.class);
}

For the uniform encoding, bind the plain IEncodingProvider to the same implementation in both modules:
@Override
public Class<? extends IEncodingProvider> bindIEncodingProvider() {
return MyEncodingProvider.class;
}

5.10.3. Encoding of an XtextResource


An XtextResource uses the IEncodingProvider of your language by default. You can override that by
passing an option on load and save, e.g.
Map<?,?> options = new HashMap();
options.put(XtextResource.OPTION_ENCODING, "UTF-8");
myXtextResource.load(options);
options.put(XtextResource.OPTION_ENCODING, "ISO-8859-1");
myXtextResource.save(options);

5.10.4. Encoding in New Model Projects


The SimpleProjectWizardFragment generates a wizard that clients of your language can use to create
model projects. This wizard expects its templates to be in the encoding of the Generator that created it
(see above). As for every new project wizard, its output will be encoded in the default encoding of the
target workspace.

5.10.5. Encoding of Xtext Source Code


The source code of the Xtext framework itself is completely encoded in “ISO 8859-1”, which is necessary
to make the Xpand templates work everywhere (they use french quotation markup). That encoding is
hard coded into the Xtext generator code. You are likely never going to change that.

Xtext 1.0 65
Chapter 6. MWE2
The Modeling Workflow Engine 2 (MWE2) is a rewritten backwards compatible implementation of
the Modeling Workflow Engine (MWE). It is a declarative, externally configurable generator engine.
Users can describe arbitrary object compositions by means of a simple, concise syntax that allows to
declare object instances, attribute values and references. One use case – that’s where the name had its
origins – is the definition of workflows. Such a workflow consists usually of a number of components
that interact with each other. There are components to read EMF resources, to perform operations
(transformations) on them and to write them back or to generate any number of other artifacts out of the
information. Workflows are typically executed in a single JVM. However there are no constraints the
prevent implementors to provide components that spawn multiple threads or new processes.

6.1. Examples
Let’s start with a couple of examples to demonstrate some usage scenarios for MWE2. The first examples
is a simple HelloWorld module that does nothing but print a message to standard out. The second module
is assembled of three components that read an Ecore file, transform the contained classifier-names to
upper-case and serialize the resource back to a new file. The last examples uses the life-cycle methods
to print the execution time of the workflow.

6.1.1. The Simplest Workflow


The arguably shortest MWE2 module may look like the following snippet.
module HelloWorld

SayHello {
message = "Hello World!"
}
It configures a very simple workflow component with a message that should be printed to System.out
when the workflow is executed. The module begins with a declaration of its name. It must fulfill the Java
conventions for fully qualified class-names. That’s why the module HelloWorld has to be placed into the
default package of a Java source folder. The second element in the module is the class-name SayHello
which introduces the root element of the module. The interpreter will create an instance of the given type
and configure it as declared between the curly braces. E.g. the assignment message = "Hello World!" in
the module will be interpreted as an invocation of the setMessage(String) on the instantiated object. As
one can easily imagine, the implementation of the class SayHello looks straight forward:
import org.eclipse.emf.mwe2.runtime.workflow.IWorkflowComponent;
import org.eclipse.emf.mwe2.runtime.workflow.IWorkflowContext;

public class SayHello implements IWorkflowComponent {

private String message = "Hello World!";


public void setMessage(String message) {
this.message = message;
}
public String getMessage() {
return message;
}

public void invoke(IWorkflowContext ctx) {


System.out.println(getMessage());
}

public void postInvoke() {}


public void preInvoke() {}
}
It looks like a simple POJO and that’s the philosophy behind MWE2. It is easily possible to assemble
completely independent objects in a declarative manner. To make the workflow executable with the
WorkflowRunner, the component SayHello must be nested in a root workflow:

Xtext 1.0 66
MWE2

module HelloWorld

Workflow {
component = SayHello {
message = "Hello World!"
}
}
The class Workflow is actually org.eclipse.emf.mwe2.runtime.workflow.Workflow but its package is
implicitly imported in MWE2 modules to make the the modules more concise. The execution result of
this workflow will be revealed after a quick Run As .. -> MWE2 Workflow in the console as
Hello World!

6.1.2. A Simple Transformation


The following workflow solves the exemplary task to rename every EClassifier in an *.ecore file. It
consists of three components that read, modify and write the model file:
module Renamer
Workflow {
component = ResourceReader {
uri = "model.ecore"
}
component = RenamingTransformer {}
component = ResourceWriter {
uri = "uppercaseModel.ecore"
}
}
The implementation of these components is surprisingly simple. It is easily possible to create own
components even for minor operations to automate a process.
The ResourceReader simply reads the file with the given URI and stores it in a so called slot of the
workflow context. A slot can be understood as a dictionary or map-entry.
public class ResourceReader extends WorkflowComponentWithSlot {
private String uri;
public void invoke(IWorkflowContext ctx) {
ResourceSet resourceSet = new ResourceSetImpl();
URI fileURI = URI.createFileURI(uri);
Resource resource = resourceSet.getResource(fileURI, true);
ctx.put(getSlot(), resource);
}

public void setUri(String uri) {


this.uri = uri;
}
public String getUri() {
return uri;
}
}
The actual transformer takes the model from the slot and modifies it. It simply iterates the content of the
resource, identifies each EClassifier and sets its name.

Xtext 1.0 67
MWE2

public class RenamingTransformer extends WorkflowComponentWithSlot {


private boolean toLowerCase = false;
public void invoke(IWorkflowContext ctx) {
Resource resource = (Resource) ctx.get(getSlot());
EcoreUtil.resolveAll(resource);
Iterator<Object> contents = EcoreUtil.getAllContents(resource, true);
Iterator<EClassifier> iter =
Iterators.filter(contents, EClassifier.class);
while(iter.hasNext()) {
EClassifier classifier = (EClassifier) iter.next();
classifier.setName(isToLowerCase()
? classifier.getName().toLowerCase()
: classifier.getName().toUpperCase());
}
}

public void setToLowerCase(boolean toLowerCase) {


this.toLowerCase = toLowerCase;
}
public boolean isToLowerCase() {
return toLowerCase;
}
}
After the model has been modified it should be written to a new file. That’s what the ResourceWriter
does. It actually takes the resource from the given slot and saves it with the configured URI:
public class ResourceWriter extends WorkflowComponentWithSlot {
private String uri;
public void invoke(IWorkflowContext ctx) {
Resource resource = (Resource) ctx.get(getSlot());
URI uri = URI.createFileURI(getUri());
uri = resource.getResourceSet().getURIConverter().normalize(uri);
resource.setURI(uri);
try {
resource.save(null);
} catch (IOException e) {
throw new WrappedException(e);
}
}

public void setUri(String uri) {


this.uri = uri;
}
public String getUri() {
return uri;
}
}
Last but not least, the common super-type for those components looks like this:
public abstract class WorkflowComponentWithSlot
implements IWorkflowComponent {
private String slot = "model";
public void setSlot(String slot) {
this.slot = slot;
}
public String getSlot() {
return slot;
}

public void postInvoke() {}


public void preInvoke() {}
}

Xtext 1.0 68
MWE2

Each of the mentioned implementations is rather simple and can be done in a couple of minutes. This
is true for many tedious tasks that developers face in their daily work. MWE2 can be used to automize
these tasks with minimum effort.

6.1.3. A Stopwatch
The last example demonstrates how to combine the MWE2 concepts to create a simple stopwatch that
allows to measure the execution time of a set of components. The idea is to add the very same stopwatch
twice as a component to a workflow. It will measure the time from the first pre-invoke to the last post-
invoke event and print the elapsed milliseconds to the console.
public class StopWatch implements IWorkflowComponent {
private long start;
private boolean shouldStop = false;
public void invoke(IWorkflowContext ctx) {}

public void postInvoke() {


if (shouldStop) {
long elapsed = System.currentTimeMillis() - start;
System.out.println("Time elapsed: " + elapsed + " ms");
}
shouldStop = true;
}

public void preInvoke() {


start = System.currentTimeMillis();
}
}
Clients who want to leverage this kind of stopwatch may use the following pattern. The stopwatch-
instance will be added as the first component and the last component to a workflow. Everything in between
will be measured. In this case, it is another workflow that does not need know about this decoration. The
idea is to use a local identifier for the instantiated StopWatch and reuse this one at the end to receive the
post-invoke life-cycle event twice.
module MeasuredWorkflow

Workflow {
component = StopWatch: stopWatch {}
component = @OtherWorkflow {}
component = stopWatch
}

6.2. Language Reference


MWE2 has a few well defined concepts which can be combined to assemble arbitrary object graphs in
a compact and declarative manner.
• A MWE2 file defines a module which exposes its root component as reusable artifact.
• Properties can be used to extract reusable, configurable parts of the workflow.
• Components are mapped to plain vanilla Java objects. Arbitrary set- and add-methods are used to
configure them.
Let’s consider the follow short example module and SampleClass to explain these concepts.
module com.mycompany.Example

import java.util.*

SampleClass {
singleValue = 'a string'
multiValue = ArrayList {}
child = {}
}

Xtext 1.0 69
MWE2

package com.mycompany;

import java.util.List;

public class SampleClass {


public void setSingleValue(String value) {..}
public void addMultiValue(List<?> value) {..}
public void addChild(SampleClass value) {..}
}

6.2.1. Mapping to Java Classes


The module com.mycompany.Example defines a root component of type com.mycompany.SampleClass.
It is possible to use the simple class-name because MWE2 uses the very same visibility rules as the Java
compiler. Classes that are in the same package as the module can be referenced by their simple name.
The same rule applies for classes from the java.lang package. For convenience reasons is the package
org.eclipse.emf.mwe2.runtime.workflow implicitly imported as well as it exposes some library workflow
components. However, the imports are more flexible then in Java since MWE2-imports can be relative,
e.g. the import java.* resolves the reference util.ArrayList to java.util.ArrayList.
The root instance of type SampleClass has to be configured after it has been created. Therefore the method
setSingleValue will be called at first. The given parameter is 'a string'. The method is identified by its
name which starts with set. To allow to assign multi-value properties, MWE provides access to methods
called add* as well.
If the right side of the assignment in the workflow file does not define a class explicitly, its type is inferred
from the method parameter. The line child = {} is equivalent to child = SampleClass {} and creates a
new instance of SampleClass.
MWE2 ships with nice tool support. The editor will provide content assist for the allowed types and
highlight incompatible assignments. The available properties for Java classes will be proposed as well.

6.2.2. Module
As MWE2 modules have a fully qualified name, it is possible to refer to them from other modules. The
type of the module is derived from the type of its root component. The com.mycompany.Example can be
assigned at any place where a com.mycompany.SampleClass is expected.
Let’s create a second module com.mycompany.Second like this:
module com.mycompany.sub.Second

import com.mycompany.*

SampleClass {
child = @Example {}
}
The child value will be assigned to an instance of SampleClass that is configured as in the first example
workflow. This enables nice composition and a very focused, reusable component design.
As the same rules apply in MWE2 like in Java, the module com.mycompany.sub.Second has to be defined
in a file called Second.mwe2 in the package com.mycompany.sub. The import semantic for other modules
is the same as for classes. The import statement allows to refer to com.mycompany.Example with a
shortened name.

6.2.3. Properties
MWE2 allows to extract arbitrary information into properties to ensure that these pieces are not cluttered
around the workflow and to allow for easier external customization. The exemplary component definition
was only changed slightly by introducing a property value.

Xtext 1.0 70
MWE2

module com.mycompany.Example

var value = 'a string'

SampleClass {
singleValue = value
}
The type of the property will be derived from the default value similar to the mechanism that is already
known from set- and add-methods. If no default value is given, java.lang.String will be assumed.
However, properties are not limited to strings. The second built in type is boolean via the familiar literals
true and false. More flexibility is available via actual component literals.
module com.mycompany.Example

var childInstance = SampleClass {


singleValue = "child"
}

SampleClass {
child = childInstance
}
If one wants to define string properties that are actual reusable parts for other properties, she may use
defined variables inside other literals like this:
var aString = "part"
var anotherString = "reuse the ${part} here"
This is especially useful for file paths in workflows as one would usually want to define some common
root directories only ones in the workflow and reuse this fragment across certain other file locations.

6.2.4. Mandatory Properties


It is not always feasible to define default values for properties. That is where mandatory properties come
into play. Modules define their interface not only via their fully qualified name and the type of the root
component but also by means of the defined properties.
module com.mycompany.Example

var optional = 'a string'


var mandatory

SampleClass {
singleValue = optional
child = {
singleValue = mandatory
}
}
This version of the example module exposes two externally assignable properties. The second one has no
default value assigned and is thereby considered to be mandatory. The mandatory value must be assigned
if we reuse org.mycompany.Example in another module like this:
module com.mycompany.Second

var newMandatory

@Example {
mandatory = "mandatoryValue"
optional = newMandatory
}
Note that it is even possible to reuse another module as the root component of a new module. In this
case we set the mandatory property of Example to a specific constant value while the previously optional
value is now redefined as mandatory by means of a new property without a default value.
It is not only possible to define mandatory properties for MWE2 modules but for classes as well. Therefore
MWE2 ships with the @Mandatory annotation. If a set- or add-method is marked as mandatory, the
module validation will fail if no value was assigned to that feature.

Xtext 1.0 71
MWE2

6.2.5. Named Components


Properties are not the only way to define something that can be reused. It is possible to assign a name
to any instantiated component whether it’s created from a class literal or from another component. This
allows to refer to previously created and configured instances. Named instances can come handy for
notification and call-back mechanisms or more general in terms of defined life-cycle events.
If we wanted to assign the created instance to a property of itself, we could use the following syntax:
module com.mycompany.Example

SampleClass : self {
child = self
}
A named component can be referenced immediately after its creation but it is not possible to define
forward references in a MWE2 file.

6.2.6. Auto Injection


Existing modules or classes often expose a set of properties that will be assigned to features of its root
component or set- and add- methods respectively. In many cases its quite hard to come up with yet another
name for the very same concept which leads to the situation where the properties itself have the very
same name as the component’s feature. To avoid the overall repetition of assignments, MWE2 offers the
possibility to use the auto-inject modifier on the component literal:
module com.mycompany.Example

var child = SampleClass {}

SampleClass auto-inject {
}
This example will implicitly assign the value of the property child to the feature child of the root
component. This is especially useful for highly configurable workflows that expose dozens of optional
parameters each of which can be assigned to one or more components.
The auto-inject modifier can be used for a subset of the available features as well. It will suppressed for
the explicitly set values of a component.

6.3. Syntax Reference


The following chapter serves as a reference for the concrete syntax of MWE2. The building blocks of a
module will be described in a few words.
MWE2 is not sensitive to white-space and allows to define line-comments and block comments
everywhere. The syntax is the same as one is used to from the Java language:
// This is a comment
/*
This is another one.
*/
Every name in MWE2 can be a fully qualified identifier and must follow the Java conventions. However,
in contrast to Java identifiers it is not allowed to use German umlauts or Unicode escape sequences in
identifiers. A valid ID-segment in MWE2 starts with a letter or an underscore and is followed by any
number of letters, numbers or underscores. An identifier is composed from one or more segments which
are delimited by a ‘.’ dot.
Name: ID ('.' ID)*;
ID: ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;
MWE2 does not use a semicolon as statement delimiter at any place.

6.3.1. Module

Xtext 1.0 72
MWE2

A module consists of four parts. The very first statement in a *.mwe2 file is the module declaration. The
name of the module must follow the naming convention for Java classes. That MWE2 file’s name must
therefore be the same as the last segment of the module-name and it has to be placed in the appropriate
package of a Java source path.
It is allowed to define any number of import statements in a module. Imports are either suffixed by a
wild-card or concrete for a class or module. MWE2 can handle relative imports in case one uses the wild-
card notation:
'import' name '.*'?

6.3.2. Property
The list of declared properties follows the optional import section. It is allowed to define modules without
any properties.

Each declared property is locally visible in the module. It furthermore defines an assignable feature of
the module in case one refers to it from another module. Properties may either have a default value or
they are considered to be mandatory. If the type of property is omitted it will be inferred from the default
value. The default type of a property is java.lang.String so if no default value is available, the property
is mandatory and of type String.
There are four types of values available in MWE2. One may either define a string, boolean or component
literal or a reference to a previously defined property.

6.3.3. Component
The last part of a module is the root component. It defines the externally visible type of the module and
may either be created from a Java type or from another module.

The type of the component can be derived in many cases except for the root component. That’s why
it’s optional in the component literal. If no type is given, it will be inferred from the left side of the
assignment. The assigned feature can either be a declared property of the module or a set- or add-method
of a Java class.
Components can be named to make them referable in subsequent assignments. Following the ':' keyword,
one can define an identifier for the instantiated component. The identifier is locally visible in the module
and any assignment that is defined after the named component can refer to this identifier and thereby
point to exactly the instantiated object.
The next option for a component is auto-inject. If this modifier is set on a component, any available
feature of the component that has the same name as a property or previously created named component
will be automatically assigned.
The core of a component is the list of assignments between the curly braces. An arbitrary number of
values can be set on the component by means of feature-to-value pairs.

Xtext 1.0 73
MWE2

The available constructs on the right hand side of the assignment are the same as for default values for
properties.

6.3.4. String Literals


String values are likely to be the most used literals in MWE2. There is a convenient syntax for string
concatenation available due to the high relevance in a description object composition and configuration
language. MWE2 strings are multi-line strings and can be composed of several parts.
var aString = 'a value'
var anotherString = 'It is possible to embed ${aString} into
a multi-line string'
This is especially convenient for path-substitution if one defines e.g. a common root directory and wants
to specify other paths relative to the base.
There are two different delimiters available for strings. Users are free to either use single- or double-
quotes to start and end strings. If a certain string contains a lot of single-quotes one would better choose
double-quotes as delimiter and vice versa. There is no semantic difference between both notations.
The escape character in MWE2 is the back-slash "\". It can be used to write line-breaks or tabular
characters explicitly and to escape the beginning of substitution variables ${ and the quotes itself. Allowed
escape sequences are:
\n .. line break
\r .. carriage return
\t .. tabular character
\' .. single-quote (can be omitted in double-quoted strings)
\" .. double-quote (can be omitted in single-quoted strings)
\${ .. escape the substitution variable start ${
\\ .. the back-slash itself
Other escape sequence are illegal in MWE2 strings.

6.3.5. Boolean Literals


MWE2 has native support for the boolean type. The literals are true and false.

6.3.6. References
Each assigned value in MWE2 either as default for properties or in a component assignment can be a
reference to a previously declared property or named component. The can be referenced intuitively by
their name.

Xtext 1.0 74
Chapter 7. IDE Concepts
For the following part we will refer to a concrete example grammar in order to explain certain aspect of
the UI more clearly. The used example grammar is as follows:
grammar org.eclipse.text.documentation.Sample
with org.eclipse.xtext.common.Terminals

generate gen 'http://www.eclipse.org/xtext/documentation/Sample'

Model :
"model" intAttribute=INT (stringDescription=STRING)? "{"
(rules += AbstractRule)*
"}"
;

AbstractRule:
RuleA | RuleB
;

RuleA :
"RuleA" "(" name = ID ")" ;

RuleB return gen::CustomType:


"RuleB" "(" ruleA = [RuleA] ")" ;

7.1. Label Provider


There are various places in the UI in which model elements have to be presented to the user: In the
outline view, in hyper links, in content proposals, find dialogs etc. Xtext allows to customize each of
these appearances by individual ILabelProviders.
An ILabelProvider has two methods: getText(Object) returns the text in an object’s label, while
getImage(Object) returns the icon. In addition, the Eclipse UI framework offers the IStyledLabelProvider,
which returns a styled string (i.e. with custom fonts, colors etc.) in the getStyledText(Object) method.
Almost all label providers in the Xtext framework inherit from the base class AbstractLabelProvider
which unifies both approaches. Subclasses can either return a styled string or a string in the
doGetText(Object) method. The framework will automatically convert it to a styled text (with default
styles) or to a plain text in the respective methods.
Dealing with images can be cumbersome, too, as image handles tend to be scarce system resources. The
AbstractLabelProvider helps you managing the images: In your implementation of doGetImage(Object)
you can as well return an Image or a string, representing a path in the icons/ folder of the containing plug-
in. This path is actually configurable by Google Guice. Have a look at the PluginImageHelper to learn
about the customizing possibilities.
If you have the LabelProviderFragment in the list of generator fragments in the MWE2 workflow for
your language, it will automatically create stubs and bindings for an EObjectLabelProvider and an
DescriptionLabelProvider which you can implement by manually.

7.1.1. Label Providers For EObjects


The first set of label providers refers to actually loaded and thereby available model elements. By
default, Xtext binds the DefaultEObjectLabelProvider to all use cases, but you can change the binding
individually for the Outline, Content Assist or other places. For that purpose, there is a so called binding
annotation for each use case. For example, to use a custom MyContentAssistLabelProvider to display
elements in the content assist, you have to override

Xtext 1.0 75
IDE Concepts

@Override
public void configureContentProposalLabelProvider(Binder binder) {
binder.bind(ILabelProvider.class)
.annotatedWith(ContentProposalLabelProvider.class)
.to(MyContentAssistLabelProvider.class);
}

p.in your language's UI module.

If your grammar uses an imported EPackage, there may be an existing .edit plug-in generated by EMF
that also provides label providers for model elements. To use this as a fallback, your label provider should
call the constructor with the delegate parameter and use DI for the initialization, e.g.
public class MyLabelProvider {
@Inject
public MyLabelProvider(AdapterFactoryLabelProvider delegate) {
super(delegate);
}
...
}

7.1.1.1. DefaultEObjectLabelProvider
The default implementation of the LabelProvider interface utilizes the polymorphic dispatcher idiom
to implement an external visitor as the requirements of the LabelProvider are kind of a best match
for this pattern. It comes down to the fact that the only thing you need to do is to implement a
method that matches a specific signature. It either provides a image filename or the text to be used to
represent your model element. Have a look at following example to get a more detailed idea about the
DefaultEObjectLabelProvider.
public class SampleLabelProvider extends DefaultLabelProvider {

String text(RuleA rule) {


return "Rule: " + rule.getName();
}

String image(RuleA rule) {


return "ruleA.gif";
}

String image(RuleB rule) {


return "ruleB.gif";
}
}

What is especially nice about the default implementation is the actual reason for its class name: It provides
very reasonable defaults. To compute the label for a certain model element, it will at first have a look
for an EAttribute name and try to use this one. If it cannot find such a feature, it will try to use the first
feature, that can be used best as a label. At worst it will return the class name of the model element, which
is kind of unlikely to happen.
You can a also customize error handling by overriding the methods handleTextError() or
handleImageError().

7.1.2. Label Providers For Index Entries


Xtext maintains an index of all model elements to allow quick searching and linking without loading
the referenced resource (See the chapter on index-based scopes for details). The elements from
this index also appear in some UI contexts, e.g. in the Find model elements dialog or in the Find
references view. For reasons of scalability, the UI should not automatically load resources, so we
need another implementation of a label provider that works with the elements from the index, i.e.
IResourceDescription, IEObjectDescription, and IReferenceDescription.

Xtext 1.0 76
IDE Concepts

The default implementation of this service is the DefaultDescriptionLabelProvider. It employs the


same polymorphic dispatch mechanism as the DefaultEObjectLabelProvider. The default text of an
EObjectDescription is its indexed name. The image is resolved by dispatching to image(EClass)
with the EClass of the described object. This is likely the only method you want to override.
IResourceDescriptions will be represented with their path and the icon registered for your language’s
editor.
To have a custom description label provider, make sure it is bound in your UI module:
public void configureResourceUIServiceLabelProvider(Binder binder) {
binder.bind(ILabelProvider.class)
.annotatedWith(ResourceServiceDescriptionLabelProvider.class)
.to(MyCustomDefaultDescriptionLabelProvider.class);
}

7.2. Content Assist


The Xtext generator, amongst other things, generates the following two content assist (CA) related
artifacts:
• an abstract proposal provider class named Abstract[Language]ProposalProvider generated into the
src-gen folder within the ui project
• a concrete sub-class in the src-folder of the ui project called ProposalProvider
First we will investigate the generated Abstract[Language]ProposalProvider with methods that look like
this:

7.2.1. ProposalProvider
public void complete[TypeName]_[FeatureName](
EObject model,
Assignment assignment,
ContentAssistContext context,
ICompletionProposalAcceptor acceptor) {
// clients may override
}

public void complete_[RuleName](


EObject model,
RuleCall ruleCall,
ContentAssistContext context,
ICompletionProposalAcceptor acceptor) {
// clients may override
}

The snippet above indicates that the generated ProposalProvider class contains a complete*-method for
each assigned feature in the grammar and for each rule. The brackets are place-holders that should give a
clue about the naming scheme used to create the various entry points for clients. The generated proposal
provider falls back to some default behavior for cross-references and keywords. Furthermore it inherits
the logic that was introduced in reused grammars.
Clients who want to customize the behavior may override the methods from the
AbstractProposalProvider or introduce new methods with a specialized first parameter. The framework
dispatches method calls according to the current context to the most concrete implementation, that can
be found.
It is important to know, that for a given offset in a model file, many possible grammar elements exist.
The framework dispatches to the method declarations for any valid element. That means, that a bunch
of complete* methods may be called.

7.2.2. Sample Implementation


To provide a dummy proposal for the description of a model object, you may introduce a specialization
of the generated method and implement it as follows. This will give ‘Description for model #7’ for a
model with the intAttribute '7'

Xtext 1.0 77
IDE Concepts

public void completeModel_StringDescription (


Model model,
Assignment assignment,
ContentAssistContext context,
ICompletionProposalAcceptor acceptor) {
// call implementation in superclass
super.completeModel_StringDescription(
model,
assignment,
context,
acceptor);

// compute the plain proposal


String proposal = "Description for model #" + model.getIntAttribute();

// convert it to a valid STRING-terminal


proposal = getValueConverter().toString(proposal, "STRING");

// create the completion proposal


// the result may be null as the createCompletionProposal(..) methods
// check for valid prefixes
// and terminal token conflicts
ICompletionProposal completionProposal =
createCompletionProposal(proposal, context);

// register the proposal, the acceptor handles null-values gracefully


acceptor.accept(completionProposal);
}

7.3. Quick Fixes


For validations written using the AbstractDeclarativeValidator it is possible to provide corresponding
quick fixes in the editor. To be able to implement a quick fix for a given diagnostic (a warning or error)
the underlying cause of the diagnostic must be known (i.e. what actual problem does the diagnostic
represent?), otherwise the fix doesn’t know what needs to be done. As we don’t want to deduce this from
the diagnostic’s error message we associate a problem specific code with the diagnostic.
In the following example taken from the DomainmodelJavaValidator the diagnostic’s code is given
by the third argument to the warning() method and it is a reference to the static String field
_==INVALID_TYPE_NAME==_ in the validator class.
warning("Name should start with a capital",
DomainmodelPackage.TYPE__NAME, INVALID_TYPE_NAME, type.getName());

Now that the validation has a unique code identifying the problem we can register quick fixes for it.
We start by adding the QuickfixProviderFragment to our workflow and after regenerating the code we
should find an empty class MyDslQuickfixProvider in our DSL’s UI project and new entries in the
_==plugin.xml_gen==_ file.
Continuing with the _==INVALID_TYPE_NAME==_ problem from the Domainmodel example we add
a method with which the problem can be fixed (have a look at the DomainmodelQuickfixProvider for
details):

Xtext 1.0 78
IDE Concepts

@Fix(DomainmodelJavaValidator.INVALID_TYPE_NAME)
public void fixName(final Issue issue, IssueResolutionAcceptor acceptor) {
acceptor.accept(issue,
"Capitalize name", // quick fix label
"Capitalize name of '" + issue.getData()[0] + "'", // description
"upcase.png", // quick fix icon
new IModification() {
public void apply(IModificationContext context)
throws BadLocationException {
IXtextDocument xtextDocument = context.getXtextDocument();
String firstLetter = xtextDocument.get(issue.getOffset(), 1);
xtextDocument.replace(issue.getOffset(), 1,
Strings.toFirstUpper(firstLetter));
}
}
);
}

By using the correct signature (see below) and annotating the method with the @Fix annotation
referencing the previously specified issue code from the validator, Xtext knows that this method
implements a fix for the problem. This also allows us to annotate multiple methods as fixes for the same
problem.
The first three parameters given to the IssueResolutionAcceptor define the UI representation of the
quick fix. As the document is not necessarily loaded when the quick fix is offered, we need to provide
any additional data from the model that we want to refer to in the UI when creating the issue in the
validator above. In this case, we provided the existing type name. The additional data is available as
Issue.getData(). As it is persisted in markers, only strings are allowed.
The actual model modification is implemented in the IModification. The IModificationContext provides
access to the erroneous document. In this case, we’re using Eclipse’s IDocument API to replace a text
region.
If you prefer to implement the quick fix in terms of the semantic model use a ISemanticModification
instead. Its apply(EObject, IModificationContext) method will be invoked inside a modify-transaction
and the first argument will be the erroneous semantic element. This makes it very easy for the fix method
to modify the model as necessary. After the method returns the model as well as the Xtext editor’s content
will be updated accordingly. If the method fails (throws an exception) the change will not be committed.
The following snippet shows a semantic quick fix for a similar problem.
@Fix(DomainmodelJavaValidator.INVALID_FEATURE_NAME)
public void fixFeatureName(final Issue issue,
IssueResolutionAcceptor acceptor) {
acceptor.accept(issue,
"Uncapitalize name", // label
"Uncapitalize name of '" + issue.getData()[0] + "'", // description
"upcase.png", // icon
new ISemanticModification() {
public void apply(EObject element, IModificationContext context) {
((Feature) element).setName(
Strings.toFirstLower(issue.getData()[0]));
}
}
);
}

7.3.1. Quickfixes for Linking Errors and Syntax Errors


You can even define quick fixes for linking errors. The issue codes are assigned by the
ILinkingDiagnosticMessageProvider. Have a look at the domain model example how to add quick fixes
for these errors.
Analogously, there is the ISyntaxErrorMessageProvider to assign issue codes to syntactical errors.

Xtext 1.0 79
IDE Concepts

7.4. Template Proposals


Xtext-based editors automatically support code templates. That means that you get the corresponding
preference page where users can add and change template proposals. If you want to ship a couple of default
templates, you have to put a file named templates.xml inside the templates directory of the generated UI-
plug-in. This file contains templates in a format as described in the Eclipse online help .

By default Xtext registers ContextTypes for each rule ( .[RuleName]) and for each
keyword ( .kw[keyword]_). If you don’t like these defaults you’ll have to subclass
XtextTemplateContextTypeRegistry and configure it via Guice.
In addition to the standard template proposal extension mechanism, Xtext ships with
a predefined set of TemplateVariableResolvers to resolve special variable types inside a
given template (i.e. TemplateContext). Besides the standard template variables available in
org.eclipse.jface.text.templates.GlobalTemplateVariables like ${user}, ${date}, ${time}, ${cursor},
etc., these TemplateVariableResolver support the automatic resolving of CrossReferences (type
CrossReferences) and Enumerations (type Enum) like it is explained in the following sections.
It is best practice to edit the templates in the preferences page, export them into the templates.xml-file
and put this one into the templates folder of your UI-plug-in. However, these templates will not be visible
by default. To fix it, you have to manually edit the xml-file and insert an ID attribute for each template
element.

7.4.1. CrossReference TemplateVariableResolver


Xtext comes with a specific template variable resolver TemplateVariableResolver called
CrossReferenceTemplateVariableResolver, which can be used to place cross-references within a
template.
The syntax is as follows:
${<displayText>:CrossReference([<MyPackageName>.]<MyType>.<myRef>)}
For example the following template:

Xtext 1.0 80
IDE Concepts

<template name="transition" description="event transition" id="transition"


context="org.eclipse.xtext.example.FowlerDsl.Transition" enabled="true">
${event:CrossReference('Transition.event')} =>
${state:CrossReference('Transition.state')
</template>

yields the text event => state and allows selecting any events and states using a drop down.

7.4.2. Enumeration TemplateVariableResolver


The EnumTemplateVariableResolver resolves a template variable to EEnumLiteral literals which are
assignment-compatible to the enumeration type declared as the first parameter of the the Enum
TemplateVariable.
The syntax is as follows:
${<displayText>:Enum([<MyPackage>.]<EnumType>)
For example the following template (taken from the domainmodel example):
<template name="Operation" description="template for an Operation"
id="org.eclipse.xtext.example.Domainmodel.Operation"
context="org.eclipse.xtext.example.Domainmodel.Operation"
enabled="true">
${visibility:Enum('Visibility')} op ${name}(${cursor}):
${type:CrossReference('Operation.type')}
</template>

yields the text public op name(): type where the display text ‘public’ is replaced with a drop down filled
with the literal values as defined in the EEnum Visibility. Also, name and type are template variables.

Xtext 1.0 81
IDE Concepts

7.5. Outline View


Xtext provides an outline view to help you navigate your models. By default, it provides a hierarchical
view on your model and allows you to sort tree elements alphabetically. Selecting an element in the
outline will highlight the corresponding element in the text editor. Users can choose to synchronize the
outline with the editor selection by clicking the Link with Editor button.

You can customize various aspects of the outline by providing implementation for its various interfaces.
The following sections show how to do this.

7.5.1. Influencing the Outline Structure


In its default implementation, the outline view shows the containment hierarchy of your model. This
should be sufficient in most cases. If you want to adjust the structure of the outline, i.e., by omitting
a certain kind of node or by introducing additional even virtual nodes, you customize the outline by
implementing ISemanticModelTransformer.
The Xtext wizard creates an empty transformer class ( MyDslTransformer) for your convenience. To
transform the semantic model delivered by the Xtext parser, you need to provide transformation methods
for each of the EClasses that are of interest:

Xtext 1.0 82
IDE Concepts

public class MyDslTransformer extends


AbstractDeclarativeSemanticModelTransformer {
/**
* This method will be called by naming convention:
* - method name must be createNode
* - first param: subclass of EObject
* - second param: ContentOutlineNode
*/
public ContentOutlineNode createNode(
Attribute semanticNode, ContentOutlineNode parentNode) {
ContentOutlineNode node = super.newOutlineNode(semanticNode, parentNode);
node.setLabel("special " + node.getLabel());
return node;
}

public ContentOutlineNode createNode(


Property semanticNode, ContentOutlineNode parentNode) {
ContentOutlineNode node = super.newOutlineNode(semanticNode, parentNode);
node.setLabel("pimped " + node.getLabel());
return node;
}

/**
* This method will be called by naming convention:
* - method name must be getChildren
* - first param: subclass of EObject
*/
public List<EObject> getChildren(Attribute attribute) {
return attribute.eContents();
}

public List<EObject> getChildren(Property property) {


return NO_CHILDREN;
}
}

To make sure Xtext picks up your new outline transformer, you have to register your implementation
with your UI module. This binding is usually added by the respective generator fragment.
public class MyDslUiModule extends AbstractMyDslUiModule {

@Override
public Class<? extends ISemanticModelTransformer>
bindISemanticModelTransformer() {
return MyDslTransformer.class;
}
...
}

7.5.2. Filtering
Often, you want to allow users to filter the contents of the outline to make it easier to concentrate
on the relevant aspects of the model. To add filtering capabilities to your outline, you need to add
AbstractFilterActions to the outline. Actions can be contributed by implementing and registering a
DeclarativeActionBarContributor.
To register a DeclarativeActionBarContributor, add the following lines to your MyDslUiModule class:

Xtext 1.0 83
IDE Concepts

public class MyDslUiModule extends AbstractMyDslUiModule {

@Override
public Class<? extends IActionBarContributor> bindIActionBarContributor() {
return MyDslActionBarContributor.class;
}
...
}

The action bar contributor will look like this:


public class MyDslActionBarContributor extends
DeclarativeActionBarContributor {
public Action addFilterParserRulesToolbarAction(
XtextContentOutlinePage page) {
return new FilterFooAction(page);
}
}

Filter actions must extend AbstractFilterAction (this ensures that the action toggle state is handled
correctly):
public class FilterFooAction extends AbstractFilterAction {

public FilterFooAction(XtextContentOutlinePage outlinePage) {


super("Filter Foo", outlinePage);
setToolTipText("Show / hide foo");
setDescription("Show / hide foo");
setImageDescriptor(Activator.getImageDescriptor("icons/foo.gif"));
setDisabledImageDescriptor(
Activator.getImageDescriptor("icons/foo.gif"));
}

@Override
protected String getToggleId() {
return "FilterFooAction.isChecked";
}

@Override
protected ViewerFilter createFilter() {
return new FooOutlineFilter();
}

The filtering itself will be performed by FooOutlineFilter:


public class FooOutlineFilter extends ViewerFilter {

@Override
public boolean select(
Viewer viewer, Object parentElement, Object element) {
if ((parentElement != null)
&& (parentElement instanceof ContentOutlineNode)) {
ContentOutlineNode parentNode = (ContentOutlineNode) parentElement;
EClass clazz = parentNode.getClazz();
if (clazz.equals(MyDslPackage.Literals.ATTRIBUTE)) {
return false;
}
}
return true;
}

Xtext 1.0 84
IDE Concepts

7.5.3. Context menus


You might want to register context menu actions for specific elements in the outline, e.g. to allow
users of your DSL to invoke a generator or to validate the selected element. As all elements in the
outline are ContentOutlineNodes, you cannot easily register an Object contribution. (Besides, using the
extension point org.eclipse.ui.popupMenus is regarded somewhat old school - you should rather use the
new command and expression framework, as depicted below).
To register context menus for specific node types of your Ecore model, we need to:
• implement IContentOutlineNodeAdapterFactory which will translate ContentOutlineNodes to their
underlying node type
• register a menu contribution to add a command / handler pair to the context menu for the specific node
types you’re interested in.

7.5.3.1. Registering an IContentOutlineNodeAdapterFactory


The Xtext code generator creates a subclass of DefaultContentOutlineNodeAdapterFactory. All we need
to do is specify a list of types that we later want to bind context menu contributions to.
public class MyDslContentOutlineNodeAdapterFactory extends
DefaultContentOutlineNodeAdapterFactory {
private static final Class<?>[] types = { Attribute.class };

public Class<?>[] getAdapterList() {


return types;
}
}

If you want to bind context menu actions to nodes representing Attribute and Entity, you need to change
the declaration of _types as follows:
private static final Class<?>[] types = { Attribute.class, Entity.class };

7.5.3.2. Registering a Menu Contribution


You can now add command / handler pairs to the context menu.
First, you need to define a command – it will serve as a handle to glue together the handler and the menu
contribution:
<extension
point="org.eclipse.ui.commands">
<command
id="org.example.mydsl.ui.editor.outline.SampleOutlineCommand"
name="Sample Command"
description="Just a sample command">
</command>
</extension>

Next, you need to define a handler which will eventually execute the code to operate on the selected node.
Please pay special attention to the attribute commandId - it must match the id attribute of your command.
<extension
point="org.eclipse.ui.handlers">
<handler
class="org.example.mydsl.ui.editor.outline.SampleOutlineNodeHandler"
commandId="org.example.mydsl.ui.editor.outline.SampleOutlineCommand">
</handler>
</extension>

Finally, define a menuContribution to add the command to the context menu:

Xtext 1.0 85
IDE Concepts

<extension
point="org.eclipse.ui.menus">
<menuContribution
locationURI="popup:org.eclipse.xtext.ui.outline?after=additions">
<command
commandId="org.example.mydsl.ui.editor.outline.SampleOutlineCommand"
label="Sample action registered for Attributes">
<visibleWhen checkEnabled="false">
<iterate>
<adapt type="org.example.mydsl.Attribute" />
</iterate>
</visibleWhen>
</command>
</menuContribution>
</extension>

Again, pay attention to the commandId attribute. The connection between your node type(s) and the menu
contribution is made by the part
<adapt></adapt>
.

7.5.4. Quick Outline


Xtext also provides a quick outline: If you press CTRL-O in an Xtext editor, the outline of the model is
shown in a popup window. The quick outline also supports drill-down search with wildcards. To enable
the quick outline, you have to put the QuickOutlineFragment into your workflow.

7.6. Hyperlinking
The Xtext editor provides hyperlinking support for any tokens corresponding to cross-references in your
grammar definition. You can either CTRL-click on any of these tokens or hit F3 while the cursor position
is at the token in question and this will take you to the referenced model element. As you’d expect this
works for references to elements in the same resource as well as for references to elements in other
resources. In the latter case the referenced resource will first be opened using the corresponding editor.

7.6.1. Location Provider


When navigating a hyperlink, Xtext will also select the text region corresponding to the referenced
model element. Determining this text region is the responsibility of the ILocationInFileProvider. The
default implementation ( DefaultLocationInFileProvider) implements a best effort strategy which can be
summarized as:
1. If the model element’s type (i.e. EClass) declares a feature name then return the region of the
corresponding token(s). As a fallback also check for a feature id.
2. If the model element’s parse tree contains any top-level tokens corresponding to ID rule calls in the
grammar then return a region spanning all those tokens.
3. As a last resort return the region corresponding to the first keyword token of the referenced model
element.

7.6.1.1. Customized Location Provider


As the default strategy is a best effort it may not always result in the selection you want. If that’s the case
you can override the ILocationInFileProvider binding in the UI module as in the following example:
public class MyDslUiModule extends AbstractMyDslUiModule {

@Override
public Class<? extends ILocationInFileProvider>
bindILocationInFileProvider() {
return MyDslLocationInFileProvider.class;
}
...
}

Xtext 1.0 86
IDE Concepts

Often the default strategy only needs some guidance (e.g. selecting the text corresponding to another
feature than name). In that case you can simply subclass DefaultLocationInFileProvider and override the
methods getIdentifierFeature() and / or useKeyword() to guide the first and last steps of the strategy as
described above (see XtextLocationInFileProvider for an example).

7.6.2. Customizing Available Hyperlinks


The hyperlinks are provided by the HyperlinkHelper which will create links for cross-referenced
objects by default. Clients may want to override createHyperlinksByOffset(XtextResource, int,
IHyperlinkAcceptor) to provide additional links or supersede the default implementation.

7.7. Syntax Coloring


Besides the already mentioned advanced features like content assist and code formatting the powerful
editor for your DSL is capable to mark up your model-code to improve the overall readability. It is
possible to use different colors and fonts according to the meaning of the different parts of your input
file. One may want to use some decent colors for large blocks of comments while identifiers, keywords
and strings should be colored differently to make it easier to distinguish between them. This kind of text
decorating markup does not influence the semantics of the various sections but helps to understand the
meaning and to find errors in the source code.

The highlighting is done in two stages. This allows for sophisticated algorithms that are executed
asynchronously to provide advanced coloring while simple pattern matching may be used to highlight
parts of the text instantaneously. The latter is called lexical highlighting while the first is based on the
meaning of your different model elements and therefore called semantic highlighting.
When you introduce new highlighting styles, the preference page for your DSL is automatically
configured and allows the customization of any registered highlighting setting. They are automatically
persisted and reloaded on startup.

7.7.1. Lexical Highlighting


The lexical highlighting can be customized by providing implementations of the interface
IHighlightingConfiguration and the abstract class AbstractTokenScanner. The latter fulfills the interface
ITokenScanner from the underlying JFace Framework, which may be implemented by clients directly.

Xtext 1.0 87
IDE Concepts

The IHighlightingConfiguration is used to register any default style without a specific binding to a pattern
in the model file. It is used to populate the preferences page and to initialize the ITextAttributeProvider,
which in turn is the component that is used to obtain the actual settings for a style’s id. An implementation
will usually be very similar to the DefaultHighlightingConfiguration and read like this:
public class DefaultHighlightingConfiguration
implements IHighlightingConfiguration {

public static final String KEYWORD_ID = "keyword";


public static final String COMMENT_ID = "comment";

public void configure(IHighlightingConfigurationAcceptor acceptor) {


acceptor.acceptDefaultHighlighting(KEYWORD_ID, "Keyword",
keywordTextStyle());
acceptor.acceptDefaultHighlighting(COMMENT_ID, "Comment", // ...
// ...
}

public TextStyle keywordTextStyle() {


TextStyle textStyle = new TextStyle();
textStyle.setColor(new RGB(127, 0, 85));
textStyle.setStyle(SWT.BOLD);
return textStyle;
}

// ...
}

Implementations of the ITokenScanner are responsible for splitting the content of a document into various
parts, the so called tokens, and return the highlighting information for each identified range. It is critical
that this is done very fast because this component is used on each keystroke. Xtext ships with a default
implementation that is based on the lexer that is generated by ANTLR which is very lightweight and
fast. This default implementation can be customized by clients easily. They simply have to bind another
implementation of the AbstractAntlrTokenToAttributeIdMapper. To get an idea about it, have a look at
the DefaultAntlrTokenToAttributeIdMapper.

7.7.2. Semantic Highlighting


The semantic highlighting stage is executed asynchronously in the background and can be used to
calculated highlighting states based on the meaning of the different model elements. Users of the editor
will notice a very short delay after they have edited the text until the styles are actually applied to the
document. This keeps the editor responsive while providing aid when reading and writing your model.
As for the lexical highlighting the interface to register the available styles is the
IHighlightingConfiguration. The ISemanticHighlightingCalculator is the primary hook to implement the
logic that will compute to-be-highlighted ranges based on the model elements.
The framework will pass the current XtextResource and an IHighlightedPositionAcceptor to the
calculator. It is ensured, that the resource will not be altered externally until the called method
provideHighlightingFor() returns. However, the resource may be null. The implementor’s task is to
navigate your semantic model and compute various ranges based on the attached node information and
associate styles with them. This may read similar to the following snippet:

Xtext 1.0 88
IDE Concepts

public void provideHighlightingFor(XtextResource resource,


IHighlightedPositionAcceptor acceptor) {
if (resource == null)
return;

Iterable<AbstractNode> allNodes = NodeUtil.getAllContents(


resource.getParseResult().getRootNode());
for (AbstractNode node : allNodes) {
if (node.getGrammarElement() instanceof CrossReference) {
acceptor.addPosition(node.getOffset(), node.getLength(),
MyHighlightingConfiguration.CROSS_REF);
}
}
}

This example refers to an implementation of the IHighlightingConfiguration that registers a style for a
cross-reference. It is pretty much the same implementation as for the previously mentioned sample of a
lexical IHighlightingConfiguration.
public class HighlightingConfiguration
implements IHighlightingConfiguration {

// lexical stuff goes here


// ..
public final static String CROSS_REF = "CrossReference";

public void configure(IHighlightingConfigurationAcceptor acceptor) {


// lexical stuff goes here
// ..
acceptor.acceptDefaultHighlighting(CROSS_REF,
"Cross-References", crossReferenceTextStyle());
}

public TextStyle crossReferenceTextStyle() {


TextStyle textStyle = new TextStyle();
textStyle.setStyle(SWT.ITALIC);
return textStyle;
}
}

The implementor of an IHighlightingCalculator should be aware of performance to ensure a good user


experience. It is probably not a good idea to traverse everything of your model when you will only
register a few highlighted ranges that can be found easier with some typed method calls. It is strongly
advised to use purposeful ways to navigate your model. The parts of Xtext’s core that are responsible
for the semantic highlighting are pretty optimized in this regard as well. The framework will only update
the ranges that actually have been altered, for example. This speeds up the redraw process. It will even
move, shrink or enlarge previously announced regions based on a best guess before the next semantic
highlighting pass has been triggered after the user has changed the document.

7.8. Project Wizard


Optionally, Xtext can generate a New Project Wizard for your DSL. Using this wizard a user can create a
new project with only a few clicks that is configured with the right dependencies and natures. By default
it contains a sample model file and workflow that serve as a scaffold for the user. Furthermore it has the
Xtext nature and thereby the builder assigned. You enable the generation of the project wizard by adding
the SimpleProjectWizardFragment fragment to the workflow:

Xtext 1.0 89
IDE Concepts

import org.eclipse.xtext.ui.generator.*

// project wizard fragment


fragment = projectWizard.SimpleProjectWizardFragment {
generatorProjectName = "${projectName}.generator"
modelFileExtension = file.extensions
}

Here
• the generatorProjectName is used to specify the project that contains the workflows and templates
used to generate artifacts from your DSL. The generated project wizard uses this to add a corresponding
dependency to the created project.
• and the modelFileExtension specifies the default file extension associated with your DSL. When the
user clicks the Finish button the project wizard will look for a file with the given extension in the
created project and open a editor.
After running the Xtext generator, the DSL’s UI project will contain a new Xpand template
MyDslNewProject.xpt in the src folder in the package .ui.wizard. Note: It may be necessary to manually
merge the new entry in the _plugin.xml___gen_ into your plugin.xml of the UI project to enable the
wizard contribution.
The generated Xpand template will be expanded by the project wizard when the user clicks the Finish
button and it is responsible for initializing the project with some sample content. When finishing the
wizard the template will be used to create a sample model file and a simple workflow to run the model
through the generator project’s MyDslGenerator.mwe workflow. However, this is only a pragmatic
default. As the Xpand template is in the src source folder you may of course modify it to generate whatever
initial content you want for a new project. Just make sure to leave the top-level main definition in place,
as that is the interface for the project wizard.
Note: To edit the generated Xpand template you should check that the JavaBeans meta model contributor
is enabled under Preferences > Xtend/Xpand. Further you should also configure the UI project with the
Xpand/Xtend nature using Configure > Add Xpand/Xtend Nature in the context menu.

7.8.1. Customizing the Project Wizard


To further customize the creation of the project you can implement your own project creator. The default
project creator is represented by the generated class MyDslProjectCreator in the src-gen folder. It is
highly extensible. Without any changes it will simply create a new plug-in project with the Xtext nature
assigned. Afterwards it will execute the main definition of the Xpand template as described above.
To add more pages or input fields to the project wizard you should subclass the class
MyDslNewProjectWizard as appropriate. Don’t forget to register the subclass in the UI project’s
MyDslUiModule. You may also want to make additionally entered user data available to the Xpand
template. In this case you should enhance the MyDslProjectInfo to allow that one to hold the information.
This is the context object which gets passed to the template when it’s executed. Don’t forget that your
specialized MyDslNewProjectWizard has to populate the data fields of your MyDslProjectInfo.

Xtext 1.0 90
Chapter 8. Referring to Java Types
A common use case when developing languages is the requirement to refer to existing concepts of other
languages. Xtext makes this very easy for other self defined languages. However, it is often very useful
to have access to the available types of the Java Virtual Machine. The JVM types Ecore model enables
clients to do exactly this. It is possible to create cross-references to classes, interfaces, and their fields and
methods. Basically every information about the structural concepts of the Java type system is available
via the JVM types. This includes annotations and their specific values and enumeration literals as well.
The implementation will be selected transparently to the client code depending on where the code is
executed. If the environment is a plain stand-alone Java or OSGi environment, the java.lang.reflect API
will be used to deduce the necessary data. On the contrary, the type-model will be created from the live
data of the JDT in an interactive Eclipse environment. All this happens behind the scenes via different
implementations that are bound to specific interfaces with Google Guice.

8.1. How to Use Java Types


Using the JVM types model is very simple. First of all, the grammar has to import the JavaVMTypes
Ecore model. Thanks to content assist this is easy to spot in the list of proposals.
import "http://www.eclipse.org/xtext/common/JavaVMTypes" as types

The next step is to actually refer to an imported concept. Let’s define a mapping to actually available
Java types for the simple data types in the self defined language. By means of cross-references this works
as one got already used to when dealing with references in Xtext.
Type:
'type' name=ID 'mapped-to' javaType=[types::JvmType|FQN];

Last but not least, the TypesGeneratorFragment has to be added to the workflow. The safest way is to
add it after the actually used scoping fragments as a specialized version of the IGlobalScopeProvider
will be configured. Don’t forget to refer to the genmodel of the Java VM types. The shortest possible
URI is a classpath-URI.
fragment = ecore.EcoreGeneratorFragment {
referencedGenModels="classpath:/model/JavaVMTypes.genmodel"
}

...
// scoping and exporting API
...

fragment = types.TypesGeneratorFragment {}

After regenerating your language, it will be allowed to define a type Date that maps to the java.util.Date
like this:
type Date mapped-to java.util.Date

These two steps will provide a nice integration into the Eclipse JDT. There is Find References on Java
methods, fields and types that will reveal results in your language files. Go To Declaration works as
expected and content assist will propose the list of available types.

8.2. Customization Points


There are several customization hooks in the runtime layer and on the editor side as well:
The AbstractTypeScopeProvider can be used to create scopes for members with respect to the override
semantics of the Java language. Of course it is possible to use this implementation to create scopes for
types as well.
As the Java VM types expose a lot of information about visibility, parameter- and return types, generic,
available annotations or enumeration literals, it is very easy to define constraints for the referred types.

Xtext 1.0 91
Referring to Java Types

The ITypesProposalProvider can be used to provide optimized proposals based on various filter criteria.
The most common selector can be used directly via createSubTypeProposals. The implementation is
optimized and uses the JDT Index directly to minimize the effort for object instantiation. The class
TypeMatchFilters provides a comprehensive set of reusable filters that can be easily combined to reduce
the list of proposals to a smaller number of valid entries.
import static org.eclipse.xtext.common.types.xtext.ui.TypeMatchFilters.*;
..
..

proposalProvider.createSubTypeProposals(factory, context,
and(not(canInstantiate()), isPublic()), acceptor);

Since Xtext 1.0.1:


Proposals can be created by means of a scoped proposal provider that is actually aware of existing import
statements and scoping rules while still leveraging the JDT index for faster proposal creation. Use the
ITypesProposalProvider.IScopableTypesProposalProvider to create a scoped proposal provider and use
this one instead of the plain implementation.
@Override
public void complete<TypeName>_<ReferenceName>(
EObject model,
Assignment assignment,
ContentAssistContext context,
ICompletionProposalAcceptor acceptor) {
IScope scope = getScopeProvider().getScope(
model,
<MyDsl>Package.Literals.<POINTER_TO_REFERENCE>);
ITypesProposalProvider scopedProposalProvider =
scopableProposalProvider.getScopedProposalProvider(model, scope);
scopedProposalProvider.create...
}

Xtext 1.0 92
Chapter 9. Typical Language
Configurations
9.1. Case Insensitive Languages
In some cases, e.g. if your SHIFT key is broken, you might want to design a case insensitive language.
Xtext offers separate generator fragments for this purpose.
For case insensitive keywords, open your MWE workflow and replace the Antlr related fragments:
// The antlr parser generator fragment.
fragment = parser.antlr.XtextAntlrGeneratorFragment {
// options = {
// backtrack = true
// }
}
...

// generates a more lightweight Antlr parser and lexer tailored ...


fragment = parser.antlr.XtextAntlrUiGeneratorFragment {
}

with
// The antlr parser generator fragment.
fragment = parser.antlr.ex.rt.AntlrGeneratorFragment {
options = {
ignoreCase = true
}
}
...

// generates a more lightweight Antlr parser and lexer tailored ...


fragment = parser.antlr.ex.ca.ContentAssistParserGeneratorFragment {
options = {
ignoreCase = true
}
}

For case insensitve element names, use the ignoreCase option in your scope fragment, i.e.
fragment = scoping.ImportNamespacesScopingFragment {
ignoreCase = true
}

or if you are using importURI based global scopes


fragment = scoping.ImportURIScopingFragment {
ignoreCase = true
}

9.2. Languages Independent of JDT


The following section describes how you make your language independent of Eclipse’s Java
Development Toolkit (JDT).
In the UIModule of your language you have to overwrite some bindings. First, remove the bindings to
components with support for the ' classpath:' URI protocol, i.e.

Xtext 1.0 93
Typical Language Configurations

@Override
public Class<? extends IResourceForEditorInputFactory>
bindIResourceForEditorInputFactory() {
return ResourceForIEditorInputFactory.class;
}

@Override
public Class<? extends IResourceSetProvider> bindIResourceSetProvider() {
return SimpleResourceSetProvider.class;
}

Second, configure the global scope provider to scan project root folders instead of the classpath of Java
projects.
@Override
public com.google.inject.Provider
<org.eclipse.xtext.resource.containers.IAllContainersState>
provideIAllContainersState() {
return org.eclipse.xtext.ui.shared.Access.getWorkspaceProjectsState();
}

The remaining steps show you how to adapt the project wizard for your language, if you have generated
one. The best way to do this is to create a new subclass of the generated IProjectCreator in the src/ folder
of the ui project and apply the necessary changes there. First, remove the JDT project configuration by
overriding configureProject with an empty body.
The next thing is to redefine the project natures and builders that should be applied to you language
projects.
In in this case just remove the JDT stuff in this way:
protected String[] getProjectNatures() {
return new String[] {
"org.eclipse.pde.PluginNature",
"org.eclipse.xtext.ui.shared.xtextNature"
};
}

protected String[] getBuilders() {


return new String[] {
"org.eclipse.pde.ManifestBuilder",
"org.eclipse.pde.SchemaBuilder"
};
}

After that you have to bind the new IProjectCreator


@Override
public Class<? extends IProjectCreator> bindIProjectCreator() {
return JDTFreeMyDslProjectCreator.class;
}

That’s all. Your language and its IDE should now no longer depend on JDT.

9.3. Parsing Expressions with Xtext


Parsing simple XML-like, structural languages with Xtext is a no-brainer. However, parsing nested
expressions is often considered a bit more complicated. This is because they are more complicated due
to their recursive nature and also because with Xtext you have to avoid left recursive parser rules. As
the underlying parser (generated by Antlr) uses a top-down approach it would recurse endlessly if you
had a left recursive grammar.
Let’s have a look at parsing a simple arithmetic expression:
2 + 20 * 2

Xtext 1.0 94
Typical Language Configurations

If you know EBNF a bit and wouldn’t think about avoiding left recursion, operator precedence or
associativity, you’ld probably write a grammar like this:
Expression :
Expression '+' Expression |
Expression '*' Expression |
INT;

This grammar would be left recursive because the parser reads the grammar top down and left to right and
would endlessly call the Expression rule without consuming any characters, i.e. altering the underlying
state of the parser. While this kind of grammar can be written for bottom-up parsers, you’ld still have
to deal with operator precedence in addition. That is define that a multiplication has higher precedence
than an addition for example.
In Xtext you define the precedence implicitly when left-factoring such a grammar. Left-factoring means
you get rid of left recursion by applying a certain technique, which we will show in the following.
So here is a left-factored grammar (not yet working with Xtext) for the expression language above :
Addition :
Multiplication ('+' Multiplication)*;

Multiplication:
NumberLiteral ('*' NumberLiteral)*;

NumberLiteral:
INT;

As you can see the main difference is that we have three rules instead of one and if you look a bit closer
you see, that there’s a certain delegation pattern involved. The rule Addition doesn’t call itself but calls
Multiplication instead. The operator precedence is defined by the order of delegation. The later the rule
is called the higher is its precedence. This is at least the case for the first two rules which are of a left
recursive nature (but we’ve left-factored them now). The last rule is not left recursive which is why you
can write them down without applying this pattern.
We should allow users to explicitly adjust precedence by adding parenthesis, e.g. write something like
(2 + 20) * 2. So let’s add support for that (note that the grammar is still not working with Xtext):
Addition :
Multiplication ('+' Multiplication)*;

Multiplication:
Primary ('*' Primary)*;

Primary :
NumberLiteral |
'(' Addition ')';

NumberLiteral:
INT;

So once again: if you have some construct that recurses on the left hand side, you need to put it into the
delegation chain according to their operator precedence. The pattern is always the same, the thing that
recurses delegates to the rule with the next higher precedence.

9.3.1. Construction of an AST


Now that we know how to avoid left recursion, let’s have a look at what the parser produces. In Xtext each
rule returns some value. Parser rules return AST nodes (i.e. EObject instances), enum rules return enum
literals and datatype rules as well as terminal rules return simple values like strings and the like (EDatatype
in EMF jargon). Xtext can automatically infer whether some rule is a parser rule, i.e. constructs and returns
an AST node, or if it is a datatype rule that returns a value. The grammar above only consists of datatype
rules all of them returning plain strings. In order to construct an AST we need to add Assignments and
Actions. But before we do that we need to talk about return types.

Xtext 1.0 95
Typical Language Configurations

The return type of a rule can be specified explicitly using the ‘returns’ keyword but can be inferred if the
type’s name is the same as the rule’s name. That is
NumberLiteral : ... ;

is a short form of
NumberLiteral returns NumberLiteral : ... ;

However in the case of the expressions grammar above, all rules need to return the same type since they
are recursive. So in order to make the grammar functional we need to add a common return type explicitly
(but the grammar is still missing some bits):
Addition returns Expression:
Multiplication ('+' Multiplication)*;

Multiplication returns Expression:


Primary ('*' Primary)*;

Primary returns Expression:


NumberLiteral |
'(' Addition ')';

NumberLiteral:
INT;

The AST type inference mechanism of Xtext will infer two types: Expression and NumberLiteral. Now
we need to add assignments and Actions in order to store all the important information in the AST and
to create reasonable subtypes for the two operations.
In the following you see the final fully working Xtext grammar:
Addition returns Expression:
Multiplication ({Addition.left=current} '+' right=Multiplication)*;

Multiplication returns Expression:


Primary ({Multiplication.left=current} '*' right=Primary)*;

Primary returns Expression:


NumberLiteral |
'(' Addition ')';

NumberLiteral:
value=INT;

Let’s go through the grammar as the parser would do it for the expression
(1 + 20) * 2

The parser always starts with the first rule (Addition). Therein the first element is an unassigned rule call
to Multiplication which in turn calls Primary. Primary now has two alternatives. The first one is calling
NumberLiteral which consists only of one assignment to a feature called ‘value’. The type of ‘value’ has
to be compatible to the return type of the INT rule.
But as the first token in the sample expression is an opening parenthesis ‘(’ the parser will take the second
alternative in Primary, consume the ‘(’ and call the rule Addition. Now the value ‘1’ is the lookahead token
and again Addition calls Multiplication and Multiplication calls Primary. This time the parser takes the
first alternative because ‘1’ was consumed by the INT rule (which btw. is a reused library terminal rule).
As soon as the parser hits an assignment it checks whether an AST node for the current rule was already
created. If not it will create one based on the return type of the current rule, which is NumberLiteral. The
Xtext generator created an EClass ‘NumberLiteral’ before which can now be instantiated. That type will
also have a property called value of type int (actually of type EInt), which will get the value ‘1’ set. This
is what the Java equivalent would look like:

Xtext 1.0 96
Typical Language Configurations

// value=INT
if (current == null)
current = new NumberLiteral();
current.setValue(ruleINT());
...

Now that the rule has been completed the created EObject is returned to the calling rule Primary, which
in turn returns the object unchanged to its own caller. Within Multiplication the call to Primary has been
successfully parsed and returned an instance of NumberLiteral. The second part of the rule (everything
within the parenthesis) is a so called group. The asterisk behind the closing parenthesis states that this part
can be consumed zero or more times. The first token to consume in this part is the multiplication operator
‘*’. Unfortunately in the current situation the next token to accept is the plus operator ‘+’, so the group is
not consumed at all and the rule returns what they got from the unassigned rule call (the NumberLiteral).
In rule Addition there’s a similar group but this time it expects the correct operator so the parser steps into
the group. The first element in the group is a so called action. As Xtext grammars are highly declarative
and bi-directional it is not a good idea to allow arbitrary expression within actions as it is usually the
case with parser generators. Instead we only support two kinds of actions. This one will create a new
instance of type Addition and assign what was the to-be-returned object to the feature left. In Java this
would have been something like:
// Multiplication rule call
current = ruleMultiplication();
// {Addition.left=current}
Addition temp = new Addition();
temp.setLeft(current);
current = temp;
...

As a result the rule would now return an instance of Addition which has a NumberLiteral set to its
property left. Next up the parser consumes the ‘+’ operator. We do not store the operator in the AST
because we have an explicit Addition type, which implicitly contains this information. The assignment
(‘right=Multiplication’) calls the rule Multiplication another time and assigns the returned object (a
NumberLiteral of value=20) to the property named right.
If we now had an additional plus operation ‘+’ (e.g. 1 + 2 + 3) the group would match another time and
create another instance of Addition. But we don’t and therefore the rule is completed and returns the
created instance of Addition to its caller which was the second alternative in Primary. Now the closing
parenthesis is matched and consumed and the parser stack is reduced once more.
We are now in rule Multiplication and have the multiplication operator ‘*’ on the lookahead. The
parser goes into the group and applies the action. Finally it calls the Primary rule, gets another instance
of NumberLiteral (value=2), assigns it as the ‘right’ operand of the Multiplication and returns the
Multiplication to the rule Addition which in turn returns the very same object as there’s nothing left to
parse.
The resulting AST looks like this:

Xtext 1.0 97
Typical Language Configurations

(It’s pretty hard to follow what’s going on just by reading this text. Therefore we have prepared a small
video which visualizes and explains the details. see http://vimeo.com/14358869)

9.3.2. Associativity
There is still one topic we should mention, which is associativity. There is left and right associativity as
well as none associativity. In the example we have seen left associativity. Associativity tells the parser
how to construct the AST when there are two infix operations with the same precedence. The following
example is taken from the corresponding wikipedia entry:
Consider the expression a ~ b ~ c. If the operator ~ has left associativity, this expression would be
interpreted as (a ~ b) ~ c and evaluated left-to-right. If the operator has right associativity, the expression
would be interpreted as a ~ (b ~ c) and evaluated right-to-left. If the operator is non-associative, the
expression might be a syntax error, or it might have some special meaning. We already know the most
important form which is left associativity:
Addition returns Expression:
Multiplication ({Addition.left=current} '+' right=Multiplication)*;

Right associativity is done using the following pattern (note the quantity operator and the call to the rule
itself at the end):
Addition returns Expression:
Multiplication ({Addition.left=current} '+' right=Addition)?;

And if you don’t want to allow multiple usages of the same expression in a row (hence non-associativity)
you write:
Addition returns Expression:
Multiplication ({Addition.left=current} '+' right=Multiplication)?;

Note that sometimes it’s better to allow associativity on parser level, but forbid it later using validation,
because you can come up with a better error message. Also the whole parsing process won’t be
interrupted, so your tooling will generally be more forgiving.

Xtext 1.0 98
Chapter 10. Integration with EMF and
Other EMF Editors
Xtext relies heavily on EMF internally, but it can also be used as the serialization back-end of other EMF-
based tools. In this section we introduce the basic concepts of the Eclipse Modeling Framework (EMF)
in the context of Xtext. If you want to learn more about EMF, we recommend reading the EMF book.

10.1. Model, Ecore Model, and Ecore


When Xtext uses EMF models as the in-memory representation of any parsed text files. This in-memory
object graph is called the Abstract Syntax Tree (AST). Depending on the community this concepts is
also called document object graph (DOM), semantic model, or simply model. We use model and AST
interchangeably. Given the example model from the introduction, the AST looks similar to this

The AST should contain the essence of your textual models and abstract over syntactical information. It is
used by later processing steps, such as validation, compilation or interpretation. In EMF a model is made
up of instances of EObjects which are connected and an EObject is an instance of an EClass. A set of
EClasses if contained in a so called EPackage, which are both concepts of Ecore In Xtext, meta models
are either inferred from the grammar or predefined by the user (see the section on package declarations
for details). The next diagram shows the meta model of our example:

The language in which the meta model is defined is called Ecore. In other words, the meta model is the
Ecore model of your language. Ecore an essential part of EMF. Your your models instantiate the meta
model, and your meta model instantiates Ecore. To put an end to this recursion, Ecore is defined in itself
(an instance of itself).

Xtext 1.0 99
Integration with EMF
and Other EMF Editors

The meta model defines the types of the semantic nodes as Ecore EClasses. EClasses are shown as
boxes in the meta mode diagram, so in our example, Model, Type, SimpleType, Entity, and Property are
EClasses. An EClass can inherit from other EClasses. Multiple inheritance is allowed in Ecore, but of
course cycles are forbidden.
EClasses can have EAttributes for their simple properties. These are shown inside the EClasses nodes.
The example contains two EAttributes name and one EAttribute isMulti. The domain of values for an
EAttribute is defined by its EDataType. Ecore ships with some predefined EDataTypes, which essentially
refer to Java primitive types and other immutable classes like String. To make a distinction from the Java
types, the EDataTypes are prefixed with an E. In our example, that’s EString and EBoolean.
In contrast to EAttributes, EReferences point to other EClasses. The containment flag indicates whether
an EReference is a containment reference or a cross-reference. In the diagram, references are edges and
containment references are marked with a diamond. At the model level, each element can have at most
one container, i.e. another element referring to it with a containment reference. This infers a tree structure
to the models, as can be seen in the sample model diagram. On the other hand, cross-references refer to
elements that can be contained anywhere else. In the example, elements and properties are containment
references, while type and extends are cross-references. For reasons of readability, we skipped the cross-
references in the sample model diagram. Note that in contrast to other parser generators, Xtext creates
ASTs with linked cross-references.
Other than associations in UML, EReferences in Ecore are always owned by one EClass and only
navigable in the direction form the owner to the type. Bi-directional associations must be modeled as two
references, being eOpposite of each other and owned by either end of the associations.
The superclass of EAttibutes and EReferences is EStructuralFeature and allows to define a name and a
cardinality by setting lowerBound and upperBound. Setting the latter to -1 means ‘unbounded’.
The common supertype of EDataType and EClass is EClassifier. An EPackage acts as a namespace and
container of EClassifiers.
We have summarized these most relevant concepts of Ecore in the following diagram:

10.2. EMF Code Generation


EMF also ships with a code generator that generates Java classes from your Ecore model. The code
generators input is the so called EMF generator model. It decorates (references) the Ecore model and adds

Xtext 1.0 100


Integration with EMF
and Other EMF Editors

additional information for the Ecore -> Java transformation. Xtext will automatically generate a generator
model with reasonable defaults for all generated metamodels, and run the EMF code generator on them.
The generated classes are based on the EMF runtime library, which offers a lot of infrastructure and tools
to work with your models, such as persistence, reflection, referential integrity, lazy loading etc.
Among other things, the code generator will generate
• A Java interface and a Java class for each EClassifier in your Ecore model. By default, all classes will
implement the interface org.eclipse.emf.ecore.EObject, linking a lot of runtime functionality.
• A Java bean property for each EStructuralFeature (member variable, accessor methods)
• A package interface and class, holding singleton objects for all elements of your Ecore model, allowing
reflection. EPackages are also registered to the EPackage.Registry to be usable at runtime.
• A factory interface and class for creating instances
• An abstract switch class implementing a visitor pattern to avoid if-instanceof cascades in your code.

10.3. XtextResource Implementation


Xtext provides an implementation of EMF’s resource, the XtextResource. This does not only encapsulate
the parser that converts text to an EMF model but also the serializer working the opposite direction. That
way, an Xtext model just looks like any other Ecore-based model from the outside, making it amenable
for the use by other EMF based tools. In fact, the Xpand templates in the generator plug-in created by
the Xtext wizard do not make any assumption on the fact that the model is described in Xtext, and they
would work fine with any model based on the same Ecore model of the language. So in the ideal case,
you can switch the serialization format of your models to your self-defined DSL by just replacing the
resource implementation used by your other modeling tools.

The generator fragment ResourceFactoryFragment registers a factory for the XtextResource to EMF’s
resource factory registry, such that all tools using the default mechanism to resolve a resource
implementation will automatically get that resource implementation.
Using a self-defined textual syntax as the primary storage format has a number of advantages over the
default XMI serialization, e.g.
• You can use well-known and easy-to-use tools and techniques for manipulation, such as text editors,
regular expressions, or stream editors.
• You can use the same tools for version control as you use for source code. Merging and diffing is
performed in a syntax the developer is familiar with.

Xtext 1.0 101


Integration with EMF
and Other EMF Editors

• It is impossible to break the model such that it cannot be reopened in the editor again.
• Models can be fixed using the same tools, even if they have become incompatible with a new version
of the Ecore model.
Xtext targets easy to use and naturally feeling languages. It focuses on the lexical aspects of a language
a bit more than on the semantic ones. As a consequence, a referenced Ecore model can contain more
concepts than are actually covered by the Xtext grammar. As a result, not everything that is possibly
expressed in the EMF model can be serialized back into a textual representation with regards to the
grammar. So if you want to use Xtext to serialize your models as described above, it is good to have a
couple of things in mind:
• Prefer optional rule calls (cardinality ? or *) to mandatory ones (cardinality + or default), such that
missing references will not obstruct serialization.
• You should not use an Xtext-Editor on the same model instance as a self-synchronizing other editor,
e.g. a canonical GMF editor (see :#gmf_integration_stage_1 for details). The Xtext parser replaces re-
parsed subtrees of the AST rather than modifying it, so elements will become stale. As the Xtext editor
continuously re-parses the model on changes, this will happen rather often. It is safer to synchronize
editors more loosely, e.g. on file changes.
• Implement an IFragmentProvider:#fragmentProvider to make the XtextResource return stable
fragments for its contained elements, e.g. based on composite names rather than order of appearance.
• Implement an IQualifiedNameProvider and an IScopeProvider:#scoping to make the names of all
linkable elements in cross-references unique.
• Provide an IFormatter:#formatting to improve the readability of the generated textual models.
• Register an IReferableElementsUnloader to turn deleted/replaced model elements into EMF proxies.
Design the rest of your application such that it does never keep references to EObjects or to cope with
proxies. That will improve the stability of your application drastically.
• Xtext will register an EMF ResourceFactory, so resources with the file extension you entered when
generating the Xtext plug-ins will be automatically loaded in an XtextResource when you use EMF’s
ResourceSet API to load it.

10.4. Integration with GMF Editors


The Graphical Modeling Framework (GMF) allows to create graphical diagram editors for Ecore models.
To illustrate how to build a GMF on top of an XtextResource we have provided an example. You must
have the Helios version 2.3 of GMF Notation, Runtime and Tooling and their dependencies installed in
your workbench to run the example. With other versions of GMF it might work to regenerate the diagram
code. Choose New->Examples->Xtext->Xtext GMF Integration to import it into your workbench. The
example consists of a number of plug-ins

Plug-in Framework Purpose Contents


o.e.x.example.gmf Xtext Xtext runtime plug-in Grammar, derived
metamodel and language
infrastructure
o.e.x.e.g.ui Xtext Xtext UI plug-in Xtext editor and services
o.e.x.e.g.edit EMF EMF.edit plug-in UI services generated
from the metamodel
o.e.x.e.g.models GMF GMF design models Input for the GMF code
generator
o.e.x.e.g.diagram GMF GMF diagram editor Purely generated from
the GMF design models
o.e.x.e.g.d.extensions GMF and Xtext GMF diagram editor Manual extensions to the
extensions generated GMF editor
for integration with
Xtext

Xtext 1.0 102


Integration with EMF
and Other EMF Editors

o.e.x.gmf.glue Xtext and GMF Glue code Generic code to integrate


Xtext and GMF

We will elaborate the example in three stages.

10.4.1. Stage 1: Make GMF Read and Write the Semantic Model As
Text
A diagram editor in GMF by default manages two resources: One for the semantic model, that is the model
we’re actually interested in for further processing. In our example it is a model representing entities and
datatypes. The second resource holds the notation model. It represents the shapes you see in the diagram
and their graphical properties. Notation elements reference their semantic counterparts. An entity’s name
would be in the semantic model, while the font to draw it in the diagram would be stored the notation
model. Note that in the integration example we’re only trying to represent the semantic resource as text.
To keep the semantic model and the diagram model in sync, GMF uses a so called CanonicalEditPolicy.
This component registers as a listener to the semantic model and automatically updates diagram elements
when their semantic counterparts change, are added or are removed. Some notational information can be
derived from the semantic model by some default mapping, but usually there is a lot of graphical stuff
that the user wants to change to make the diagram look better.
In an Xtext editor, changes in the text are transfered to the underlying XtextResource by a call to
the method org.eclipse.xtext.resource.XtextResource.update(int, int, String), which will trigger a partial
parsing of the dirty text region and a replacement of the corresponding subtree in the AST model (semantic
model).
Having an Xtext editor and a canonical GMF editor on the same resource can therefore lead to loss of
notational information, as a change in the Xtext editor will remove a subtree in the AST, causing the
CanonicalEditPolicy to remove all notational elements, even though it was customized by the user. The
Xtext rebuilds the AST and the notation model is restored using the default mapping. It is therefore not
recommended to let an Xtext editor and a canonical GMF editor work on the same resource.
In this example, we let each editor use its own memory instance of the model and synchronize on file
changes only. Both frameworks already synchronize with external changes to the edited files out-of-the-
box. In the glue code, a ConcurrentModificationObserver warns the user if she tries to edit the same file
with two different model editors concurrently.
In the example, we started with writing an Xtext grammar for an entity language. As explained above, we
preferred optional assignments and rather covered mandatory attributes in a validator. Into the bargain,
we added some services to improve the EMF integration, namely a formatter, a fragment provider and
an unloader. Then we let Xtext generate the language infrastructure. From the derived Ecore model and
its generator model, we generated the edit plug-in (needed by GMF) and added some fancier icons.
From the GMF side, we followed the default procedure and created a gmfgraph model, a gmftool model
and a gmfmap model referring to the Ecore model derived form the Xtext grammar. We changed some
settings in the gmfgen model derived by GMF from the gmfmap model, namely to enable printing and
to enable validation and validation decorators. Then we generated the diagram editor.
Voilà, we now have a diagram editor that reads/writes its semantic model as text. Also note that the
validator from Xtext is already integrated in the diagram editor via the menu bar.

10.4.1.1. Stage 2: Calling the Xtext Parser to Parse GMF Labels


GMF’s generated parser for the labels is a bit poor: It will work on attributes only, and will fail for cross-
references, e.g. an attibute’s type. So why not use the Xtext parser to process the user’s input?
An XtextResource keeps track of it’s concrete syntax representation by means of a so called node model
(see :#parser_rules for a more detailed description). The node model represents the parse tree and provides
information on the offset, length and text that has been parsed to create a semantic model element. The
nodes are attached to their semantic elements by means of a node adapter.
We can use the node adapter to access the text block that represents an attribute, and call
the Xtext parser to parse the user input. The example code is contained in AntlrParserWrapper.
SimplePropertyWrapperEditPartOverride shows how this is integrated into the generated GMF editor.
Use the EntitiesEditPartFactoryOverride to instantiate it and the EntitiesEditPartProviderOverride to

Xtext 1.0 103


Integration with EMF
and Other EMF Editors

create the overridden factory, and register the latter to the extension point. Note that this is a non-invasive
way to extend generated GMF editors.
When you test the editor, you will note that the node model will be corrupt after editing a few labels.
This is because the node model is only updated by the Xtext parser and not by the serializer. So we
need a way to automatically call the (partial) parser every time the semantic model is changed. You will
find the required classes in the package org.eclipse.xtext.gmf.glue.editingdomain. To activate node model
reconciling, you have to add a line
XtextNodeModelReconciler.adapt(editingDomain);

in the method createEditingDomain() of the generated EntitiesDocumentProvider. To avoid changing


the generated code, you can modify the code generation template for that class by setting
Dynamic Templates -> true
Template Directory -> "org.eclipse.xtext.example.gmf.models/templates"

in the GenEditorGenerator and


Required Plugins -> "org.eclipse.xtext.gmf.glue"

in the GenPlugin element of the gmfgen before generating the diagram editor anew.

10.4.1.2. Stage 3: A Popup Xtext Editor (experimental)


SimplePropertyPopupXtextEditorEditPartOverride demonstrates how to spawn an Xtext editor to edit a
model element. The editor pops up in its control and shows only the section of the selected element. It is
a fully fledged Xtext editor, with support of validation, code assist and syntax highlighting. The edited
text is only transfered back to the model if it does not have any errors.
Note that there still are synchronization issues, that’s why we keep this one marked as experimental.

Xtext 1.0 104


Chapter 11. Migrating from Xtext 0.7.x
to 1.0
Most of the tasks when migrating to Xtext 1.0 can be automated. Some changes will be necessary in the
manually written code where you have to carefully verify that your implementation is still working with
Xtext 1.0. A reliable test-suite helps a lot.
The grammar language is fully backward compatible. You should not have to apply any changes in the
primary artifact. However, we introduced some additional validation rules that mark inconsistencies in
your grammar. If you get any warnings in the grammar editor, it should be straight forward to fix them.
Tip: You’ll learn something about the new features if you compare a freshly created Xtext project based
on 0.7.x with a new Xtext project based on 1.0. Especially the new fragments in the workflow are a good
indicator for useful new features.

11.1. Take the Shortcut


If you haven’t made too many customizations to the generated defaults and if you’re not referencing
many classes of your Xtext language from the outside, you might consider starting with a new Xtext
project, copying your grammar and then manually restoring your changes step by step. If that does not
work for you, go on reading!

11.2. Migrating Step By Step


Before you start the migration to Xtext 1.0, you should make sure that no old plug-ins are in your target
platform. Some plug-ins from Xtext 0.7.x have been merged and do no longer exist.
Tip: The following steps try to use the Eclipse compiler to spot any source-incompatible changes while
fixing them with only a few well described user actions. Doing these steps in another order causes most
likely a higher effort.

11.2.1. Update the Plug-in Dependencies and Import Statements


You should update the constraints from version 0.7.x to [1.0.0,2.0.0) in your manifest files if
you specified any concrete versions. Make sure that your dsl.ui-projects do not refer to the plug-
in org.eclipse.xtext.ui.common or org.eclipse.xtext.ui.core but to org.eclipse.xtext.ui instead. The
arguably easiest way is a global text-based search and replace across the manifest files. The bundle
org.eclipse.xtext.log4j is obsolete as well. The generator will create import-package entries in the
manifests later on.
The next step is to fix the import statements in your classes to match the refactored naming
scheme in Xtext. Perform a global search for " import org.eclipse.xtext.ui.common." and " import
org.eclipse.xtext.ui.core." and replace the matches with " import org.eclipse.xtext.ui.". This fixes most
of the problems in the manually written code.

11.2.2. Rename the Packages in the dsl.ui-Plug-in


We changed the naming pattern for artifacts in the dsl.ui-plug-in to match the OSGI conventions. The
easiest way to update your existing projects is to apply a “Rename Package” refactoring on the packages
in the src- and src-gen folder before you re-run the workflow that regenerates your language. Make sure
you ticked “Rename sub-packages” in the dialog. It is error-prone to enable the search in non-Java files
as this will perform incompatible changes in the manifest files. Furthermore, it is important to perform
the rename operation in the src-gen folder, too. This ensures that the references in your manually written
code are properly updated.

11.2.3. Update the Workflow


The JavaScopingFragment does no longer exist. It has been superseded by the
ImportURIScopingFragment in combination with the SimpleNamesFragment. Please replace
<fragment
class="org.eclipse.xtext.generator.scoping.JavaScopingFragment"/>

Xtext 1.0 105


Migrating from Xtext 0.7.x to 1.0

with
<fragment
class="org.eclipse.xtext.generator.scoping.ImportURIScopingFragment"/>
<fragment
class="org.eclipse.xtext.generator.exporting.SimpleNamesFragment"/>
The PackratParserFragment has been abandoned as well. It is save to remove the reference to that one if
it is activated in your workflow. After you’ve changed your workflow, it should be possible to regenerate
your language without any errors in the console. It is ok to have compilation errors prior to executing
the workflow.

11.2.4. MANIFEST.MF and plugin.xml


The previous rename package refactoring updated most of the entries in the MANIFEST.MF and and
some entries in the plugin.xml. Others have to be fixed manually. The Eclipse compiler will point to many
of the remaining problems in the manifest files but it is unlikely that it will spot the erroneous references
in the plugin.xml.
• In the generated UI plug-in’s MANIFEST.MF, remove the package exports of no longer existing
packages and make sure the bundle activator points to the newly generated one (with .ui. in its package
name).
• It was already mentioned that the plug-ins org.eclipse.xtext.ui.core and org.eclipse.xtext.ui.common
have been merged into a new single plug-in org.eclipse.xtext.ui. The same happened to the respective
Java packages. Change eventually remaining bundle-dependencies in all manifests.
• The plug-in org.eclipse.xtext.log4j no longer exists. We use a package import of org.apache.log4j
instead. Also remove the buddy registration.
• Due to renamed packages, you have to fix all references to classes therein in the plugin.xml. A diff with
the _plugin.xml___gen_ will be a great help. If you haven’t added a lot manually, consider merging
these into the generated version instead of going the other way around. Note that warnings in the
plugin.xml can be considered to be real errors most of the time. Make sure
• the ExecutableExtensionFactory has the .ui. package prefix
• classes from org.eclipse.xtext.ui.common and org.eclipse.xtext.ui.core are now usually somewhere
in org.eclipse.xtext.ui. They are also referenced by the ExecutableExtensionFactory and thus not
covered by the editor’s validation.
• A number of new features are being registered in the plugin.xml, e.g. Find references, Quick
Outline, and Quick Fixes. You can enable them by manually copying the respective entries from
_plugin.xml___gen_ to plugin.xml.
• To run MWE2 workflows later on, you must change the plug-in dependencies from
org.eclipse.emf.mwe.core to org.eclipse.emf.mwe2.launch in your manifest. Optional resolution is fine.

11.2.5. Noteworthy API Changes


The src folders are generated once, so existing code will not be overwritten but has to be updated
manually. At least one new class has appeared in your src-folder of the ui plug-in. there will now be a
StandaloneSetup inheriting form the generated StandaloneSetupGenerated to allow customization.
You will face a couple of compilation problems due to changes in the API. Here’s a list of the most
prominent changes. It is usually only necessary to change your code, if you face any compilation
problems.
• The method IScopeProvider.getScope(EObject,EClass) has been removed. Use
IScopeProvider.getScope(EObject,EReference) instead.
• Renamed DefaultScopeProvider to SimpleNameScopeProvider. There have been further significant
changes in the scoping API that allow for optimized implementations. Consult the section on scoping
for details.
• The return type of AbstractInjectableValidator.getEPackages() was changed from List<? extends
EPackage> to List&lt;EPackage>.
• The parser interfaces now use java.io.Reader instead of java.io.InputStream to explicitly address
encoding. Have a look at the section on encoding for details.

Xtext 1.0 106


Migrating from Xtext 0.7.x to 1.0

• The handling of ILabelProvider in various contexts has been refactored. The former base class
DefaultLabelProvider no longer exists. Use the DefaultEObjectLabelProvider instead. See the section
on label providers for details.
• We have introduced a couple of new packages to better separate concerns. Most classes should be
easy to relocate.
• The runtime and UI modules have separate base classes DefaultRuntimeModule and DefaultUiModule
now. We use Guice’s module overrides to combine them with the newly introduced SharedModule.
You have to add a constructor the your UiModule that takes an AbstractUiPlugin as argument and pass
that one to the super constructor. Tip: There is an Eclipse quick fix available for that one.
• The interfaces ILexicalHighlightigConfiguration and ISemanticHighlightingConfiguration have been
merged into IHighlightingConfiguration.
• The DefaultTemplateProposalProvider takes an additional, injectable constructor parameter of type
ContextTypeIdHelper.
• The HyperlinkHelper uses field injection instead of constructor injection. The method
createHyperlinksByOffset should be overridden instead of the former findCrossLinkedEObject.
• The API to skip a node in the outline has changed. Instead of returning the _HIDDEN_NODE_ you’ll
have to implement boolean consumeNode(MyType) and return false.
• The Readonly*Storage implementations have been removed. There is a new API to open editors
for objects with a given URI. Please use the IURIEditorOpener to create an editor or the
IStorage2UriMapper to obtain an IStorage for a given URI.
• The interfaces IStateAccess and IEObjectHandle have been moved along with the IUnitOfWork to the
package org.eclipse.xtext.util.concurrent.
• The ValidationJobFactory is gone. Please implement a custom IResourceValidator instead.
• The grammar elements Alternatives and Group have a new common super type CompoundElement.
The methods getGroups and getTokens have been refactored to getElements.
• Completion proposals take a StyledString instead of a plain string as display string.
• The AbstractLabelProvider does no longer expose its IImageHelper. Use convertToImage instead or
inject your own IImageHelper.
• The implementation-classes from org.eclipse.xtext.index were superseded by the builder infrastructure.
Use the QualifiedNamesFragment and the ImportNamespacesScopingFragment instead of the
ImportedNamespacesScopingFragment. Please refer to the section about the builder infrastructure for
details.
• All the Xtend-based fragments were removed.
• ILinkingService.getLinkText was removed. Have a look at the LinkingHelper and the
CrossReferenceSerializer if you relied on this method.
• The SerializerUtil was renamed to Serializer. There were other heavy refactorings that involved the
serializer and its components (like e.g. the ITransientValueService) but it should be pretty straight
forward to migrate existing client code.
• The method-signatures of the IFragmentProvider have changed. The documentation will give a clue
on how to update existing implementations.
• Some static methods were removed from utility classes such as EcoreUtil2 and ParsetreeUtil in favor
of more sophisticated implementations.

11.3. Now Go For The New Features


After migrating, some of the new features in Xtext 1.0 will be automatically available. Others require
further configuration. We recommend reading the sections about
• qualified names and namespace imports
• the builder infrastructure
• quick fixes
• unordered groups

Xtext 1.0 107


Migrating from Xtext 0.7.x to 1.0

• quick outline
• MWE2
• refering to Java elements
For an overview over the new features consult our New and Noteworthy online.

Xtext 1.0 108


Chapter 12. The ANTLR IP Issue (Or
Which Parser To Use?)
In order to be able to parse models written in your language, Xtext needs to provide a special parser for
it. The parser is generated from the self defined language grammar.
It is recommended to use the ANTLR -based parser. ANTLR is a very sophisticated parser generator
framework which implements a so called LL(*) algorithm. It is fast, simple and at the same time has
some very nice and sophisticated features. Especially its support for error recovery is much better than
what other parser generators provide.
Xtext uses ANTLR 3 which comes in two parts: the runtime and the generator. Both are shipped under
the BSD license and have a clean intellectual property history. However the ANTLR parser generator
is unfortunately still implemented in an older version of itself (v 2.x), where it was not possible for the
Eclipse Foundation to ensure where exactly every line of code originated. Therefore ANTLR 2 didn’t
get the required IP approval. Eclipse has a strict IP policy, which makes sure that everything provided by
Eclipse can be consumed under the terms of the Eclipse Public License. The details are described in this
document. That is why we are not allowed to ship Xtext with the ANTLR generator but only with the
IP approvied runtime components. We have to provide it separately and for your convenience you’ll be
asked to download the ANTLR generator when you run your language generator for the first time. You
can even download it directly or install an additional plug-in into Eclipse:
• http://download.itemis.com/antlr-generator-3.0.1.jar
• or use the update site at http://download.itemis.com/updates
The workflow will not bother you with this issue on subsequent executions as the archive will be stored
in your project’s root directory and can thereby be reused during the next run.
IMPORTANT : Although if you use the non-IP approved ANTLR generator, you can still ship any
languages and the IDEs you’ve developed with Xtext without any worrying, because it is not needed at
runtime

Xtext 1.0 109

You might also like