Standalone Parsing with Xtext

Xtext is an awesome framework to create your own domain specific languages (DSLs). Providing a grammar, Xtext will create your data model, Lexer, Parser and even a powerful Editor integrated into an Eclipse IDE including syntax highlighting and auto completion 🙂

In this blog post I want to summarize some of my experiences with the behaviour of Xtext using different parsing approaches. These are useful if you want to parse input with Xtext in standalone applications or in the Eclipse context in order to get a model representation of your DSL code.

Approach 1: Injecting an IParser instance

The first approach uses the IParser interface. In a standalone application (that means if your code is not running in an Eclipse/Equinox environment), a parser instance can be retrieved using the injector returned by an instance of <MyDSL>StandaloneSetup:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
public class XtextParser {
 
    @Inject
    private IParser parser;
 
    public XtextParser() {
        setupParser();
    }
 
    private void setupParser() {
        Injector injector = new MyDSLStandaloneSetup().createInjectorAndDoEMFRegistration();
        injector.injectMembers(this);
    }
 
    /**
     * Parses data provided by an input reader using Xtext and returns the root node of the resulting object tree.
     * @param reader Input reader
     * @return root object node
     * @throws IOException when errors occur during the parsing process
     */
    public EObject parse(Reader reader) throws IOException
    {
        IParseResult result = parser.parse(reader);
        if(result.hasSyntaxErrors())
        {
            throw new ParseException("Provided input contains syntax errors.");
        }
        return result.getRootASTElement();
    }
}

Using this approach, your parse result can be retrieved with only very few lines of code. However, it only works in standalone applications. If you execute this code in the Eclipse context, the following errror is logged:

java.lang.IllegalStateException: Passed org.eclipse.xtext.builder.clustering.CurrentDescriptions not of type org.eclipse.xtext.resource.impl.ResourceSetBasedResourceDescriptions
    at org.eclipse.xtext.resource.containers.ResourceSetBasedAllContainersStateProvider.get(ResourceSetBasedAllContainersStateProvider.java:35)

To resolve this, the injector has to be created differently, using Guice.createInjector and the Module of your language:

1
Injector injector = Guice.createInjector(new MyDSLRuntimeModule());

Now the parser works fine, even in Eclipse. But if you use references to other resources or import mechanisms, you will find that the references to other resources can not be resolved. That’s why you need a resource to parse Xtext input properly.

Approach 2: Using an XtextResourceSet

To parse input using resources, you inject an XtextResourceSet and create a resource inside the ResourceSet. There are two ways to specify the input:

  1. an InputStream
  2. an URI specifying the location of a resource

In my implementation there are two methods for those two alternatives, respectively:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
public class XtextParser {
 
    @Inject
    private XtextResourceSet resourceSet;
 
    public XtextParser() {
        setupParser();
    }
 
    private void setupParser() {
        new org.eclipse.emf.mwe.utils.StandaloneSetup().setPlatformUri("../");
        Injector injector = Guice.createInjector(new MyDSLRuntimeModule());
        injector.injectMembers(this);
        resourceSet.addLoadOption(XtextResource.OPTION_RESOLVE_ALL, Boolean.TRUE);
    }
 
    /**
     * Parses an input stream and returns the resulting object tree root element.
     * @param in Input Stream
     * @return Root model object
     * @throws IOException When and I/O related parser error occurs
     */
    public EObject parse(InputStream in) throws IOException
    {
        Resource resource = resourceSet.createResource(URI.createURI("dummy:/inmemory.ext"));
        resource.load(in, resourceSet.getLoadOptions());
        return resource.getContents().get(0);
    }
 
    /**
     * Parses a resource specified by an URI and returns the resulting object tree root element.
     * @param uri URI of resource to be parsed
     * @return Root model object
     */
    public EObject parse(URI uri) {
        Resource resource = resourceSet.getResource(uri, true);
        return resource.getContents().get(0);
    }
 
}

In both cases, the resource set is injected using Guice. Also, the Eclipse platform path is initialized using new org.eclipse.emf.mwe.utils.StandaloneSetup().setPlatformUri(“../”). The load option RESOLVE_ALL is added to the resource set.

If an InputStream is provided, the underlying resource is a dummy resource created in the ResourceSet. Make sure that the file extension matches the one of your DSL.

In case of a given resource URI, your resource can be parsed directly using resourceSet.getResource(). Using this approach, all references (even to imported / other referenced resources) will be resolved.

The Dependency Injection Issue

Still there is another problem because we use Dependency Injection here in a way which is not exactly elegant. The point is that a class should not need to care about how its members are injected. In a well-designed DI-based application, there is only one injection call and all members are intantiated recursively from “outside the class”. To learn more about this issue, please read this excellent blog post by Jan Köhnlein.

I hope this post was useful to you. Please feel free to share your thoughts / ideas for improvements.