In Java, we can define behavior that is executed exclusively at the initialization of an instance. For now, our metamodel represented these behaviors as methods. This evolution represents them as Initializers.
We consider as initializers the following elements:
Constructors: they are called when creating a new instance. When a constructor is called, if no explicit call is defined, it implicitly calls the default no-argument constructor, that calls the no-argument constructor in the superclass. We do not represent implicit constructors and these invocations.
Initialization blocks: blocks that are executed when a new instance is created. They are copied by the Java compiler into each constructor and avoid code duplication. We do not represent this implicit invocation.
In Famix: the <Initializer> method: we create a method to hold all attribute definitions in a type.
The main motivation for this change is to adapt the metamodel to the needs of building call graphs.
Call graphs must be able to create the implicit invocations described above and to distinguish between the 3 types of initializers.
Another motivation is to differentiate between initializers and actual methods.
In analyses, we often need to focus on methods and initializers can add noise when treated as actual methods, especially the <Initializer> method.
We introduce FamixJavaInitializer, a subclass of FamixJavaMethod.
An Initiliazer has 2 properties:
#isInitializationBlock: boolean, false by default.
#isConstructor: boolean, derived. In java, a constructor is an initializer with the same name as its parent type, with no declared type (or void as declared type).
We do not merge all initializers as we did before, but we still merge similar initializers, that will always be called together.
In a Java model, a type (TWithMethods) can define a maximum of 4 initializers (besides constructors):
An instance initialization block, that is the merge of all instance initialization blocks.
An instance-side <Initializer> method, similar to the one we created before.
A static initialization block, merge of all static initialization blocks (isClassSideis true).
A static <Initializer>method (isClassSideis true) for static attributes definition.
When inspecting a Java model, initializers can now be found under Initializers and Model initializers.
If you’re here, you’re probably interested in creating a new FAST metamodel and expanding Moose to represent the AST (Abstract Syntax Tree) of an additional language.
In this post, we explain to you how to generate a “First version” of a new FAST-Language metamodel using the project Pharo-Tree-Sitter.
To be able to understand that, we assume you are already familiar with:
Tree-Sitter
Pharo-Tree-Sitter
FAST
Metamodel generators
Tree-Sitter is a parser generator tool and an incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited. It is able to parse a large variety of programming languages such as Java, C++, C#, Python and many others.
Pharo-Tree-Sitter is a project developed in Pharo that integrates the original Tree-Sitter parsers and allows visualizing their results (such as ASTs) directly in Pharo. It relies on the FFI protocol, which requires the corresponding libraries depending on the OS (.dll, .so, or .pylib) to be present in Pharo’s VM folders.
The project supports parsing several languages, and for some of them (like Python, TypeScript, and C), the library generation is automated. You can find more details in the repository’s README.
This is the project that we will use to generate a new FAST-Language metamodel, so you need to download it into your Pharo image.
FAST means Famix AST. Contrary to Famix that represent application at a high abstraction level, FAST uses a low-level representation: the AST.
FAST defines a set of traits that can be used to create new meta-models compatible with Moose tools.
When developing a new FAST-Language metamodel, you will rely on these FAST traits to structure your metamodel. However, this does not apply to the “First version” described in this post, but rather to the upgraded versions when you evolve and refine it.
Metamodel generator is a Pharo library used to create new metamodels such as FAST-Java, Famix-Java, or FAST-Fortran.
The generation of any new version of a FAST-Language metamodel can only be achieved through the metamodel generator.
As you will see in this post, Pharo-Tree-Sitter enables you to define a new metamodel generator. Once executed, it produces the corresponding FAST-Language metamodel. We will explain this process in more detail in the following sections.
Download Pharo-Tree-Sitter and get the correspondent libraries
Once downloaded, you need to make sure that Pharo-Tree-Sitter is able to parse the language that you intend to create the metamodel for.
If it is not included, you need to follow the instructions in the readme file of this repository and add the new language.
For this blog post we will assume that the language is already supported and we will continue with “Python” 🐍🐍🐍.
To be able to continue, and if this is the first time you’re using this project (Pharo-Tree-Sitter), you need to launch the tests of python in package “TreeSitter-Tests” class “TSParserPythonTest”.
This is needed to launch the process of downloading the original tree-sitter and tree-sitter-python projects from GitHub, generating the correspondent libraries and moving them to the correspondent VM folder based on the image version you create: for example Moose 12.
If you create another image of another version, you need to launch the tests again to make sure the libraries are again moved to the correspondent folder.
Now that you have the libraries, you can parse python code and get an AST, but not FAST-Python model.
So in the next step we explain how this can be possible.
Create the first version of the metamodel (FAST-Python in our example)
This package contains two main classes: “TSFASTBuilder” and “TSFASTImporter”.
For our task we will rely on the first one.
The second is used to make the transition between an AST generated by TreeSitter and a FAST-Language model.
“TSFASTBuilder” contains a set of methods responsible for generating a new metamodel generator:
#tsLanguage: is used to set an instance of TSLanguage, which is TSLanguage python in our case.
#createMetamodelGeneratorClass is responsible for creating a new package and a class inside. By default, the class name will be “FASTLanguageNameMetamodelGenerator” which is “FASTPythonMetamodelGenerator” and the package name is “FAST-LanguageName-Model-Generator”.
This method also calls another one “typesToReify”, which gets all the symbols from the initial TreeSitter project (using an FFI call), and add them as slots in the class definition. These symbols represent the nodes of the language in question like “class” for Python.
#addPrefixMethodIn: adds #prefix method on the class side of the metamodel generator class. By default it is FASTLanguage.
#addPackageNameMethodIn: adds #packageName method on the class side of the metamodel generator class. By default it’s ‘FAST-Language-Model’.
#addSubmetamodelsMethodIn: adds #submetamodels method on the class side of the metamodel generator class, and by default it contains FASTMetamodelGenerator.
#addDefineClassIn: adds #defineClasses method. In this method slots are defined, starting by #entity then all the symbols imported from TreeSitter.
#addDefineTraitsIn: adds #defineTraits method. By default FASTTEntity trait is created.
#addDefineHierarchyIn: adds #defineHierarchy method. By default only #entity relation is defined with FASTTEntity.
#addDefineRelationsIn: adds #defineRelations method. By default only #entity relations are defined with genericChildren and genericParent.
Voilà, now that you understand how it works, we will show you how to generate one for Python:
tsb := TSFASTBuilder new.
tsb languageName: 'Python'.
tsb tsLanguage: TSLanguage python.
tsb build.
This will generate the metamodel generator. Now that the generator is created you can use it to generate the metamodel:
FASTPythonMetamodelGenerator new generate.
Now you can access the packages and classes created: ‘FAST-Python-Model’ and ‘FAST-Python-Model-Generator’.
From now on you have to handle the metamodel manually. You have to add missing traits (including FAST Traits), properties that should be imported from TreeSitter… You benefit from the importer to handle the parsing on the metamodel side. You can create a package for tools having a #parse method doing this for example:
N.B: We recommend you to parse many python examples (you can find a lot in the main project of TreeSitter-Python), using Pharo-Tree-Sitter project. Once parsed you can inspect in Pharo the properties for each node using #collectFieldNameOfNamedChild and find the properties for each one. Then you can add them in #defineRelations of the metamodel.
How do we represent the relation between a generic entity, its type parameters and the entities that concretize it? The Famix metamodel has evolved over the years to improve the way we represent these relations. The last increment is described in a previous blogpost.
We present here a new implementation that eases the management of parametric entities in Moose.
The major change between this previous version and the new implementation presented in this post is this:
We do not represent the parameterized entities anymore.
What’s wrong with the previous parametrics implementation?
The major issue with the previous implementation was the difference between parametric and non-parametric entities in practice, particularly when trying to trace the inheritance tree.
Here is a concrete example: getting the superclass of the superclass of a class.
For a non-parametric class, the sequence is straightforward: ask the inheritance for the superclass, repeat.
For a parametric class (see the little code snippet below), there was an additional step, navigating through the concretization:
importjava.util.ArrayList; "public class ArrayList<E> { /* ... */ }"
This has caused many headaches to developers who wanted to browse a hierarchy: how do we keep track of the full hierarchy when it includes parametric classes? How to manage both situations without knowing if the classes will be parametric or not?
The same problem occurred to browse the implementations of parametric interfaces and the invocations of generic methods.
Each time there was a concretization, a parametric entity was created. This created duplicates of virtually the same entity: one for the generic entity and one for each parameterized entity.
Let’s see an example:
publicMyClass implements List<Float> {
publicList<Integer>getANumber() {
List<Number> listA;
List<Integer> listB;
}
}
For the interface List<E>, we had 6 parametric interfaces:
One was the generic one: #isGeneric >>> true
3 were the parameterized interfaces implemented by ArrayList<E>, its superclass AbstractList<E> and MyClass. They were different because the concrete types were different: E from ArrayList<E>, E from AbstractList<E>and Float.
2 were declared types: List<Number> and List<Integer>.
When deciding of a new implementation, our main goal was to create a situation in which the dependencies would work in the same way for all entities, parametric or not.
That’s where we introduce parametric associations. These associations only differ from standard associations by one property: they trigger a concretization.
Here is the new Famix metamodel traits that represent concretizations:
There is a direct relation between a parametric entity and its type parameters.
A concretization is the association between a type parameter and the type argument that replaces it.
A parametric association triggers one or several concretizations, according to the number of type parameters the parametric entity has. Example: a parametric association that targets Map<K,V> will trigger 2 concretizations.
The parametric entity is the target of the parametric association. It is always generic. As announced, we do not represent parameterized entities anymore.
Coming back to the entities’ duplication example above, we now represent only 1 parametric interface for List<E>and it is the target of the 5 parametric associations.
This metamodel evolution is the occasion of another major change: the replacement of the direct relation between a typed entity and its type. This new association is called Entity typing.
The choice to replace the existing relation by a reified association is made to represent the dependency in coherence with the rest of the metamodel.
With this new association, we can now add parametric entity typings.
In a case like this:
publicArrayList<String> myAttribute;
we have an “entity typing” association between myAttribute and ArrayList. This association is parametric: it triggers the concretization of E in ArrayList<E> by String.
In the previous implementation, the bounds of type parameters were implemented as inheritances: in the example above, Number would be the superclass of T.
Since this change, bounds were introduced for wildcards.
We have now the occasion to also apply them to type parameters.
In the new implementation, Number is the upper bound of T.
This diagram sums up the new parametrics implementation in Famix traits and Java metamodel.
Please note that this is not the full Java metamodel but only a relevant part.
The representation of parametric entities is a challenge that will most likely continue as Famix evolves. The next question will probably be this one: should Concretization really be an association?
An association is the reification of a dependency. Yet, there is no dependency between a type argument and the type parameter it replaces. Each can exist without the other. The dependency is in fact between the source of the parametric association and the type parameter.
MySpecializedList has a superclass (ArrayList<E>) and also depends on String, as a type argument. However, String does not depend on E neither E on String.
The next iteration of the representation of parametric entities will probably cover this issue. Stay tuned!
Here, ClassA is a generic class because there is one generic type T.
One can not use ClassA without specifying the generic type.
ClassA<Integer> class1=newClassA<Integer>;
ClassA<String> class2=newClassA<String>;
class1 and class2 are variables of type ClassA, but this time ClassA doesn’t have a generic type but String or Integer.
So, how do we represent all that?
We have 5 new traits in our meta-model :
TParametricEntity is used by all parametric entities. It can be a ParametricClass, ParametricMethod, and ParametricInterface.
TConcretisation allows one to have a link between two TParametricEntity. A TParametricEntity can have one or more concretizations with other TParametricEntity. Each TParametricEntity that is a concretization of another TParametricEntity has a genericEntity.
TConcreteParameterType for concrete parameters.
TGenericParameterType for generic parameters.
TParameterConcretisation is the same as TConcretisation but instead of two TParametricEntity it has TConcreteParameter and TGenericParameterType. TGenericParameterType can have one or more concretisations and TConcreteParameterType has generics.
A TParametricEntity knows its concrete and generic parameters.
ParameterType uses the TWithInheritance trait because in Java we can do the following: <T extends Object> and <? super Number>.
For the first, it means that T can be all subclasses of Object and for the second, it means Number and all its superclasses or interfaces (Number, Object, Serializable).
ParameterType also uses the TThrowable trait because we can have a genericException so ParameterType should be considered like it.
If we take the first class. We have a ParametricClass with one ParameterType name T.
For the second class, we have a class that extends a parametric class with one parameter named String.
String here is a class.
It is not a ParameterType anymore.
So, what is the link between the two parametric classes and the parameters T and String?
We have here a Concretisation.
ClassA with the parameter T has one concretization and the parameter T has one parameter Concretisation which is String.
If we take back our first example:
publicclassClassA<T>
ClassA<Integer> class1 = new ClassA<Integer>
ClassA<String> class2 = new ClassA<String>
We have three ParametricClass, one ParameterType and two types (String and Integer).
T is our ParameterType and has two ParameterConcretisations: String and Integer.
We can say that T is generic and String and Integer are concrete because we know what they are: classes.
ClassA with the ParameterType T (ClassA<T>) also has two concretizations.
These are ClassA<Integer> and ClassA<String>.
The three different classA know their parameters. T is in genericParameters. String and Integer are in concreteParameters.
A class is generic if it has at least one ParameterType.
We can have concretization of a parametric class that is also generic. See the example below:
The second ParametricClass has one ParameterType, so the class is generic.
The four parameters (T, V, K, Z) have each a concretization (String, Integer, Integer, Z). Even if Z is still a ParameterType.
ParametricClass2 has for superclass ParametricClass, which has for generic entity ParametricClass with 4 ParameterTypes.
Let’s see what we have here. First of all, we recognize a ParametricClass with one ParameterType. This class has two methods. One is a regular method and the second is a parametricMethod.
The first one isn’t generic because when the class is concretized, the ParameterType T will become String, Integer, Animals… and it will be the same for the method.
The parameter of the first method depends on the class and this is not the case for the second method. That is why the second method is generic, not the first one.
Sometimes, a meta-model does not have all the information you want.
Or, you want to connect it with another one.
A classic example is linking a meta-model at an abstract level to a more concrete meta-model.
In this blog post, I will show you how to extend and connect a meta-model with another (or reuse a pre-existing meta-model into your own meta-model).
We will use the Coaster example.
The Coaster meta-model is super great (I know… it is mine 😄 ).
Using it, one can manage its collection of coasters.
However, did you notice that there is only one kind of Creator possible: the brewery.
It is not great because some coasters are not created by breweries but for events.
My model is not able to represent this situation.
So, there are two possibilities: I can fix my meta-model, or I can extend it with a new concept.
Here, we will see how I can extend it.
As presented in the above figure, we add the Events concept as a kind of Creator.
As a first step, we need the original Coaster meta-model generator loaded in our image.
We can download it from my Coaster GitHub repository.
You should have a named CoasterCollectorMetamodelGenerator in your image.
This is the generator of the original meta-model.
We will now create another generator reusing the original one.
First, we create a new generator for our extended meta-model.
Then, we link our generator with the original one.
To do so, we will use the submetamodels feature of the generator.
We only have to implement the #submetamodels method in the class side of our new generator.
This method should return an array including the generators of the submetamodels that we want to reuse.
Creating new concepts in the new meta-model is done following the same approach as for classic meta-model generator.
In our example, we add the Event class.
Thus, we create the method #defineClasses with the new entity.
To extend the original meta-model, we first need to identify the entities of the original meta-model we will extend.
In our case, we only extend the Creator entity.
Thus, we declare it in the #defineClasses method.
To do so, we use the method #remoteEntity:withPrefix:.
The prefix is used to allow multiple entities coming from different submetamodels but with the same name.
Once the declaration done, one can use the remote entities as classic entities in the new generator.
In our example, we will create the hierarchy between Creator and Event.
Once everything is defined, as for classic generator, we generate the meta-model.
To do so, execute in a playground:
CoasterExtendedMetamodelGenerator generate
The generation creates a new package with the Event entity.
It also generates a class named CCEModel used to create an instance of our extended meta-model.
It is now possible to use the new meta-model with the Event concept.
For instance, one can perform the following script in a playground to create a little model.
myExtendedModel := CCEModel new.
myExtendedModel add: (CCBrewery new name: 'Badetitou'; yourself).
myExtendedModel add: (CCEEvent new name: 'Beer party'; yourself)
We saw that one can extend a meta-model by creating a new one based on the pre-existing entities.
It is also possible to connect two existing meta-models together.
To do so, let’s assume we have two existing meta-models to be connected.
As an example, we will connect our coaster meta-model, with the world meta-model.
The world meta-model aims to represent the world, with its continent, countries, regions and cities.
We will not detail how to implement the world meta-model.
But the generator is available in my GitHub repository.
The figure below illustrates the meta-model.
Connecting world meta-model with Coaster meta-model
Before creating the connection, we must declare, in the new meta-model, the entities that will be contected.
To do so, we declare them as remoteEntity.
Once the generator is created, we can generate the connection by generating the new meta-model.
To do so, execute in a playground:
ConnectMetamodelGenerator generate
Then, it is possible to create a model with all the entities and to link the two meta-models.
In the following, we present a script that create a model.
"create the entities"
coaster1 := CCCoaster new.
coaster2 := CCCoaster new.
coaster3 := CCCoaster new.
coasterFranceCountry := CCCountry new name: #'France'; yourself.
coasterFranceCountry addCoaster: coaster1.
coasterFranceCountry addCoaster: coaster2.
coasterGermanyCountry := CCCountry new name: #'Germany'; yourself.
coasterGermanyCountry addCoaster: coaster3.
wFranceCountry := WCountry new name: #'France'; yourself.
wGermanyCountry := WCountry new name: #'Germany'; yourself.
continent := WContinent new name: #Europe; yourself.
continent addCountry: wFranceCountry.
continent addCountry: wGermanyCountry.
"connect CCountries to WCountries"
coasterFranceCountry country: wFranceCountry.
coasterGermanyCountry country: wGermanyCountry.
"put all entities into the same model"
connectedModel := CMModel new.
connectedModel addAll:
{ coaster1. coaster2 . coaster3 .
coasterFranceCountry . coasterGermanyCountry .
wFranceCountry . wGermanyCountry . continent }.
Based on the preceding model, it is possible to create query that will request the coaster and the world meta-model.
For instance, the following snippet count the number of coasters by country in the Europe continent:
europe := (connectedModel allWithType: WContinent)
detect: [ :continent| continent name =#Europe ].
(europe countries collect: [ :eCountry|
eCountry name -> eCountry country coasters size ]) asDictionary
The coutry: and coutry methods are accessors that allow to set and recover one CCCountry (resp. WCountry) into a WCountry (resp. CCCountry).
The accessors names are the same in both classes and were generated automatically from the declaration of the relationship in defineRelations (this is normal behaviour of the generator, not specific to using sub-models.
In this post, we saw how one can extend and connect meta-models using Famix Generator.
This feature is very helpfull when you need to improve a meta-model without modifying it directly.
If you need more control on the generated entities (e.g., name of the relations, etc.), please have a look at the create meta-model wiki page.
I’m a coasters collector.
I’m not a huge collector but I want to inventory them in one place.
For sure, I can create a PostgreSQL database.
But, at the same time, it appears that I can also design my collection using Moose.
So, you’re going to use a complete system analysis software to manage your coasters collection?
As for every software system, the first step is to design the model.
In my case, I want to represent a collection of coasters.
Let’s say a coaster is an entity.
It can belong to a brewery or not (for example event coasters).
A coaster also has a form.
It can be round, squared, oval, or others.
A Coaster can also be specific to a country.
Because it is a collection, I can register coaster I own and other I do not.
Finally, each coaster can have an associated image.
From this description of the problem, I designed my UML schema:
The most complicated part is done.
We just need to implement the meta-model in Moose now 😄.
Ok! Let’s create a generator that will generate for us the meta-model.
We only need to describe the meta-model in the generator.
We will name this generator CoasterCollectorMetamodelGenerator.
A meta-model is composed of entities.
In our case, it corresponds to the entities identified in the UML.
We use the method #defineClasses to define the entities of our meta-model.
As we have defined the classes, we defined the properties of the entities using the #defineProperties method.
defineProperties
super defineProperties.
creator property: #name type: #String.
country property: #name type: #String.
coaster property: #image type: #String.
coaster property: #owned type: #Boolean
In this example, we did not use Trait already created in Moose.
However, it is possible to use the Trait TNamedEntity to define that countries and creators have a name instead of using properties.
I have created my meta-model.
Now I need to fill my collection.
First of all, I will create a collection of coasters.
To do so, I instantiate a model with: model := CCModel new.
And now I can add the entities of my real collection in my model and I can explore it in Moose.
For example, to add a new brewery I execute: model add: (CCBrewery new name: 'Badetitou'; yourself).