Skip to content

Blog

Transformation journey (1/3) : Locating entities and nodes

Sometimes we have to perform several similar edits on our source code. This can happen in order to fix a recurring bug, to introduce a new design pattern or to change the architecture of a portion of a software. When many entities are concerned or when the edits to perform are complicated and take too much time, it can be interesting to consider building a transformation tool to help us and make that task easier. Fortunately, Moose is here to help with several modeling levels and powerful tools enabling us to build what we need!

This little transformation journey will be divided into three blog posts. We will see how to build a simple transformation tool, with each post focusing on a different aspect:

  • First post : Locating entities and nodes to transform
  • Second post : Creating AST copies and AST nodes to make a transformation
  • Final post : Viewing and editing our transformation, and applying it to the source files

Throughout those three posts, we will follow a simple transformation scenario, based on the software ArgoUML, an open-source Java project used in this wiki. The first step is to create the model for the software, using the sources and libraries available on that wiki post, but creating the model on the latest stable version of VerveineJ.

Using the available Docker image, this command (on Windows) will create the model and store it in the sources repository :

Terminal window
docker run -v .\src:/src -v .\libs:/dependency badetitou/verveinej -format json -o argouml.json -anchor assoc .

All that remains to do is to create a fresh image, and import this model (with the sources repository used to build the model as root folder) to start making our tool.

Note : As the creation of that tool is divided in three blog posts, keeping that image for all three blog posts is recommended.

The transformation case we will be dealing with in those blog posts is rather simple. In the ArgoUML system, three classes define and use a method named logError. In our scenario, a class ArgoLogger has been defined and contains a static method logError. The transformation task is to add a receiver node to each logError method invocation so that the method is called using the right class.

For this blog post, we will have to import two tools :

The first one is Carrefour, allowing us to bind and access the (F)AST model of an entity to its Famix counterpart. Loading it will also load the FAST Java metamodel. To load the project, execute this command in a Playground :

Metacello new
githubUser: 'moosetechnology' project: 'Carrefour' commitish: 'v5' path: 'src';
baseline: 'Carrefour';
load

Second, we will use MoTion, an object pattern matcher that will allow us to easily explore the FAST of our methods and find the specific nodes we are looking for. To load the project, execute this command in a Playground :

Metacello new
baseline: 'MoTion';
repository: 'github://AlessHosry/MoTion:main';
load: 'MoTion-Moose'

Finally done with explanations and setup! 😄 Let us start by creating a class with a model instance variable, accessors, and add a class side initializer method for ease of use:

"Creating our class"

onModel: aMooseModel
^ self new
model: aMooseModel;
yourself

This class will contain our entire code manipulation logic. It will be pretty short, but of course when working on more important transformations dividing the logic of our tool will help to make it more understandable and maintainable.

In any case, we will start with a basic Famix query, in order to find all the implementation of logError methods, which will later allow us to easily find their invocations :

fetchLogErrorMethods
^ model allModelMethods select: [ :m | m name = 'logError' ]

And then another query using this result, to get all invocations (the exact entities we seek to transform) :

fetchLogErrorMethodInvocations
^ self fetchLogErrorMethods flatCollect: [ :m |
m incomingInvocations ]

Using MooseQuery, you should be able to find any Famix entities you are seeking. From those Famix entities, we want to get the FAST nodes that we need to transform. We will look at two different methods to do so.

In this context, Carrefour is the perfect tool to use to find the nodes we want to transform in the FAST of our methods. Now that we found the entities that we have to transform in the Famix model, all that remains is building and binding the FAST node of every entity within our method, and then fetch the ones from our method invocations.

To do so, we will add two methods to our class. First, a method to fetch the FAST node matching a given method invocation :

fetchFastNodeForFamixInvocation: anInvocation
"building and binding the FAST of the invocating method"
anInvocation sender generateFastIfNotDoneAndBind.
"returning the actual node of the method invocation, our target"
^ anInvocation fast

And finally, a method that returns a list with every node we have to transform :

fetchAllFastNodesUsingCarrefour
^ self fetchLogErrorMethodInvocations collect: [ :mi |
self fetchFastNodeForFamixInvocation: mi ]

And just like that, we now have the complete list of FAST nodes to transform!

But before celebrating, we should keep in mind that this transformation is a very simple use case. Here, it is easy to find the entities to transform using Famix, but in some other cases it might be much more complex to find the methods that are candidates to a transformation, not to mention every node that must be transformed.

In those cases, a good way to make things easier is to divide the logic of this process, and use separate means to find the methods that are candidates to a transformation and to find the nodes that must be transformed.

Making queries on the Famix model remains a very reliable way to find the candidates methods, but then what about the nodes inside the AST of these methods? Methods can be quite complex (50 lines of code, 100, more…) and the resulting AST is huge. Finding the right node(s) in such AST is difficult. That’s where MoTion comes in. In order to find the nodes we are looking for, we can define patterns that describe those nodes, and the path to follow through the FAST to be able to reach those nodes.

MoTion is a powerful tool, allowing us to find specific items within a graph through concise patterns describing the objects we are looking for and the path in the graph used to reach them. However, it does have a very specific syntax that must be looked through before starting making our own patterns. Thankfully, everything is well documented and with examples (one of those being another example for FAST Java) on the repository of MoTion (look at the README!).

But enough description. Time to code! 😄

motionQueryForFastMethod: aFASTJavaMethodEntity
| query |
query := FASTJavaMethodEntity "type of the root node"
% { (#'children*' <=> FASTJavaMethodInvocation "looking through all childrens (with *)"
"until we find method invocation nodes"
% { (#name <=> 'logError') } as: #logErrorInvocation) }
"if their name is logError,"
"we save them to the given key"
collectBindings: { 'logErrorInvocation' } "at the end, we want all found invocations"
for: aFASTJavaMethodEntity. "and this the root entity for our search"
"the result of the query is a list of dictionaries, with each result in a dictionary"
"we only have one call to logError per method, so we can do a simple access"
^ query first at: 'logErrorInvocation'

And without commentaries, to have a clearer view on how the pattern looks :

motionQueryForFastMethod: aFASTJavaMethodEntity
| query |
query := FASTJavaMethodEntity
% { (#'children*' <=> FASTJavaMethodInvocation
% { (#name <=> 'logError') } as: #logErrorInvocation) }
collectBindings: { 'logErrorInvocation' }
for: aFASTJavaMethodEntity.
^ query first at: 'logErrorInvocation'

Now, to complete the use of our pattern, let’s make a final method that will fetch every node we need to transform :

fetchAllFastNodesUsingMotion
^ self fetchLogErrorMethodInvocations collect: [ :mi |
mi sender generateFastIfNotDoneAndBind.
self motionQueryForFastMethod: mi sender fast ]

As you can see, we still use Carrefour even in this context, as it remains the easiest way to get the FAST of our method before looking through it using MoTion. Those two tools can therefore be used together when dealing with complex transformation cases.

Now that our class is done, we are able to locate candidates methods for transformation and the specific nodes to transform using Famix, FAST, Carrefour and MoTion. You can use a Playground to test out our class and model and see for yourself the results of each method :

t := LoggerTransformationTool onModel: (MooseModel root at: 1).
t fetchAllFastNodesUsingMotion

"Testing our class"

The whole source code that was written on this blog post is also available on that repository.

Using Famix, FAST, Carrefour and MoTion, we are able to search and locate methods and nodes candidates for a given transformation test case. This first step is primordial to build a fully completed transformation tool. In the next blog posts, we will see how to create AST copies and AST nodes to use in a transformation, and finally how to view and edit our transformation before applying it to the source files.

Generate a class diagram visualization for a meta-model

This post is the first in a serie dedicated to Famix Tools

When creating or studying a meta-model, it is often convenient to be able to “see” it as a whole.

UML looks like a natural solution for this.

So in the past we had a tool to create UML diagrams of the meta-models through PlantUML (a small language and a tool to generate UML diagrams). The post Generate a plantUML visualization for a meta-model explained how to use this tool

But the tool had some limitations, one of which was that it was not easy to add a different backend than PlantUML.

Therefore, inspired by the previous tool, we redesigned a new one, FamixUMLDocumentor, with a simpler API and the possibility to add new backends.

We illustrate the use with the same Coaster example already used previously. You can also experiment with FDModel, a small meta-model used for testing.

You can create a PlantUML script for a UML class of your metamodel with:

FamixUMLDocumentor new
model: CCModel ;
generate ;
exportWith: (FamixUMLPlantUMLBackend new).

The result will be a PlantUML script that you can paste into https://plantuml.org/ to get this UML class diagram:

Generated UML class of the Coaster meta-model{: .img-fluid}

The API for the documenter is as follow:

  • model: — adds a meta-model to export. Several meta-models can be exported jointly by adding them one after the other. By default each meta-model is automatically assigned a color in which its entities will be drawn.
  • model:color: — same as previous but manually assign a Color to the meta-model.
  • onlyClasses: — specifies a list of classes to export. It can replace the use of model:.
  • excludeClasses: — specifies a list of classes to exclude from the export. Typically used with model: to remove from the UML some of the meta-model’s classes. Can also be used to exlude “stub” classes (see beWithStubs).
  • beWithStubs — Indicates to also export the super-classes and used traits of exported classes, even if these super-classes/traits or not part of the meta-models. These stubs have an automatically selected color different from the meta-models displayed.
  • beWithoutStubs — opposite of the preceding. This is the default option.
  • generate — creates an internal representation of a UML class diagram according to the configuration created with the preceding messages.
  • exportWith: — exports the internal representation with the “backend” given (for example: FamixUMLPlantUMLBackend in the example above)

The backend is normally called by the FamixUMLDocumentor but can be called manually. For example, the image above can be exported in a PlantUML script with:

documentor := FamixUMLDocumentor new.
documentor
model: CCModel ;
generate.
FamixUMLPlantUMLBackend new export: documentor umlEntities.

(Compare with the example given above)

Backends have only one mandatory method:

  • export: — Exports the collection of umlEntities (internal representation) in the format specific to the backend.

New backends can be created by subclassing FamixUMLAbstractBackend.

There is a FamixUMLRoassalBackend to export the UML diagram in Roassal (visible inside Pharo itself), and a FamixUMLMermaidBackend to export in Mermaid format (similar to PlantUML).

There is a FamixUMLTextBackend that outputs the UML class diagram in a textual form. By default it returns a string but this can be changed:

  • toFile: — Instead of putting the result in a string, will write it to the file whose name is given in argument.
  • outputStream: — specifies a stream on which to write the result of the backend.

FamixUMLPlantUMLBackend and FamixUMLMermaidBackend are subclasses of this FamixUMLTextBackend (therefore they can also export to a file).

Enhancing software analysis with Moose's aggregation

As software systems grow more complex, importing large models into Moose using the conventional process can cause issues with speed, excessive memory usage, and overall performance due to the vast amount of data. To ensure a smoother analysis process, managing the importation of extensive models efficiently is crucial. To overcome these challenges, strategic filtering and aggregation have emerged as powerful techniques.

One feature of Moose is its model import filtering, which provides a practical approach to effectively handle large models. It allows us to selectively choose relevant entities for analysis instead of importing the entire model.

However, filtering has its limitations. By excluding certain entities during importation, we may lose some fine-grained details that could potentially be relevant for certain analyses. Moreover, if our filtering criteria are too aggressive, we might overlook important dependencies that could impact the overall understanding of the software system. To address these limitations, we have adopted a specific approach in this context - not importing methods.

Simplifying the model by not importing methods

Section titled “Simplifying the model by not importing methods”

let’s take a look at a real-life example - a massive software model with over 130,000 methods!

"Massive Model"

While method-related information can be crucial for certain analysis tasks, focusing on high-level relationships between classes is often more important than diving into individual method implementations. By avoiding the importation of individual methods, we strike a balance between capturing essential dependency information and simplifying the model.

But how do we preserve crucial dependency information when we’re not importing methods? This is where aggregation comes into play.

Aggregation: an approach to capture dependencies

Section titled “Aggregation: an approach to capture dependencies”

Aggregation involves creating an aggregated method within each class, serving as a central repository for consolidating dependencies. This approach reduces the need for complex connections between individual methods, leading to improved performance and overall efficiency. The abstraction layer introduced by aggregated methods not only simplifies the model but also enhances its modularity. By adopting this approach, we promote cleaner code design, making the software more maintainable and adaptable.

Now, let’s explore the process of importing a software model into Moose using the aggregator approach.

Importing a model in Moose with the aggregator

Section titled “Importing a model in Moose with the aggregator”

To import an aggregated model into Moose:

  1. Open Moose’s model browser.
  2. Locate the model file on your computer.
  3. Click “Aggregate Methods.”
  4. Click “Import.”

"Importing Model"

Now, the aggregated model is available for analysis in Moose.

"My Java Model"

Benchmarking aggregation’s impact on model size and analysis

Section titled “Benchmarking aggregation’s impact on model size and analysis”

To assess the effectiveness of aggregation in reducing model complexity, we conducted a benchmark using a real-life example. The original software model had a staggering 10,267 methods.

"Source Model Number Of Methods"

After importing the model into Moose using the aggregation approach, the corresponding aggregated model had only 448 methods. This showcases a substantial reduction in complexity achieved through aggregation.

"Aggregated Model Number Of Methods"

In proportion, the aggregated model represents just 4.4% of the original model’s size (448 / 10,267 * 100). This remarkable decrease in the number of methods demonstrates the powerful impact of aggregation in simplifying the model.

Our benchmark confirms that aggregation is an invaluable technique for managing large models in Moose. It significantly streamlines the analysis process while preserving essential dependency information. Aggregation empowers software engineers to work with large-scale systems more efficiently and promotes cleaner code design, making the software more maintainable and adaptable.

In summary, aggregation proved to be a highly effective approach for managing large models in Moose. By adopting aggregation, software engineers can work more efficiently with complex systems.

Representation of parametrics

Note that this blog post is rendered obsolete by the new Parametrics next generation blog post.

In Java generic types allow you to write a general, generic class (or method) that works with different types, allowing code reuse.

But their modeling and how it works can be difficult to understand. Let’s take an example.

public class ClassA<T>

Here, ClassA is a generic class because there is one generic type T. One can not use ClassA without specifying the generic type.

ClassA<Integer> class1 = new ClassA<Integer>;
ClassA<String> class2 = new ClassA<String>;

class1 and class2 are variables of type ClassA, but this time ClassA doesn’t have a generic type but String or Integer. So, how do we represent all that?

Modelisation_generic

We have 5 new traits in our meta-model :

  • TParametricEntity is used by all parametric entities. It can be a ParametricClass, ParametricMethod, and ParametricInterface.
  • TConcretisation allows one to have a link between two TParametricEntity. A TParametricEntity can have one or more concretizations with other TParametricEntity. Each TParametricEntity that is a concretization of another TParametricEntity has a genericEntity.
  • TConcreteParameterType for concrete parameters.
  • TGenericParameterType for generic parameters.
  • TParameterConcretisation is the same as TConcretisation but instead of two TParametricEntity it has TConcreteParameter and TGenericParameterType. TGenericParameterType can have one or more concretisations and TConcreteParameterType has generics.

A TParametricEntity knows its concrete and generic parameters.

ParameterType uses the TWithInheritance trait because in Java we can do the following: <T extends Object> and <? super Number>. For the first, it means that T can be all subclasses of Object and for the second, it means Number and all its superclasses or interfaces (Number, Object, Serializable). ParameterType also uses the TThrowable trait because we can have a genericException so ParameterType should be considered like it.

public interface GenericThrower<T extends Throwable> {
public void doThrow() throws T;
}

example

If we take the first class. We have a ParametricClass with one ParameterType name T.

{{ &#x27;classA<T>&#x27; | escape }}

For the second class, we have a class that extends a parametric class with one parameter named String. String here is a class. It is not a ParameterType anymore.

{{ &#x27;classB extends class<String>&#x27; | escape }}

So, what is the link between the two parametric classes and the parameters T and String?

concretization

We have here a Concretisation. ClassA with the parameter T has one concretization and the parameter T has one parameter Concretisation which is String.

If we take back our first example:

public class ClassA<T>
ClassA<Integer> class1 = new ClassA<Integer>
ClassA<String> class2 = new ClassA<String>

We have three ParametricClass, one ParameterType and two types (String and Integer). T is our ParameterType and has two ParameterConcretisations: String and Integer. We can say that T is generic and String and Integer are concrete because we know what they are: classes. ClassA with the ParameterType T (ClassA<T>) also has two concretizations. These are ClassA<Integer> and ClassA<String>. The three different classA know their parameters. T is in genericParameters. String and Integer are in concreteParameters.

A class is generic if it has at least one ParameterType. We can have concretization of a parametric class that is also generic. See the example below:

public class ParametricClass<T, V, K, Z>
public class ParametricClass2<Z> extends ParametricClass<String, Integer, Integer, Z>

The second ParametricClass has one ParameterType, so the class is generic. The four parameters (T, V, K, Z) have each a concretization (String, Integer, Integer, Z). Even if Z is still a ParameterType.

ParametricClass2 has for superclass ParametricClass, which has for generic entity ParametricClass with 4 ParameterTypes.

methodParametric

Let’s see what we have here. First of all, we recognize a ParametricClass with one ParameterType. This class has two methods. One is a regular method and the second is a parametricMethod. The first one isn’t generic because when the class is concretized, the ParameterType T will become String, Integer, Animals… and it will be the same for the method. The parameter of the first method depends on the class and this is not the case for the second method. That is why the second method is generic, not the first one.

public classA<T>
public classB extends classA<String>

This is how we can represent this in Pharo.

classAgen := FamixJavaParametricClass named:'ClassA'.
t := FamixJavaParameterType named:'T'.
classAgen addGenericParameter: t.
classAcon := FamixJavaParametricClass named:'ClassA'.
string := FamixJavaClass named:'String'.
classAgen addConcreteParameter: string.
FamixJavaConcretisation new concreteEntity: classAcon ; genericEntity: classAgen.
FamixJavaParameterConcretisation new concreteParameter: string ; genericParameter: t.
classB := FamixJavaClass named:'ClassB'.
FamixJavaInheritance new subclass: classB ; superclass: classAcon .

In this post, we have seen how generics types are modeled with VerveineJ and Moose for code analysis.