Blog

Some tools on FAST models

Mar 7, 2025

The package FAST-Core-Tools in repository https://github.com/moosetechnology/FAST offers some tools or algorithms that are running on FAST models.

These tools may be usable directly on a specific language FAST meta-model, or might require some adjustements by subtyping them. They are not out-of-the-shelf ready to use stuff, but they can provide good inspiration for whatever you need to do.

Dumping AST

Writing test for FAST can be pretty tedious because you have to build a FAST model in the test corresponding to your need. It often has a lot of nodes that you need to create in the right order with the right properties.

This is where FASTDumpVisitor can help by visiting an existing AST and “dump” it as a string. The goal is that executing this string in Pharo should recreate exactly the same AST.

Dumping an AST can also be useful to debug an AST and checking that it has the right properties.

To use it, you can just call FASTDumpVisitor visit: <yourAST> and print the result. For example:

FASTDumpVisitor visit:
  (FASTJavaUnaryExpression new
    operator: '-' ;
    expression:
      (FASTJavaIntegerLiteral new
        primitiveValue: '5'))

will return the string: FASTJavaUnaryExpression new expression:(FASTJavaIntegerLiteral new primitiveValue:'5');operator:'-' which, if evaluated, in Pharo will recreate the same AST as the original.

Note: Because FAST models are actually Famix models (Famix-AST), the tools works also for Famix models. But Famix entities typically have more properties and the result is not so nice:

FASTDumpVisitor visit:
  (FamixJavaMethod new
    name: 'toto' ;
    parameters: {
      FamixJavaParameter new name: 'x' .
      FamixJavaParameter new name: 'y'} ).

will return the string: FamixJavaMethod new parameters:{FamixJavaParameter new name:'x';isFinal:false;numberOfLinesOfCode:0;isStub:false.FamixJavaParameter new name:'y';isFinal:false;numberOfLinesOfCode:0;isStub:false};isStub:false;isClassSide:false;isFinal:false;numberOfLinesOfCode:-1;isSynchronized:false;numberOfConditionals:-1;isAbstract:false;cyclomaticComplexity:-1;name:'toto'.

Local symbol resolution in an AST

By definition an AST (Abstract Syntax Tree) is a tree (!). So the same variable can appear several time in an AST in different nodes (for example if the same variable is accessed several times).

The idea of the class FASTLocalResolverVisitor is to relate all uses of a symbol in the AST to the node where the symbol is defined. This is mostly useful for parameters and local variables inside a method, because the local resover only looks at the AST itself and we do not build ASTs for entire systems.

This local resolver will look at identifier appearing in an AST and try to link them all together when they correspond to the same entity. There is no complex computation in it. It just looks at names defined or used in the AST.

This is dependant on the programming language because the nodes using or defining a variable are not the same in all languages. For Java, there is FASTJavaLocalResolverVisitor, and for Fortran FASTFortranLocalResolverVisitor.

The tool brings an extra level of detail by managing scopes, so that if the same variable name is defined in different loops (for example), then each use of the name will be related to the correct definition.

The resolution process creates:

In declaration nodes (eg. FASTJavaVariableDeclarator or FASTJavaParameter),a property #localUses will list all referencing nodes for this variable;
In accessing nodes, (eg. FASTJavaVariableExpression), a property #localDeclarations will lists the declaration node corresponding this variable.
If the declaration node was not found a FASTNonLocalDeclaration is used as the declaration node.

Note: That this looks a bit like what Carrefour does (see /blog/2022-06-30-carrefour), because both will bind several FAST nodes to the same entity. But the process is very different:

Carrefour will bind a FAST node to a corresponding Famix node;
The local resolver binds FAST nodes together.

So Carrefour is not local, it look in the entire Famix model to find the entity that matches a FAST node. In Famix, there is only one Famix entity for one software entity and it “knows” all its uses (a FamixVariable has a list of FamixAccess-es). Each FAST declaration node will be related to the Famix entity (the FamixVariable) and the FAST use nodes will be related to the FamixAccess-es.

On the other hand, the local resolver is a much lighter tool. It only needs a FAST model to work on and will only bind FAST nodes between themselves in that FAST model.

Round-trip validation

For round-trip re-engineering, we need to import a program in a model, modify the model, and re-export it as a (modified) program. A lot can go wrong or be fogotten in all these steps and they are not trivial to validate.

First, unless much extra information is added to the AST, the re-export will not be syntactically equivalent: there are formatting issues, indentation, white spaces, blank lines, comments that could make the re-exported program very different (apparently) from the original one.

The class FASTDifferentialValidator helps checking that the round-trip implementation works well. It focuses on the meaning of the program independently of the formatting issues. The process is the follwing:

parse a set of (representative) programs
model them in FAST
re-export the programs
re-import the new programs, and
re-create a new model

Hopefully, the two models (2nd and last steps) should be equivalent This is what this tool checks.

Obviously the validation can easily be circumvented. Trivially, if we create an empty model the 1st time, re-export anything, and create an empty model the second time, then the 2 models are equivalent, yet we did not accomplish anything. This tool is an help for developers to pinpoint small mistakes in the process.

Note that even in the best of conditions, there can still be subtle differences between two equivalent ASTs. For example the AST for “a + b + c” will often differ from that of “a + (b + c)”.

The validator is intended to run on a set of source files and check that they are all parsed and re-exported correctly. It will report differences and will allow to fine tune the comparison or ignore some differences.

It goes through all the files in a directory and uses an importer, an exporter, and a comparator. The importer generates a FAST model from some source code (eg. JavaSmaCCProgramNodeImporterVisitor); the exporter generates source code from a model (eg. FASTJavaExportVisitor); the comparator is a companion class to the DifferentialValidator that handle the differences between the ASTs.

The basic implementation (FamixModelComparator) does a strict comparison (no differences allowed), but it has methods for accepting some differences:

#ast: node1 acceptableDifferenceTo: node2: If for some reason the difference in the nodes is acceptable, this method must return true and the comparison will restart from the parent of the two nodes as if they were the same.
#ast: node1 acceptableDifferenceTo: node2 property: aSymbol. This is for property comparison (eg. the name of an entity), it should return nil if the difference in value is not acceptable and a recovery block if it is acceptable. Instead of resuming from the parent of the nodes, the comparison will resume from an ancestor for which the recovery block evaluates to true.

A real example on using tags

Mar 5, 2025

Nicolas Anquetil

Moose expert

A real example on using tags

Tags can be a powerful tool to visualize things on legacy software and perform analyses. For example, tags can be used to create virtual entities and see how they “interact” with the real entities of the system analyzed. In the article Decomposing God Classes at Siemens we show how tags can be used to create virtual classes and see their dependencies to real classes.

In this post I will show another use of tags: how they can materialize a concept and show its instantiation in a system.

The scenario is that of analysing Corese, a platform to “create, manipulate, parse, serialize, query, reason and validate RDF data.” Corese is an old software that dates back to the early days of Java. Back then, enums did not exist in Java and a good way to implement them was to use a set of constants:

  public static final int MONDAY = 1;
  public static final int TUESDAY = 2;
  public static final int WEDNESDAY = 3;
  public static final int THURSDAY = 4;
  public static final int FRIDAY = 5;
  public static final int SATURDAY = 6;
  public static final int SUNDAY = 7;

Those were the days!

As an effort to restructure and rationalize implementation, the developers of Corese wish to replace these sets of constants by real Java enums. This is not something that can be done in any modern IDE even with the latest refactoring tool.

Let us see how Moose can help in the task.

Where are the constants used?

For an analysis in Moose, we need a model of the system, and this starts with getting the source code (https://github.com/corese-stack/corese-core). The model is created using VerveineJ which can be run using docker:

docker run -rm -v src/main/java/:/src ghcr.io/evref-bl/verveinej:latest -alllocals -o corese-core.json

This will create a file corese-core.json in the directory src/main/java/. The command to create the model as an option -alllocals. This is because VerveineJ by default only tracks the uses of variables with non primitive type (variables containing objects). Here the constants are integers and if we want to know where they are used, we need more details.

Let’s import the model in Moose. This can be done simply by dragging-and-dropping the file in Moose.

"Importing the Corese model"

We will study the use of the constants defined in fr.inria.corese.core.stats.IStats:

public interface IStats {

    public static final int NA = 0;
    public static final int SUBJECT = 1;
    public static final int PREDICATE = 2;
    public static final int OBJECT = 3;
    public static final int TRIPLE = 4;
[...]

To find where the constants are used, we need to find the representation of the constants in the model. For this, we can inspect the model (“Inspect” button in the Model Browser) and look for all “Model Attributes”. The constants are attributes of the interface/class in which they are defined as shown in the listing above). And they are model attributes because they are defined in the source code analysed, as opposed to System.out which may be used in the code but for which we don’t have the source code.

We can then select all the model attributes named PREDICATE: select: [ :each | each name = 'PREDICATE']. (note, the backslash (\) before the square bracket ([) was added by the publishing tool and is not part of the code)

Moose gives us 8 different definitions of PREDICATE (and 9 for OBJECT, and 10 for SUBJECT). The one we are interested in is the 3rd in the list (IStats.PREDICATE).

"All attributes named PREDICATE"

Having the same constants defined multiple times is not good news for the analysis and for the developers. But this kind of thing is fairly common in old systems which evolved during a long time in the hands of many developers. Not all of them had a complete understanding of the system and each had different skills and programming habits.

Looking at the lists of definitions for the 3 main constants (SUBJECT, PREDICATE, OBJECT), we find that there are at least 5 different definitions of these constants:

stats.IStats:

    public static final int NA = 0;
    public static final int SUBJECT = 1;
    public static final int PREDICATE = 2;
    public static final int OBJECT = 3;
    public static final int TRIPLE = 4;

kgram.sorter.core.Const:

    public static final int ALL = 0;
    public static final int SUBJECT = 1;
    public static final int PREDICATE = 2;
    public static final int OBJECT = 3;
    public static final int TRIPLE = 4;
    public static final int NA = -1;

compiler.result.XMLResult

    private static final int TRIPLE = 9;
    private static final int SUBJECT = 10;
    private static final int PREDICATE = 11;
    private static final int OBJECT = 12;

kgram.api.core.ExprType

       public static int TRIPLE    = 88;
       public static int SUBJECT  = 89;
       public static int PREDICATE = 90;
       public static int OBJECT    = 91;

kgram.core.Exp

    public static final int ANY        = -1;
    public static final int SUBJECT    = 0;
    public static final int OBJECT     = 1;
    public static final int PREDICATE  = 2;

So now we need to track the uses of all these constants in the system to understand how they can be replaced by one enum.

Note: Don’t close the Inspector window yet, we are going to need it soon.

Tagging the constants and their uses

Moose can help us here with tags. Tags are (as the name implies) just labels that can be attached to any entity in the model. Additionally, tags have a color that will help us distinguish them in visualizations.

So let’s tag our constants. We will define 5 tags, one for each set of constants, that is to say one for each of the 5 classes that implement these constants. You can choose whatever name and color you prefer for your tags, as long as you remember which is which. Here I named the tags from the name of the classes that define each set of constant.

"The tags that represent each set of constant"

Now we want to tag all the constants in a set with the same tag. Let’s see how to do it for constants in IStats, the ones listed in the previous section and that were our initial focus.

We select the “IStats” tag in the Tag Browser and go back to the Inspector where we have a list of all definitions of PREDICATE. If we click on the 3rd of these PREDICATE (“fr::inria::corese::core::stats::IStats.PREDICATE”), a new pane appears on the right, focusing on this attribute. There, we can click on its “parentType”, giving yet another pane. (The following screenshot shows the inspector right before we click on “parentType”).

"The inspector while navigating to the set of attributes of IStats" .

The right pane now focuses on the IStats Java interface. We can click on “attributes” to get the list of attributes it defines (including PREDICATE from which we started). There are 5 attributes which are the ones listed in the previous section.

So far so good.

To tag these attributes, we will “propagate” them (toolbar button of the Inspector on the right) to all tools that are in “Follow” mode. Note that if you minimized the Tag Browser at some point, it will be in “Freeze” mode like in the screenshot above. You need to put it back in “Follow” (radio toolbar button on the left) before propagating the list of constants.

Once propagated, the list appears in the center pane of the Tag Browser and you can pass it to the right pane with the ”>>>” button. Doing this will effectively tag the entities with the selected tag.

We now have tagged these 5 constants with the “IStats” tag. Ideally we want to find also the usage of these constants. So we would like to also tag the methods that use these constants.

For this you can open a Query Browser, it will start with the same list of 5 attributes that we just propagated. We can create a “Navigation query” and ask for all the “incoming” “accesses” to these attributes as shown below. The result is a list of 6 methods.

"The methods accessing the 5 attributes propagated"

We can now propagate these 6 methods and they will appear in the Tag Browser. We tag them with the same tag as the attributes themselves.

You can repeat the same operations for the 5 sets of constants listed above and the 5 different tags.

Visualizing the result

All this tagging was to be able to visualize where each set of constant is defined and, most importantly, used. We now turn to the “Architectural Map” which is a fine tool to visualize tags. for example, we could show all the top level packages of Corese and the Architectural Map will give visual clues on which ones contain tagged entities, and what tags. The Architectural Map allows to expand the content of entities which will allow us to deep dive into each package containing tagged entities to understand where exactly the entities is used or defined.

To select all the top level packages, we go back one last time to the Inspector to the very first pane on the left (you may also “Inspect” again the model to open a new Inspector). We select the “Model packages” and enter this query in the “script” at the bottom: self select: [ :each | each parentPackage isNotNil and: [each parentPackage name = 'core'] ]. (Again, ignore the backslashes)

The result is a list of 23 packages that we can propagate. Finally we open an Architectural Map that will start with the 23 packages that we just propagated.

In the following screenchot, I restricted the Architectural Map to the only 5 packages that do use our tags: “stats”, “kgram”, “util”, “sparql”, and “query”. This makes it easier to see the results here. I also expanded “kgram” that is small and contains different tags.

"The packages using the 5 attributes"

The single-color square, on the right of each package name, shows that it contains entities having one uniq tag (of this color). In our case it means that it contains the constants and methods accessing them, all with the same tag. For example, “core” and “util” packages contain entities tagged with only the green tag (which corresponds to the kgram.core.Exp class as previously shown in the Tag Browser screenshot).

When the square is multicolored, it means it contains entities with different tags. For example, we see that the package “kgram” contains at least the green (“Exp”) and the yellow (“Const”) tags.

Note that in this particular case, I added another tag for class kgram.api.core.Node which has its own definition of the OBJECT constant. I wanted to see where it was used also. This is the reason for the multicolored square of class StatsBasedEstimation, in package “stats”, which uses OBJECT from Node and the other constants from IStats.

In the end, the visualization allows to conclude that each package sticks pretty much to its own definition of the constants which is rather reassuring. It also shows where one would have to look if we were to replace the constant by a real enum.

This is not the end of it however because the constant values used in these methods can be passed off to other methods as argument. Here Famix alone (the meta-model used in Moose by default) can no longer help us to follow the flow of usage of the constants because they are just integer being passed around. For a finer analysis, a complete AST model should be used. This could be done with the FAST meta-model (Famix-AST), but it is another story that falls outside the scope of this blog-post.

See you later.

Generating a visitor infrastructure for a given meta-model

Feb 26, 2025

Nicolas Anquetil

Moose expert

This post is part of a serie dedicated to Famix Tools

Once we have a model of a program in Famix, we often find ourselves wanting to ¨ go through it” to apply some systematic analysis or modification. For example one could want to export the model as source-code https://github.com/moosetechnology/FAMIX2Java.

The Visitor design pattern is well adapted for these tasks. Note that the double-dispatch mechanism, which is an integral part of the visitor pattern, may also be useful to have entity specific actions even if one does not want to visit the entire model.

The Visitor pattern requires:

an accept: aVisitor method in every entity of the meta-model (eg.: in FamixJavaClass, FamixJavaMethod, FamixJavaAttribute,…)
visitFamixXYZ: aFamixXYZ for all entites to be visited, in the visitor class
the accept: methods invoke the correct visitFamixXYZ: method of the visitor depending on the class it is implemented in
the visitFamixXYZ: by default recursively visits all the “children” of the entity being visited (eg.: visitFamixJavaClass: should trigger the visit of the attributes and methods of the class)

For large meta-models, this can be cumbersome to implement as there may be many kinds of entities (43 for FamixJava) and the work is very repetitive. The tool FamixVisitorCodeGenerator can do all the work for you, automatically.

FamixVisitorCodeGenerator

Taking advantage of the meta-description of the Famix entities and the reflective nature of Pharo, it generates the accept: and visitFamixXYZ: for all entities of a meta-model.

Usage exmaple:

FamixVisitorCodeGenerator new
  package: 'Famix-Java-Entities' visitorClass: FamixJavaVisitor .

or for a FAST meta-model:

FamixVisitorCodeGenerator new
  package: 'FAST-Java-Entities' visitorClass: FASTJavaVisitor.

The tool needs an empty visitor class (or trait) created by the user (FamixJavaVisitor in the example above), and a Pharo package containing the Famix classes of a meta-model (Famix-Java-Entities in the example above). From this it will:

create an accept: method in all the classes in the given Famix package;
the accept: methods are created as extensions made by the package of the visitor;
the accept: methods invoke the correct visitFamixXYZ: depending on the class they are implemented in.
a setter method allows to skip this part: generateAccepts: false
the visitFamixXYZ: methods are created in the visitor class (or trait) for a “maximal visit” (see below).

Visiting methods

For a friendlier visitor, it is convenient that the visitor methods reproduce the inheritance hierarchy of the entities. For example, if FamixJavaAttribute and FamixJavaParameter both inherit from FamixJavaVariable entity, it is convenient that visitFamixJavaAttribute: and visitFamixJavaParameter: both call visitFamixJavaVariable: so that a common behaviour for the visit can be implemented only once.

Since Famix heavily relies on traits to compose meta-models, the same idea applies to used traits. For example many entities implements FamixTSourceEntity to get a sourceAnchor. For all these entities, and if we need a generic behavior based on the source anchor, it is convenient that all the visitXYZ: methods call visitTSourceEntity: where the common behavior can be implemented.

Finally it might be convenient that the visitXYZ: methods, visit all the entites related to the one we are visiting. For example when visiting a FamixJavaVariable, it might be interesting to recursively visit its declaredType.

All these conditions defines what we call a “maxium” visit of the meta-model.

”Maximum” Visit

As described above, the maximum visitXYZ: methods will trigger a resursive visit of many things (super-class visit method, used traits visit method, all entities related to the one being visited).

One down side of this is that a maximum visit is not a viable one in Famix because all relationships are bi-directionnal, causing infinite loops in the visit. For example, if in a FamixJavaVariable we recursively visit its declaredType, then in the FamixJavaTType we will recursively visit the typedEntities including the one we just came from.

There could be various solution to this problem:

implement a memory mechanism in the visitor to remember what entities were already visited and do not go back to them.
do not visit all relationships of an entity, but only its “children” relationship
let the user handle the problem

For now the tool rely on the third solution. If an actual visitor inherits (or use) the generated visitor, it must take care or redefining the visit methods so that it does not enter in an infinite loop (implementing any of the two other solutions)

In this sense, the maximum visit methods can be viewed as cheat sheets, showing all the things that one could do when visiting an entity.

In the future we might implement the second solution (visit only children) as an option of the visitor. For now it is important to remember that the generated visitor cannot be used as is. Methods must be redefined to get a viable visitor.

class vs. trait visitor

The natural action would be to define the visitor as aclass from which all required actual visitors will inherit. However, because visitors are so handy to go through the entire model, we discovered that we needed a lot of them (eg. visitors to create a copy of a model, to re-export a model) and they sometimes need to inherit from some other class.

As such, we rather recommend to create the maximal visitor as a trait that can be used by all actual visitors. This make no difference for the FamixVisitorCodeGenerator and it might prove very useful.

Transformation journey (2/2) : Copying FASTs and creating nodes

Apr 15, 2024

Romain Degrave

Master degree intern in software analysis

Welcome back to the little series of blog posts surrounding code transformation! In the previous post, we started to build a transformation tool using a basic fictional transformation use case on the software ArgoUML. In this post, we will add behavior to the class we created previously, to achieve the actual transformation while making sure that we do not modify the original model in the process. As a reminder, this is the scenario: in the ArgoUML system, three classes define and use a method named logError. A class ArgoLogger has been defined that contains a static method logError. The transformation task is to add a receiver node to each logError method invocation so that the method called is now ArgoLogger.logError.

Preface

As this blog post follows what was discussed and built in the first one of this series, a lot of information (how to build the image and model used for our use case, but also the use case itself) is provided in the previous post. If you haven’t read it, or if you forgot the content of the post, it is recommended to go back to it, as it will not be repeated in this post.

Tools to import

For this blog post, we will have to import a tool library, FAST-Java-Tools, which contains several tools to use on FAST for Java entities. It also contains practical tools to build transformation tools! To load the project, execute this command in a Playground :

Metacello new
  baseline: 'FASTJavaTools';
  repository: 'github://moosetechnology/FAST-Java-Tools:main/src';
  load

Loading it will automatically load the FAST-Java repository. Carrefour and MoTion (which were used in the previous post) are still necessary, so if you are starting from a clean image, you will need them as well.

Copying the (F)ASTs to transform

With our import done, we are ready to get to work! 😄 To create our transformation, we will modify the FAST of each candidate method for our transformation, and this modified FAST will then be generated into source code that we will be able to inject into the actual source files. Therefore, our next step is modifying the FAST of our methods. However, we will begin by making a copy of said FAST, to ensure that our transformation will not modify the actual original model of our method. If that happens, we will still be able to run Carrefour on our method again (using the message generateFastAndBind) but it is better to avoid that scenario. If we have a bug on our tool and the FAST of many methods ends up being modified, then re-calculating and binding every FAST will be a big waste of time. Making a copy allows us to make the same actions, but ensures that our model remains untouched.

To create a copy, we have a visitor that fortunately does all the job for us! Using this simple command will give you a FAST copy :

FASTJavaCopyVisitor new copy: aFASTMethod

However, to make things even easier, we will use an already made wrapper that will contain our candidate method for transformation in every useful “format” for transformation, which are :

The Famix entity of that method
The original FAST
The copied (transformed) FAST

{::comment} This wrapper also has collections to store specific nodes, but patience, we will see how to use these in the next blog post! 😉 {:/comment}

For now, let’s add a method to create that wrapper on a collection of candidate methods, in the class created in the first post :

createWrappersForCandidateMethods: aCollectionOfFamixJavaMethods

  ^ aCollectionOfFamixJavaMethods collect: [ :fmxMeth |
    | fastMethod |
    fmxMeth generateFastIfNotDoneAndBind.
    fastMethod := fmxMeth fast.
    JavaMethodToTransformWrapper
      forFamixMethod: fmxMeth
      andFAST: fastMethod ]

With this, we created wrappers for each method to transform, and each of these wrappers contains a copy of the FAST ready to be transformed. Before getting to the next step, we will define one more method, which will contain the “main behavior” of our tool, so that calling this method is enough to follow each step in order to transform the code.

findAndTransformAllLogErrorInvocations

  | wrappers |
  wrappers := self createWrappersForCandidateMethods:
    self fetchLogErrorMethodInvocationsSenders.

  wrappers do: [ :w | self transformMethod: w ].

If you followed everything so far, you can notice that there are two new methods called here. The first one, fetchLogErrorMethodInvocationsSenders, is just here to make the code easier to read :

fetchLogErrorMethodInvocationsSenders

  ^ self fetchLogErrorMethodInvocations
    collect: [ :mi | mi sender ]

The other one, transformMethod:, is the next step in this transformation journey, creating a new receiver node for our logError invocations and applying it on said invocations.

Create a new node

Therefore, the first thing to do is pretty obvious. We need to build a receiver node, for the ArgoLogger class. This transformation use case is pretty easy, so the node to create is always the same and is a pretty basic node.

We can in this case simply build this node “manually” in a method :

createNewReceiverNode

  ^ FASTJavaIdentifier new
    name: 'ArgoLogger';
    yourself

However, this option is not possible for every use case. Sometimes the node to create will be dependent on the context (the name of an identifier, method or variable might change), or we might have to create more complex nodes, composed of several nodes. In order to create those more complex nodes, one option is to parse the code using the parsing tool from the FAST-Java repository, the JavaSmaCCProgramNodeImporterVisitor, before inspecting the parsed result using the FASTDump view.

"Inspecting with FASTDump"

Using this view of the inspector then enable us to get the node or tree that we seek for our transformation and copy-paste the code to create it in a method for our tool to use it when needed.

Adding the new receiver node to (F)ASTs

Now that we have created our node and saw different means to do so, all that remains to do is setting it as the receiver of every copied FAST method to complete our transformation! Let us do so right away, by finally implementing our transformMethod: from before:

transformMethod: aJavaMethodWrapper

  | methodInvocationNode |
  methodInvocationNode := self motionQueryForFastMethod:
    aJavaMethodWrapper transformedFastMethod.

  methodInvocationNode receiver: self createNewReceiverNode

There are two things worth saying after making this method. First, and hopefully without boring you by repeating this over and over… This is a very simple transformation use case! As you can see, the only thing we do is using the receiver: setter on the appropriate node. Depending on the kind of edit that you want to apply on a method, you might need to experiment a bit and read the comments and method list of the classes of the nodes you are trying to edit. To experiment, one way is inspecting one of the FAST method that you wish to transform, to find the nodes you need to locate and transform, then go and read the class documentation and method list of the types of those nodes to find the appropriate setters you need to use. Another fun way to do so involves the Playground and using once again the parsing tool used in the previous section to use the FASTDump view of the Moose Inspector.

"Parsing a method"

Using this little code snippet, you can parse a Java method and inspect its FAST. So, copy the method you need to transform, inspect it, then transform the code of this method manually, then parse it and inspect it again! It’s an easy way to start the work towards building a transformation tool. 😄

The second thing to notice, is the use of MoTion and not Carrefour. As we now modify a copy of a FAST, it is easier to run the MoTion query on it to get the node we are looking for right away, rather than go through the original Famix invocation. However, it is still doable, as the original and copied FAST nodes both store their counterpart in an attribute slot. We can see this by inspecting one of the fast methods in a wrapper resulting from our transformation (look at the inspected slots and mooseID of each object).

"Inspecting an original and copy"

Try it yourself!

We are now done with the second part of our three blog post journey into code transformation! Our class is now able to create wrappers for each candidate method containing a copy of the FAST to transform, creating the new nodes to insert and finally make the edit on the copied FAST.

Just like with the first post, feel free to now try the tool and methods that we created here in a Playground and experiment with the results, or the methods themselves!

t := LoggerTransformationTool onModel: MooseModel root first.
t findAndTransformAllLogErrorInvocations

"Testing class"

The whole source code that was written on this blog post (and the previous one) is also available on that repository.

Conclusion

With the help of some tools and wrappers for FAST-Java, we managed to copy the FASTs of our candidate methods for our small transformation use case, before creating the necessary node for the transformation. In the next and final blog post on code transformation, we will see how to view and edit our transformation before confirming its content, and applying it to the source files. Until next time! 😄

Transformation journey (1/2) : Locating entities and nodes

Apr 1, 2024

Romain Degrave

Master degree intern in software analysis

Sometimes we have to perform several similar edits on our source code. This can happen in order to fix a recurring bug, to introduce a new design pattern or to change the architecture of a portion of a software. When many entities are concerned or when the edits to perform are complicated and take too much time, it can be interesting to consider building a transformation tool to help us and make that task easier. Fortunately, Moose is here to help with several modeling levels and powerful tools enabling us to build what we need!

Preface

This little transformation journey will be divided into two blog posts. We will see how to build a simple transformation tool, with each post focusing on a different aspect:

First post : Locating entities and nodes to transform
Second post : Creating AST copies and AST nodes to make a transformation {::comment}
Final post : Viewing and editing our transformation, and applying it to the source files {:/comment}

Throughout those two posts, we will follow a simple transformation scenario, based on the software ArgoUML, an open-source Java project used in this wiki. The first step is to create the model for the software, using the sources and libraries available on that wiki post, but creating the model on the latest stable version of VerveineJ.

Using the available Docker image, this command (on Windows) will create the model and store it in the sources repository :

docker run -v .\src:/src -v .\libs:/dependency badetitou/verveinej -format json -o argouml.json -anchor assoc .

All that remains to do is to create a fresh image, and import this model (with the sources repository used to build the model as root folder) to start making our tool.

Note : As the creation of that tool is divided in three blog posts, keeping that image for all three blog posts is recommended.

The scenario

The transformation case we will be dealing with in those blog posts is rather simple. In the ArgoUML system, three classes define and use a method named logError. In our scenario, a class ArgoLogger has been defined and contains a static method logError. The transformation task is to add a receiver node to each logError method invocation so that the method is called using the right class.

Tools to import

For this blog post, we will have to import two tools :

The first one is Carrefour, allowing us to bind and access the (F)AST model of an entity to its Famix counterpart. Loading it will also load the FAST Java metamodel. To load the project, execute this command in a Playground :

Metacello new
  githubUser: 'moosetechnology' project: 'Carrefour' commitish: 'v5' path: 'src';
  baseline: 'Carrefour';
  load

Second, we will use MoTion, an object pattern matcher that will allow us to easily explore the FAST of our methods and find the specific nodes we are looking for. To load the project, execute this command in a Playground :

Metacello new
    baseline: 'MoTion';
    repository: 'github://AlessHosry/MoTion:main';
    load: 'MoTion-Moose'

Querying the model

Finally done with explanations and setup! 😄 Let us start by creating a class with a model instance variable, accessors, and add a class side initializer method for ease of use:

"Creating our class"

onModel: aMooseModel

  ^ self new
      model: aMooseModel;
      yourself

This class will contain our entire code manipulation logic. It will be pretty short, but of course when working on more important transformations dividing the logic of our tool will help to make it more understandable and maintainable.

In any case, we will start with a basic Famix query, in order to find all the implementation of logError methods, which will later allow us to easily find their invocations :

fetchLogErrorMethods

  ^ model allModelMethods select: [ :m | m name = 'logError' ]

And then another query using this result, to get all invocations (the exact entities we seek to transform) :

fetchLogErrorMethodInvocations

  ^ self fetchLogErrorMethods flatCollect: [ :m |
      m incomingInvocations ]

Using MooseQuery, you should be able to find any Famix entities you are seeking. From those Famix entities, we want to get the FAST nodes that we need to transform. We will look at two different methods to do so.

Using Carrefour

In this context, Carrefour is the perfect tool to use to find the nodes we want to transform in the FAST of our methods. Now that we found the entities that we have to transform in the Famix model, all that remains is building and binding the FAST node of every entity within our method, and then fetch the ones from our method invocations.

To do so, we will add two methods to our class. First, a method to fetch the FAST node matching a given method invocation :

fetchFastNodeForFamixInvocation: anInvocation

  "building and binding the FAST of the invocating method"
  anInvocation sender generateFastIfNotDoneAndBind.

  "returning the actual node of the method invocation, our target"
  ^ anInvocation fast

And finally, a method that returns a list with every node we have to transform :

fetchAllFastNodesUsingCarrefour

  ^ self fetchLogErrorMethodInvocations collect: [ :mi |
      self fetchFastNodeForFamixInvocation: mi ]

And just like that, we now have the complete list of FAST nodes to transform!

Using MoTion

But before celebrating, we should keep in mind that this transformation is a very simple use case. Here, it is easy to find the entities to transform using Famix, but in some other cases it might be much more complex to find the methods that are candidates to a transformation, not to mention every node that must be transformed.

In those cases, a good way to make things easier is to divide the logic of this process, and use separate means to find the methods that are candidates to a transformation and to find the nodes that must be transformed.

Making queries on the Famix model remains a very reliable way to find the candidates methods, but then what about the nodes inside the AST of these methods? Methods can be quite complex (50 lines of code, 100, more…) and the resulting AST is huge. Finding the right node(s) in such AST is difficult. That’s where MoTion comes in. In order to find the nodes we are looking for, we can define patterns that describe those nodes, and the path to follow through the FAST to be able to reach those nodes.

MoTion is a powerful tool, allowing us to find specific items within a graph through concise patterns describing the objects we are looking for and the path in the graph used to reach them. However, it does have a very specific syntax that must be looked through before starting making our own patterns. Thankfully, everything is well documented and with examples (one of those being another example for FAST Java) on the repository of MoTion (look at the README!).

But enough description. Time to code! 😄

motionQueryForFastMethod: aFASTJavaMethodEntity

  | query |
  query := FASTJavaMethodEntity "type of the root node"
           % { (#'children*' <=> FASTJavaMethodInvocation "looking through all childrens  (with *)"
                                       "until we find method invocation nodes"
              % { (#name <=> 'logError') } as: #logErrorInvocation) }
                                       "if their name is logError,"
                                       "we save them to the given key"

             collectBindings: { 'logErrorInvocation' }   "at the end, we want all found invocations"
             for: aFASTJavaMethodEntity. "and this the root entity for our search"

  "the result of the query is a list of dictionaries, with each result in a dictionary"
  "we only have one call to logError per method, so we can do a simple access"
  ^ query first at: 'logErrorInvocation'

And without commentaries, to have a clearer view on how the pattern looks :

motionQueryForFastMethod: aFASTJavaMethodEntity

  | query |
  query := FASTJavaMethodEntity
           % { (#'children*' <=> FASTJavaMethodInvocation
              % { (#name <=> 'logError') } as: #logErrorInvocation) }
             collectBindings: { 'logErrorInvocation' }
             for: aFASTJavaMethodEntity.

  ^ query first at: 'logErrorInvocation'

Now, to complete the use of our pattern, let’s make a final method that will fetch every node we need to transform :

fetchAllFastNodesUsingMotion

  ^ self fetchLogErrorMethodInvocations collect: [ :mi |
      mi sender generateFastIfNotDoneAndBind.
      self motionQueryForFastMethod: mi sender fast ]

As you can see, we still use Carrefour even in this context, as it remains the easiest way to get the FAST of our method before looking through it using MoTion. Those two tools can therefore be used together when dealing with complex transformation cases.

Try it yourself!

Now that our class is done, we are able to locate candidates methods for transformation and the specific nodes to transform using Famix, FAST, Carrefour and MoTion. You can use a Playground to test out our class and model and see for yourself the results of each method :

t := LoggerTransformationTool onModel: (MooseModel root at: 1).
t fetchAllFastNodesUsingMotion

"Testing our class"

The whole source code that was written on this blog post is also available on that repository.

Conclusion

Using Famix, FAST, Carrefour and MoTion, we are able to search and locate methods and nodes candidates for a given transformation test case. This first step is primordial to build a fully completed transformation tool. In the next blog posts, we will see how to create AST copies and AST nodes to use in a transformation, and finally how to view and edit our transformation before applying it to the source files.