Skip to content

story

5 posts with the tag “story”

A real example on using tags

Tags can be a powerful tool to visualize things on legacy software and perform analyses. For example, tags can be used to create virtual entities and see how they “interact” with the real entities of the system analyzed. In the article Decomposing God Classes at Siemens we show how tags can be used to create virtual classes and see their dependencies to real classes.

In this post I will show another use of tags: how they can materialize a concept and show its instantiation in a system.

The scenario is that of analysing Corese, a platform to “create, manipulate, parse, serialize, query, reason and validate RDF data.” Corese is an old software that dates back to the early days of Java. Back then, enums did not exist in Java and a good way to implement them was to use a set of constants:

public static final int MONDAY = 1;
public static final int TUESDAY = 2;
public static final int WEDNESDAY = 3;
public static final int THURSDAY = 4;
public static final int FRIDAY = 5;
public static final int SATURDAY = 6;
public static final int SUNDAY = 7;

Those were the days!

As an effort to restructure and rationalize implementation, the developers of Corese wish to replace these sets of constants by real Java enums. This is not something that can be done in any modern IDE even with the latest refactoring tool.

Let us see how Moose can help in the task.

For an analysis in Moose, we need a model of the system, and this starts with getting the source code (https://github.com/corese-stack/corese-core). The model is created using VerveineJ which can be run using docker:

docker run -rm -v src/main/java/:/src ghcr.io/evref-bl/verveinej:latest -alllocals -o corese-core.json

This will create a file corese-core.json in the directory src/main/java/. The command to create the model as an option -alllocals. This is because VerveineJ by default only tracks the uses of variables with non primitive type (variables containing objects). Here the constants are integers and if we want to know where they are used, we need more details.

Let’s import the model in Moose. This can be done simply by dragging-and-dropping the file in Moose.

"Importing the Corese model"

We will study the use of the constants defined in fr.inria.corese.core.stats.IStats:

public interface IStats {
public static final int NA = 0;
public static final int SUBJECT = 1;
public static final int PREDICATE = 2;
public static final int OBJECT = 3;
public static final int TRIPLE = 4;
[...]

To find where the constants are used, we need to find the representation of the constants in the model. For this, we can inspect the model (“Inspect” button in the Model Browser) and look for all “Model Attributes”. The constants are attributes of the interface/class in which they are defined as shown in the listing above). And they are model attributes because they are defined in the source code analysed, as opposed to System.out which may be used in the code but for which we don’t have the source code.

We can then select all the model attributes named PREDICATE: select: [ :each | each name = 'PREDICATE']. (note, the backslash (\) before the square bracket ([) was added by the publishing tool and is not part of the code)

Moose gives us 8 different definitions of PREDICATE (and 9 for OBJECT, and 10 for SUBJECT). The one we are interested in is the 3rd in the list (IStats.PREDICATE).

"All attributes named PREDICATE"

Having the same constants defined multiple times is not good news for the analysis and for the developers. But this kind of thing is fairly common in old systems which evolved during a long time in the hands of many developers. Not all of them had a complete understanding of the system and each had different skills and programming habits.

Looking at the lists of definitions for the 3 main constants (SUBJECT, PREDICATE, OBJECT), we find that there are at least 5 different definitions of these constants:

  • stats.IStats:
public static final int NA = 0;
public static final int SUBJECT = 1;
public static final int PREDICATE = 2;
public static final int OBJECT = 3;
public static final int TRIPLE = 4;
  • kgram.sorter.core.Const:
public static final int ALL = 0;
public static final int SUBJECT = 1;
public static final int PREDICATE = 2;
public static final int OBJECT = 3;
public static final int TRIPLE = 4;
public static final int NA = -1;
  • compiler.result.XMLResult
private static final int TRIPLE = 9;
private static final int SUBJECT = 10;
private static final int PREDICATE = 11;
private static final int OBJECT = 12;
  • kgram.api.core.ExprType
public static int TRIPLE = 88;
public static int SUBJECT = 89;
public static int PREDICATE = 90;
public static int OBJECT = 91;
  • kgram.core.Exp
public static final int ANY = -1;
public static final int SUBJECT = 0;
public static final int OBJECT = 1;
public static final int PREDICATE = 2;

So now we need to track the uses of all these constants in the system to understand how they can be replaced by one enum.

Note: Don’t close the Inspector window yet, we are going to need it soon.

Moose can help us here with tags. Tags are (as the name implies) just labels that can be attached to any entity in the model. Additionally, tags have a color that will help us distinguish them in visualizations.

So let’s tag our constants. We will define 5 tags, one for each set of constants, that is to say one for each of the 5 classes that implement these constants. You can choose whatever name and color you prefer for your tags, as long as you remember which is which. Here I named the tags from the name of the classes that define each set of constant.

"The tags that represent each set of constant"

Now we want to tag all the constants in a set with the same tag. Let’s see how to do it for constants in IStats, the ones listed in the previous section and that were our initial focus.

We select the “IStats” tag in the Tag Browser and go back to the Inspector where we have a list of all definitions of PREDICATE. If we click on the 3rd of these PREDICATE (“fr::inria::corese::core::stats::IStats.PREDICATE”), a new pane appears on the right, focusing on this attribute. There, we can click on its “parentType”, giving yet another pane. (The following screenshot shows the inspector right before we click on “parentType”).

"The inspector while navigating to the set of attributes of IStats".

The right pane now focuses on the IStats Java interface. We can click on “attributes” to get the list of attributes it defines (including PREDICATE from which we started). There are 5 attributes which are the ones listed in the previous section.

So far so good.

To tag these attributes, we will “propagate” them (toolbar button of the Inspector on the right) to all tools that are in “Follow” mode. Note that if you minimized the Tag Browser at some point, it will be in “Freeze” mode like in the screenshot above. You need to put it back in “Follow” (radio toolbar button on the left) before propagating the list of constants.

Once propagated, the list appears in the center pane of the Tag Browser and you can pass it to the right pane with the ”>>>” button. Doing this will effectively tag the entities with the selected tag.

We now have tagged these 5 constants with the “IStats” tag. Ideally we want to find also the usage of these constants. So we would like to also tag the methods that use these constants.

For this you can open a Query Browser, it will start with the same list of 5 attributes that we just propagated. We can create a “Navigation query” and ask for all the “incoming” “accesses” to these attributes as shown below. The result is a list of 6 methods.

"The methods accessing the 5 attributes propagated"

We can now propagate these 6 methods and they will appear in the Tag Browser. We tag them with the same tag as the attributes themselves.

You can repeat the same operations for the 5 sets of constants listed above and the 5 different tags.

All this tagging was to be able to visualize where each set of constant is defined and, most importantly, used. We now turn to the “Architectural Map” which is a fine tool to visualize tags. for example, we could show all the top level packages of Corese and the Architectural Map will give visual clues on which ones contain tagged entities, and what tags. The Architectural Map allows to expand the content of entities which will allow us to deep dive into each package containing tagged entities to understand where exactly the entities is used or defined.

To select all the top level packages, we go back one last time to the Inspector to the very first pane on the left (you may also “Inspect” again the model to open a new Inspector). We select the “Model packages” and enter this query in the “script” at the bottom: self select: [ :each | each parentPackage isNotNil and: [each parentPackage name = 'core'] ]. (Again, ignore the backslashes)

The result is a list of 23 packages that we can propagate. Finally we open an Architectural Map that will start with the 23 packages that we just propagated.

In the following screenchot, I restricted the Architectural Map to the only 5 packages that do use our tags: “stats”, “kgram”, “util”, “sparql”, and “query”. This makes it easier to see the results here. I also expanded “kgram” that is small and contains different tags.

"The packages using the 5 attributes"

The single-color square, on the right of each package name, shows that it contains entities having one uniq tag (of this color). In our case it means that it contains the constants and methods accessing them, all with the same tag. For example, “core” and “util” packages contain entities tagged with only the green tag (which corresponds to the kgram.core.Exp class as previously shown in the Tag Browser screenshot).

When the square is multicolored, it means it contains entities with different tags. For example, we see that the package “kgram” contains at least the green (“Exp”) and the yellow (“Const”) tags.

Note that in this particular case, I added another tag for class kgram.api.core.Node which has its own definition of the OBJECT constant. I wanted to see where it was used also. This is the reason for the multicolored square of class StatsBasedEstimation, in package “stats”, which uses OBJECT from Node and the other constants from IStats.

In the end, the visualization allows to conclude that each package sticks pretty much to its own definition of the constants which is rather reassuring. It also shows where one would have to look if we were to replace the constant by a real enum.

This is not the end of it however because the constant values used in these methods can be passed off to other methods as argument. Here Famix alone (the meta-model used in Moose by default) can no longer help us to follow the flow of usage of the constants because they are just integer being passed around. For a finer analysis, a complete AST model should be used. This could be done with the FAST meta-model (Famix-AST), but it is another story that falls outside the scope of this blog-post.

See you later.

Migrating internationalization files

During my Ph.D. migration project, I considered the migration of several GUI aspects:

  • visual
  • behavioral
  • business

These elements are the main ones. When perfectly considered, you can migrate the front-end of any application. But, we are missing some other stuff 😄 For example, how do you migrate i18N files?

In this post, I’ll present how to build a simple migration tool to migrate I18N files from .properties (used by Java) to .json format (used by Angular).

First, let’s see our source and target.

As a source, I have several .properties files, including I18N for a Java project. Each file has a set of key/value and comments. For example, the EditerMessages_fr.properties is as follow:

##########
# Page : Edit
##########
pageTitle=Editer
classerDemande=Demande
classerDiffusion=Diffusion
classerPar=Classer Par

And it’s Arabic version EditerMessages_ar.properties

#########
# Page : Editer
#########
pageTitle=تحرير
classerDemande=طلب
classerDiffusion=بث
classerPar=تصنيف حسب

As a target, I need only one JSON file per language. Thus, the file for the french translation looks like this:

{
"EditerMessages" : {
"classerDemande" : "Demande",
"classerDiffusion" : "Diffusion",
"classerPar" : "Classer Par",
"pageTitle" : "Editer"
}
}

And the Arabic version:

{
"EditerMessages" : {
"classerDemande" : "طلب",
"classerDiffusion" : "بث",
"classerPar" : "تصنيف حسب",
"pageTitle" : "تحرير"
},
}

To perform the transformation from the .properties file to json, we will use MDE. The approach is divided into three main steps:

  1. Designing a meta-model representing internationalization
  2. Creating an importer of properties files
  3. Creating a JSON exporter

I18N files are simple. They consist of a set of key/values. Each value is associated with a language. And each file can be associated with a namespace.

For example, in the introduction example, the namespace of all entries is “EditerMessages”.

I designed a meta-model to represent all those concepts:

meta-model

Once the meta-model is designed, we must create an importer that takes .properties files as input and produces a model.

To produce a model, I first look for a .properties parser without much success. Thus, I decided to create my own parser. Given a correctly formatted file, the parser provides me the I18N entries. Then, by iterating on this collection, I build an I18N model.

To implement the parser, I used the PetitParser2 project. This project aims to ease the creation of new parsers.

First, I downloaded the last version of Moose, and I installed PetitParser using the command provided in the repository Readme:

Metacello new
baseline: 'PetitParser2';
repository: 'github://kursjan/petitparser2';
load.

In my Moose Image, I created a new parser. To do so, I extended the PP2CompositeNode class.

PP2CompositeNode << #CS18NPropertiesParser
slots: { };
package: 'Casino-18N-Model-PropertyImporter'

Then, I defined the parsing rules. Using PetitParser2, each rule corresponds to a method.

First, start is the entry point.

start
^ pairs end

pairs parses the entries of the .properties files.

pairs
^ comment optional starLazy, pair , ((newline / comment) star , pair ==> [ :token | token second ]) star , (newline/comment) star ==> [ :token |
((OrderedCollection with: token second)
addAll: token third;
yourself) asArray ]

The first part of this method (before ==>) corresponds to the rule parsed. The second part (after ==>), to the production.

The first part tries to parse one or several comment. Then, it parses one pair followed by a list of comment, newline, and pair.

This parser is clearly not perfect and would require some improvement. Nevertheless, it does work for our context.

The second part produces a collection (i.e. a list) of the pair.

Now that we can parse one file, we can build a I18N model. To do so, we will first parse every .properties file. For each file, we extract the language and the namespace based on the file name. Thus, EditerMessages_fr.properties is the file for the fr language and the EditerMessages namespace. Then, for each file entry, we instantiate an entry in our model inside the namespace and with the correct language attached.

importString: aString
(parser parse: aString) do: [ :keyValue |
(self model allWithType: CS18NEntry) asOrderedCollection
detect: [ :entry |
"search for existing key in the file"
entry key name = keyValue key ]
ifOne: [ :entry |
"If an entry already exists (in another language for instance)"
entry addValue: ((self createInModel: CS18NValue)
name: keyValue value;
language: currentLanguage;
yourself) ]
ifNone: [
"If no entry exist"
(self createInModel: CS18NEntry)
namespace: currentNamespace;
key: ((self createInModel: CS18NKey)
name: keyValue key;
yourself);
addValue: ((self createInModel: CS18NValue)
name: keyValue value;
language: currentLanguage;
yourself);
yourself ] ]

After performing the import, we get a model with, for each namespace, several entries. Each entry has a key and several values. Each value is attached to the language.

To perform the JSON export, I used the NeoJSON project. NeoJSON allows one to create a custom encoder.

For the export, we first select a language. Then, we build a dictionary with all the namespaces:

rootDic := Dictionary new.
(model allWithType: CS18NNamespace)
select: [ :namespace | namespace namespace isNil ]
thenDo: [ :namespace | rootDic at: namespace name put: namespace ].

To export a namespace (i.e., a CS18NNamespace), I define a custom encoder:

writter for: CS18NNamespace customDo: [ :mapper |
mapper encoder: [ :namespace | (self constructNamespace: namespace) asDictionary
]
].
constructNamespace: aNamespace
| dic |
dic := Dictionary new.
aNamespace containables do: [ :containable |
(containable isKindOf: CS18NNamespace)
ifTrue: [ dic at: containable name put: (self constructNamespace: containable) ]
ifFalse: [ "should be an CS18NEntry"
dic at: containable key name put: (containable values detect: [ :value | value language = language ] ifOne: [ :value | value name ] ifNone: [ '' ]) ] ].
^ dic

The custom encoder consists on converting a Namespace into a dictionary of entries with the entries keys and their values in the selected language.

Once my importer and exporter are designed, I can perform the migration. To do so, I use a little script. It creates a model of I18N, imports several .properties file entries in the model, and exports the Arabic entries in a JSON file.

"Create a model"
i18nModel := CS18NModel new.
"Create an importer"
importer := CS18NPropertiesImporter new.
importer model: i18nModel.
"Import all entries from the <myProject> folder"
('D:\dev\myProject\' asFileReference allChildrenMatching: '*.properties') do: [ :fileRef |
self record: fileRef absolutePath basename.
importer importFile: fileRef.
].
"export the arabian JSON I18N file"
'D:/myFile-ar.json' asFileReference writeStreamDo: [ :stream |
CS18NPropertiesExporter new
model: importer model;
stream: stream;
language: ((importer model allWithType: CS18NLanguage) detect: [ :lang | lang shortName = 'ar' ]);
export
]

The meta-model, importer, and exporter are freely available in GitHub.

Towards analyzing TypeScript with Moose

TypeScript is a more and more popular programming language, and so it would be great if we could analyze TypeScript projects using Moose. At the time of writing, no meta-model (or importer) exists for the TypeScript language in Moose. First, what are the pieces of the puzzle needed to analyze TypeScript with Moose? Before we consider TypeScript, let’s look at how things work with Java:

Elements of analyzing a Java project

VerveineJ is the importer that can generate models of Java files, allowing us to do analyses in Pharo/Moose.

If we want to do the same thing for TypeScript, we would need:

  • an equivalent of VerveineJ (importer) for TypeScript files,
  • a Famix model of TypeScript.

Creating a parser and importer for TypeScript is no small task, but TypeScript is a popular environment and we can use ts-morph to facilitate the navigation of the TypeScript AST. There’s also a very cool visualization of TypeScript ASTs, which will be useful for understanding and debugging.

Designing a new meta-model for TypeScript is definitely not trivial, because it requires a deep understanding of the language. On the other hand, once a meta-model exists, it’s easy to generate using FamixNG domain-specific language.

Pragmatically speaking, do we need a perfect model of TypeScript to analyze it?

“All models are wrong, but some are useful.” —maybe not George Box

By searching the web for TypeScript and Moose, I discovered a GitHub project called pascalerni/abap2famix. It is an ABAP importer (written in TypeScript) that models ABAP projects using FAMIX 3.0 (compatibility meta-model for Java). Java and ABAP are indeed different languages, but perhaps the differences are not so important if we want to do some static analysis? Seems like a pragmatic approach!

Looking at the node packages used by abap2famix I discovered famix, a TypeScript implementation of Famix, which facilitates creating FAMIX 3.0 entities from TypeScript. Its source is at pascalerni/famix, and I could see that much of it was generated, e.g., in class.ts there’s proof it was not written by hand:

// automatically generated code, please do not change
import {FamixMseExporter} from "../../famix_mse_exporter";
import {Type} from "./../famix/type";
export class Class extends Type {
...
}

How was this code generated? The answer is the fork of FameJava at pascalerni/FameJava, namely the Famix30Codegen.java file. The original FameJava was used to generate the Java API for use with FAMIX 3.0 metamodel. This fork generates (via Java) a TypeScript API. Clever and useful!

So, what if we try to create an importer using ts-morph and the famix packages that will model TypeScript programs in a Java metamodel? As a first try, model only the object-oriented elements of TypeScript, such as classes, methods, attributes, etc.

This is actually the project I proposed to students in an advanced-topics software design course at my university during the winter of 2021.

Several teams set out to achieve this goal, although none of the students had ever done parsing or Pharo before. Many were familiar with node and TypeScript.

I’m happy to say that they all were successful (in varying degrees) in writing an importer for TypeScript that allowed analyses to be done in Moose and Pharo, and their results are all on GitHub:

Team 1 | Team 2 | Team 3 | Team 4

Here are some visualizations produced by Team 4 using Roassal on models loaded into Moose.

The first shows the Weighted Method Count (sum) for classes in several TypeScript projects. The cyclomatic complexity values were calculated using another npm package (ts-complex) in the TypeScript importer:

WMC for various typescript projects

The following chart shows distributions of Cyclomatic Complexities of methods for various classes in the prisma project:

CC for prisma project in typescript

Limits of modeling TypeScript in a Java metamodel

Section titled “Limits of modeling TypeScript in a Java metamodel”

Here are some of the obvious things in TypeScript (Javascript) that don’t quite fit into a Java model:

  • Functions can exist in the global namespace. A workaround proposed by one team was to create a “Global” class in the Java model, and just put functions there as static methods.
  • Functions can exist in methods, but maybe this is possible to model in a newer meta-model for Java that supports lambdas. The API from pascalerni/famix supports an older meta-model for Java.
  • string, number, any are types in TypeScript, but they do not really map to primitive types or classes in Java.
  • TypeScript doesn’t have packages like Java, although it does have ways to specify namespaces and avoid naming conflicts.

Even though a formal model in TypeScript doesn’t (yet) exist in Famix, it’s possible to perform useful analyses of TypeScript using the FAMIX 3.0 (Java) metamodel, thanks to packages, tools and APIs developed and reused in the npm and Moose communities.

Photo credit: “patchwork beads” (CC BY-SA 2.0) by various brennemans

Dependency Structure Matrix for a Java project using Moose

Dependency Structure Matrix for a Java project using Moose

Section titled “Dependency Structure Matrix for a Java project using Moose”

As an extension to Analyzing Java With Moose, in this post I will show how one can create a Design Structure Matrix (DSM) in Moose, in particular from a model of a Java project.

First, you need to generate and load an MSE file into Moose for a Java project. Refer to this post for those steps, which uses the Java code from from Head First Design Patterns.

Roassal (which is a visualization platform that’s part of Moose) has a visualization for DSM called RTDSM. It’s explained here, but with Pharo classes. How to use it with Moose on a Java model?

The key is in the dependency: block, which we define using a Moose Query with allClients. Open a Moose Playground and paste the following Pharo code:

| dsm classes |
dsm := RTDSM new.
classes := (MooseModel root first allModelClasses
select: [ :c |
c mooseName
beginsWith:
'headfirst::designpatterns::combining::decorator' ])
reject: #isAnonymousClass.
"Avoid arbitrary ordering by sorting"
dsm objects: (classes asSortedCollection: [ :a :b | a name < b name]).
"Change the default label from asString which is very long"
dsm labelShapeX label text: #name.
dsm labelShapeY label text: #name.
"Moose Query equivalent to #dependentClasses for a Pharo class"
dsm dependency: #allClients.
dsm rotation: 270.
^dsm

Roassal DSM visualization

This visualization may not be as powerful as IDEA’s DSM analysis or Lattix’s, but it’s open source and can be manipulated in Pharo.

In Moose Query, the opposite to allClients is allProviders.

Analysis of Sindarin API: a modular Moose use case

How is our API used? This is the question the Pharo debugging team came to me with. They needed to know if it is used as they intended to or if there are some improvements to do. To answer that question, we used some of Moose new browsers.

The Pharo debugging team works on Sindarin, an API that eases the expression and automation of debugging tasks. Sindarin is made of two layers:

  • The “technical” API is the minimal API needed to interact with a debugging kernel (the Pharo kernel here).
  • The “scripting” API is here to ease the writing of debugging scripts without having to manipulate low-level concepts.

"Sindarin API"

Now, that’s the theory. In reality, the debugging team realized that these two layers are not really respected: users bypass the scripting API to use technical methods and the debugging kernel methods directly. We used Moose to analyze Sindarin API usage, to help understand if, and how, it should be improved.

Moose has recently been upgraded and now offers more modularity. Users can analyze their model through new specialized browsers that allow users to navigate and visualize their model entities. Entities are propagated from one browser to the others, dynamically updating their contents.

Helping the debugging team was an opportunity to test some of these browsers under operating conditions.

Analysis of the interactions of the users scripts with Sindarin and with the debugging kernel

Section titled “Analysis of the interactions of the users scripts with Sindarin and with the debugging kernel”

To analyze Sindarin, we built a model constituted of 3 key elements:

  • Users debugging scripts: script methods gathered in 4 Classes (green in the figure above)
  • Sindarin: methods from the SindarinDebugger class (both APIs, yellow and blue in the figure)
  • Pharo debugging kernel (pink in the figure)

Our goal was to understand how the users’ scripts use Sindarin and if they interact directly with the debugging kernel.

We imported the model directly from Pharo, using the Model Root browser. This browser is the entry point to modular Moose. It shows the list of installed models and allows to import new ones from MSE files or from Pharo.

"Model Root Browser"

Once the model was imported, it was automatically propagated in modular Moose, to other browsers.

Querying the model to select key elements from the model

Section titled “Querying the model to select key elements from the model”

To explore the debugging scripts, Sindarin and the debugger kernel - and the interactions between them - we used the Query browser.

The purpose of this browser is to query a model or a group of entities. It can filter entities according to their type or the value of their properties. It can also use MooseQuery to navigate associations (inheritances, invocations, accesses, and references) and scopes (containment).

"Queriest Browser"

The left pane of the browser is a visualization of the queries that were made. On the image above, no queries were made yet. The left pane only shows the model on which queries will apply. The right pane of the browser shows the result of the selected query and the code used to build it.

We obtained the 3 elements relevant to our analysis (users scripts, Sindarin, and the debugger kernel) by combining simple queries:

  • A Type query (Classes) gave us all the classes in the model, then
  • a property query (name beginsWith: 'SindarinScript') gave us the 4 scripts classes.

The two images below show how this last query was constructed:

"Query Scripts - Step 1"

"Query Scripts - Step 2"

Queries to get Sindarin API, i.e. methods of the class SindarinDebugger:
Section titled “Queries to get Sindarin API, i.e. methods of the class SindarinDebugger:”
  • A type query (Classes), gave us all classes in the model, then
  • a property query (name = 'SindarinDebugger’), gave us the Sindarin API class.
  • finally, a scope query (scope down to methods) gave us all Sindarin methods, i.e. the API.
Queries to get methods from the debugging kernel:
Section titled “Queries to get methods from the debugging kernel:”

The image below shows the sequence of queries created to obtain methods from the Context class and the Debugger-Model package:

  • Two Type queries (Classes and Packages, green on right), gave us all classes and packages in the model, then
  • two property queries (name = 'Context' and name = 'Debbuger-Model', blue), gave us the Context class and the Debugger-Model package, then
  • we made the union (pink) of the class and package, and finally
  • a scope query (scope down to methods, turquoise, bottom right query) gave us the methods they contain.

"Kernel methods"

Saving queries combinations as custom queries

Section titled “Saving queries combinations as custom queries”

We saved these combinations of simple queries as custom queries, as shown in the image below:

  • SindarinScript classes are the 4 script classes.
  • SindarinDebugger methods are the Sindarin API, i.e. methods from SindarinDebugger class
  • Debugger Kernel methods are methods from the debugging kernel, that should not be called directly from the scripts.

"Saved queries"

How do scripts interact with Sindarin and the debugging kernel?

Section titled “How do scripts interact with Sindarin and the debugging kernel?”

We added a navigation query to obtain methods that were called in the script. We then narrowed this result by selecting methods from Sindarin or from the debugging kernel.

To find which methods are called in the users scripts, we queried outgoing invocations from the scripts classes. We obtained a group of 331 candidate methods whose signatures match messages sent in the scripts:

"Script outgoing"

The next step was to compare this group of candidates with the methods from Sindarin scripting API. We established with the debugging team that scripting methods should have "scripting" as comment. This convention allowed us to get them with a property query (blue query in the figure below).

We queried for the intersection between the scripting API and the group of candidates invoked from the users scripts (salmon query at the bottom left in the figure below). This gave us 26 methods that are called from the scripts, out of 45 in the Sindarin scripting API:

These 26 mehtods, result of the last query performed, are listed in the top right pane of the Query browser.

"API called in scripts"

Sindarin methods that are not used in the scripts

Section titled “Sindarin methods that are not used in the scripts”

It was important for the debugging team to know the other 19 scripting API methods that are not called from the scripts. We obtained them by querying the difference between Sindarin methods and the group of candidates (purple query at the bottom left in the figure below):

"API not called in scripts"

Again the 19 resulting methods are listed in the “Result” pane.

The debugging team now knows which methods from their scripting API have been useful for users and, most importantly, which methods have not.

Debugging kernel methods called from the scripts

Section titled “Debugging kernel methods called from the scripts”

We also compared the group of candidates invoked from the users scripts with the methods of the debugging kernel and obtained 15 methods. These methods are used directly in the users scripts without using Sindarin:

"Kernel in scripts"

Identifying the scripts that use the debugging kernel:

Section titled “Identifying the scripts that use the debugging kernel:”

Let’s get back to the fact that outgoing invocations gave us a group of candidates methods. When several methods share the same signature, we cannot know for certain which method is called.

For example: Sindarin defines #arguments, which is also defined in Context.

Sindarin defines:

arguments
^ self context arguments

that should be used as follows:

sindarin := SindarinDebugger new.
"some steps"
(sindarin arguments at: 1) doSomething.

We wanted to detect cases where the users did not use Sindarin, like in the following:

sindarin := SindarinDebugger new.
"some steps"
(sindarin context arguments at: 1) doSomething.

This is a simple example or wrong practice. In more complex cases, interacting directly with the kernel would force the user to write verbose scripts and increase the risk of bugs. Why not benefit from a tested API when you can ?

To detect these cases, we used a convention of the users scripts: each time a call to Sindarin API is made, the receiver is a variable named sindarin. This means that invocations which receiver is not sindarin are probably cases where the user bypassed Sindarin API, calling the debugging kernel directly.

We inspected the 15 debugging kernel methods that may be used directly in the scripts and selected the ones that are sent from the scripts to a receiver other than sindarin. This was done in Pharo (not the Query browser), using MooseQuery:

self select: [ :method |
method incomingInvocations anySatisfy: [ :invocation |
(invocation sender parentType name beginsWith: 'SindarinScript')
and: [ invocation receiver isNil
or: [ invocation receiver name ~= 'sindarin' ] ] ] ]

We narrowed it down to 11 methods of the debugging kernel that are called from the scripts:

"Kernel called in scripts"

We then collected the scripts where these methods are called:

self flatCollectAsSet: [ :method |
method incomingInvocations
select: [ :invocation |
(invocation sender parentType name beginsWith: 'SindarinScript')
and: [ invocation receiver isNil
or: [ invocation receiver name ~= 'sindarin' ] ] ]
thenCollect: #sender ]

We obtain 15 user script methods that need to be investigated manually. Moose cannot determine which object will receive a message when the script is executed. The debugging team will analyze in the scripts the statements where the probable kernel methods are invoked. They will run the debugging scripts to identify at runtime which objects are the receivers.

Exploring these 15 scripts will help the debugging team to understand whether:

  • Users did not use Sindarin correctly: they interacted directly with the debugging Kernel, even though Sindarin methods are available,
  • Sindarin API is incomplete: users had no choice but to interact with the debugging kernel because Sindarin does not provide the necessary methods,
  • There are other reasons why the API is not used correctly?

"Scripts calling kernel"

With this experiment, we were able to provide valuable information to the debugging team. In return, we got suggestions of improvements to be implemented in modular Moose tools.

To know how many times a method is called, we need the number of invocations of this method. That is querying the size of a query result. The query browser does not allow it yet, and it should.

To get the 15 users scripts that need to be explored, we had to inspect a group of methods and run a selection manually (Pharo code). This should be possible directly in the queries browser: users should be able to build a query that executes a query script.

This experiment will help Pharo debugging team to improve Sindarin. They will be able to make it more complete and documented, so users continue to develop debugging scripts.

In the meantime, we will improve modular Moose to widen the queries possibilities. Then, we will challenge these new features: we will use them to analyze the upgraded Sindarin API to document its evolution.