For Moose, I had to design a number of parsers for various languages (Java, Ada, C/C++, PowerBuilder). If you have already done that, you will know that the Visitor pattern is a faithful ally. To help me in this, I came with the concept of “micro visitor” allowing to modularize visitors.
Parsing source code starts with a grammar of the programming language and an actual parser that creates an Abstract syntax Tree (AST) of the program.
For many programming languages, the AST can contain tens of different nodes.
The way to master this complexity is to use visitors.
A visitor is a class with one method (traditionaly
visitXYZ(XYZ node)) for each possible type of node in the AST.
Each method treats the current node and delegates to other methods treating the nodes below it.
For a parser like VerveineJ (Java to MSE importer) the visitor class reached 2000 lines of code and became difficult to maintain as there are also interactions between visiting methods because the treatment of a node down in the AST may depend on what are its parent nodes.
For example, in Java,
ThisExpression node may be found in different situations:
Therefore the treatment in
visitThisExpression( ThisExpression node) may depend on which method called it.
This makes it more complex to develop and maintain all the “visitXYZ” methods.
On the other hand, a visitor typically has a small state:
As a result, I came up with the notion of micro-visitors specialized for a single task. For example, for VerveineJ, I have 10 (concrete) micro-visitors, 4 to create entities and 6 to create dependencies between them:
VisitorPackageDef, creating Famix packages;
VisitorClassMethodDef, creating Famix classes and methodes;
VisitorVarsDef, creating Famix attribute, parameter, local variable definition;
VisitorComments, creating comments in all Famix entities;
VisitorInheritanceRef, creating inheritances between classes
VisitorTypeRefRef, creating reference to declared types;
VisitorAccessRef, creating accesses to variables;
VisitorInvocRef, creating invocation dependencies between methods;
VisitorAnnotationRef, handling annotations on entities;
VisitorExceptionRef, handling declared/catched/thrown exceptions.
The resulting visitors are much smaller (around 600 lines of code for the three more complex:
VisitorAccessRef ; less than 150 lines of code for
VisitorExceptionRef) and thus easier to define and maintain.
Also, because the visitor is specialized, there are less dependencies between the methods:
VisitorInvocRef only treats
ThisExpression when it is a constructor invocation.
The overhead on the execution is small as each visitor is specialized and does not need to go through all the AST (eg a visitor for function declaration in C would not have to visit the body of these functions since they cannot contain other function declarations).
Micro-visitors can be used independantly one of the other (in sequence) as in VerveineJ where each visitor is called one after the other (by the FamixRequestor class) to visit the full AST. The “orchestrator” object owns the state and pass it to each visitor in turn.
Micro-visitors can also call one another (in delegation).
For example for PowerBuilder, there is one main visitor (
PowerBuilder-Parser-Visitor.PWBCompilationUnitVisitor, visiting the AST for a source file) and 7 (concrete) micro-visitors:
PWBTypeDeclarationVisitor, visiting type declarations;
PWBBehaviouralDeclarationVisitor, visiting function/routine definitions;
PWBVariableDeclarationVisitor, visiting declarations of all kind of variables;
PWBTypeReferenceToIdentifierVisitor, visiting references to type names (for example in variable declarations);
PWBStatementsVisitor, visiting statements in the body of behaviourals;
PWBExpressionsVisitor, visiting expressions in statements;
PWBBehaviouralInvocationVisitor, visiting the invocation of behavioural in expressions.
In this case, the main visitor (
PWBCompilationUnitVisitor) owns the state and its auxiliary visitors get this state from their respective parent visitor:
PWBBehaviouralDeclarationVisitorwhen it encounters a function definition, this one spawns a
PWBStatementsVisitorto visit the body of the function,
PWBExpressionsVisitorto visit expressions found in the statements.
PWBExpressionsVisitorneeds to access the context stack, it asks to its parent
PWBStatementsVisitor, that asks to its parent
PWBBehaviouralDeclarationVisitor, that asks to the