Expression evaluation in Java (4)

Previously we looked at various options how to evaluate expressions, then we implemented our own evaluator in ANTLR v4 and then we complicated it a bit with more types and more operations. But it still doesn’t make sense without variables. What good would any expression based rule engine be if we can’t change input parameters? 🙂

But before that…

All the code related to this blog post is here on GitHub, package name is called expr3 this time as it is our third iteration of the ANTLR solution (even though we are in the 4th post). Tests are here. We will focus on variables, but there are some changes made to expr3 beyond variables themselves.

  • Grammar supports NULL literal and identifiers (for variables) – but we will get to this.
  • ExpressionCalculatorVisitor now accepts even more types and method convertToSupportedType is taking care of this. This is important to support reasonable palette of object types for variables (and will work the same for return values from functions later).
  • Numbers can be represented as Integer or BigDecimal. If the number fits into Integer range (and is not decimal number, of course) it will be represented with Integer, otherwise BigDecimal is used. This does complicate arithmetic and relational (comparisons) operations, we need some promotions here and there, etc. As of now it is coded in a bit crude way – had the implicit conversion rules been more complicated, some more sophisticated solution would be better.
  • Java Date Time API types can be used as variable values, but these will be converted to the ISO extended strings (which still allows comparison!). LocalDate, LocalDateTime and Instant are supported in this demo.

As I said, I’ll not focus on these changes, they are in the code and while they affect how we treat variables, they are not inherently related to introducing them. I’ll also not talk about related tests (like literal resolution into Integer vs BigDecimal) – again, it is in the repo.

Identifiers and null

When we’re working with variables, we need to write them somehow into the expression – and that’s where identifiers come in. As you’d expect, identifiers represent the variables on their respective place in the expression (or rather their value), so they are one kind of elemental expressions, just like various literals. Second thing we may need is NULL value. This is not strictly necessary in all contexts, but null is so common in our Java world that I decided to support it too. Our expr node in its fullness looks like this:

expr: STRING_LITERAL # stringLiteral
    | BOOLEAN_LITERAL # booleanLiteral
    | NUMERIC_LITERAL # numericLiteral
    | NULL_LITERAL # nullLiteral
    | op=('-' | '+') expr # unarySign
    | expr op=(OP_MUL | OP_DIV | OP_MOD) expr # arithmeticOp
    | expr op=(OP_ADD | OP_SUB) expr # arithmeticOp
    | expr op=(OP_LT | OP_GT | OP_EQ | OP_NE | OP_LE | OP_GE) expr # comparisonOp
    | OP_NOT expr # logicNot
    | expr op=(OP_AND | OP_OR) expr # logicOp
    | ID # variable
    | '(' expr ')' # parens
    ;

Null literal is predictably primitive:

NULL_LITERAL : N U L L;

Identifiers are not very complicated either, and I guess they are pretty much similar to Java syntax:

ID: [a-zA-Z$_][a-zA-Z0-9$_.]*;

Various tests for null in the expression (without variables first) may look like this:

    @Test
    public void nullComparison() {
        assertEquals(expr("null == null"), true);
        assertEquals(expr("null != null"), false);
        assertEquals(expr("5 != null"), true);
        assertEquals(expr("5 == null"), false);
        assertEquals(expr("null != 5"), true);
        assertEquals(expr("null == 5"), false);
        assertEquals(expr("null > null"), false);
        assertEquals(expr("null < null"), false);
        assertEquals(expr("null <= null"), false); assertEquals(expr("null >= null"), false);
    }

We don’t need much for this – resolving the NULL literal is particularly simple:

@Override
public Object visitNullLiteral(NullLiteralContext ctx) {
    return null;
}

We also modified visitComparisonOp – now it starts like this:

@Override
public Boolean visitComparisonOp(ExprParser.ComparisonOpContext ctx) {
    Comparable left = (Comparable) visit(ctx.expr(0));
    Comparable right = (Comparable) visit(ctx.expr(1));
    int operator = ctx.op.getType();
    if (left == null || right == null) {
        return left == null && right == null && operator == OP_EQ
            || (left != null || right != null) && operator == OP_NE;
    }
...

The rest is dealing with non-null values, etc. We may also let this method return null when null is involved anywhere except for EQ/NE, now it returns false. Depends on what logic we want.

Variable resolver

Variable resolving inside the calculator class is also quite simple. We need something that resolves them – that’s that variableResolver field initialized in the constructor and used in visitVariable:

private final ExpressionVariableResolver variableResolver;

public ExpressionCalculatorVisitor(ExpressionVariableResolver variableResolver) {
    if (variableResolver == null) {
        throw new IllegalArgumentException("Variable resolver must be provided");
    }
    this.variableResolver = variableResolver;
}

@Override
public Object visitVariable(VariableContext ctx) {
    Object value = variableResolver.resolve(ctx.ID().getText());
    return convertToSupportedType(value);
}

Anything this resolver returns is converted to supported types as mentioned in the introduction. ExpressionVariableResolver is again very simple:

public interface ExpressionVariableResolver {
    Object resolve(String variableName);
}

And how we can implement this? In Java 8 you must just love it – here is piece of test:

private ExpressionVariableResolver variableResolver;

@BeforeMethod
public void init() {
    variableResolver = var -> null;
}

@Test
public void primitiveVariableResolverReturnsTheSameValueForAnyVarName() {
    variableResolver = var -> 5;
    assertEquals(expr("var"), 5);
    assertEquals(expr("anyvarworksnow"), 5);
}

I use field that is set in init method to default “implementation” that always return null. In another test method I change it and it always return 5, regardless of actual variable name (parameter var) as this test clearly demonstrates. Next test is more useful, because this resolver returns value only for specific value:

@Test
public void variableResolverReturnsValueForOneVarName() {
    variableResolver = var -> var.equals("var") ? 5 : null;
    assertEquals(expr("var"), 5);
    assertEquals(expr("var != null"), true);
    assertEquals(expr("var == null"), false);

    assertEquals(expr("anyvarworksnow"), null);
    assertEquals(expr("anyvarworksnow == null"), true);
}

Now the actual name of the variable (identifier) must be “var”, otherwise it returns null again. You might have heard that lambdas may work as super-short test implementations – and yes, they can.

You may wonder why I have the field instead of using it only in the test method itself. This would be better contained and preventing any accidental leaks (although that @BeforeMethod covered me on this). Trouble is that variableResolver is used deeper in expr(…) method and I didn’t want to add it as parameter everywhere, hence the field:

private Object expr(String expression) {
    ParseTree parseTree = ExpressionUtils.createParseTree(expression);
    return new ExpressionCalculatorVisitor(variableResolver)
        .visit(parseTree);
}

Any real-life implementation?

Variable resolvers in the test were obviously very primitive, so let’s try something more realistic. First try is also very simple, but indeed realistic. Remember Bindings in Java’s ScriptEngine? It actually extends Map – so how about resolver that wraps existing Map<String, Object> (mapping variable name to it’s value)? Ok, it – again – may be too primitive:

ExpressionVariableResolver resolver = var -> map.get(var);

Bah, we need some bigger challenge! Let’s say I have a Java Bean or any POJO and I want to explicitly specify my variable names and how they should be resolved with that object. This may be call of method, like getter, so right now we don’t have the values readily available in a collection (or a map).

Important thing to realize here is that resolver will be different from an object to object, because for different objects it needs to provide different values. However, the way how it obtains the values will be the same. And we will wrap this “way” into VariableMapper that knows how to get values from an object of specific type (using generics) – and it will also help us to resolve the value for specific instance. Tests show how I intend to use it:

private VariableMapper<SomeBean> variableMapper;
private ParseTree myNameExpression;
private ParseTree myCountExpression;

@BeforeClass
public void init() {
    variableMapper = new VariableMapper<SomeBean>()
        .set("myName", o -> o.name)
        .set("myCount", SomeBean::getCount);
    myNameExpression = ExpressionUtils.createParseTree("myName <= 'Virgo'"); myCountExpression = ExpressionUtils.createParseTree("myCount * 3"); } @Test public void myNameExpressionTest() { SomeBean bean = new SomeBean(); ExpressionCalculatorVisitor visitor = new ExpressionCalculatorVisitor( var -> variableMapper.resolveVariable(var, bean));

    assertEquals(visitor.visit(myNameExpression), false); // null comparison is false
    bean.name = "Virgo";
    assertEquals(visitor.visit(myNameExpression), true);
    bean.name = "ABBA";
    assertEquals(visitor.visit(myNameExpression), true);
    bean.name = "Virgo47";
    assertEquals(visitor.visit(myNameExpression), false);
}

@Test
public void myCountExpressionTest() {
    SomeBean bean = new SomeBean();
    ExpressionCalculatorVisitor visitor = new ExpressionCalculatorVisitor(
        var -> variableMapper.resolveVariable(var, bean));

// assertEquals(visitor.visit(myCountExpression), false); // NPE!
    bean.setCount(3f);
    assertEquals(visitor.visit(myCountExpression), new BigDecimal("9"));
    bean.setCount(-1.1f);
    assertEquals(visitor.visit(myCountExpression), new BigDecimal("-3.3"));
}

public static class SomeBean {
    public String name;
    private Float count;

    public Float getCount() {
        return count;
    }

    public void setCount(Float count) {
        this.count = count;
    }
}

VariableMapper can live longer, you set it up and then reuse it. Its configuration is its state (methods set), concrete object is merely input parameter. Variable resolver itself works like a closure around the concrete instance. Keep in mind that instantiating calculator visitor is cheap and visiting itself is something you have to do anyway. Creating parsing tree is expensive, but we don’t repeat this between tests – and that’s probably how you want to use it in your application too. Cache the parse trees, create visitors – even with state specific for a single calculation – and then throw them away. This is also safest from threading perspective. You don’t want to use calculator that “closes over” another target object in another thread in the middle of your visiting business. 🙂

Ok, how does that VariableMapper look like?

public class VariableMapper<T> {
    private Map<String, Function<T, Object>> variableValueFunctions = new HashMap<>();

    public VariableMapper<T> set(String variableName, Function<T, Object> valueFunction) {
        variableValueFunctions.put(variableName, valueFunction);
        return this;
    }

    public Object resolveVariable(String variableName, T object) {
        Function<T, Object> valueFunction = variableValueFunctions.get(variableName);
        if (valueFunction == null) {
            throw new ExpressionException("Unknown variable " + variableName);
        }
        return valueFunction.apply(object);
    }
}

As said, it keeps the configuration, but not the state of the object used in concrete calculation – that’s what the variable resolver does (and again, using lambda, one simply can’t resist in this case). Sure, you can combine VariableResolver with the mapping configuration too, but that will either 1) work in a single-threaded environment only, 2) or you have to reconfigure that mapping for each resolver in each thread. It simply doesn’t make sense. Mapper (long-lived) keeps the “way” how to get stuff from an object of some type in a particular computation context while variable resolver (short-lived) merely closes over the concrete instance.

Of course, our mapper can stand some improvements, it would be good if one could “seal” the configuration and no more “set” calls are allowed after that (probably throwing IllegalStateException).

Conclusion

So here we are, supporting even more types (Integer/BigDecimal), but – most importantly – variables! As you can see, now every computation can bring different result. That’s why it’s advisable to rethink how you want to instantiate your visitors, especially in case of multi-threaded environment.

Our ExpressionVariableResolver interface is very simple, it supports only variable name – so if you want to resolve from something stateful (and probably mutable) it’s important to wrap around it somehow. Variable resolver doesn’t know how to get stuff from an object, because there is no such input parameter. That’s why we introduced VariableMapper that supports getting values from an object of some type (generic). And we “implement” variable resolver as lambda to close over the configured variable mapper and an object that is then fed to its resolveVariable method. This method, in contrast to variable resolver’s resolve, takes in the object as a parameter.

It doesn’t have to be an object – you may implement other ways to get variable values in different contexts, you just have to wrap around that context (in our case object) somehow. I dare to say that Java 8 functional programming capabilities make it so much easier…

Still, the main hero here is ANTLR v4, of course. Now our expression evaluator truly makes sense. I’m not promising any continuation of this series, but maybe I’ll talk about functions too. Although I guess you can easily implement them yourselves by now.

Leave a comment