Reflection selector expression

Java::Geci is a code generator that runs during unit test time. If the generated code fits the actual version of the source code then the test does not fail. If there is a need for any modification then the tests modify the source code and fail. For example, there is a new field that needs a setter and getter then the accessor generator will generate the new setter and getter and then it fails. If there is no new field then the generated code is just the one that is already there, no reason to touch the source code: the test that started the generator finishes successfully.

Because Java::Geci generators run as tests, which is run-time and because they need access to the Java code structures for which they generate code Java reflection is key for these generators.

To help the code generators to perform their tasks there are a lot of support methods in the javageci-tools module.

com.javax0.geci
javageci-tools
1.1.1

In this article, I will write one class in this module: Selector that can help you select a field, method or class based on a logical expression.

Introduction

The class javax0.geci.tools.reflection.Selector is a bit like the regular expression class Pattern. You can create an instance invoking the static method compile(String expression). On the instance, you can invoke match(Object x) where the x object can be either a Field a Method or a Class or something that can be cast of any of those (Let’s call these CFoMs). The method match() will return true if x fits the expression that was compiled.

Selector expression

The expression is a Java String. It can be as simple as true that will match any CFoM. Similarly false will not match anything. So far trivial. There are other conditions that the expression can contain. public, private volatile and so on can be used to match a CFoM that has any of those modifiers. If you use something like volatile on a CFoM that cannot be volatile (class or method) then you will get IllegalArgumentException.

For classes you can have the following conditions:

  • interface when the class is interface
  • primitive when it is a primitive type
  • annotation when it is an annotation
  • anonymous
  • array
  • enum
  • member
  • local

Perhaps you may look up what a member class is and what a local class is. It is never too late to learn a bit of Java. I did not know it was possible to query that a class is a local class in reflection until I developed this tool.

These conditions are simple words. You can also use pattern matching. If you write extends ~ /regex/ it will match only classes that extend a class that has a name matching the regular expression regex. You can also match the name, simpleName and canonicalName against a regular expression. In case our CFoM x is a method or field then the return type is checked, except in case of name because they also have a name.

Conditions

There are many conditions that can be used, here I list only a subset. The detailed documentation that contains all the words is at https://github.com/verhas/javageci/blob/master/FILTER_EXPRESSIONS.md

Here is an appetizer though:

protected, package, static, public, final, synthetic,
synchronized, native, strict, default, vararg, implements,
overrides, void, transient, volatile, abstract

Expression Structure

Checking one single thing would not be too helpful. And also calling the argument of the method compile() to be an “expression” suggests that there is more.

You can combine the conditions to full logical expression. You can create a selector Selector.compile("final | volatile") to match all fields that are kind of thread safe being either final or volatile or both (which is not possible in Java, but the selector expression would not mind). You can also say Selector.compile("public & final & static") to match only those fields that are public, final and static. Or you can Selector.compile("!public & final & static") to match the final and static fields that are private, protected or package private, also as “not public”. You can also apply parenthesis and with those, you can build up fairly complex logical expressions.

Use

The usage can be any application that heavily relies on reflection. In Java::Geci the expression can be used in the filter parameter of any generator that generates some code for the methods or for the fields of a class. In that case, the filter can select which fields or methods need code generation. For example, the default value for the filter in case of the accessor generator is true: generate setters and getter for all the fields. If you need only setters and getters for the private fields you can specify filter="private". If you want to exclude also final fields you can write `filter=”!final & private”. In that case, you will not get a getter for the final fields. (Setters are not generated for final fields by default and at all. The generator is clever.)

Using streams it is extremely easy to write expressions, like

Arrays.stream(TestSelector.class.getDeclaredFields())
.filter(Selector.compile("private & primitive")::match)
.collect(Collectors.toSet());

that will return the set of the fields that are private and primitive. Be aware that in that case, you have some selector compilation overhead (only once for the stream, though) and in some cases, the performance may not be acceptable.

Experiment and see if it suits your needs.

I just forgot to add: You can also extend the selector during run-time calling the selector(String,Function) and/or selectorRe(String,Function) methods.

Advertisements

Generating setters and getters using Java::Geci

In the article , we created very simple hello-world generators to introduce the framework and how to generate generators generally. In this article, we will look at the accessor generator, which is defined in the core module of Java::Geci and which is a commercial grade and not a demo-only generator. Even though the generator is commercial grade, using the services of the framework it has simple code so that it can be represented in an article.

What does an accessor generator

Accessors are setters and getters. When a class has many fields and we want to help encapsulation we declare these fields to be private and create setters and getters, a pair for each field that can set the value for the field (the setter) and can get the value of the field (the getter). Note that contrary to what many juniors think creating setters and getters is not encapsulation by itself, but it may be a tool to do proper encapsulation. And the same time note that it also may NOT be a tool for proper encapsulation. You can read more about it in “Joshua Bloch: Effective Java 3rd Edition” Item 16.

Read it with a bit of caution though. The book says that it was updated for Java 9. That version of Java contains the module system. The chapter Item 16 does not mention it and even this edition still says to use private members with setters and getters for public classes, which in case of Java 9 may also mean classes in packages that the module does not export.

Many developers argue that setters and getters are inherently evil and a sign of bad design. Don’t make a mistake! They do not advocate to use the raw fields directly. That would even be worse. They argue that you should program with a more object-oriented mindset. In my opinion, they are right and still in my professional practice I have to use a lot of classes maintaining legacy applications using legacy frameworks containing setters, getters, which are needed by the programming tools around the application. Theory is one thing and real life is another. Different integrated development environments and many other tools like generate setters and getters for us unless we forget to execute them when a new field was added.

A setter is a method that has an argument of the same type as the field and returns void. (A.k.a. does not return any value.) The name of the setter, by convention, is set and the name of the field with the first letter capitalized. For the field businessOwner the setter is usually setBusinessOwner. The setter sets the value of the field to that of the argument of the setter.

The getter is also a method which does not have any argument but returns the argument value and hence it has the same return type as the type of the field. The name of the getter, by convention, is get and again the name of the field capitalized. That way the getter will be getBusinessOwner.

In case of boolean or Boolean type fiels the getter may have the is prefix, so isBusinessOwner could also be a valid name in case the field is some boolean type.

An accessor generates setter and getter for all the fields it has to.

How to generate accessors

The accessor generator has to generate code for some of the fields of the class. This generator is the ideal candidate for a filtered field generator in Java::Geci. A filtered field generator extends the AbstractFilteredFieldsGenerator class and its process() method is invoked once for each filtered field. The method also gets the Field as a third parameter in addition to the usual Source and CompoundParams parameter that we already saw in the article a few weeks ago.

The class AbstractFilteredFieldsGenerator uses the configuration parameter filter to filter the fields. That way the selection of which field to take into account is the same for each generator that extends this class and the generators should not care about field filtering: it is done for them.

The major part of the code of the generator is the following:

public class Accessor extends AbstractFilteredFieldsGenerator {

    ...

    @Override
    public void process(Source source, Class<?> klass, 
                        CompoundParams params, 
                        Field field) throws Exception {
        final var id = params.get("id");
        source.init(id);
        var isFinal = Modifier.isFinal(field.getModifiers());
        var name = field.getName();
        var fieldType = GeciReflectionTools.typeAsString(field);
        var access = check(params.get("access", "public"));
        var ucName = cap(name);
        var setter = params.get("setter", "set" + ucName);
        var getter = params.get("getter", "get" + ucName);
        var only = params.get("only");
        try (var segment = source.safeOpen(id)) {
            if (!isFinal && !"getter".equals(only)) {
                writeSetter(name, setter, fieldType, access, segment);
            }
            if (!"setter".equals(only)) {
                writeGetter(name, getter, fieldType, access, segment);
            }
        }
    }
}

The code at the place of the ellipsis contains some more methods, which we will look at later. The first call is to get the parameter id. This is a special parameter and in case it is not defined then default params.get("id") returns is the mnemonic of the generator. This is the only parameter that has such a global default value.

The call to source.init(id) ensures that the segment will be treated as “touched” even if the generator does not write anything to that segment. It may happen in some cases and when writing a generator it never hurts calling source.init(id) for any segment that the generator intends to write into.

The code looks at the actual field to check if the field is final. If the field is final then it has to get the value by the time the object is created and after that, no setter can modify it. In this case, only a getter will be created for the field.

The next thing the setter/getter generator needs is the name of the field and also the string representation of the type of the field. The static utility method GeciReflectionTools.typeAsString() is a convenience tool in the framework that provides just that.

The optional configuration parameter access will get into the variable of the same name and it will be used in case the access modifier of the setter and the getter needs to be different from public. The default is public and this is defined as the second argument to the method params.get(). The method check() is part of the generator. It checks that the modifier is correct and prevents in most cases generation of syntax errored code (e.g.: creating setters and getter with access modifier pritected). We will look at that method in a while.

The next thing is the name of the getter and the setter. By default is set/get+ capitalized name of the field, but it can also be defined by the configuration parameter setter and getter. That way you can have isBusinessOwner if that is an absolute need.

The last configuration parameter is the key only. If the code specifies only='setter' or only='getter' then only the setter or only the getter will be generated.

The segment the generator wants to write into is opened in the head of the try-with-resources block and then calls local writeSetter and writeGetter methods. There are two different methods to open a segment from a source object. One is calling open(id), the other one if safeOpen(id). The first method will try to open the segment and if the segment with the name is not defined in the class source file then the method will return null. The generator can check the nullity and it has the possibility to use a different segment name if it is programmed so. On the other hand safeOpen() throws a GeciException if the segment cannot be opened. This is the safer version to avoid later null pointer exceptions in the generator. Not nice.

Note that the setter is only written if the field is not final and if the only configuration key was NOT configured to be getter (only).

Let’s have a look at these two methods. After all, these are the real core methods of the generators that do actually generate code.

    private static void writeGetter(String name, String getterName,
                                    String type, String access, Segment segment) {
        segment.write_r(access + " " + type + " " + getterName + "(){")
                .write("return " + name + ";")
                .write_l("}")
                .newline();
    }

    private static void writeSetter(String name, String setterName,
                                    String type, String access, Segment segment) {
        segment.write_r(access + " void " + setterName + "(" +
                type + " " + name + "){")
                .write("this." + name + " = " + name + ";")
                .write_l("}")
                .newline();
    }

The methods get the name of the field, the name of the accessor, the type of the field as a string, the access modifier string and the Segment the code has to be written into. The code generators do not write directly into the source files. The segment object provided by the framework is used to send the generated code and the framework inserts the written lines into the source code if that is needed.

The write(), write_l() and write_r() methods of the segment can be used to write code. They work very much like String.format if there are more than one parameters, but they also care about the proper tabulating. When the code invokes write_r() then the segment will remember that the lines following it have to be tabulated four spaces to the right more. When the code calls write_l() then the segment knows that the tabulation has to be decreased by four characters (even for the actual written line). They also handle multi-line strings so that they all will be properly tabulated.

Generated code should also be readable.

The final non-trivial method is the access modifier check.

    private static final Set<String> accessModifiers =
            Set.of("public", "private", "protected", "package");
...

    private String check(final String access) {
        if (!access.endsWith("!") && !accessModifiers.contains(access)) {
            throw new GeciException("'"+access+"' is not a valid access modifier");
        }
        final String modifiedAccess;
        if( access.endsWith("!")){
            modifiedAccess = access.substring(0,access.length()-1);
        }else {
            modifiedAccess = access;
        }
        if( modifiedAccess.equals("package")){
            return "";
        }
        return modifiedAccess;
    }

The purpose of this check is to protect the programmer from mistyping the access modifier. It checks that the access modifier is either private (I do not see a real use case for this one though), protected, public or package. The last one is converted to an empty string, as the package protected access is the default for class methods. The same time using the empty string in the configuration to denote package private access is not really readable.

That way if the code is configured pritected including a typo the code generator will throw an exception and refuses to generate code that is known to contain syntax error. On the other hand, the access modifier can also be more complex. In some rare cases, the program may need synchronized getters and setters. We do not try to figure out automatically anything like that checking if the field is volatile or such, because these are border cases. However, the generator provides a possibility to overcome the limited syntax checking and that way just to provide any string as access modifier. If the access modifier string ends with an exclamation mark then it means the programmer using the generator takes full responsibility for the correctness of the access modifier and the generator will use it as it is (without the exclamation mark of course).

What is left are the methods mnemonic and cap:

    private static String cap(String s) {
        return s.substring(0, 1).toUpperCase() + s.substring(1);
    }

    @Override
    public String mnemonic() {
        return "accessor";
    }

The method mnemonic() is used by the framework to identify the sources that need the service of this generator and also to use it as a default value for the configuration parameter id. All generators should provide this. The other one is cap that capitalizes a string. I will not explain how it works.

Sample use

@Geci("accessor filter='private | protected'")
public class Contained1 {

    public void callMe() {

    }

    private final String apple = "";
    @Geci("accessors only='setter'")
    private int birnen;

    int packge;

    @Geci("accessor access='package' getter='isTrue'")
    protected boolean truth;
    @Geci("accessor filter='false'")
    protected int not_this;

    public Map<String,Set<Map<Integer,Boolean>>> doNothingReally(int a, Map b, Set<Set> set){
        return null;
    }

    //<editor-fold id="accessor" desc="setters">

    //</editor-fold>

}

The class is annotated with the Geci annotation. The parameters is accessor filter='private | protected' that defines the name of the generator to be used on this source file and configures the filter. It says that we need setters and getters for the fields that are private and protected. The logical expression should be read: “filter the field is it is private or protected”.

Some of the fields are also annotated. birnen will get only a setter, truth setter and getter will be package protected and the getter will be named isTrue(). The field not_this will not get a setter or getter because the filter expression is overridden in the field annotation and it says: false that will never be true, which is needed to be processed by the generator.

The field apple is not annotated and will be processed according to the class level configuration. It is private therefore it will get accessor and because it is final it will get only a getter.

The code between the

    //<editor-fold id="accessor" desc="setters">

    //</editor-fold>

will contain the generated code. (You have to run the code to see it, I did not copy it here.)

Summary

In this article, we looked at a generator, which is a real life, commercial grade generator in the Java::Geci framework. Walking through the code we discussed how the code works, but also some other, more general aspects of writing code generators. The next step is to start a project using Java::Geci as a test dependency, use the accessor generator instead of the IDE code generator (which lets you forget to re-execute the setter getter generation) and later, perhaps you can create your own generators for even more complex tasks than just setters and getters.

Box old objects to be autoclosable

Since Java 7 we can use try-with-resources and have any object automatically closed that implements the Autocloseable interface. If the resource is Autocloseable. Some of the classes need some wrap-up but are not Autocloseable. These are mainly old classes in some legacy framework that still get in our way to make us trip up. Nobody is using Struts any more, but still, there are enough old frameworks that are there lurking in the dark and with which we have to live. I recently had that experience and I was so motivated that I created a simple AutoCloser class.

We may have a legacy class (in the example this is a mocking inner class of the testing class)

    public class NotAutoclosable {
        public NotAutoclosable() {
            opened = true;
        }

        public void dispose() {
            opened = false;
        }
    }

which is not auto-closeable as the name also implies. It does not implement the Autocloseable interface and it does not have a close() method. It has to be disposed calling the aptly named method dispose(). (The boolean field opened is used to check later in the unit test to assert the correct functioning of the AutoCloser class.)

The use of the class looks as follows:

    @Test
    void test() {
        final NotAutoclosable notAu;
        try (final var s = AutoCloser.useResource(new NotAutoclosable())
                .closeWith(sp -> sp.get().dispose())) {
            Assertions.assertTrue(opened);
        }
        Assertions.assertFalse(opened);
    }

We create the resource using the constructor of the inner class and we also define a Consumer that will “close” the resource. This consumer will get the same Supplier that is stored in the variable s.

Side note: this functional argument has to be a consumer and cannot be a Runnable using the variable s because that variable is not initialized when the lambda expression is evaluated as a lambda expression. When it is going to be used it will already be defined but that is too late for the Java compiler, it does not trust the programmer that much and usually, it does it with good reason.

The AutoCloser class is the following:

public class AutoCloser<T> {

    private final T resource;

    private AutoCloser(T resource) {
        this.resource = resource;
    }

    public static <T> AutoCloser<T> useResource(T resource) {
        return new AutoCloser<>(resource);
    }

    public AutoClosableSupplier closeWith(Consumer<Supplier<T>> closer){
        return new AutoClosableSupplier(closer);
    }

    public class AutoClosableSupplier implements Supplier<T>, AutoCloseable {
        private final Consumer<Supplier<T>> closer;

        private AutoClosableSupplier(Consumer<Supplier<T>> closer) {
            this.closer = closer;
        }

        @Override
        public T get() {
            return resource;
        }

        @Override
        public void close() {
            closer.accept(this);
        }

    }
}

The inner AutoClosableSupplier class is used because we do not want the programmer accidentally forget to specify the lambda that will finally close the resource.

This is nothing really serious. It is just a programming style that moves the closing of the resource close to the opening of the resource a bit like the deferred statement in the Go language.

Lazy assignment in Java

Programmers are inherently lazy and similis simili gaudet also like when the programs are lazy. Have you ever heard lazy loading? Or lazy singleton? (I personally prefer the single malt version though.) If you are programming in Scala or Kotlin, which is also a JVM language you can even evaluate expressions in a lazy way.

If you are programming in Scala you can write

lazy val z = "Hello"

and the expression will only be evaluated when z is accessed the first time. If you program in Kotlin you can write something like

val z: String by lazy { "Hello" }

and the expression will only be evaluated when z is accessed the first time.

Java does not support that lazy evaluation per se, but being a powerful language it provides language elements that you can use to have the same result. While Scala and Kotlin give you the fish, Java teaches you to catch your own fish. (Let’s put a pin in this thought.)

What really happens in the background, when you code the above lines in Scala and/or Kotlin, is that the expression is not evaluated and the variable will not hold the result of the expression. Instead, the languages create some virtual “lambda” expressions, a ‘supplier’ that will later be used to calculate the value of the expression.

We can do that ourselves in Java. We can use a simple class, Lazy that provides the functionality:

public class Lazy implements Supplier {

final private Supplier supplier;
private boolean supplied = false;
private T value;

private Lazy(Supplier supplier) {
this.supplier = supplier;
}

public static  Lazy let(Supplier supplier) {
return new Lazy(supplier);
}

@Override
public T get() {
if (supplied) {
return value;
}
supplied = true;
return value = supplier.get();
}
}

The class has the public static method let() that can be used to define a supplier and this supplier is invoked the first time the method get() is invoked. With this class, you can write the above examples as

var z = Lazy.let( () -> "Hello" );

By the way, it seems to be even simpler than the Kotlin version. You can use the class from the library:

com.javax0
lazylet
1.0.0

and then you do not need to copy the code into your project. This is a micro library that contains only this class with an inner class that makes Lazy usable in a multi-thread environment.

The use is simple as demonstrated in the unit tests:

private static class TestSupport {
int count = 0;

boolean callMe() {
count++;
return true;
}
}

...

final var ts = new TestSupport();
var z = Lazy.let(ts::callMe);
if (false && z.get()) {
Assertions.fail();
}
Assertions.assertEquals(0, ts.count);
z.get();
Assertions.assertEquals(1, ts.count);
z.get();
Assertions.assertEquals(1, ts.count);

To get the multi-thread safe version you can use the code:

final var ts = new TestSupport();
var z = Lazy.sync(ts::callMe);
if (false && z.get()) {
Assertions.fail();
}
Assertions.assertEquals(0, ts.count);
z.get();
Assertions.assertEquals(1, ts.count);
z.get();
Assertions.assertEquals(1, ts.count);

and get a Lazy supplier that can be used by multiple threads and it is still guaranteed that the supplier passed as argument is evaluated only once.

Giving you a fish or teaching you to fish

I said to put a pin in the note “While Scala and Kotlin give you the fish, Java teaches you to catch your own fish.” Here comes what I meant by that.

Many programmers write programs without understanding how the programs are executed. They program in Java and they write nice and working code, but they have no idea how the underlying technology works. They have no idea about the class loaders, garbage collections. Or they do, but they do not know anything about the machine code that the JIT compiler generates. Or they even do that but they have no idea about the processor caches, different memory types, hardware architecture. Or they know that but have no knowledge about microelectronics and lithography and how the layout of the integrated circuits are, how the electrons move inside the semiconductor, how quantum mechanics determines the non-deterministic inner working of the computer.

I do not say that you have to be a physicist and understand the intricate details of quantum mechanics to be a good programmer. I recommend, however, to understand a few layers below your everyday working tools. If you use Kotlin or Scala it is absolutely okay to use the lazy structures they provide. They give a programming abstraction one level higher than what Java provides in this specific case. But it is vital to know how the implementation probably looks like. If you know how to fish you can buy the packaged fish because then you can tell when the fish is good. If you do not know how to fish you will rely on the mercy of those who give you the fish.

Creating a Java::Geci generator

A few days back I wrote about Java::Geci architecture, code generation philosophy and the possible different ways to generate Java source code.

In this article, I will talk about how simple it is to create a generator in Java::Geci.

Hello, Wold generator

HelloWorld1

The simplest ever generator is a Hello, World! generator. This will generate a method that prints Hello, World! to the standard output. To create this generator the Java class has to implement the Generator interface. The whole code of the generator is:

package javax0.geci.tutorials.hello;

import javax0.geci.api.GeciException;
import javax0.geci.api.Generator;
import javax0.geci.api.Source;

public class HelloWorldGenerator1 implements Generator {
    public void process(Source source) {
        try {
            final var segment = source.open("hello");
            segment.write_r("public static void hello(){");
            segment.write("System.out.println(\"Hello, World\");");
            segment.write_l("}");
        } catch (Exception e) {
            throw new GeciException(e);
        }
    }
}

This really is the whole generator class. There is no simplification or deleted lines. When the framework finds a file that needs the method hello() then it invokes process().

The method process () queries the segment named “hello”. This refers to the lines

    //<editor-fold id="hello">
    //</editor-fold>

in the source code. The segment object can be used to write lines into the code. The method write() writes a line. The method write_r() also writes a line, but it also signals that the lines following this one have to be indented. The opposite is write_l() which signals that already this line and the consecutive lines should be tabbed back to the previous position.

To use the generator we should have a class that needs it. This is

package javax0.geci.tutorials.hello;

public class HelloWorld1 {
    //<editor-fold id="hello">
    //</editor-fold>
}

We also need a test that will run the code generation every time we compile the code and thus run the unit tests:

package javax0.geci.tutorials.hello;

import javax0.geci.engine.Geci;
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.DisplayName;
import org.junit.jupiter.api.Test;

import static javax0.geci.api.Source.maven;

public class TestHelloWorld1 {

    @Test
    @DisplayName("Start code generator for HelloWorld1")
    void testGenerateCode() throws Exception {
        Assertions.assertFalse(new Geci()
                .only("^.*/HelloWorld1.java$")
                .register(new HelloWorldGenerator1()).generate(), Geci.FAILED);
    }
}

When the code has executed the file HelloWorld1.java will be modified and will get the lines inserted between the editor folds:

package javax0.geci.tutorials.hello;

public class HelloWorld1 {
    //<editor-fold id="hello">
    public static void hello(){
        System.out.println("Hello, World");
    }
    //</editor-fold>
}

This is an extremely simple example that we can develop a bit further.

HelloWorld2

One thing that is sub-par in the example is that the scope of the generator is limited in the test calling the only() method. It is a much better practice to let the framework scan all the files and select the source files that themselves some way signal that they need the service of the generator. In the case of the “Hello, World!” generator it can be the existence of the hello segment as an editor fold in the source code. If it is there the code needs the method hello(), otherwise it does not. We can implement the second version of our generator that way. We also modify the implementation not simply implementing the interface Generator but rather extending the abstract class AbstractGeneratorEx. The postfix Ex in the name suggests that this class handles exceptions for us. This abstract class implements the method process() and calls the to-be-defined processEx() which has the same signature as process() but it is allowed to throw an exception. If that happens then it is encapsulated in a GeciException just as we did in the first example.

The code will look like the following:

package javax0.geci.tutorials.hello;

import javax0.geci.api.Source;
import javax0.geci.tools.AbstractGeneratorEx;

import java.io.IOException;

public class HelloWorldGenerator2 extends AbstractGeneratorEx {
    public void processEx(Source source) throws IOException {
        final var segment = source.open("hello");
        if (segment != null) {
            segment.write_r("public static void hello(){");
            segment.write("System.out.println(\"Hello, World\");");
            segment.write_l("}");
        }
    }
}

This is even simpler than the first one although it is checking the segment existence. When the code invokes source.open("hello") the method will return null if there is no segment named hello in the source code. The actual code using the second generator is the same as the first one. When we run both tests int the codebase they both generate code, fortunately identical.

The test that invokes the second generator is

package javax0.geci.tutorials.hello;

import javax0.geci.engine.Geci;
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.DisplayName;
import org.junit.jupiter.api.Test;

import static javax0.geci.api.Source.maven;

public class TestHelloWorld2 {

    @Test
    @DisplayName("Start code generator for HelloWorld2")
    void testGenerateCode() throws Exception {
        Assertions.assertFalse(new Geci()
                .register(new HelloWorldGenerator2())
                .generate(), Geci.FAILED);
    }
}

Note that this time we did not need to limit the code scanning calling the method only(). Also the documentation of the method only(RegEx x) says that this is in the API of the generator builder as a last resort.

HelloWorld3

The first and the second version of the generator are working on text files and do not use the fact that the code we modify is actually Java. The third version of the generator will rely on this fact and that way it will be possible to create a generator, which can be configured in the class that needs the code generation.

To do that we can extend the abstract class AbstractJavaGenerator. This abstract class finds the class that corresponds to the source code and also reads the configuration encoded in annotations on the class as we will see. The abstract class implementation of processEx() invokes the process(Source source, Class klass, CompoundParams global) only if the source code is a Java file, there is an already compiled class (sorry compiler, we may modify the source code now so there may be a need to recompile) and the class is annotated appropriately.

The generator code is the following:

package javax0.geci.tutorials.hello;

import javax0.geci.api.Source;
import javax0.geci.tools.AbstractJavaGenerator;
import javax0.geci.tools.CompoundParams;

import java.io.IOException;

public class HelloWorldGenerator3 extends AbstractJavaGenerator {
    public void process(Source source, Class<?> klass, CompoundParams global)
            throws IOException {
        final var segment = source.open(global.get("id"));
        final var methodName = global.get("methodName", "hello");
        segment.write_r("public static void %s(){", methodName);
        segment.write("System.out.println(\"Hello, World\");");
        segment.write_l("}");
    }

    public String mnemonic() {
        return "HelloWorld3";
    }
}

The method process() (an overloaded version of the method defined in the interface) gets three arguments. The first one is the very same Source object as in the first example. The second one is the Class that was created from the Java source file we are working on. The third one is the configuration that the framework was reading from the class annotation. This also needs the support of the method mnemonic(). This identifies the name of the generator. It is a string used as a reference in the configuration. It has to be unique.

A Java class that needs itself to be modified by a generator has to be annotated using the Geci annotation. The Geci annotation is defined in the library javax0.geci.annotations.Geci. The code of the source to be extended with the generated code will look like the following:

package javax0.geci.tutorials.hello;

import javax0.geci.annotations.Geci;

@Geci("HelloWorld3 id='hallo' methodName='hiya'")
public class HelloWorld3 {
    //<editor-fold id="hallo">
    //</editor-fold>
}

Here there is a bit of a nuisance. Java::Geci is a test phase tool and all the dependencies to it are test dependencies. The exception is the annotations library. This library has to be a normal dependency because the classes that use the code generation are annotated with this annotation and therefore the JVM will look for the annotation class during run time, even though there is no role of the annotation during run-time. For the JVM test execution is just a run-time, there is no difference.

To overcome this Java::Geci lets you use any annotations so long as long the name of the annotation interface is Geci and it has a value, which is a String. This way we can use the third hello world generator the following way:

package javax0.geci.tutorials.hello;

import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;

@HelloWorld3a.Geci(value = "HelloWorld3 id='hallo'", methodName = "hiyaHuya")
public class HelloWorld3a {
    //<editor-fold id="hallo">
    //</editor-fold>

    @Retention(RetentionPolicy.RUNTIME)
    @interface Geci {
        String value();

        String methodName() default "hello";
    }
}

Note that in the previous example the parameters id and methodName were defined inside the value string (which is the default parameter if you do not define any other parameters in an annotation). In that case, the parameters can easily be misspelled and the IDE does not give you any support for the parameters simply because the IDE does not know anything about the format of the string that configures Java::Geci. On the other hand, if you have your own annotations you are free to define any named parameters. In this example, we defined the method methodName in the interface. Java::Geci is reading the parameters of the annotation as well as parsing the value string for parameters. That way some generators may use their own annotations that help the users with the parameters defined as annotation parameters.

The last version of our third “Hello, World!” application is perhaps the simplest:

package javax0.geci.tutorials.hello;

import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;

public class HelloWorld3b {
    //<editor-fold id="HelloWorld3" methodName = "hiyaNyunad">
    //</editor-fold>
}

There is no annotation on the class, and there is no comment that would look like an annotation. The only thing that is there an editor-fold segment that has the id HelloWorld3, which is the mnemonic of the generator. If it exists there, the AbstractJavaGenerator realizes that and reads the parameters from there. (Btw: it reads extra parameters that are not present on the annotation even if the annotation is present.) And not only reads the parameters but also calls the concrete implementation, so the code is generated. This approach is the simplest and can be used for code generators that need only one single segment to generate the code into, and when they do not need separate configuration options for the methods and fields that are in the class.

Summary

In this article, I described how you can write your own generator and we also delved into how the annotations can be used to configure the class that needs generated code. Note that some of the features discussed in this article may not be in the release version but you can download and build the (b)leading edge version from https://github.com/verhas/javageci.

Handling exceptions functional style

Java supports checked exceptions from the very start. With Java 8 the language element lambda and the RT library modifications supporting stream operations introduced functional programming style to the language. Functional style and exceptions are not really good friends. In this article, I will describe a simple library that handles exceptions somehow similar to how null is handled using Optional.

The library works (after all it is a single Class and some inner classes, but really not many). On the other hand, I am not absolutely sure that using the library will not deteriorate the programming style of the average programmer. It may happen that someone having a hammer sees everything as a nail. A hammer is not a good pedicure tool. Have a look at this library more like an idea and not as a final tool that tells you how to create perfect code handling exceptions.

Also, come and listen to the presentation of Michael Feathers about exceptions May 6, 2019, Zürich https://www.jug.ch/html/events/2019/exceptions.html

Handling Checked Exception

Checked exceptions have to be declared or caught like a cold. This is a major difference from null. Evaluating an expression can silently be null but it cannot silently throw a checked exception. When the result is null then we may use that to signal that there is no value or we can check that and use a “default” value instead of null. The code pattern doing that is

var x = expression;
if( expression == null ){
  x = default expression that is really never null
}

The pattern topology is the same in case the evaluation of the expression can throw a checked exception, although the Java syntax is a bit different:

Type x; // you cannot use 'var' here
try{
  x = expression
}catch(Exception weHardlyEverUseThisValue){
  x = default expression that does not throw exception
}

The structure can be more complex if the second expression can also be null or may throw an exception and we need a third expression or even more expressions to evaluate in case the former ones failed. This is especially naughty in case of an exception throwing expression because of the many bracketing

Type x; // you cannot use 'var' here
try{
  try {
    x = expression1
  }catch(Exception e){
  try {
    x = expression2
  }catch(Exception e){
  try {
    x = expression3
  }catch(Exception e){
    x = expression4
  }}}}catch(Exception e){
  x = default expression that does not throw exception
}

In the case of null handling, we have Optional. It is not perfect to fix the million dollar problem, which is the name of designing a language having null and also an underestimation, but it makes life a bit better if used well. (And much worse if used in the wrong way, which you are free to say that what I describe in this article is exactly that.)

In the case of null resulting expressions, you can write

var x = Optional.ofNullable(expresssion)
         .orElse(default expression that is nere null);

You can also write

var x = Optional.ofNullable(expresssion1)
.or( () -> Optional.ofNullable(expression2))
.or( () -> Optional.ofNullable(expression3))
.or( () -> Optional.ofNullable(expression4))
...
.orElse(default expression that is nere null);

when you have many alternatives for the value. But you cannot do the same thing in case the expression throws an exception. Or can you?

Exceptional

The library Exceptional (https://github.com/verhas/exceptional)

<groupId>com.javax0</groupId>
<artifactId>exceptional</artifactId>
<version>1.0.0</version>

implements all the methods that are implemented in Optional, one method more and some of the methods a bit differently aiming to be used the same way in case of exceptions as was depicted above for Optional in case of null values.

You can create an Exceptional value using Exceptional.of() or Exceptional.ofNullable(). The important difference is that the argument is not the value but rather a supplier that provides the value. This supplier is not the JDK Supplier because that one cannot throw an exception and that way the whole library would be useless. This supplier has to be Exceptional.ThrowingSupplier which is exactly the same as the JDK Supplier but the method get() may throw an Exception. (Also note that only an Exception and not Throwable which you should only catch as often as you catch a red-hot iron ball using bare hands.)

What you can write in this case is

var x = Exceptional.of(() -> expression) // you CAN use 'var' here
    .orElse(default expression that does not throw exception);

It is shorter and shorter is usually more readable. (Or not? That is why APL is so popular? Or is it? What is APL you ask?)

If you have multiple alternatives you can write

var x = Exceptional.of(() -> expression1) // you CAN use 'var' here
    .or(() -> expression2)
    .or(() -> expression3) // these are also ThrowingSupplier expressions
    .or(() -> expression4)
...
    .orElse(default expression that does not throw exception);

In case some of the suppliers may result null not only throwing an exception there are ofNullable() and orNullable() variants of the methods. (The orNullable() does not exist in Optional but here it makes sense if the whole library does at all.)

If you are familiar with Optional and use the more advanced methods like ifPresent(), ifPresentOrElse(), orElseThrow(), stream(), map(), flatMap(), filter() then it will not be difficult to use Exceptional. Similar methods with the same name exist in the class. The difference again is that in case the argument for the method in Optional is a Function then it is ThrowingFunction in case of Exceptional. Using that possibility you can write code like

    private int getEvenAfterOdd(int i) throws Exception {
        if( i % 2 == 0 ){
            throw new Exception();
        }
        return 1;
    }

    @Test
    @DisplayName("some odd example")
    void testToString() {
        Assertions.assertEquals("1",
                Exceptional.of(() -> getEvenAfterOdd(1))
                        .map(i -> getEvenAfterOdd(i+1))
                        .or( () -> getEvenAfterOdd(1))
                .map(i -> i.toString()).orElse("something")
        );
    }

It is also possible to handle the exceptions in functional expressions like in the following example:

    private int getEvenAfterOdd(int i) throws Exception {
        if (i % 2 == 0) {
            throw new Exception();
        }
        return 1;
    }

    @Test
    void avoidExceptionsForSuppliers() {
        Assertions.assertEquals(14,
                (int) Optional.of(13).map(i ->
                        Exceptional.of(() -> inc(i))
                                .orElse(0)).orElse(15));
    }

Last, but not least you can mimic the ?. operator of Groovy writing

a.b.c.d.e.f

expressions, where all the variables/fields may be null and accessing the next field through them, causes NPE. You can, however, write

var x = Exceptional.ofNullable( () -> a.b.c.d.e.f).orElse(null);

Summary

Remember what I told you about the hammer. Use with care and for the greater good and other BS.

How to generate source code?

In this article, I will talk about the different phases of software development where the source code can be generated programmatically and I will compare the different approaches. I will also describe the architecture and the ideas (the kind of eureka moment) of a specific tool that generates code at a specific phase.

Manually

This is the answer to the question set in the title. If there is a possibility for the purpose you have to generate the code manually. I have already written an article a year ago about code generation and I have not changed my mind.

You should not generate code unless you really have to.

Weird statement, especially when I promote a FOSS tool that is exactly targeting Java code generation. I know, and still, the statement is that you have to write all the code you can manually. Unfortunately, or for the sake of my little tool, there are enough occasions when manual code generation is not an option, or at least automated code generation seems to be a better option.

Why to generate manually

I discussed it already in the referenced article, but here we go again. When the best option is to generate source code then there is something wrong or at least suboptimal in the system.

  • the developer creating the code is sub-par,
  • the programming language is sub-par, or
  • the environment, some framework is sub-par.

Do not feel offended. When I talk about the “sub-par developer” I do not mean You. You are well above the average developer last but not least because you are open and interested in new things proven by the fact that you are reading this article. However, when you write a code you should also consider the average developer Joe or Jane, who will some time in the future maintain your program. And, there is a very specific feature of the average developers: they are not good. They are not bad either, but they, as the name suggests, are average.

Legend of the sub-par developer

It may happen to you what has happened to me a few years back. It went like the following.

Solving a problem I created a mini-framework. Not really a framework, like Spring or Hibernate because a single developer cannot develop anything like that. (It does not stop though some of them trying even in a professional environment, which is contradictory as it is not professional.) You need a team. What I created was a single class that was doing some reflection “magic” converting objects to maps and back. Before that, we had toMap() and fromMap() methods in all classes that needed this functionality. They were created and maintained manually.

Luckily I was not alone. I had a team. They told me to scrap the code I wrote, and keep creating the toMap() and fromMap() manually. The reason is that the code has to be maintained by the developers who come after us. And we do not know them as they are not even selected. They may still study at the university or not even born. We know one thing: they will be average developers and the code I created needs a tad more than average skills. On the other hand, maintaining the handcrafted toMap() and fromMap() methods does not require more than the average skill, though the maintenance is error prone. But that is only a cost issue that needs a bit more investment into QA and is significantly less than hiring ace senior developers.

You can imagine my ambivalent feelings as my brilliant code was refused but with a cushion that praised my ego. I have to say, they were right.

Sub-par framework

Well, many frameworks are in this sense sub-par. Maybe the expression “sub-par” is not really the best. For example, you generate Java code from a WSDL file. Why does the framework generate source code instead of Java byte-code? There is a good reason.

Generating byte code is complex and need special knowledge. It has a cost associated with it. It needs some byte-code generation library like Byte Buddy, more difficult to debug for the programmer using the code and is a bit JVM version dependent. In case the code is generated as Java source code, even if it is for some later version of Java and the project is using some lagging version the chances are better, that the project can some way downgrade the generated code in case this is Java source than if it is byte code.

Sub-par language

Obviously, we are not talking about Java in this case, because Java is the best in the world and there is nothing better. Or is it? If anyone claims about just any programming language that the language is perfect ignore that person. Every language has strength and weaknesses. Java is no different. If you think about the fact that the language was designed more than 20 years ago and according to the development philosophy it kept backward compatibility very strict it simply implies that there should be some areas that are better in other languages.

Think about the equals() and hashCode() methods that are defined in the class Object and can be overridden in any class. There is no much invention overriding any of those. The overridden implementations are fairly standard. In fact, they are so standard that the integrated development environments each support generating code for them. Why should we generate code for them? Why are they not part of the language in some declarative way? Those are questions that should have very good answers because it would really not be a big deal to implement things like that into the language and still they are not. There has to be a good reason, that I am not the best person to write about.

As a summary of this part: if you cannot rely on the manually generated code, you can be sure that something is sub-par. This is not a shame. This is just how our profession generally is. This is how nature goes. There is no ideal solution, we have to live with compromises.

Then the next question is,

When to generate code?

Code generation principally can happen:

  • (BC) before compilation
  • (DC) during compilation
  • (DT) during the test phase
  • (DCL) during class loading
  • (DRT) during run-time

In the following, we will discuss these different cases.

(BC) Before compilation

The conventional phase is before compilation. In that case, the code generator reads some configuration or maybe the source code and generates Java code usually into a specific directory separated from the manual source code.

In this case, the generated source code is not part of the code that gets into the version control system. Code maintenance has to deal with the code generation and it is hardly an option to omit the code generator from the process and go on maintaining the code manually.

The code generator does not have easy access to the Java code structure. If the generated code has to use, extend or supplement in any way the already existing manual code then it has to analyze the Java source. It can be done line by line or using some parser. In either way, this is a task that will be done again by the Java compiler later and also there is a slight chance that the Java compiler and the tool used to parse the code for the code generator may not be 100% compatible.

(DC) during compilation

Java makes it possible to create so-called Annotation Processors that are invoked by the compiler. These can generate code during the compilation phase and the compiler will compile the generated classes. That way the code generation is part of the compilation phase.

The code generators running in this phase cannot access the compiled code, but they can access the compiled structure through an API that the Java compiler provides for the annotation processors.

It is possible to generate new classes, but it is not possible to modify the existing source code.

(DT) during the test phase

First, it seems to be a bit off. Why would anyone want to execute code generation during the test phase? However, the FOSS I try to “sell” here does exactly that, and I will detail the possibility, the advantages and honestly the disadvantages of code generation in this phase.

(DCL) during class loading

It is also possible to modify the code during the class loading. The programs that do this are called Java Agents. They are not real code generators. They work on the byte code level and modify the already compiled code.

(DRT) during run-time

Some code generators work during run-time. Many of these applications generate java bytecode directly and load the code into the running application. It is also possible to generate Java source code, compile the code and load the resulting bytes into the JVM.

Generating Code in Test Phase

This is the phase when and where Java::Geci (Java GEnerate Code Inline) generates the code. To help you understand how one comes to the weird idea to execute code generation during unit test (when it is already too late: the code is already compiled) let me tell you another story. The story is made up, it never happened, but it does not dwarf the explaining power.

We had a code with several data classes each with several fields. We had to create the equals() and hashCode() methods for each of these classes. This, eventually, meant code redundancy. When the class changed, a field was added or deleted then the methods had to be changed as well. Deleting a field was not a problem: the compiler does not compile an equal() or hashCode() method that refers to a non-existent field. On the other hand, the compiler does not mind such a method that does NOT refer to a new existing field.

From time to time we forgot to update these methods and we tried to invent more and more complex and better ways to counteract the error-prone human coding. The weirdest idea was to create an MD5 value of the field names and have this inserted as a comment into the equals() and hashCode() methods. In case there was a change in the fields then a test could check that the value in the source code is different from the one calculated from the names of the fields and then signal an error: unit test fails. We never implemented it.

The even weirder idea, that turned out not that weird and finally led to Java::Geci is actually to create the expected equals() and hashCode() method test during the test from the fields available via reflection and compare it to the one that was already in the code. If they do not match then they have to be regenerated. However, the code at this point is already regenerated. The only issue is that it is in the memory of the JVM and not in the file that contains the source code. Why just signal an error and tell the programmer to regenerate the code? Why does not the test write back the change? After all, we, humans should tell the computer what to do and not the other way around!

And this was the epiphany that led to Java::Geci.

Java::Geci Architecture

Java::Geci generates code in the middle of the compilation, deployment, execution life cycle. Java::Geci is started when the unit tests are running during the build phase.

This means that the manual and previously generated code is already compiled and is available for the code generator via reflection.

Executing code generation during the test phase has another advantage. Any code generation that runs later should generate only code, which is orthogonal to the manual code functionality. What does it mean? It has to be orthogonal in the sense that the generated code should not modify or interference in any way with the existing manually created code that could be discovered by the unit tests. The reason for this is that a code generation happening any later phase is already after the unit test execution and thus there is no possibility to test if the generated code effects in any undesired way the behavior of the code.

Generating code during the test has the possibility to test the code as a whole taking the manual as well as the generated code into consideration. The generated code itself should not be tested, per se, (that is the task of the test of the code generator project) but the behavior of the manual code that the programmers wrote may depend on the generated code and thus the execution of the tests may depend on the generated code.

To ensure that all the tests are OK with the generated code, the compilation and the tests should be executed again in case there was any new code generated. To ensure this the code generation is invoked from a test and the test fails in case new code was generated.

To get this correct the code generation in Java::Geci is usually invoked
from a three-line unit test that has the structure:

Assertions.assertFalse(...generate(...),"code has changed, recompile!");

The call to generate(...) is a chain of method calls configuring the framework and the generators and when executed the framework decides if the generated code is different or not from the already existing code. It writes Java code back to the source code if the code changed but leaves the code intact in case the generated code has not changed.

The method generate() which is the final call in the chain to the code
generation returns true if any code was changed and written back to
the source code. This will fail the test, but if we run the test again
with the already modified sources then the test should run fine.

This structure has some constraints on the generators:

  • Generators should generate exactly the same code if they are executed on the same source and classes. This is usually not a strong requirement, code generators do not tend to generate random source. Some code generators may want to insert timestamps as a comment in the code: they should not.
  • The generated code becomes part of the source and they are not compile-time artifacts. This is usually the case for all code generators that generate code into already existing class sources. Java::Geci can generate separate files but it was designed mainly for inline code generation (hence the name).

  • The generated code has to be saved to the repository and the manual source along with the generated code has to be in a state that does not need further code generation. This ensures that the CI server in the development can work with the original workflow: fetch – compile – test – commit artifacts to the repo. The code generation was already done on the developer machine and the code generator on the CI only ensures that it was really done (or else the test fails).

Note that the fact that the code is generated on a developer machine
does not violate the rule that the build should be machine independent.
In case there is any machine dependency then the code generation would
result in different code on the CI server and thus the build will break.

Code Generation API

The code generator applications should be simple. The framework has to do all the tasks that are the same for most of the code generators, and should provide support or else what is the duty of the framework?

Java::Geci does many things for the code generators:

  • it handles the configuration of the file sets to find the source files
  • scans the source directories and finds the source code files
  • reads the files and if the files are Java sources then it helps to find the class that corresponds to the source code
  • supports reflection calling to help deterministic code generation
  • unified configuration handling
  • Java source code generation in different ways
  • modifies the source files only when changed and write back changes
  • provide fully functional sample code generators. One of those is a full-fledged Fluent API generator that alone could be a whole project.
  • supports Jamal templating and code generation.

Summary

Reading this article you got a picture of how Java::Geci works. You can actually start using it visiting the GitHub Home Page of Java::Geci. I will also deliver a talk about this topic in Mainz at the JAX conference Wednesday, May 8, 2019. 18:15 – 19:15

In the coming weeks, I plan to write more articles about the design considerations and actual solutions I followed in Java::Geci.

You are encouraged to contact me, for the code, create tickets follow on Twitter, Linked-in whatnot. It is fun.