ValueTypes video, must see for juniors

JAX-TV uploaded the recording of my talk I presented this year in Mainz at W-JAX

The topic is Java ValueTypes. The talk, however, starts with the topic of how the CPU feels the time. What are the different ratios between the different time intervals when we read something from the CPU cache, memory, SSD or even over the network? This is a knowledge needed to understand the importance of value types.

The talk uses the same metaphor (the CPU as a bureaucrat and the clock as the heartbeat), which I also used in my book. It is very graphic and easy to remember, therefore I think this is a good listen for every junior. Even perhaps for some of the seniors.

Advertisements

Java Projects: Book Review

This article is about the book

Java Projects Second Edition, by Peter Verhas

that I wrote last year. The aim of such an article is usually to boost the sales of the book. It is no different in this case, but since this is a book that I wrote, and I am the person, who is writing the review it would be extremely awkward to praise the book. So I will not, although I like this book a lot. I think loving your own product, at least at the time when it is ready is a must. You may think about it differently later like I do about the previous edition of the same book, which could have been better. But then again, that is why there is the second edition in addition to the fact that Java developed in the meantime and it became Java 11 from Java 9. But back to the previous thought: you have to love your product when it is finished otherwise you can just throw it away. If you do not like it no-one else will. What is also important that you also have to love your work while you are working on it. And I did and I enjoyed creating this book.

Thus now I will write about the book, what it is, and what I intended it to be. Later in the article, I will also talk about how I was working on the book, some technicalities, and some secrets. (They are not that much of a secret if I publish them here, are they.) But before those, have the URLs here, where you can buy my book at PACKT, or Amazon, etc.

Intended Audience and Content of the Book

In agreement with the publisher, I wanted to write a book for those, who want to learn Java but already has some programming experience. I did not want to write about the simple notion of variables, loops, conditional constructs. I wanted to write a book that teaches you Java and a bit of programming. I wanted a book that any PHP, Python, C#, C, C++, Go, etc. programmer fresh out of uni can read and learn some Java programming and they can decide if it is for them or not. I wanted to dedicate the last chapter to non-java programming topics, like what can happen later in your career if you start as a programmer. You can stay a programmer, or become an architect, project manager, devops engineer. There are many possibilities based upon opportunities and interest. This intention was met with less agreement from the publisher side, but they accepted that my hands are the one that hit the keyboard and we got to a compromise. So the last chapter is also about some technical topics, like Java agent, polyglot programming, annotation processing, DSL, SDLC and so on.

Content of the book

The book has ten chapters in a bit more than 500 pages.

  • Chapter ONE

is how you get started. To start you need to install the Java environment and you have to get familiar with the command line tools. This is a bit cumbersome and in the case of Java, it is more complex than it is with other languages. I have some friends who started to learn Java using this book and struggled with it (not because of the book, but because of the complexity of the task). When you start learning Java you have to be patient at this point and you must have a strong belief that it will work.

  • Chapter TWO

is about the supporting tools and about the basic language elements. Even though the book is for those who can already program in some programming language the text has to describe the basic elements of the language like variables, classes, methods, types, expressions, loops and so on. You can see how complex Java is so that it is already the

  • Chapter THREE

is where we start programming something more complex than a “Hello, World”. The program is a sorting program and we implement not only the simple bubble sort but also quick sort. Along the way, we also touch topics like generics, TDD, unit tests, Java modules. These are advanced topics that originally were planned in later chapters, but I wanted to explain less the language and more programming along with the language.

  • Chapter FOUR

is a new program, and new topics. In this chapter, we (I imagine the reader and I) develop the game Mastermind. The user, sitting in front of the computer “hides” the pins and the program finds out what is hidden. The same chapter talks about collections, dependency injection, and integration tests.

  • Chapter FIVE

is the one I am most proud of. It is about concurrent programming. Many books use an example that scales well. You run it on one processor and it runs. You run it on two processors and it runs twice as fast. In real life usually, tasks are not that independent. So I decided to make the Mastermind game concurrent. This needed some refactoring. Honestly: I did not realize that before I started to write chapter 5 and chapter 4 was already finished. I decided not to rewrite chapter 4 (although that would have been the smaller amount of work), rather I detailed in the chapter the coding decisions and how the code has to be refactored. This is only a part of a chapter that is already about a very complex topic, so do not expect a full-blown refactoring tutorial. If you need a good book about refactoring then read Martin Fowler’s Refactoring book.

In addition to that, the chapter details most of the concurrent programming tools: wait, notify, locks, queues. The chapter concludes with the introduction of microbenchmarking that shows has faster parallel programs run faster on many CPUs.

  • Chapter SIX

is about creating a simple web interface for the program. Because the main topic of the book is Java and not HTML, CSS and JavaScript the front-end is fairly simple. On the other hand, the chapter focuses on IP, TCP, DNS, HTTP, and even HTTP/2. Then it goes on detailing the C/S architecture, mentions JavaServer Pages (a must is a must) and then we develop the code writing a servlet running with Jetty.

  • Chapter SEVEN

uses a new program and here we develop a REST program using Spring MVC, servlet filters, audit logging with AOP and we even discuss how dynamic proxies work.

  • Chapter EIGHT

extends the program and touches subjects like annotations, reflection, functional programming and scripting in Java.

  • Chapter NINE

is the last coding chapter. Here we create an “accounting” application using reactive interface. It is a bit awkward example but at the time I could not find anything better. Nevertheless, the principles of reactive programming and how to use the new reactive interfaces in Java are described in this chapter.

  • Chapter TEN

is the last chapter and that way it is the densest. It talks about topics that all developers should know about but hardly any developer will use. You, probably, will never create a Java agent or an annotation interface. But you should know what they are and that is why they are described here. There are also a few words about polyglot programming, which will be more and more prevalent. The majority of the chapter is about how programming in an enterprise setting works.

Motivation

My motivation was to create a programming book that will outlast the current version of Java. A book that teaches whoever reads it a bit of programming and helps them start to become a better programmer. Maybe my frustration meeting a lot of job interview candidates who had no clue in some of the very essential areas but who still thought they were senior developers was also a motivation factor.

Technicalities

At the start, I teased that I will tell you some secrets. Here they are.

Packt wanted me to write the book using Microsoft Word or using an online WordPress based WYSIWYG editor. WordPress has markup editing possibility, but this was switched off. I asked that they switch it on, but I was refused. So I decided to use Microsoft Word when I created the first edition of the book. The result was disastrous. The code samples copied from the actual source were reformatted during the editing process somewhere in the hands of the editors. Some of the formatting change made the code hard to read. Some of the changes were simply wrong, like removing all the spaces between the word int and the variable name n resulting intn.

When I started the second edition I decided to hack the system. By that time I was practicing a bit with Python and I created the Pyama project that can fetch code fragments from the source directories and it can insert it into Markdown files overriding the old versions. I also created a script that converted the special WordPress flavor HTML into Markdown and back. The first edition of my book was converted by Packt into this WordPress format.

When I opened a chapter with the WYSIWYG editor I pressed F12 to get to the debug mode and I used “edit HTML” on the WYSIWYG form to copy the HTML and paste it into a text file. I converted the input HTML to Markdown and I worked on the Markdown version. I like to work in a way that I edit the markup and at the same time, I can see the rendered page. When a chapter was ready I converted it back to HTML and I used the same debug more to paste the code back. It worked. Packt did not know it.

Summary

I believe that I wrote a book, which can be used professionally to learn programming and a bit also Java 11. As I wrote at the start of the first chapter:

It is like going through a path in a forest. You can focus on the gravel of the road but it is pointless. Instead, you can enjoy the view, the trees, the birds, and the environment around you, which is more enjoyable. This book is similar as I won’t be focusing only on the language. From time to time, I will cover topics that are close to the road and will give you some overview and directions on where you can go further after you finish this book. I will not only teach you the language but also talk a bit about algorithms, object-oriented programming principles, tools that surround Java development, and how professionals work. This will be mixed with the coding examples that we will follow.

Annotation Handling and JPMS

TLDR; Instead of annotation.getClass().getMethod("value") call
annotation.annotationType().getMethod("value").

(If you do not know what JPMS is, then you are not a Java developer. Stop reading, go ahead buy and read the book of the genius Nicolai Parlog https://www.manning.com/books/the-java-module-system about JPMS and become a Java developer again!)

All Java developers have heard about annotations. Annotations are with us since Java 1.5 (or only 1.6 if you insist). Based on my experience interviewing candidates I feel that most Java developers know how to use annotations. I mean, most developers know that it looks like @Test, or @Override and that they come with Java or with some library and they have to be written in front of a class, method, or variable.

A few developers know that you can also define an annotation in your code using @interface and that your code can do some metaprogramming using the annotation. Even fewer know that annotations can be processed by annotation processors and some of them can be processed during run time.

I could continue but long story short is that annotations are a mystery for most Java developers. If you think I am wrong stating how clueless related to the annotations most of the Java developers are, then consider that the number of programmers, (or coders, generally) was growing exponentially during the last 30 years and Java developers, especially, was doing so during the last 20 years and it is still growing exponentially. The exponential function has this feature: If the number of whatnots is growing exponentially then most of the whatnots are young.
That is the reason why most Java developers are not familiar with annotations.

To be honest, annotation handling is not something simple. It deserves its own article, especially when we want to handle annotations while using module systems.

During the final touches of the release 1.2.0 of the Java::Geci code generation framework I ran into a problem that was caused by my wrong use of annotations and reflection. Then I realized that probably most of the developers who handle annotations using reflection are doing so the same wrong way. There was hardly any clue on the net to help me understand the problem. All I found was a GitHub ticket and based on the information there I had to figure out what is really happening.

So let’s refresh a bit what annotations are and after that let’s have a look at what we may be doing wrong that was okay so far but may cause trouble when JPMS comes into the picture.

What is an annotation?

Annotations are interfaces that are declared using the interface keyword preceded with the @ character. This makes the annotation usable in the code the way we got used to. Using the name of the annotation interface with the @ in front of it (e.g.: @Example). The most frequently used such annotation is @Override that the Java compiler is using during compile time.

Many frameworks use annotations during run-time, others hook into the compilation phase implementing an annotation processor. I wrote about annotation processors and how to create one. This time we focus on the simpler way: handling annotations during run-time. We do not even implement the annotation interface, which is a rarely used possibility but is complex and hard to do as the article describes.

To use an annotation during run-time the annotation has to be available during run-time. By default, the annotations are available only during compile-time and do not get into the generated byte-code. It is a common mistake to forget (I always do) to put the @Retention(RetentionPolicy.RUNTIME) annotation on the annotation interface and then starting to debug why I cannot see my annotation when I access it using reflection.

A simple run-time annotation looks like the following:

@Retention(RetentionPolicy.RUNTIME)
@Repeatable(Demos.class)
public @interface Demo {
    String value() default "";
}

The annotations have parameters when used on classes, on methods, or on other annotated elements. These parameters are methods in the interface. In the example, there is only one method declared in the interface. It is called value(). This is a special one. This is a kind of default method. If there are no other parameters of an annotation interface, or even if there are but we do not want to use the other parameters and they all have default values then we can write

@Demo("This is the value")

instead of

@Demo(value="This is the value")

If there are other parameters that we need to use then we do not have this shortcut.

As you can see annotations were introduced on top of some existing structure. Interfaces and classes are used to represent annotations and it was not something totally new introduced into Java.

Starting with Java 1.8 there can be multiple annotations of the same type on an annotated element. You could have that feature even before Java 1.8. You could define another annotation, for example

@Retention(RetentionPolicy.RUNTIME)
public @interface Demos {
    Demo[] value();
}

and then use this wrapper annotation on the annotated element, like

@Demos(value = {
    @Demo("This is a demo class"),
    @Demo("This is the second annotation")})
public class DemoClassNonAbbreviated {
}

To ease the tendinitis, caused by excessive typing, Java 1.8 introduced the annotation Repeatable (as you can see on the annotation interface Demo) and that way the above code can be written simply as

@Demo("This is a demo class")
@Demo("This is the second annotation")
public class DemoClassAbbreviated {
}

How to read the annotation using reflection

Now that we know that the annotation is just an interface the next question is how can we get information about them. The methods that deliver the information about the annotations are in the reflection part of the JDK. If we have an element that can have an annotation (e.g. a Class, Method or Field object) then we can call getDeclaredAnnotations() on that element to get all the annotations that the element has or getDeclaredAnnotation() in case we know what annotation we need.

The return value is an annotation object (or an annotation array in the first case). Obviously, it is an object because everything is an object in Java (or a primitive, but annotations are anything but primitive). This object is the instance of a class that implements the annotation interface. If we want to know what string the programmer wrote between the parenthesis we should write something like

final var klass = DemoClass.class;
final var annotation = klass.getDeclaredAnnotation(Demo.class);
final var valueMethod = annotation.getClass().getMethod("value");
final var value = valueMethod.invoke(annotation);
Assertions.assertEquals("This is a demo class", value);

Because value is a method in the interface, certainly implemented by the class that we have access to through one of its instances we can call it reflectively and get back the result, which is "This is a demo class" in this case.

What is the problem with this approach

Generally nothing as long as we are not in the realm of JPMS. We get access to the method of the class and invoke it. We could get access to the method of the interface and invoke it on the object but in practice, it is the same. (Or not in case of JPMS.)

I was using this approach in Java::Geci. The framework uses the @Geci annotation to identify which class needs generated code inserted into. It has a fairly complex algorithm to find the annotations because it accepts any annotation that has the name Geci no matter which package it is in and it also accepts any @interface that is annotated with a Geci annotation (it is named Geci or the annotation has an annotation that is Geci recursively).

This complex annotation handling has its reason. The framework is complex so the use can be simple. You can either say:

@Geci("fluent definedBy='javax0.geci.buildfluent.TestBuildFluentForSourceBuilder::sourceBuilderGrammar'")

or you can have your own annotations and then say

@Fluent(definedBy="javax0.geci.buildfluent.TestBuildFluentForSourceBuilder::sourceBuilderGrammar")

The code was working fine up until Java 11. When the code was executed using Java 11 I got the following error from one of the tests

java.lang.reflect.InaccessibleObjectException: 
Unable to make public final java.lang.String com.sun.proxy.jdk.proxy1.$Proxy12.value() 
accessible: module jdk.proxy1 does not 
"exports com.sun.proxy.jdk.proxy1" to module geci.tools

(Some line breaks were inserted for readability.)

The protection of JPMS kicks in and it does not allow us to access something in the JDK we are not supposed to. The question is what do we really do and why do we do it?

When doing tests in JPMS we have to add a lot of --add-opens command-line argument to the tests because the test framework wants to access the part of the code using reflection that is not accessible for the library user. But this error code is not about a module that is defined inside Java::Geci.

JPMS protects the libraries from bad use. You can specify which packages contain the classes that are usable from the outside. Other packages even if they contain public interfaces and classes are only available inside the module. This helps module development. Users cannot use the internal classes so you are free to redesign them so long as long the API remains. The file module-info.java declares these packages as

module javax0.jpms.annotation.demo.use {
    exports javax0.demo.jpms.annotation;
}

When a package is exported the classes and interfaces in the package can be accessed directly or via reflection. There is another way to give access to classes and interfaces in a package. This is opening the package. The keyword for this is opens. If the module-info.java only opens the package then this is accessible only via reflection.

The above error message says that the module jdk.proxy1 does not include in its module-info.java a line that exports com.sun.proxy.jdk.proxy1. You can try and add an add-exports jdk.proxy1/com.sun.proxy.jdk.proxy1=ALL_UNNAMED but it does not work. I do not know why it does not work, but it does not. And as a matter of fact, it is good that it is not working because the package com.sun.proxy.jdk.proxy1 is an internal part of the JDK, like unsafe was, that caused so much headache to Java in the past.

Instead of trying to illegally open the treasure box let’s focus on why we wanted to open it in the first place and if we really need to access to that?

What we want to do is get access to the method of the class and invoke it. We can not do that because the JPMS forbids it. Why? Because the Annotation objects class is not Demo.class (which is obvious since it is just an interface). Instead it’s a proxy class that implements the Demo interface. That proxy class is internal to the JDK and so we can not call annotation.getClass().
But why would we access the class of the proxy object, when we want to call the method of our annotation?

Long story short (I mean a few hours of debugging, investigating and understanding instead of mindless stackoverflow copy/paste that nobody does): we must not touch the value() method of the class that implements the annotation interface. We have to use the following code:

final var klass = DemoClass.class;
final var annotation = klass.getDeclaredAnnotation(Demo.class);
final var valueMethod = annotation.annotationType().getMethod("value");
final var value = valueMethod.invoke(annotation);
Assertions.assertEquals("This is a demo class", value);

or alternatively

final var klass = DemoClass.class;
final var annotation = klass.getDeclaredAnnotation(Demo.class);
final var valueMethod = Demo.class.getMethod("value");
final var value = valueMethod.invoke(annotation);
Assertions.assertEquals("This is a demo class", value);

(This is already fixed in Java::Geci 1.2.0) We have the annotation object but instead of asking for the class of it we have to get access to the annotationType(), which is the interface itself that we coded. That is something the module exports and thus we can invoke it.

Mihály Verhás, my son, who is also a Java developer at EPAM usually reviews my articles. In this case, the “review” was extended and he wrote a non-negligible part of the article.

Java hexadecimal floating point literal

How I met hexadecimal floating point numbers

I was developing a new functionality into Java::Geci to make it less prone to code reformatting. The current release of the code will overwrite an otherwise identical code if it was reformatted. It is annoying since it is fairly easy to press the reformatting key shortcut and many projects even require that developers set their editor to automatically format the code upon save. In those cases Java::Geci cannot be used because as soon as the code is reformatted the generator thinks that the code it generates is not the same as the one already in the source file, updates it and signals the change of the code failing the unit tests.

The solution I was crafting compares the Java source files first converting them to a list of lexical elements. That way you can even reformat the code inserting new-lines, spaces, etc. so long as long the code remains the same. To do that I needed a simplified lexical analyzer for Java. Writing a lexical analyzer is not a big deal, I created several for different reasons since I first read the Dragon Book in 1987. The only thing I really needed is the precise definition of what are the string, character, number literals, the keywords and so on. In short: what is the definition of the Java language on the lexical level and how is it processed. Fortunately, there is a precise definition for that, the Java Language Specification, which is not only precise but also readable and has examples. So I started to read the corresponding chapters.

To my bewilderment, I could see there that there is a possibility in the Java language to express a floating point in hexadecimal. Strange, is it? Since I have not ever seen it, first I thought that this was something new introduced in Java 12 but my investigation showed that it probably was introduced in Java 1.5 That was the very first Java version I really liked but not because of hexadecimal floating points. So this was how I met this beast in the standard face to face. I started to wonder if this beast can be found at all in the wild or is it only something that can be seen captive in the confinements of the text of the JLS. So…

I put up a vote on Twitter


As you can see nine decent humans answered the question, mostly saying that they have had no idea about this feature.

Probably hexadecimal floating points are the least known and used feature of the Java language right after lambdas and streams (just kidding… hexadecimal floating points are important, no?)

Even though I did some scientific study in the past, I cannot see any use of hexadecimal floating point literals.


What is a floating point number?

We will get to hexadecimal floating point numbers, but to understand that we have to know first what a floating point number, generally is.

Floating point numbers have a mantissa and exponent. The mantissa has an integer and a fractional part, like iii.ffff. The exponent is an integer number. For example, 31.415926E-1 is a floating point number and an approximation for the ratio of the diameter and the circumference of a circle.

Java internally stores the float numbers on 32 bit and double number on 64 bit. The actual bits are used according to the IEEE 754 standard.

That way the bits store a sign on a single bit, then the exponent on 8 or 11 bits and finally the mantissa on 23 or 52 bits for 32- or 64-bit float/double respectively. The mantissa is a fractional number with a value between 1 and 2. This could be represented with a bit stream, where the first bit means 1, the second means 1/2 and so on. However, because the number is always stored normalized and therefore the number is always between [1 and 2) the first bit is always 1. There is no need to store it. Tus the mantissa is stored so that the most significant bit means 1/2, the next 1/22 and so on but when we need the value we add to it 1.

The mantissa is unsigned (hence we have a separate signum bit). The exponent is also unsigned, but the actual number of bitshifts are calculated subtracting 127 or 1023 from the value to get a signed number. It specifies how many bits the mantissa should virtually be shifted to the left or right. Thus when we write 31.415926E-1f then the exponent will NOT be -1. That is the decimal format of the number.

The actual value is 01000000010010010000111111011010. Breaking it down:

  • 0 sign, the number is positive. So far so good.
  • 10000000 128, which means we have to shift the mantissa one bit left (multiply the value by two)
  • 10010010000111111011010 is 4788186/2^23+1 \approx 1.570796251296997. The hex representation of this bit stream is 0x490FDA

And here comes the

Hexadecimal floating point literal

We can write the same number in Java as 0x0.C90FDAP2f. This is the hexadecimal floating point representation of the same number.

The mantissa 0xC9aFDA should be familiar to the hexadecimal representation of the number above 0x490FDA. The difference is that the first character is C instead of 4. That is the extra one bit, which is always 1 and is not stored in the binary representation. C is 1100 while the original 4 is 0100. The exponent is the signed decimal representation of the actual bitshifts needed to push the number to the proper position.

The format of the literal is not trivial. First of all, you HAVE TO use the exponent part and the character for the exponent is p or P. This is a major difference from the decimal representation. (UPDATE: If the exponent was optional you could not tell if, for example, 0.55 is a decimal floating point or a hexadecimal floating point. A hexadecimal number can, by accident, contain only decimal characters and still be hexadecimal.)

After a little bit of thinking it becomes obvious that the exponent cannot be denoted using the conventional e or E since that character is a legitimate hexadecimal digit and it would be ambiguous in case of numbers like 0x2e3. Would this be a hexadecimal integer or 2\times 2^3. It is an integer because we use p and not e.

The reason why the exponent part is mandatory I can only guess. Because developers got used to decimal floating point numbers with e or E as exponent it would be very easy to misread 0xC90F.0e+3 as a single floating point number, even though in case of hexadecimal floating point p is required instead of e. If the exponent were not mandatory this example would be a legit sum of a floating point number and an integer. The same time it looks like a single number, and that would not be good.

The other interesting thing is that the exponent is decimal. This is also because some hexadecimal digits were already in use for other purposes. The float and double suffix. In case you want to denote that a literal is a float, you can append the f or F to the end. If you want to denote that this literal is double then you can append d or D to the end. This is the default, so appending D is optional. If the exponent were hexadecimal we would not know if 0x32.1P1f is a float literal or a double and having a lot of magnitudes different value. This way, that that exponent is decimal it is a float number (32+1/2)\times 2^1.

Java and IEEE 754

Java implemented the IEEE 754 standard strictly until Java 1.2 This standard defines not only the format of the numbers when stored in memory but also defines rules how calculations should be executed. After the Java release 1.2 (including 1.2) the standard was released to make the implementations more liberal allowing to use more bits to store intermediate results. This was and it still is available on the Intel CPU platforms and it is used heavily in numeric calculations in other languages like FORTRAN. This was a logical step to allow the implementations to use of this higher precision.

The same time to preserve backward compatibility the strictfp modifier was added to the language. When this modifier is used on a class, interface or method the floating point calculations in those codes will strictly follow the IEEE 754 standard.

Takeaway

  • There are hexadecimal floating point literals in Java. Remember it and also what strictfp is because somebody may ask you about it on a Java interview. No practical use in enterprise programming.
  • Do not use them unless it makes the code more readable. I can barely imagine any situation where this would be the case. So, simply put: do not use them just because you can.
  • Follow me on Twitter @verhas to get notification about new articles.

I think that is it, nothing more. By the time this article is published, I will probably be swimming across the lake of Zürich along with ten thousand people. This is a big event here.

Oh… and yes: if you have ever used hexadecimal floating point literals in Java to make it more readable, please share the knowledge in the comments. I dare say in the name of the readers: we are interested.

UPDATE: Joseph Darcy, (Engineer, OpenJDK developer at Oracle, marathoner, fast walker, occasional photographer, lots of other things.) provided feedback on Twitter. I copied his reply to here as it is absolutely valuable and adds value to this article for the benefit of the reader:

The mapping between decimal strings and particular settings of binary floating-point values is often non-obvious. Hexadecimal floating-point literals provide a straightforward text to binary fp mapping when needed, such as in tests. See https://blogs.oracle.com/darcy/hexadecimal-floating-point-literals

Inject-able only in test?

This article is about some thoughts of test design and testability. Some questions that we discussed with my son, who is a junior Java developer and currently is employed and studies at EPAM Hungary (the same company but a different subsidiary where I work). All the things in this article are good old knowledge, but still, you may find something interesting in it. If you are a junior then because of that. If you are a senior then you can get some ideas on how to explain these things. If neither: sorry.

Introduction to the problem

The task they had was some roulette program or some other game simulation code, they had to write. The output of the code was the amount of simulated money lost or won. The simulation used a random number generator. This generator caused a headache when it came to testing. (Yes, you are right: the very basis of the problem was lack of TDD.) The code behaved randomly. Sometimes the simulated player was winning the game, other times it was losing.

Make it testable: inject mock

How to make this code testable?

The answer should be fairly obvious: mock the random number generator. Make the use of the source of randomness injected and inject a different non-random source during tests. Randomness is not important during testing and there is no need to test the randomness. We have to believe that the random number generator is good (it is not, it is never good, perhaps good enough, but that is a totally different story) and was tested by its own developers.

Learning #1: Do not test the functionality of your dependency.

We can have a field of type Supplier initialized to something like () -> rnd() lambda and in case of test it is overwritten using a setter.

Is testable good?

Now we changed the structure of the class. We opened a new entry to inject a random number generator. Is this okay?

There is no general yes or no answer to that. It depends on the requirements. Programmers like to make their code configurable and more general than they are absolutely needed by the current requirements. The reason that… well… I guess, it is because many times in the past programmers experienced that requirements have changed (no kidding!) and in case the code was prepared for the change then the coding work was easier. This is fair enough reasoning but there are essential flaws in it. The programmers do not know what kind of future requirements may come. Usually, nobody really knows, and everybody has some idea about it.

Programmers usually have the least knowledge. How would they know the future? Business analysts know a bit better, and at the end of the chain, the users and customers know it the best. However, even they do not know the business environment out of their control that may require new features of the program.

Another flaw is that developing of a future requirement now has extra costs that the developers a lot of times do not comprehend.

Practice shows that the result of such ‘ahead of time’ thinking is usually complex code and flexibility that’s hardly ever needed. There is even an acronym for that: YAGNI, “You Aren’t Gonna Need It”.

So, is implementing that injectability feature a YAGNI? Not at all.

First of all: a code has many different uses. Executing it is only one. An equally important one is the maintenance of the code. If the code cannot be tested, it cannot be reliably used. If the code cannot be tested, it cannot be reliably refactored, extended: maintained.

A functionality that is only needed for testing is like a roof bridge on a house. You do not use it yourself while you live in the house, but without them, it would be hard and expensive to check the chimneys. Nobody questions the need for those roof bridges. They are needed, they are ugly and still, they are there. Without them, the house is not testable.

Learning #2: Testable code usually has better structure.

But that is not the only reason. Generally, when you create a code testable the final structure will usually be more useable as well. That is, probably, because testing is mimicking the use of the code and designing the code testable will drive your thinking towards the usability to be on the first place and implementation to be only on the second place. And, to be honest: nobody really cares about implementation. Usability is the goal, implementation is only the tool to get there.

Responsibility

Okay, we got that far: testability is good. But then there is a question about responsibility.

The source of randomness should be hard-wired into the code. The code and the developer of the code are responsible for the randomness. Not because this developer implemented it, but this developer selected the random number generator library. Selecting the underlying libraries is an important task and it has to be done responsibly. If we open a door to alter this selection of implementation for randomness then we lose control over something that is our responsibility. Or don’t we?

Yes and no. If you open the API and provide a possibility to inject a dependency then you are not inherently responsible for the functioning of the injected functionality. Still, the users (your customers) will come to you asking for help and support.

“There is a bug!” they complain. Is it because of your code or something in the special injected implementation the user selected?

You essentially have three choices:

  1. You may examine the bugs in each of those cases and tell them when the error is not your bug and help them select a better (or just the default) implementation of the function. It will cost you precious time either paid or unpaid.
  2. The same time you can also exclude the issue and say: you will not even examine any bug that cannot be reproduced using the standard, default implementation.
  3. You technically prevent the use of the feature that is there only for the testability.

The first approach needs good sales support or else you will end up spending your personal time fixing customers problem instead of spending your paid customer time. Not professional.

The second approach is professional, but customers do not like it.

The third is a technical solution to drive users from #1 to #2.

Learning #3: Think ahead about users’ expectations.

Whichever solution you choose the important thing is to do it consciously and not just by accident. Know what your users/customer may come up with and be prepared.

Prevent production injecting

When you open the possibility to inject the randomness generator into the code how do you close that door for the production environment if you really must?

The first solution, which I prefer, is not to open it wide in the first place. Use it via the initialized field holding the lambda expression (or some other way) that makes it injectable, but do not implement injection support. Let the field be private (but not final, because that may cause other problems in this situation) and apply a bit of reflection in the test to alter the content of the private field.

Another solution is to provide a package private setter, or even better an extra constructor to alter/initialize the value of the field and throw an exception if it is used in the production environment. You can check that many different ways:

  • Invoke `Class.forName()` for a test class that is not on the classpath in the production environment.
  • Use `StackWalker` and check that the caller is test code.

Why do I prefer the first solution?

Learning #4: Do not use a fancy technical solution just because you can. Boring is usually better.

First of all, because this is the simplest and puts all testing code into the test. The setter or the special constructor in the application code is essentially testing code and the byte codes for them are there in the production code. Test code should be in test classes, production code should be in production classes.

The second reason is that designing functionality that is deliberately different in the production and in the test environment is just against the basic principles of testing. Testing should mimic the production environment as much as economically feasible. How would you know that the code will work properly in the production environment when the test environment is different? You hope. There are many environmental factors already that may alter the behavior in the production environment and let bug manifest there only and silently remaining dormant in the test environment. We do not need extra such things to make our testing even riskier.

Summary

There are many more aspects of programming and testing. This article was addressing only a small and specific segment that came up in a discussion. The key learnings also listed in the article:

  • Test the system under test (SUT) and not the dependencies. Be careful, you may think you are testing the SUT when actually you are testing the functionality of some dependencies. Use stupid and simple mocks.
  • Follow TDD. Write the test before and mixed with the functionality development. If you don’t because just you don’t, then at least think about the tests before and while you write the code. Testable code is usually better (not just for the test).
  • Think about how fellow programmers will use your code. Imagine how a mediocre programmer will use your API and produce the interfaces of your code not only for the geniuses like you, who understand your intentions even better than you.
  • Do not go for a fancy solution when you are a junior just because you can. Use a boring and simple solution. You will know when you are a senior: when you no longer want to use the fancy solution over the boring one.

Converting objects to Map and back

In large enterprise applications sometimes we need to convert data objects to and from Map. Usually, it is an intermediate step to a special serialization. If it is possible to use something standard then it is better to use that, but many times the architecture envisioned by some lead architect, the rigid environment or some similar reason does not make it possible to use JOOQ, Hibernate, Jackson, JAX or something like that. In such a situation, as it happened to me a few years ago, we have to convert the objects to some proprietary format being string or binary and the first step towards that direction is to convert the object to a Map.

Eventually, the conversion is more complex than just

Map myMap =  (Map)myObject;

because these objects are almost never are maps by their own. What we really need in the conversion is to have a Map where each entry corresponds to a field in the “MyObject” class. The key in the entry is the name of the field, and the value is the actual value of the field possibly converted to a Map itself.

One solution is to use reflection and reflectively read the fields of the object and create the map from it. The other approach is to create a toMap() method in the class that needs to be converted to a Map that simply adds each field to the returned map using the name of the field. This is somewhat faster than the reflection-based solution and the code is much simpler.

When I was facing this problem in a real application a few years ago I was so frustrated writing the primitive but numerous toMap() methods for each data object that I created a simple reflection-based tool that to do it just for any class we wanted. Did it solve the problem? No.

This was a professional environment where not only the functionality matters but also the quality of the code and the quality of my code, judged by my fellow programmers, was not matching. They argued that the reflection-based solution is complex and in case it becomes part of the code base then the later joining average developers will not be able to maintain it. Well, I had to admit that they were correct. In a different situation, I would have said that the developer has to learn reflection and programming in Java on a level that is needed by the code. In this case, however, we were not speaking about a specific person, but rather somebody who comes and joins the team in the future, possibly sometime when we have already left the project. This person was assumed to be an average developer, which seemed to be reasonable as we did not know anything about this person. In that sense, the quality of the code was not good, because it was too complex. The quorum of the developer team decided that maintaining the numerous manually crafted toMap() method was going to be cheaper than finding senior and experienced developers in the future.

To be honest, I was a bit reluctant to accept their decision but I accepted it even though I had the possibility to overrule it based simply on my position in the team. I tend to accept the decisions of the team even if I do not agree with that, but only if I can live with those decisions. If a decision is dangerous, terrible and threatens the future of the project then we have to keep discussing the details until we get to an agreement.

Years later I started to create Java::Geci as a side project that you can download from http://github.com/verhas/javageci

Java::Geci is a code generation tool that runs during the test phase of the Java development life cycle. Code generation in Java::Geci is a “test”. It runs the code generation and in case all the generated code stays put then the test was successful. In case anything in the code base changed in a way that causes the code generator to generate different code than before and thus the source code changes then the test fails. When a test fails you have to fix the bug and run build, including tests again. In this case, the test generates the new, by now fixed code, therefore all you have to do is only to run the build again.

When developing the framework I created some simple generators to generate equals() and hashCode(), setters and getters, a delegator generator and finally I could not resist but I created a general purpose toMap() generator. This generator generates code that converts the object to Map just as we discussed before and also the fromMap() that I did not mention before, but fairly obviously also needed.

Java::Geci generators are classes that implement the Generator interface. The Mapper generator does that extending the abstract class AbstractJavaGenerator. This lets the generator to throw any exception easing the life of the generator developer, and also it already looks up the Java class, which was generated from the currently processed source. The generator has access to the actual Class object via the parameter klass and the same time to the source code via the parameter source, which represents the source code and provides methods to create Java code to be inserted into it.

The third parameter global is something like a map holding the configuration parameters that the source code annotation @Geci defines.

package javax0.geci.mapper;

import ...

public class Mapper extends AbstractJavaGenerator {

...

    @Override
    public void process(Source source, Class<?> klass, CompoundParams global)
                                                             throws Exception {
        final var gid = global.get("id");
        var segment = source.open(gid);
        generateToMap(source, klass, global);
        generateFromMap(source, klass, global);

        final var factory = global.get("factory", "new {{class}}()");
        final var placeHolders = Map.of(
                "mnemonic", mnemonic(),
                "generatedBy", generatedAnnotation.getCanonicalName(),
                "class", klass.getSimpleName(),
                "factory", factory,
                "Map", "java.util.Map",
                "HashMap", "java.util.HashMap"
        );
        final var rawContent = segment.getContent();
        try {
            segment.setContent(Format.format(rawContent, placeHolders));
        } catch (BadSyntax badSyntax) {
            throw new IOException(badSyntax);
        }
    }

The generator itself only calls the two methods generateToMap() and generateFromMap(), which generate, as the names imply the toMap() and fromMap() methods into the class.

Both methods use the source generating support provided by the Segment class and they also use the templating provided by Jamal. It is also to note that the fields are collected calling the reflection tools method getAllFieldsSorted() which returns all the field the class has in a definitive order, that does not depend on the actual JVM vendor or version.

    private void generateToMap(Source source, Class<?> klass, CompoundParams global) throws Exception {
        final var fields = GeciReflectionTools.getAllFieldsSorted(klass);
        final var gid = global.get("id");
        var segment = source.open(gid);
        segment.write_r(getResourceString("tomap.jam"));
        for (final var field : fields) {
            final var local = GeciReflectionTools.getParameters(field, mnemonic());
            final var params = new CompoundParams(local, global);
            final var filter = params.get("filter", DEFAULTS);
            if (Selector.compile(filter).match(field)) {
                final var name = field.getName();
                if (hasToMap(field.getType())) {
                    segment.write("map.put(\"%s\", %s == null ? null : %s.toMap0(cache));", field2MapKey(name), name, name);
                } else {
                    segment.write("map.put(\"%s\",%s);", field2MapKey(name), name);
                }
            }
        }
        segment.write("return map;")
                ._l("}\n\n");
    }

The code selects only the fields that are denoted by the filter expression.

Extending abstract classes with abstract classes in Java

The example issue

When I was creating the Java::Geci abstract class AbstractFieldsGenerator and AbstractFilteredFieldsGenerator I faced a not too complex design issue. I would like to emphasize that this issue and the design may seem obvious for some of you, but during my recent conversation with a junior developer (my son, Mihály specifically, who also reviews my articles because his English is way better than mine) I realized that this topic may still be of value.

Anyway. I had these two classes, fields and filtered fields generator. The second class extends the first one

abstract class AbstractFilteredFieldsGenerator
                  extends AbstractFieldsGenerator {...

adding extra functionality and the same time it should provide the same signature for concrete implementation. What does it mean?

These generators help to generate code for a specific class using reflection. Therefore the input information they work on is a Class object. The fields generator class has an abstract method process(), which is invoked for every field. It is invoked from an implemented method that loops over the fields and does the invocation separately for each. When a concrete class extends AbstractFieldsGenerator and thus implements this abstract method then it will be called. When the same concrete class is changed so that it extends AbstractFilteredFieldsGenerator then the concrete method will be invoked only for the filtered method. I wanted a design so that the ONLY change that was needed in the concrete class is to change the name.

Diff between the two versions of the concrete class

Abstract class problem definition

The same problem described in a more abstract way: There are two abstract classes A and F so that F extends A and F provides some extra functionality. Both declare the abstract method m() that a concrete class should implement. When the concrete class C declaration is changed from C extends A to C extends F then the invocation of the method m() should change, but there should be no other change in the class C. The method m() is invoked from method p() defined in class A. How to design F?

What is the problem with this?

Extending A can be done in two significantly different ways:

  • F overrides m() making it concrete implementing the extra functionality in m() and calls a new abstract method, say mx()
  • F overrides the method p() with a version that provides the extra functionality (filtering in the example above) and calls the still abstract method m()

The first approach does not fulfill the requirement that the signature to be implemented by the concrete class C should remain the same. The second approach throws the already implemented functionality of A to the garbage and reimplements it a bit different way. In practice this is possible, but it definitely is going to be some copy/paste programming. This is problematic, let me not explain why.

The root of the problem

In engineering when we face a problem like that, it usually means that the problem or the structure is not well described and the solution is somewhere in a totally different area. In other words, there are some assumptions driving our way of thinking that are false. In this case, the problem is that we assume that the abstract classes provide ONE extension “API” to extend them. Note that the API is not only something that you can invoke. In the case of an abstract class, the API is what you implement when you extend the abstract class. Just as libraries may provide different APIs for different ways to be used (Java 9 HTTP client can send() and also sendAsync()) abstract (and for the matter of fact also non-abstract) classes can also provide different ways to be extended for different purposes.

There is no way to code F reaching our design goal without modifying A. We need a version of A that provides different API to create a concrete implementation and another, not necessarily disjunct/orthogonal one to create a still abstract extension.

The difference between the APIs in this case is that the concrete implementation aims to be at the end of a call-chain while the abstract extension wants to hook on the last but one element of the chain. The implementation of A has to provide API to be hooked on the last but one element of the call-chain. This is already the solution.

Solution

We implement the method ma() in the class F and we want p() to call our ma() instead of directly calling m(). Modifying A we can do that. We define ma() in A and we call ma() from p(). The version of ma() implemented in A should call m() without further ado to provide the original “API” for concrete implementations of A. The implementation of ma() in F contains the extra functionality (filtering in the example) and then it calls m(). That way any concrete class can extend either A or F and can implement m() with exactly the same signature. We also avoided copy/paste coding with the exception that calling m() is a code that is the same in the two versions of ma().

If we want the class F extendable with more abstract classes then the F::ma implementation should not directly call m() but rather a new mf() that calls m(). That way a new abstract class can override mf() giving again new functionality and invoke the abstract m().

Takeaway

  1. Programming abstract classes is complex and sometimes it is difficult to have a clear overview of who is calling who and which implementation. You can overcome this challenge if you realize that it may be a complex matter. Document, visualize, discuss whatever way may help you.
  2. When you cannot solve a problem (in the example, how to code F) you should challenge the environment (the class A we implicitly assumed to be unchangeable by the wording of the question: “How to implement F?”).
  3. Avoid copy/paste programming. (Pasta contains a lot of CH and makes your code fat, the arteries get clogged and finally, the heart of your application will stop beating.)
  4. Although not detailed in this article, be aware that the deeper the hierarchy of abstraction is the more difficult it is to have a clear overview of who calls whom (see also point number 1).