Category Archives: Uncategorized

Comparing Golang and understanding Java Value Types

Slides and recorded sound of the talk at W-JAX 2018 Mainz conference.


Java getting back to the browser?

Betteridge’s law of headlines apply.


This article talks about WebAssembly and can be read to get the first glimpse of it. At the same time, I articulate my opinion and doubts. The summary is that WebAssembly is an interesting approach and we will see what it will become.

Java in the Browser, the past

There was a time when we could run Java applets in the browsers. There were a lot of problems with it, although the idea was not total nonsense. Nobody could tell that the future of browser programmability is not Java. Today we know that JavaScript was the winner and the applet as it is deprecated in Java 9 and is going to be removed from later Java versions. This, however, does not mean that JavaScript is without issues and it is the only and best possible solution for the purpose that a person can imagine.

JavaScript has language problems, there are a lot of WTF included in the language. The largest shortage, in my opinion, is that it is a single language. Developers are different and like different languages. Projects are different best solved by different programming languages. Even Java would not so immensely successful without the JVM infrastructure supported by so many different languages. There are a lot of languages that run on the JVM, even such a crap as ScriptBasic.

Now you can say that the same is true for the JavaScript infrastructure. There are other languages that are compiled to JavaScript. For example, there is TypeScript or there is even Java with the GWT toolkit. JavaScript is a target language, especially with asm.js. But still, it is a high level, object-oriented, memory-managed language. It is nothing like machine code.

Compiling to JavaScript invokes the compiler once, then the JavaScript syntax analyzer, internal bytecode and then the JIT compiler. Isn’t it a bit too many compilers till we get to the bits that are fed into the CPU? Why should we download the textual format JavaScript to the browser and compile it into bytecode each time a page is opened? The textual format may be larger, though compression technologies are fairly advanced, and the compilation runs millions of times on the client computer emitting a lot of carbon into the air, where we already have enough, no need for more.

(Derail: Somebody told me that he has an advanced compression algorithm that can compress any file into one bit. There is no issue with the compression. Decompression is problematic though.)


Why can’t we have some bytecode based virtual machine in the browser? Something that once the JVM was for the applets. This is something that the WebAssembly guys were thinking in 2015. They created WebAssembly.

WebAssembly is a standard program format to be executed in the browser nearly as fast as native code. The original idea was to “complement JavaScript to speed up performance-critical parts of web applications and later on to enable web development in other languages than JavaScript.” (WikiPedia)

Today the interpreter runs in Firefox, Chromium, Google Chrome, Microsoft Edge and in Safari. You can download a binary program to the browser and you can invoke it from JavaScript. There is also some tooling supporting developing programs in “assembly” and also on higher level languages.


The binary web assembly contains blocks. Each block describes some characteristics of the code. I would say that most of the blocks are definition and structure tables and there is one, which is the code itself. There is a block that lists the functions that the code exports, and which can be invoked from JavaScript. Also, there is a block that lists the methods that the code wants to invoke from the JavaScript code.

The assembly code is really assembly. When I started to play with it I had some nostalgic feeling. Working with these hex codes is similar to programming the Sinclair ZX80 in Z80 assembly when we had to convert the code manually to hex on paper and then we had to “POKE” the codes from BASIC to the memory. (If you understand what I am talking about you are seasoned. I wanted to write ‘old’ but my editor told me that is rude. I am just kidding. I have no editor.)

I will not list all the features of the language. If you are interested, visit the WebAssembly page. There is consumable documentation about the binary format.

There are, however, some interesting features that I want to talk about to later express my opinions.

No Objects

The WebAssembly VM is not an object-oriented VM. It does not know objects, classes or any similar high-level structures. It really looks like some machine language. It has some primitive types, like i32, i64, f32, f64 and that it is. The compiler that compiles high-level language has to use these.


The memory management is also up to the application. It is assembly. There is no garbage collector. The code works on a (virtually) continuous memory segment that can grow or shrink via a system call and it is totally up to the application to decide which code fragment uses which memory address.

Two Stacks

There are two stacks the VM works with. One is the operation stack for arithmetic operations. The other one is the call stack. There are functions that can call each other and return to the caller. The call sequence is stored in a stack. This is a very usual approach. The only shortage is that there is no possibility to mark the call stack and purge it when an exception happens. The only possibility to handle try/catch programming structure is to generate code before and after function calls that check for exception conditions and if the exception is not caught on the caller function level then the code has to return to the higher level caller. This way the exception handling walks through the call stack with the extra generated code around each function call. This slows down not only the exception handling but also the function calls.

Single Thread

There is no threading in WebAssembly.

Support, Tooling

The fact that most of the browsers support WebAssembly is one half of the bread. There have to be developer tools supporting the concept to have code that can be executed.

There is an LLVM backed compiler solution so technically any language that is compiled to LLVM should be compilable to WebAssembly and run in the browser. There is a C compiler in the tooling and you can also compile RUST to WebAssembly. There is also a textual format in case you want to program directly in assembly level.


Security is at least questionable. First of all, WebAssembly is binary, therefore it is not possible, or at least complex to look at the code and analyze the code. The download of the code does not require channel encryption (TLS) therefore it is vulnerable to MITM attack. Similarly, WebAssembly does not support code signature that would assert that the code was not tampered with since being generated in the (hopefully protected) development environment.

WebAssembly runs in a sandbox, just like JavaScript or like Flash was running. Fairly questionable architecture from the security point of view.

You can read more on the security questions in this article.


WebAssembly was developed for to years to reach a Minimal Viable Product (MVP) that can be used as a PoC. There are features, like garbage collection, multi-thread support, exception handling support, SIMD type instructions, DOM access support directly from WebAssembly, which are developed after MVP.

Present and Future

I can say after playing like a weekend with WebAssembly that it is an interesting and nice toy. In its current state, it is a toy, nothing more. Without the features planned after MVP, I see only one viable use case: WebAssembly is the perfect tool to deploy malicious mining code on the client machines. In addition to that, any implementation flaw in the engine is a security risk. Note that these security risks come from a browser functionality that gives no value to the average user. You can disable WebAssembly in some of the browsers. It is a little worrisome that it is enabled by default, although it is needed only for early adopters for PoC and not commercial projects. If I were paranoid I would say that the browser vendors, like Google, have a hidden agenda with the WebAssembly engine in the browser.

I am afraid that we see no security issues currently with WebAssembly only because technology is new and IT felons have not learned yet the tools. I am almost certain that the security holes are currently lurking in the current code waiting to be exploited. Disable WebAssembly in your browser till you want to use it. Perhaps in a few years (or decades).

The original aim was to amend JavaScript. With the features after MVP, I strongly believe that WebAssembly will rather aim to replace JavaScript than amend it. There will be a time when we will be able to write applications to run in the browser in Golang, Swift, Java, C, Rust or whatever language we want to. So looking at the question in the title “will Java get back to the browser?” the answer is definitely NO. But some kind of VM technology, JIT, bytecode definitely will sometime in the future.

But not yet.

Comparing files in Java

I am creating a series of video tutorials for PACKT about network programming in Java. There is a whole section about Java NIO. One sample program is to copy a file via raw socket connection from a client to a server. The client reads the file from the disk, and the server saves the bytes as they arrive, to disk. Because this is a demo, the server and the client are running on the same machine and the file is copied from one directory to the exact same directory but a different name. The proof of the pudding is eating it: the files have to be compared.

The file I wanted to copy was created to contain random bytes. Transferring only text information can leave sometimes some tricky bug lurking in the code. The random file was created using the simple Java class:


import java.util.Random;

public class SampleMaker {
    public static void main(String[] args) throws IOException {
        byte[] buffer = new byte[1024 * 1024 * 10];
        try (FileOutputStream fos = new FileOutputStream("sample.txt")) {
            Random random = new Random();
            for (int i = 0; i < 16; i++) {

Using IntelliJ comparing files is fairly easy, but since the files are binary and large this approach is not really optimal. I decided to write a short program that will not only signal that the files are different but also where the difference is. The code is extremely simple:



public class SampleCompare {
    public static void main(String[] args) throws IOException {
        long start = System.nanoTime();
        BufferedInputStream fis1 = new BufferedInputStream(new FileInputStream("sample.txt"));
        BufferedInputStream fis2 = new BufferedInputStream(new FileInputStream("sample-copy.txt"));
        int b1 = 0, b2 = 0, pos = 1;
        while (b1 != -1 && b2 != -1) {
            if (b1 != b2) {
                System.out.println("Files differ at position " + pos);
            b1 =;
            b2 =;
        if (b1 != b2) {
            System.out.println("Files have different length");
        } else {
            System.out.println("Files are identical, you can delete one of them.");
        long end = System.nanoTime();
        System.out.print("Execution time: " + (end - start)/1000000 + "ms");

The running time comparing the two 160MB files is around 6 seconds on my SSD equipped Mac Book and it does not improve significantly if I specify a large, say 10MB buffer as the second argument to the constructor of BufferedInputStream. (On the other hand, if we do not use the BufferedInputStream then the time is approximately ten times more.) This is acceptable, but if I simply issue a diff sample.txt sample-copy.txt from the command line, then the response is significantly faster, and not 6 seconds. It can be many things, like Java startup time, code interpretation at the start of the while loop, till the JIT compiler thinks it is time to start to work. My hunch is, however, that the code spends most of the time reading the file into the memory. Reading the bytes to the buffer is a complex process. It involves the operating system, the device drivers, the JVM implementation and they move bytes from one place to the other and finally we only compare the bytes, nothing else. It can be done in a simpler way. We can ask the operating system to do it for us and skip most of the Java runtime activities, file buffers, and other glitters.

We can ask the operating system to read the file to memory and then just fetch the bytes one by one from where they are. We do not need a buffer, which belongs to a Java object and consumes heap space. We can use memory mapped files. After all, memory mapped files use Java NIO and that is exactly the topic of the part of the tutorial videos that are currently in the making.

Memory mapped files are read into the memory by the operating system and the bytes are available to the Java program. The memory is allocated by the operating system and it does not consume the heap memory. If the Java code modifies the content of the mapped memory then the operating system writes the change to the disk in an optimized way, when it thinks it is due. This, however, does not mean that the data is lost if the JVM crashes. When the Java code modifies the memory mapped file memory then it modifies a memory that belongs to the operating system and is available and is valid after the JVM stopped. There is no guarantee and 100% protection against power outage and hardware crash, but that is very low level. If anyone is afraid of those then the protection should be on the hardware level that Java has nothing to do anyway. With memory mapped files we can be sure that the data is saved into the disk with certain, very high probability that can only be increased by failure tolerant hardware, clusters, uninterruptible power supplies and so on. These are not Java. If you really have to do something from Java to have the data written to disk then you can call the MappedByteBuffer.force() method that asks the operating system to write the changes to disk. Calling this too often and unnecessarily may hinder the performance though. (Simple because it writes the data to disk and returns only when the operating system says that the data was written.)

Reading and writing data using memory mapped files is usually much faster in case of large files. To have the appropriate performance the machine should have significant memory, otherwise, only part of the file is kept in memory and then the page faults increase. One of the good things is that if the same file is mapped into the memory by two or more different processes then the same memory area is used. That way processes can even communicate with each other.

The comparing application using memory mapped files is the following:


import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;

public class MapCompare {
    public static void main(String[] args) throws IOException {
        long start = System.nanoTime();
        FileChannel ch1 = new RandomAccessFile("sample.txt", "r").getChannel();
        FileChannel ch2 = new RandomAccessFile("sample-copy.txt", "r").getChannel();
        if (ch1.size() != ch2.size()) {
            System.out.println("Files have different length");
        long size = ch1.size();
        ByteBuffer m1 =, 0L, size);
        ByteBuffer m2 =, 0L, size);
        for (int pos = 0; pos < size; pos++) {
            if (m1.get(pos) != m2.get(pos)) {
                System.out.println("Files differ at position " + pos);
        System.out.println("Files are identical, you can delete one of them.");
        long end = System.nanoTime();
        System.out.print("Execution time: " + (end - start) / 1000000 + "ms");

To memory map the files we have to open them first using the RandomAccessFile class and ask for the channel from that object. The channel can be used to create a MappedByteBuffer, which is the representation of the memory area where the file content is loaded. The method map in the example maps the file in read-only mode, from the start of the file to the end of the file. We try to map the whole file. This works only if the file is not larger than 2GB. The start position is long but the size of the area to be mapped is limited by the size of an Integer.

Generally this it… Oh yes, the running time comparing the 160MB random content files is around 1sec.

UPDATE: pointed out that the part of the code

        for (int pos = 0; pos < size; pos++) {
            if (m1.get(pos) != m2.get(pos)) {
                System.out.println("Files differ at position " + pos);

can be replaced using the built-in ByteBuffer::mismatch method. The code is simpler, it does exactly what the example code is aiming and it is probably faster.

Rating Articles

My last article was commented as “Weakest article in a long time, and it shows a prime example of typical inch pincher stuff certain people still like to make a fuzz about.”

To ease this type of feedback for you I switched on the rating functionality. (I do not know why I did not do that before.)

You can simply click on the stars at the top of the article to express you like or dislike an article. (It is not shown on the opening page, you have to click on the title of the article to get to the article’s own page.) This will help me to write better articles and it will also help the other readers to skip articles when they are not that good.

Raid, backup and archive

This is a short tutorial about the similarities and the differences of redundant storage, backup, and archive functionality. I felt a need to create this short introduction because I realized that many IT professionals do not know the difference between these operations and many times mix them or using the wrong approach for some purpose.

I personally once was the witness of a backup at a Hungarian bank, which was stored on a partition of a raid set disk, which also held the operational data. Raid controller failure happened. Backup was unusable. Technically it was not a backup. A Digital Equipment Corp. engineer was restoring the allocation bits of the raid set for two weeks to restore account data. Although neither the bank, which shall not be named, nor Digital do not exist anymore I am more than convinced that similar backups still do.

What these methods are

Redundant storage, backup, and archive copy operational data. They do that aiming more stability in operation. The copied data is stored in a redundant way and in case there is some event that needs data deleted or corrupted previously the copied version is still available. The differences between these data redundancy increasing strategies are

  • (NEED) the type of event that creates the need for the deleted data
  • (CAUSE) the type of event that causes the deletion of the data
  • (DISCOVERY) how the data loss or need is recognized
  • (HOW) how the actual copy is created and stored

Redundant storage

Redundant storage copies the data online and all the time. (HOW) When there is some change in the data the redundant information is created in some storage media as soon as the hardware and software make it possible. The action of copy is not batched. It is not waiting for a bunch of data to be copied together. It is copied as soon as possible.

The actual implementation is usually some RAID configuration. A RAID configuration two or more same-size disks parallel. In case of two disks, anything written on one is written to the other at the same time. When reading one of the disks is used, which makes reading twice as fast regarding the data transfer assuming that the data transfer bus between the disk and the computer is fast enough. Seek time in case of rotating (non-SSD) disks is not improved.

When there are three or more disks the writing is a bit different. In this situation whenever a bit is changed on one disk then the bit is also changed on the last disk of the RAID set. The RAID controller keeps the bits of the last disk of the set to be the XOR value of the same bits on the other disks. That way the data is “partially copied”.

In case of a hardware failure, the RAID solutions usually allow the faulty disk to be replaced without switching off the disk system. The controller will automatically reconstruct the missing data.

(NEED) Redundant storage keeps the data available during normal operation and prevents data loss in case of (CAUSE) hardware failure. All the data is copied all the time and in case there is a failure the data recovery causes a few milliseconds in data access delay. Data redundancy recovery may be longer in the range o few minutes or hours, but the data is available unless there are multiple failures.

(DISCOVERY) The data loss is automatically detected because the redundancy is checked upon every read.


(HOW) Backup copies data usually to offline media. The copy is started at regular intervals, like every hour, day or week. When a backup is executed files that changed since the last backup are copied to the backup media. Backup can cover the application data or can cover the whole operating system. Many times operating system is not backed up. When there is a need to restore the information OS is installed fresh from installation media and only the application files are restored from the backup storage. This may require smaller backup storage, faster backup and restore execution.

There are different techniques called full, partial and differential backups. Creating backups without purging old data would infinitely grow the size of the backup media. This would not only cost ever increasing money buying the media but the burden to catalog and keep the old media would also mean a huge operational cost burden. To optimize the costs old backups are deleted with special strategy. As an example, a strategy can require to create a backup every day and delete the backups that are older than one week except those that were created on Monday. Backups older than a month can also be deleted except those that were created on the first Monday of the month and similarly backups older than a year may be deleted except the backup of January and June.

(NEED) The data stored on the backup media is needed if it is discovered that some data was deleted. (CAUSE) The reason for the deletion may be human error or sabotage. A user of the system mistyped the name of a record to be deleted or thought that the data is not needed anymore and later it is realized that it was a mistake. Sabotage is a deliberate action when somebody having access to the system deletes or alters data as a wrongdoing. In either case, the data is ruined by human interaction. It may also be possible that the data is ruined by disaster (flood, fire, earthquake) or some hardware error that causes much more severe damage than a simple disk error.

The backup media itself can also be the target of the sabotage. Disaster can also damage backup media. For this reason, backup is usually stored offline disconnected from the main operating system and many times the media is transferred to a different location.

When data needs to be restored the backup media has to be copied back to the operational components to restore the information that was deleted or altered. The restore process needs to connect the backup media, or a copy of the backup media to the operational components and copy the data back. The connecting is usually a manual process because anything automated can be the target for a sabotage. Because of manual nature of the process restoring a backup is usually a long time. It may be a few minutes, hours or days. Usually the older the backup the more time is needed to get back the operational data.


(HOW) The creation of an archive is very similar to the creation of a backup. We copy some of the data to some offline media and we store it for a long time. The archive copy is usually done on data that was not yet archived. Archive this way is kind of incremental usually. (CAUSE) Archive stores data, which is deleted from the system deliberately by the normal operational processes, because it is not needed by the operation. The archive is not aiming to provide a backup source for data that is found to be deleted accidentally. The data stored in the archive is never needed for normal operation. (DISCOVERY/NEED) The archive data is needed for extraordinary operation.

For example, the mobile company does not need the cell information of individual phones for a long time. It is an operation data stored in the HLR and VLR database and this information is not even backed up usually. In case there is data loss getting the actual information is faster gathering it from the GSM network than restoring from a backup being probably fairly outdated (mobile phones move in the meantime). On May 9, 2002, some robbers killed 8 people in the small Hungarian town Mor. A few years later when the investigation got to the point to examine the mobile phone movements in the area the data was not available as operational data but it was available in the archives. Analysing GSM cell data to support the operation of homicide investigation is not a normal operation of a telecom company.

You archive data that you are obligated to store and archive by law, you suspect that you may need for some unforeseeable future purpose. Records that describe the business level operations and transactions are archived usually.


As you can see from the above one of the method cannot replace the other. They supplement each other and if you do not implement one of them then you can expect that the operation will be sub-par.

The example in the intro explains clearly why redundant storage does not eliminate the need for a backup. Similarly archiving cannot be replaced by an otherwise proper backup solution. The error, in this case, will not face you so harsh and evident because of the long-term nature of the archive. Nevertheless, an archive is not the same as backup.

In some cases, I have seen the use of archive as the source of data backup. This is a forgivable sin only when the data loss has already happened and the archive still has the data you need. On the other hand, the archive does not contain all the operational data, only those that have long-term business relevance.


This is a short introduction to redundant storage, backup, and archive. Do not think that understanding what is written here makes you an expert in any of these topics. Each of the topics is a special expert area with tons of literature to learn and loads of exercises to practice and ace. On the other hand, now you should understand the basic roles of these methods, what they are good for and what they are not good for, as well as you should know the most important differences to avoid the mistakes that others have already committed.

There is no need to repeat old mistakes. Commit new ones!

Java 9 Module Services

Wiring and Finding

Java has a ServiceLoader class for long time. It was introduced in 1.6 but a similar technology was in use since around Java 1.2. Some software components used it, but the use was not widespread. It can be used to modularize the application (even more) and to provide a mean to extend an application using some kind of plug-ins that the application does not depend on compile time. Also, the configuration of these services is very simple: just put it on the class/module path. We will see the details.

The service loader can locate implementations of some interfaces. In EE environment there are other methods to configure implementations. In the non-EE environment, Spring became ubiquitous, which has a similar, though not the exact same solution to a similar, but not an exactly same problem. Inversion of Control (IoC) and Dependency Injections (DI) provided by Spring are the solution to the configuration of the wiring of the different components and are the industry best practice how to separate the wiring description/code from the actual implementation of the functionalities that the classes have to perform.

As a matter of fact, Spring also supports the use of the service loader so you can wire an implementation located and instantiated by the service loader. You can find a short and nicely written article about that here.

ServiceLoader is more about how to find the implementation before we could inject it into the components that need it. Junior programmers sometimes mistakenly mix the two and it is not without reason: they are strongly related.

Perhaps because of this most of the applications, at least those that I have seen, do not separate the wiring and the finding of the implementation. These applications usually use Spring configuration for both finding and wiring and this is just OK. Although this is a simplification, we should live with and be happy with it. We should not separate the two functions just because we can. Most of the applications do not need to separate these. They are neatly sitting on a simple line of the XML configuration of a Spring application.

We should program on a level of abstraction that is needed but never more abstract.

Yes, this sentence is a paraphrase of a saying that is attributed to Einstein. If you think about it you can also realize that this statement is nothing more than the principle KISS (keep it simple and stupid). The code, not you.

ServiceLoader finds the implementation of a certain class. Not all the implementations that may be on the classpath. It finds only those that are “advertised”. (I will tell later what “advertised” means.) A Java program cannot traverse through all the classes that are on the classpath, or can they?

Browsing the classpath

This section is a little detour, but it is important to understand why ServiceLoader works the way it does, even before we discuss how it does.

A Java code cannot query the classloader to list all the classes that are on the classpath. You may say I lie because Spring does browse the classes and finds automatically the implementation candidates. Spring actually cheats. I will tell you how it does. For now, accept that the classpath cannot be browsed. If you look at the documentation of the class ClassLoader you do not find any method that would return the array, stream or collection of the classes. You can get the array of the packages but you cannot get the classes even from the packages.

The reason for it is the level of abstraction how Java handles the classes. The class loader loads the classes into the JVM and the JVM does not care from where. It does not assume that the actual classes are in files. There are a lot of applications that load classes, not from a file. As a matter of fact, most of the applications load some of the classes from some different media. Also your programs, you just may not know it. Have you ever used Spring, Hibernate or some other framework? Most of these frameworks create proxy objects during run-time and the loads these objects from the memory using a special class loader. The class loader cannot tell you if there will ever be a new object created by the framework it supports. The classpath, in this case, is not static. There is even no such thing as classpath for these special class loaders. They find the classes dynamically.

Okay. Well said and described in detail. But then again: how does Spring find the classes? Spring actually makes a bold assumption. It assumes that the class loader is a special one: URLClassLoader. (And as Nicolai Parlog writes in his article it is not true with Java 9 any more.) It works with a classpath that contains URLs and it can return the array of URLs.

ServiceLoader does not make such an assumption and as such it does not browse the classes.

How does ServiceLoader Find a Class

The ServiceLoader can find and instantiate classes that implement a specific interface. When we call the static method ServiceLoader.load(interfaceKlass), it returns a “list” of classes that implement this interface. I used “list” between quotes because technically it returns an instance of ServiceLoader, which itself implements Iterable so we can iterate over the instances of the classes that implement the interface. The iteration is usually done in a for loop invoking the method load() following the (:) colon.

To successfully find the instances, the JAR files that contain the implementations should have a special file in the directory META-INF/service having the fully qualified name of the interface. Yes, the name has dots in it and there is no any specific file name extension, but nevertheless, it has to be a text file. It has to contain the fully qualified name of the class that implements the interface in that JAR file.

The ServiceLoader invokes the ClassLoader method findResources to get the URLs of the files and reads the names of the classes and then it asks the ClassLoader again to load those classes. The classes should have a public zero-argument constructor so that the ServiceLoader can instantiate each.

Having those files to contain the name of the classes to piggyback the class loading and instantiation using the resource load works, but it is not too elegant.
Java 9, while keeping the annoying META-INF/services solution introduced a new approach. With the introduction of Jigsaw, we have modules and modules have module descriptors. A module can define a service that a ServiceLoader can load and a module can also specify what services it may need to load via the ServiceLoader. This new way the discovery of the implementation of the service interface moves from textual resources to Java code. The pure advantage of it is that coding errors related to wrong names can be identified during compile time, or module load time to make failing code fail faster.

To make things more flexible or just to make them uselessly more complex (future will tell) Java 9 also works if the class is not an implementation of the service interface but does have a public static provider() method that returns an instance of the class that implements the interface. (Btw: in this case, the provider class even may implement the service interface if it wants, but it generally is a factory so why would it. Mind SRP.)

Sample Code

You can download a multi-module maven project from

This project contains three modules Consumer, Provider and ServiceInterface. The consumer calls the ServiceLoader and consumes the service, which is defined by an interface javax0.serviceinterface.ServiceInterface in the module ServiceInterface and implemented in the module Provider. The structure of the code can be seen in the following picture:

The module-info files contain the declarations:

module Provider {
    requires ServiceInterface;
    provides javax0.serviceinterface.ServiceInterface
      with javax0.serviceprovider.Provider;

module Consumer {
    requires ServiceInterface;
    uses javax0.serviceinterface.ServiceInterface;

module ServiceInterface {
    exports javax0.serviceinterface;


Here I will tell you some of the stupid mistakes I made while I created this very simple example so that you can learn from my mistakes instead of repeating the same. First of all, there is a sentence in the Java 9 JDK documentation in the ServiceLoader that reads

In addition, if the service is not in the application module, then the module declaration must have a requires directive that specifies the module which exports the service.

I do not know what it wants to say, but what it means to me is not true. Maybe I misinterpret this sentence, which is likely.

Looking at our example the Consumer module uses something that implements the javax0.serviceinterface.ServiceInterface interface. This something is actually the Provider module and the implementation in it, but it is decided only during run time and can be replaced by any other fitting implementation. Thus it needs the interface and thus it has to have the requires directive in the module info file requiring the ServiceInterface module. It does not have to require the Provider module! The Provider module similarly depends on the ServiceInterface module and has to require it. The ServiceInterface module does not require anything. It only exports the package that contains the interface.

It is also important to note that neither the Provider nor the Consumer modules are not required to export any package. Provider provides the service declared by the interface and implemented by the class named after the with keyword in the module info file. It provides this single class for the world and nothing else. To provide only this class it would be redundant to export the package containing it and it would possibly unnecessarily open the classes that may happen in the same package but are module internal. Consumer is invoked from the command line using the –m option and that also it does not require the module to export any package.
The command like to start the program is

java -p Consumer/target/Consumer-1.0.0-SNAPSHOT.jar:
  jar -m Consumer/javax0.serviceconsumer.Consumer

and it can be executed after a successful mvn install command. Note that the maven compiler plugin has to be at least version 3.6 otherwise, the ServiceInterface-1.0.0-SNAPSHOT.jar will be on the classpath instead of the module path during the compilation and the compilation will fail not finding the module-info.class file.

What is the point

The ServiceLoader can be used when an application is wired with some modules only during run-time. A typical example is an application with plugins. I myself ran into this exercise when I ported ScriptBasic for Java from Java 7 to Java 9. The BASIC language interpreter can be extended by classes containing public static methods and they have to be annotated as BasicFunction. The last version required that the host application embedding the interpreter list all the extension classes calling an API in the code. This is superfluous and not needed. The ServiceLoader can locate service implementation for which the interface (ClassSetProvider) is defined in the main program, and then the main program can call the service implementations one after the other and register the classes returned in the sets. That way the host application does not need to know anything about the extension classes, it is enough that the extension classes are put on the module path and that each provides the service.

The JDK itself also uses this mechanism to locate loggers. The new Java 9 JDK contains the System.LoggerFinder class that can be implemented as a service by any module and if there is an implementation that the ServiceLoader can find the method System.getLogger() will find that. This way the logging is not tied to the JDK, not tied to a library during compile time. It is enough to provide the logger during run-time and the application, the libraries the application uses and the JDK all will use the same logging facility.

With all these changes in the service loading mechanism, and making it part of the language from being piggy-backed on resource loading one may hope that this type of service discovery will gain momentum and will be used in broader scale as it was used before.

Do We Need User Acceptance Test?

(Betteridge’s law of headlines does not apply.)

I was wondering long time why we do UAT. Do not be mistaken. I do not want to say not to do that type of testing. I just did not understand why it is User Acceptance Test. To be more precise I did not understand why it is User. And if it is User then why is it Acceptance?

In this article I will ruminate a bit on this, and by the end I will get to the conclusion that UAT is really UAT.

When we develop professional software, we deliver it to the customer. Not to the User. To the customer. Sometimes these two actors are the same, but that is only a special case. The two roles are different, and in our business model, we should have two stick figures, one for each. We want the customer accepting what we deliver and we want to have the customer to be happy with what we have achieved. It is not the user. We develop software professional from the start until the end and this includes that we are paid for that. Customer is the actor who pays the bill. It is not the user.

The difference is clear in an example when we deliver an administrative software that helps a large bank, insurance company, and mining company to perform the administration with less human resource replacing the manual work with cheaper software labor. In that case the users are the administrators, the customer is the director of the company. The director never ever touches any administrative screen. The director a.k.a. the customer is happy if the users can use the system and the administration goes well. What we really want to achieve is customer satisfaction.

What we need is Customer Satisfaction Assurance Test (CSAT).

The thing is that we cannot do that directly. The same is true with unit tests. We need functional coverage and we do code coverage. The two are not the same, but there is a correlation between the two. The same is true with technical interviews. What we need to know is that the candidate will perform well in the position working for the firm. What we do: we ask simple or complex, but usually annoying and stupid questions and hopelessly hope that the answers’ correctness will correlate with the future performance the company wants. The same is true with stocks, marriage, health… We fail in all of those, though we do our best. That is UAT.

We cannot do CSAT. What we do is UAT, we hope that the result will correlate with customer satisfaction, they will pay our bills, future orders will come, and we will live a financially viable and happy life. What a miserable failure most of the time! Never mind.
This is not generally true, but we can safely assume as a work hypothesis that the
customer is satisfied if the users are okay.

Sometimes there are no users per se, especially in the IoT arena and that is a reason why nobody cares for example security. That is why we are doomed to use shitty IoT applications which put you in a position that may obnoxiously be familiar if you have ever visited a proctologist or gynecologist. However, this is another story. Usually there are users.
We want the users to be happy.

Wait! No! No! And also third time: no! We actually do not mind if the user is happy and it generally is good to have the user happy but we do not explicitly want them to be. We also do not want them not to be. That is not the question. We just do not bother. We want customer satisfaction, which is ad 1. not the user an ad 2. a level less than happy. The customer is satisfied if the business needs are met and business needs can only be met if the users can work with the system. Users will accept a new system only if they can work with it.

The next question is why do we not want the user to be happy? The aggressive answer would be that it is a different profession. The professional answer is that the PNL analysis does not justify it (PNL= profit & loss). We have a maximum revenue that depends on many factor. Mostly on how good our sales people can promote. (Side note: did you know that lists the word “promote” as a synonym for “lie”?) However, this time let us ignore this factor, which is by the way happens to be the most significant revenue factor, but irrelephant from the UAT point of view. We focus on customer happiness only for now. The formula is:

Income = I(user happiness)

This is a monotonically increasing function. The happier the user is the more income we generate. This increase may lag behind the difference of user happiness; it may even be zero, but a happier user never meant less income. Well, maybe if you work as a sexton, but that is far from software development.

If the program works faster, the UI is simpler to understand, the functions work seamless then the users are happier and they will love us. On the other side, doing that increases the cost. Faster working usually needs more hardware, simpler UI needs more design and analysis work and many times UI refactoring and these do not come free.

We also have costs that are also dependent on the user happiness level we want to achieve:

Costs = C(user happiness)

The profit in PNL is the difference of the income and the costs. Let’s denote user happiness with UH because this looks more scientific:

P = I (UH) – C(UH)

Both I and C increase with UH. The usual characteristics of these functions is that I increases fast when UH is small. For example when we have a program that has a bug making it unusable them the income is fairly low. I mean zero. Nobody except government agencies will pay for a software that does not work. With a small investment, we can fix the bug, we can get a software that is just usable, and the users accept it. Out income jumped we will get paid.

If we look at the income on the project level this is the maximum we can get. Any further investment to increase user happiness is waste of the shareholders money.

We can also look on a broader scale. We believe that creating better than just usable software will generate further revenue. If the people have a picture of us as quality software provider, it will not hurt us. In that case, the income should be calculated considering the future income amounts with their probability factors and discounting to present value. That is when we feel lost. How can you do that? Sorry, you cannot. This is when science turns to be art. Still it is important to know when you develop software: there are features that you unfold to close the project and get paid, and there are features that you develop in the woeful hope of future income.

As we get further along the line of the user happiness axis calculating the income and cost functions, we will see that the income does not grow any further too rapidly. On the other hand after a while the cost starts to grow quite rapidly. When you have developed the main use cases, the happy path execution of the process works, the most frequently exceptional cases are covered then you have to stop. Nobody should develop a functionality into a mobile billing application that calculates the roaming costs for a phone of a deceased whose phone accidentally was buried with him in the coffin (or with her). Never happens. If ever say the person gets exhumed, moved to another country and the phone still works then in this single special case handling the situation manually or even by means of pigeon mail is cheaper than developing (and maintaining!!!) the software for the special case. (You see: even sextons are not immune of software problems.) When you have a user interface that can perform the most frequent use cases with one click, the less frequent ones with two clicks and some rare use cases with three or more clicks; then there is no bonus to make everything to one click. Enough is enough. 20% of the features will generate 80% of the revenue. That is pareto.


UAT is only one of the procedures that we regularly do. It is industry best practice. Everybody does that so we also do it. The reasoning is bad.

When there is an industry best practice we can follow it, but senior engineers should also know the reason.

In case of UAT we should know that it is not the “user” whose happiness and satisfaction is primarily important for us. It is the customer, but one cannot be without the other.

P.S.: When I said “different profession”, I was implying psychologist.