Java Deep

Pure Java, what else

Knowing the bits

We use complex systems. My mother once said that there could be little leprechauns behind the TV screen redrawing the screen 50 times a second she could not care. (At least she new that the TV in Europe had 50 (half) screens every second.) Most of the people do not care about the electronics and the softwar around us. The trend is that this technology penetration is going to be even more dense. Electronics gets cheaper, programming becomes easier and soon toilet papers will have one-time-use embedded computers on it. (Come up with a good application!) Face recognition is not the privilege of NSA, CIA, KG or Mosad and the technology spread does not stop at the level of big corporations like FB, or Google. Shops start to install cameras and software that recognizes and identifies frequent buyers helping the work of the sales. People get used to it and IT personnel are not different, are we?

Kind of yes. The difference is that we are interested in the details of those leprechauns how they do their job. We know that these days there are liquid crystals in the screen, they are controlled by low voltage signals (at least compared to the voltages of the former CRT solutions) and that there is a processor in the TV/toaster/toilet paper and it is programmed in a language called e.g. Java.

We, Java programmers, program these applications and we not only use the language (including RT) but also layered software, frameworks. How do these layered software work? Should we understand or should we just use it and hope that it works?

The more you know a framework the better you can use it.

Better means faster, more reliable, creating code that is more likely to be compatible with future versions. On the other hand there should be a reasonable stop when you have to halt learning and start using. There is no point to know all the details of a framework, if you never start using it. You should aim for the value you generate.

On the other end of the line however, if you do not have enough knowledge of the framework you may end up using a hammer digging a hole instead of a shovel. I usually feel confident when my knowledge reaches the level of understanding that I know how they (the developers of the framework) did it. When I can bravely say:

If I had time (sometimes perhaps more than lifetime of a single person) I could develop that framework myself.

Of course, I will not, because I do not have the time and also, more importantly because there is no point developing something that is already developed with appropriate quality. Or is there?

I could do it better.

I have heard that many times from junior programmers and from programmers, who considered themselves not that junior. The correct attitude would have been:

I could do it better, but I won’t because it is done and is good enough.

You do not need the best. You just need a solution that is good enough. There is no point to invest more if there is no extra leverage. There is no point to invest more even if there is leverage but it is lower than the investment in other areas would be higher.

Generally that is it when you are professional. Face it!

We hate/love lambda

We have the long waited lambda feature in Java 8. And we love it. We love to use it at places where we used anonymous class. We love to use it where we used some looping construct. Now we use functional interfaces instead and thus we get faster performance using parallel streams and we get more readable code. The time is a short period of euphoria to be replaced by the several, low orbiting WTFs reviewing out others code. I write nice and readable code but I continually experience that others write ugly, unreadable and wrong code. (Please feel the delicate odour of irony.) And I expect lambda will make it worse.

There are two main values in a programming language feature:

  1. How well can you express your ideas utilizing the feature.
  2. How badly one can use (or abuse) the feature.

Good language features aid a lot in expressing yourself and can not be abused. They just shine bright readable in all their glory no matter how hard a bad guy may use them the wrong way. They end up with properly working and readable code. The only problem with these ideal language features is that they do not exist.

On the other hand bad language features are hard to use to express your ideas and are easy to misuse. And contrary to the previous one: they exist.

The reality is that most languages have language features that provide great ways to express your ideas but at the same time they can also be misused. Usually the easier to express yourself the easier to misuse a feature. Java language does not shine with brilliant syntax to briefly express your ideas but the same time you can not abuse it really. You can abuse it a bit, but there is no Java counterpart of the obfuscated C contest. Java is verbose, dull, boring. It is so dull that the language that wanted to overcome this dullness is named groovy. Groovy is many steps ahead of Java implementing shiny features. They have the tools and they are not afraid to use it. Groovy developers are and should not be afraid of unreadable code written by the bad guys or else they are going to have bad time. If you are to maintain Groovy code, prepare for the worst. No matter how good and beautiful Groovy code could be it will be just bad, because it was written by average programmer.

The Groovy language implements a feature if the feature can be used well and brilliant but how bad the feature can be abused is not considered really. The designers of the Groovy language assume that the programmers are all brilliant, and experienced artists. We all are, aren’t we?

Implementing lambda Java made a step towards the Groovy direction. It is a great feature that can replace the ugly anonymous class use, aid functional style and so on. But at the same time …

OpenSource License Manager

What is a License Manager?

License managers are used to enforce license rights, or at least to support the enforcement. When you develop an open source program, there is no much you need to or can do to enforce license rights. The code is there and if anyone just wants to abuse the program there is nothing technical that could stop them. Closed source programs are different. (Are they?) In that case the source code is not available for the client. It is not possible to alter the program so that it circumvents the license enforcement code, and thus there is a real role for license rights enforcement.

But this is not true.

The truth is that there is no fundamental difference between closed and open source code in this respect. Closed source codes can also be altered. The ultimate “source” for the execution is there after all: the machine code. There are tools that help to analyze and decode the binary to more or less human readable format and thus it is possible to circumvent the license management. It is possible and there is a great source of examples for it. On some sites hosted in some countries you can simply download the cracked version of practically any software. I do not recommend to do that and not only for ethical reasons though. You just never know which of the sites are funded by secret services or criminals (if there is any difference) and you never know if you install spy software on your machine using the cracked version.

Once I worked for a company where one of the success measurements of their software was the number of the days after release till the cracked versions appeared on the different sites compared to the same value of the competitor. The smaller the number was for their software the happier they were. Were they crazy? Why were they happy to know that their software was cracked? When the number of the days was only one single they, why did not they consider applying stronger license enforcement measure, like morphing code, hardware key and so on?

The answer is the following. This company knows very well that license management is not to prevent the unauthorized use. It can be used that way but it will have two major effects which will ruin your business:

  • Writing license management code you spend your time on non-productive code.
  • License management (this way) works against your customer.

Never implement license management against your customer.

When your license management solution is too restrictive you may restrict the software use of your customer. When you deliver your code using hardware key you impose inconvenience to your customer. When you bind your license to Ethernet MAC address of the machine the application is running on, again: you work against your customer.

Set<User> != Set<Customer>

Face the said truth: there will always be people, who use your software without paying for it. They are not your customers. Do they steal from you? Not necessarily. If there is someone who is not buying your software, he is not your customer.

If you know that there is no way they would pay for the software and the decision was in your hands whether you want them to use the software or use that of your competitor what would you choose? I guess you would like your software to be used to get more feedback and more knowledge even in the area of non-customers. People using your software may become your customer more likely than people not using it. This is why big companies sell out educational licenses to universities and other academic institutions.

Should we use license management at all in that case? Is license management bad down to ground in all aspects? My answer is that it is not. There is a correct use case for license management, even when the software is open source (but not free, like Atlassian products). To find and understand this use case there is one major thing to understand:

The software is for the customer, and any line in the code has to support the customers to reach their business goals.

Paying the fee for the software is for the customers. If nobody finances a software the software will die. There is nothing like free lunch. Somebody has to pay for it. To become a customer and pay for the software used is the most straightforward business model and provides the strongest feedback and control for the customer over the vendor to get the features needed.

At the same time paying for the software use is not the core business of the customer. Paying for the resources used supports them to reach their business goals is indirect. This is where license management comes into picture. It helps the customer to due their duties. It helps them remember their long term needs. This also means that license management should not prevent functionality. No functionality should stop if a license expires. Not to mention functionality that may prevent access to data that actually belongs to the customer.

If you approach license management with this mindset you can see that even open source (but not free) software may need it.

License Management Tool: license3j

Many years ago I was looking for some license management library and I found that there was none open source. I wanted to create an open source (but not free) application and it required that the license management is also open source. What I found was also overpriced taking into account our budget that was just zero for a part time start-up software (which actually failed business wise miserably, but that is another story). For this reason I created License3j which surprisingly became one of the most used library of my OS projects.

License3j is very simple in terms of business objects. It uses a simple property file and lets the application check the content of the individual fields. The added value is handling electronic signature and checking the authenticity of the license file. Essentially it is hardly more than a single class file.

<dependency>
	<groupId>com.verhas</groupId>
	<artifactId>license3j</artifactId>
	<version>1.0.4</version>
</dependency>

Feel free to use it if you like.

How not to use Java 8 default methods

Warning: you can not make this unseen once you have read

I was talking about the multiple inheritance of default methods in the last blog article and how they behave during compilation and run time. This week I look at how to use default methods to do real inheritance, which actually, default methods were not designed for. For this very reason, please read these lines at your own risk, and do not imply that this is a pattern to be followed, just as well do not imply the opposite. What I write here are some coding technics that can be made using Java 8 but their usability is questionable at least for me. I am also a bit afraid to let some ifrit out of the bottle , but on the other hands those ifrits just do not stay there anyway. Some day somebody would let it out. At least I attach the warning sign.

Sample problem

A few years ago I worked on an application that used a lot of different types of objects that each had a name. After many classes started to contain

public String getName(){...}
public void setName(String name){...}

methods that were just setters and getters the copy paste code smell just filled the room unbearable. Therefore we created a class

class HasName {
  public String getName(){...}
  public void setName(String name){...}
}

and each of the classes that had name were just extending this class. Actually it was not working for a long time. There were classes that extended already other classes. In that case we just tried to move the HasName upward in the inheritance line, but in some cases it just did not work. As we went up the line reaching for the top we realized that those classes and their some other descendant do not have a name, why to force them? To be honest, in real life it was bit more complex than just having name. If it were only names, we could live with it having other classes having names. It was something more complex that would just make the topic even more complicated and believe me: it is going to be complex enough.

Summary: we could not implement having the name for some of the objects implemented in some spare classes. But now we could do that using default methods.

HasName interface with default implementation

Default methods just deliver default functionality. A default method can access the this variable, which is always the object that is implementing the interface and on which behalf the method was invoked. If there is an interface I and class C implements the interface, when a method on a C c object is invoked the variable this is actually the object c. How would you implement getName() and setName()?

These are setters and getters that accessing a String variable that is in the object. You can not access that from the interface. But it is not absolutely necessary that the value is stored IN the object. The only requirement is that whatever is set for an object the same is get. We can store the value somewhere else, one for each object instance. So we need some value that can be paired to an object and the lifetime of the value has to be the same as the lifetime of the object. Does it ring the bell?

It is a weak hash map! Yes, it is. And using that you can easily implement the HasName interface.

public interface HasName {
    class Extensions {
        private static final WeakHashMap<HasName, String> map = new WeakHashMap<>();
    }
    default void setName(String name) {
        Extensions.map.put(this, name);
    }
    default String getName() {
        return Extensions.map.get(this);
    }
}

All you have to do is write at the end of the list of interfaces the class implements: ,HasName and it magically has.

In this example the only value stored is a String. However you can have instead of String any class and you can implement not only setters and getters but any methods that do something with that class. Presumably these implementations will be implemented in the class and the default methods will only delegate. You can have the class somewhere else, or as an inner class inside the interface. Matter of taste and style.

Conclusion

Interfaces can not have instance fields. Why? Because in that case they were not interfaces but classes. Java does not have multiple implementation inheritance. Perhaps it has but "please don't use it" kind of. The default method is a technological mistake. You can call it compromise. Something that was needed to retain backward compatibility of JDK libraries when extended with functional methods. Still you can mimic the fields in interfaces using weak hash maps to get access the inherited class "vtable" of fields and methods to delegate to. With this you can do real multiple inheritance. The type that your mother always warned you about. I told you mate!

Another warning: the above implementation is NOT thread safe. If you try to use it in multithread environment you may get ConcurrentModificationException or it may even happen that calling get() on a weak hash map gets into infinite loop and never returns. I do not tell how to fix the usage of weak hash maps in this scenario. Or, well, I changed my mind, and I do: use default methods only the they were designed for.

Java 8 default methods: what can and can not do?

What default method is

With the release of Java 8 you can modify interfaces adding new methods so that the interface remains compatible with the classes that implement the interface. This is very important in case you develop a library that is going to be used by several programmers from Kiev to New York. Until the dawn of Java 8 if you published an interface in a library you could not add a new method without risking that some application implementing in the interface will break with the new version of the interface.

With Java 8 this fear is gone? No.

Adding a default method to an interface may render some class unusable.

Let’s see first the fine points of the default method.

In Java 8 a method can be implemented in an interface. (Static methods can also be implemented in an interface as of Java8, but that is another story.) The method implemented in an interface is called default method and is denoted by the keyword default as a modifier. When a class implements an interface it may, but does not need to implement a method implemented already in the interface. The class inherits the default implementation. This is why you may not need touch a class when an interface it implements changes.

Multiple inheritance?

The things start to get complicated when a concrete class implements more than one (say two) interfaces and the interfaces implement the same default method. Which default method will the class inherit? The answer is none. In such a case the class has to implement the method itself (directly or by inheritance from a higher class).

This is also true when only one of the interfaces implement the default method and the other one only declares it as abstract. Java 8 tries to be disciplined and avoid “implicit” things. If the methods are declared in more than one interfaces then no default implementation is inherited, you get a compile time error.

However you can not get a compile time error if you have your class already compiled. This way Java 8 is not consistent. It has its reason, which I do not want to detail here or get into debate for various reasons (e.g.: the release is out, debate time is long over and was never on this platform).

  • Say you have two interfaces, and a class implementing the two interfaces.
  • One of the interfaces implement a default method m().
  • You compile all the interfaces and the class.
  • You change the interface not containing the method m() to declare it as an abstract method.
  • Compile the modified interface only.
  • Run the class.

default method multiple inheritance
In this case the class runs. You can not compile it again with the modified interfaces, but if it was compiled with the older version: it still runs. Now

  • modify the interface having the abstract method m() and create a default implementation.
  • Compile the modified interface.
  • Run the class: failure.

When there are two interfaces providing default implementation for the same method the method can not be invoked in the implementing class unless implemented by the class (again: either directly or inherited from another class).
invalid multiple inharitance of default methods
The class is compatible. It can be loaded with the new interface. It can even start execution so long as long there is no invocation to the method having default implementation in both interfaces.

Sample code

directory structure of test

To demonstrate the above I created a test directory for the class C.java and three subdirectories for the interfaces in files I1.java and I2.java. The root directory of the test contains the source code for the class C in file C.java. The directory base contains the interface version that is good for execution and compilation. I1 contains the method m() with default implementation. The interface I2 does not contain any method for now.

The class contains a main method so we can execute it in our test. It tests if there is any command line argument so we can easily execute it with and without invoking the method m().

~/github/test$ cat C.java 
public class C implements I1, I2 {
  public static void main(String[] args) {
    C c = new C();
    if( args.length == 0 ){
      c.m();
    }
  }
}
~/github/test$ cat base/I1.java 
public interface I1 {
  default void m(){
    System.out.println("hello interface 1");
  }	
}
~/github/test$ cat base/I2.java 
public interface I2 {
}

We can compile and run the class using the command lines:

~/github/test$ javac -cp .:base C.java
~/github/test$ java -cp .:base C
hello interface 1

The directory compatible contains a version of the interface I2 that declares the method m() abstract, and for technical reasons it contains I1.java unaltered.

~/github/test$ cat compatible/I2.java 

public interface I2 {
  void m();
}

This can not be used to compile the class C:

~/github/test$ javac -cp .:compatible C.java 
C.java:1: error: C is not abstract and does not override abstract method m() in I2
public class C implements I1, I2 {
       ^
1 error

The error message is very precise. Even though we have the C.class from the previous compilation and if we compile the interfaces in the directory compatible we will have two interfaces that can still be used to run the class:

~/github/test$ javac compatible/I*.java
~/github/test$ java -cp .:compatible C
hello interface 1

The third directory, wrong contains a version of I2 that also defines the method m():

~/github/test$ cat wrong/I2.java 
public interface I2 {
  default void m(){
    System.out.println("hello interface 2");
  }
}

We should not even bother to compile it. Even though the method is double defined the class still can be executed so long as long it does not invoke the method, but it fails as soon as we try to invoke the method m(). This is what we use the command line argument for:

~/github/test$ javac wrong/*.java
~/github/test$ java -cp .:wrong C
Exception in thread "main" java.lang.IncompatibleClassChangeError: Conflicting default methods: I1.m I2.m
	at C.m(C.java)
	at C.main(C.java:5)
~/github/test$ java -cp .:wrong C x
~/github/test$

Conclusion

When you start to move your library to Java 8 and you modify your interfaces adding default implementations, you probably will not have problems. At least that is what Java 8 library developers hope adding functional methods to collections. Applications using your library are still relying on Java 7 libraries that do not have default methods. When different libraries are used and modified, there is a slight chance of conflict. What to do to avoid it?

Design your library APIs as before. Do not go easy relying on the possibility of default methods. They are last resort. Choose names wisely to avoid collision with other interfaces. We will learn how Java programming will develop using this feature.

Documenting API using Concordion

Concordion is an open source tool for writing automated acceptance tests in Java.” It is a handy little tool, simple to use and even the source code of the tool is good style. You describe the tests using HTML with special markups and when you run your special unit tests using the ConcordionRunner it processes the HTML and replaces the special tags with the actual values fetched from the tests. In the end you get an HTML colored with red and green spots where some tests failed or succeeded respectively. This is a result good for the eyes, BAs and easy to spot any error. This way it is similar to fitnesse and GreenPepper.

Even though the tool was designed for automated acceptance test — which one could argue is a non-sense term — I wanted to use it to document API.

The usual way to document API is JavaDoc. JavaDoc includes the signature of the methods, and comments. The comments are supposed to depict the way the method should be used. This is fairly good approach but has some shortages:

  • The comments become outdated. Developers change the way the method is to be used, but forget to update the JavaDoc.
  • A method many times should be used in different ways together with other objects and methods.

For this reason many developers believe that unit tests are the real documentation of an API. Two approaches with good and bad aspects. How could we leverage the best of the both world?

I decided to write a small library that can be included into Concordion fixtures using delegation and which can read the source code of the fixture or just any other source code, cut off some lines from the code and return them as string. Referencing the method Concordion output can include (presumably preformatted) Java code. This way the resulting HTML will contain living documentation including actual code without manual copy pasting that decreases the danger of the documentation getting outdated.

The project is available from GitHub and also from Sonatype repo. The documentation of the project was also created this way.

Object Interning

Java stores the string constants appearing in the source code in a pool. In other words when you have a code like

String a = "I am a string";
String b = "I am a string";

the variables a and b will hold the same value. Not simply two strings that are equal but rather the very same string. In Java words a == b will be true. However this works only for Strings and small integer and long values. Other objects are not interned thus if you create two objects that hold exactly the same values they are usually not the same. They may and probably be equal but not the same objects. This may be a nuisance some time. Probably when you fetch some object from some persistence store. If you happen to fetch the same object more than one time you probably would like to get the same object instead of two copies. In other words I may also say that you only want to have one single copy in memory of a single object in the persistence. Some persistence layers do this for you. For example JPA implementations follow this pattern. In other cases you may need to perform caching yourself.

In this example I will describe a simple intern pool implementation that can also be viewed on the stackoverflow topics. In this article I also explain the details and the considerations that led to the solution depicted there (and here as well). This article contains more detailed tutorial information than the original discussion.

Object pool

Interning needs an object pool. When you have an object and you want to intern that object you essentially look in the object pool to see if there is already an object equal to the one in hand. In case there is one we will use the one already there. If there is no object equal to the actual one then we put the actual object into the pool and then use this one.

There are two major issues we have to face during implementation:

  • Garbage Collection
  • Multi-thread environment

When an object is not needed anymore it has to be removed from the pool. The removal can be done by the application but that would be a totally outdated and old approach. One of the main advantage of Java over C++ is the garbage collection. We can let GC collect these objects. To do that we should not have strong references in the object pool to the pooled objects.

Reference

If you know what soft, weak and phantom references, just jump to the next section.

You may noticed that I did not simply say “references” but I said “strong references”. If you have learned that GC collects objects when there are no references to the object then it was not absolutely correct. The fact is that it is a strong reference that is needed for the GC to treat an object untouchable. To be even more precise the strong reference should be reachable travelling along other strong references from local variables, static fields and similar ubiquitous locations. In other word: the (strong) references that point point from one dead object to another does not count, they together will be removed and collected.

So if these are strong references, then presumably there are not so strong references you may think. You are right. There is a class named java.lang.ref.Reference and there are three other classes that extend it. The classes are

  1. PhantomReference
  2. WeakReference and
  3. SoftReference

in the same package. If you read the documentation you may suspect that what we need is the weak one. Phantom is out of question for use to use in the pool, because phantom references can not be used to get access to the object. Soft reference is an overkill. If there are no strong references to the object then there is no point to keep it in the pool. If it comes again from some source, we will intern it again. It will certainly be a different instance but nobody will notice it since there is no reference to the previous one.

Weak references are the ones that can be use to get access to the object but does not alter the behavior of the GC.

WeakHashMap

Weak reference is not the class we have to use directly. There is a class named WeakHashMap that refers to the key objects using soft references. This is actually what we need. When we intern an object and want to see if it is already in the pool we search all the objects to see if there is any equal to the actual one. A map is just the thing that implements this search capability. Holding the keys in weak references will just let the GC collect the key object when nobody needs it.

We can search so far, which is good. Using a map we also have to get some value. In this case we just want to get the same object, so we have to put the object into the map when it is not there. However putting there the object itself would ruin what we gained keeping only weak references for the same object as a key. We have to create and put a weak reference to the object as a key.

WeakPool

After that explanation here is the code. It just says if there is an object equal to the actual one then get(actualObject) should return it. If there is none, get(actualObject) will return null. The method put(newObject) will put a new object into the pool and if there was any equal to the new one, it will overwrite the place of the old one with the new.

public class WeakPool<T> {
  private final WeakHashMap<T, WeakReference<T>> pool = new WeakHashMap<T, WeakReference<T>>();
  public T get(T object){
      final T res;
      WeakReference<T> ref = pool.get(object);
      if (ref != null) {
          res = ref.get();
      }else{
          res = null;
      }
      return res;
  }
  public void put(T object){
      pool.put(object, new WeakReference<T>(object));
  }
}

InternPool

The final solution to the problem is an intern pool, that is very easy to implement using the already available WeakPool. The InternPool has a weak pool inside, and there is one single synchronized method in it intern(T object).

public class InternPool<T> {
  private final WeakPool<T> pool = new WeakPool<T>();
  public synchronized T intern(T object) {
    T res = pool.get(object);
    if (res == null) {
        pool.put(object);
        res = object;
    }
    return res;
  }
}

The method tries to get the object from the pool and if it is not there then puts it there and then returns it. If there is a matching object already there then it returns the one already in the pool.

Multi-thread

The method has to be synchronized to ensure that the checking and the insertion of the new object is atomic. Without the synchronization it may happen that two threads check two equal instances in the pool, both of them find that there is no matching object in it and then they insert their version into the pool. One of them, the one putting its object later will be the winner overwriting the already there object but the looser also thinks that it owns the genuine single object. Synchronization solves this problem.

Racing with the Garbage Collector

Even though the different threads of the java application using the pool can not get into trouble using the pool at the same time we still should look at it if there is any interference with the garbage collector thread.

It may happen that the reference gets back null when the weak reference get method is called. This happens when the key object is reclaimed by the garbage collector but the weak hash map in the weak poll implementation still did not delete the entry. Even if the weak map implementation checks the existence of the key whenever the map is queried it may happen. The garbage collector can kick in between the call of get() to the weak hash map and to the call of get() to the weak reference returned. The hash map returned a reference to an object that existed by the time it returned but, since the reference is weak it was deleted until the execution of our java application got to the next statement.

In this situation the WeakPool implementation returns null. No problem. InternPool does not suffer from this also.

If you look at the other codes in the before mentioned stackoverflow topics, you can see a code:

public class InternPool<T> {

    private WeakHashMap<T, WeakReference<T>> pool = 
        new WeakHashMap<T, WeakReference<T>>();

    public synchronized T intern(T object) {
        T res = null;
        // (The loop is needed to deal with race
        // conditions where the GC runs while we are
        // accessing the 'pool' map or the 'ref' object.)
        do {
            WeakReference<T> ref = pool.get(object);
            if (ref == null) {
                ref = new WeakReference<T>(object);
                pool.put(object, ref);
                res = object;
            } else {
                res = ref.get();
            }
        } while (res == null);
        return res;
    }
}

In this code the author created an infinite loop to handle this situation. Not too appealing, but it works. It is not likely that the loop will be executed infinite amount of time. Likely not more than twice. The construct is hard to understand, complicated. The morale: single responsibility principle. Focus on simple things, decompose your application to simple components.

Conclusion

Even though Java does interning only for String and some of the objects that primitive types are boxed to it is possible and sometimes desirable to do interning. In that case the interning is not automatic, the application has to explicitly perform it. The two simple classes listed here can be used to do that using copy paste into your code base or you can

        <dependency>
          <groupId>com.javax0</groupId>
          <artifactId>intern</artifactId>
          <version>1.0.0</version>
        </dependency>

import the library as dependency from the maven central plugin. The library is minimal containing only these two classes and is available under the Apache license. The source code for the library is on GitHub.

Poll

After we managed to have a pool, now lets to have a poll! Please answer the following questions, honestly:

Logging or debugging

Debugging is lame. You should debug log.

If your code is structured you do not need debug logging.

These are two opinions from the two ends of the line. I am, as usually, standing in the middle, and I will tell you why.

First of all, there is no principal difference between debugging versus logging. They are just two different implementations of the same thing: observation of your execution engine state in time dimension.

Issue with debugging

When you debug you step your program forward in time and at any point the execution stops you can examine the value of any variable. The shortage is that you can not step back in time. At some points you realize that you would just like to see what the value of a certain variable was just before some method was called, some object was created or whatsoever happened in the system. What you actually do in such a situation is to restart the code and hoping it behaves deterministic try to catch the execution at the earlier stage that you are interested in. And this is another shortage of debugging. You can not effectively debug a code that does not behave deterministic. And trust me: most bugs behave non deterministic.
Debug versus Logging

Issue with logging

With logs the major issue is different. It is not the time but rather the breadth of states, variables that you can look at is the problem. You insert log statements into your code dumping the values of variables into a log file at a certain point of execution. When you examine the log file you can scroll back and forth. However if you did not print out the value of a certain variable at a certain execution point, there is no way to get it from the log file. The solution is the same as with debugging: execute the code again, this time extended with the new log statements. If, however, you have enough information in your log files, then you will just get enough information to track down a bug even if that is not deterministic. Only ‘if you have’ …

Solution: logging all the states all the times?

The ideal solution would be to dump all variables into a possibly binary log file at each state of the execution and examine the content of the file afterwards. The examination would essentially look like a debugger, except that the change of the variables comes from the recorded log file instead of from on the fly calculation. It would be like a playback of a recorded execution and as such you could replay it several times. I do not know if there is any tool like that for the JVM.

You just can not define what is “each state” effectively in a multi thread execution environment like the JVM is. This is one of the issues. The other thing is that if you’d start dumping the JVM memory after each command (forgetting the issues of multi-thread) it would require enormous amount of bandwidth and disk space.

Dreaming about the ideal solution not deliverable is sort of no use. What is the solution that can practically be executed?

Practical approach

You can debug when it is appropriate. Full stop. You just did that so far, keep doing that. I tend to use log statements even when I debug some code and if the environment allows it I do it on the fly. When I find the root cause of the issue I am hunting I review the log statements and I delete them. They did the job while debugging, they are not needed anymore. At least that was my practice unit I found myself writing log statements that I have already created before. Why? Because fixing one bug does not mean that I have fixed all of them. There is nothing like all bugs fixed. But the log items littered the log file and that just increased the work to find the needed information hunting the next bug. In other words the log file is full of noise and that is why I deleted these items the first place. But for the same reason I could also delete the unit tests that already pass. It would save a lot of time during compilation, wouldn’t it? We do not do that.

Summary in one sentence? Log and debug the way it fits you and the issue you are hunting.

Synthetic and bridge methods

If you have ever played with reflection and executed getDeclaredMethods() you may have been surprised. You may get methods that are not present in the source code. Or, perhaps, you had a look at the modifiers of some of the methods and saw that some of these special methods are volatile. Btw: this is nasty question for Java interviews “What does it mean, when a method is volatile?” The proper answer is that a method can not be volatile. At the same time there can be some method among those returned by getDeclaredMethods() or evengetMethods() for which Modifier.isVolatile(method.getModifiers()) is true.

This has happened to one of the users of the project immutator. He realized that immutator (which itself digs quite deep into the dark details of Java) generated Java source that was not compilable using the keyword volatile as modifier for a method. As a consequence it did not work either.

What has happened there? What are the bridge and syntethic methods?

Visibility

When you create a nested or embedded class the private variables and methods of the nested class are reachable from the top level class. This used by the immutable embedded builder pattern. This is a well defined behavior of Java, defined in the language specification.

JLS7, 6.6.1 Determining Accessibility

… if the member or constructor is declared private, then access is
permitted if and only if it occurs within the body of the top level class (ยง7.6)
that encloses the declaration of the member or constructor…

package synthetic;

public class SyntheticMethodTest1 {
    private A aObj = new A();

    public class A {
        private int i;
    }

    private class B {
        private int i = aObj.i;
    }

    public static void main(String[] args) {
        SyntheticMethodTest1 me = new SyntheticMethodTest1();
        me.aObj.i = 1;
        B bObj = me.new B();
        System.out.println(bObj.i);
    }
}

How is it handled by the JVM? The JVM does not know inner or nested classes. For the JVM all classes are top level outer classes. All classes are compiled to be a top level class, and this is the way how those nice ...$. .class files are created.

 $ ls -Fart
../                         SyntheticMethodTest2$A.class  MyClass.java  SyntheticMethodTest4.java  SyntheticMethodTest2.java
SyntheticMethodTest2.class  SyntheticMethodTest3.java     ./            MyClassSon.java            SyntheticMethodTest1.java

If you create an nested or inner class it will be compiled to be a full blown top level class.

How will the private fields be available from the outer class? If those get into a top level class and are private, as they really are, then how will they be reachable from the outer class?

The way javac solves this issue that for any field, method or constructor being private but used from the top level class it generates a synthetic method. These synthetic methods are used to reach the original private filed/method/constructor. The generation of these methods are done in a clever way: only those are generated that are really needed and used from outside.

package synthetic;

import java.lang.reflect.Constructor;
import java.lang.reflect.Method;

public class SyntheticMethodTest2 {

    public static class A {
        private A(){}
        private int x;
        private void x(){};
    }

    public static void main(String[] args) {
        A a = new A();
        a.x = 2;
        a.x();
        System.out.println(a.x);
        for (Method m : A.class.getDeclaredMethods()) {
            System.out.println(String.format("%08X", m.getModifiers()) + " " + m.getName());
        }
        System.out.println("--------------------------");
        for (Method m : A.class.getMethods()) {
            System.out.println(String.format("%08X", m.getModifiers()) + " " + m.getReturnType().getSimpleName() + " " + m.getName());
        }
        System.out.println("--------------------------");
        for( Constructor<?> c : A.class.getDeclaredConstructors() ){
            System.out.println(String.format("%08X", c.getModifiers()) + " " + c.getName());
        }
    }
}

Since the name of the generated methods depend on the implementation and is not guaranteed the most I can say for the output of the above program is that on the specific platform where I executed it produced the following output:

2
00001008 access$1
00001008 access$2
00001008 access$3
00000002 x
--------------------------
00000111 void wait
00000011 void wait
00000011 void wait
00000001 boolean equals
00000001 String toString
00000101 int hashCode
00000111 Class getClass
00000111 void notify
00000111 void notifyAll
--------------------------
00000002 synthetic.SyntheticMethodTest2$A
00001000 synthetic.SyntheticMethodTest2$A

In the program above we assign value to the field x and we also call the method of the same name. These are needed to trigger the compiler to generate the synthetic methods. You can see that it generated three methods, presumably the setter and the getter for the fieldx and a synthetic method to the method x(). These synthetic methods, however, are not listed in the next list returned by getMethods() since these are synthetic methods and as such are not available for generic invocation. They are, in this sense, as private methods.

The hexa numbers can be interpreter looking at the constants defined in the class java.lang.reflect.Modifier:

00001008 SYNTHETIC|STATIC
00000002 PRIVATE
00000111 NATIVE|FINAL|PUBLIC
00000011 FINAL|PUBLIC
00000001 PUBLIC
00001000 SYNTHETIC

There are two constructors in the list. There is a private one and a synthetic one. The private exists, since we defined it. The synthetic on the other hand exists because we invoked the private one from outside. Bridge methods, however, do not had any so far.

Generics and inheritance

So good, so far, but we still did not see any “volatile” methods.

Looking at the source code of java.lang.reflec.Modifier you can see that the constant 0x00000040 is defined twice. Once as VOLATILE and once as BRIDGE (this latter is package private and is not for general use).

To have such a method a very simple program will do:

package synthetic;

import java.lang.reflect.Method;
import java.util.LinkedList;

public class SyntheticMethodTest3 {

    public static class MyLink extends LinkedList<String> {
        @Override
        public String get(int i) {
            return "";
        }
    }

    public static void main(String[] args) {

        for (Method m : MyLink.class.getDeclaredMethods()) {
            System.out.println(String.format("%08X", m.getModifiers()) + " " + m.getReturnType().getSimpleName() + " " + m.getName());
        }
    }
}

We have a linked list that has a method get(int) returning String. Let’s not discuss the clean code issues. This is a sample code to demonstrate the topic. The same issues come up in clean code as well, though more complex and harder to get to the point when it causes a problem.

The output says

00000001 String get
00001041 Object get

we have two get() methods. One that appears in the source code and another one, which is synthetic and bridge. The decompiler javap says that the generated code is:

public java.lang.String get(int);
  Code:
   Stack=1, Locals=2, Args_size=2
   0:   ldc     #2; //String
   2:   areturn
  LineNumberTable:
   line 12: 0


public java.lang.Object get(int);
  Code:
   Stack=2, Locals=2, Args_size=2
   0:   aload_0
   1:   iload_1
   2:   invokevirtual   #3; //Method get:(I)Ljava/lang/String;
   5:   areturn

The interesting this is that the signature of the two methods is the same and only the return types are different. This is allowed in the JVM even though this is not possible in the Java language. The bridge method does not do anything else, but calls the original one.

Why do we need this synthetic method? Who will use it. For example the code that wants to invoke the method get(int) using a variable that is no of the type MyLink:

        List<?> a = new MyLink();
        Object z = a.get(0);

It can not call the method returning String because there is no such in List. To make it more demonstrative lets override the method add() instead of get():

package synthetic;

import java.util.LinkedList;
import java.util.List;

public class SyntheticMethodTest4 {

    public static class MyLink extends LinkedList<String> {
        @Override
        public boolean add(String s) {
            return true;
        }
    }

    public static void main(String[] args) {
        List a = new MyLink();
        a.add("");
        a.add(13);
    }
}

We can see that the bridge method

public boolean add(java.lang.Object);
  Code:
   Stack=2, Locals=2, Args_size=2
   0:   aload_0
   1:   aload_1
   2:   checkcast       #2; //class java/lang/String
   5:   invokevirtual   #3; //Method add:(Ljava/lang/String;)Z
   8:   ireturn

not only calls the original one. It also checks that the type conversion is OK. This is done during run-time not done by the JVM itself. As you expect it does throw up in the line 18:

Exception in thread "main" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String
	at synthetic.SyntheticMethodTest4$MyLink.add(SyntheticMethodTest4.java:1)
	at synthetic.SyntheticMethodTest4.main(SyntheticMethodTest4.java:18)

When you get the question about volatile methods at an interview next time, you may know even more than the interviewer.

Blog Service Message about syndication with DZONE and JCG

I have contracted DZONE and equally Java Code Geeks as syndication partners. They started to select my articles for republishing on their sites.

The cooperation between me and the syndication partners is non-profit and I truly believe that this is mutually benefiting all parties including the syndication partners as well as you, the reader. Cooperation with syndication partners will hopefully increase the number of people using the blog and belonging to a larger technical community is an advantage.

Feel free to visit these sites and use the services, read articles, express your point of view in forms of comments, whatever you feel appropriate.

Follow

Get every new post delivered to your Inbox.

Join 861 other followers