Java Deep

Pure Java, what else

Monthly Archives: September 2014

Autoboxing

Autoboxing is clear for all Java developers since Java 1.5 Well, I may be too optimistic. At least all developers are supposed to be ok with autoboxing. After all there is a good tutorial about it on the page of ORACLE.

Autoboxing is the phenomena when the Java compiler automatically generates code creating an object from a primitive type when it is needed. For example you can write:

Integer a = 42;

and it will automatically generate JVM code that puts the value int 42 into an Integer object. This is so nice of the compiler to do it for us that after a while we, programmers just tend to forget about the complexity behind it and from time to time we run against the wall.

For example we have double.class and Double.class. Both of them are objects (as being a class and each class itself is an object in permgen or just on the heap in post-permgen version of JVM). Both of these objects are of type Class. What is more: since Java 1.5 both of them are of type Class<Double>.

If two objects have the same type, they also have to be assignment compatible aren’t they. Seems to be an obvious statement. If you have object O a and object O b then you can assign a = b.

Looking at the code, however we may realize being oblivious instead of obvious:

public class TypeFun {
    public static void main(String[] args) {
        // public static final Class<Double>   TYPE = (Class<Double>)Class.getPrimitiveClass("double");
        System.out.println("Double.TYPE == double.class: " + (Double.TYPE == double.class));
        System.out.println("Double.TYPE == Double.class: " + (Double.TYPE == Double.class));
        System.out.println("double.class.isAssignableFrom(Double.class): " + (double.class.isAssignableFrom(Double.class)));
        System.out.println("Double.class.isAssignableFrom(double.class): " + (Double.class.isAssignableFrom(double.class)));
    }
}

resulting

Double.TYPE == double.class: true
Double.TYPE == Double.class: false
double.class.isAssignableFrom(Double.class): false
Double.class.isAssignableFrom(double.class): false

This means that the primitive pair of Double is double.class (not surprising). Even though one can not be assigned from the other. We can look at the source at least of the one of the them. The source of the class Double is in the RT.jar and it is open source. There you can see that:

public static final Class<Double>	TYPE = (Class<Double>) Class.getPrimitiveClass("double");

Why does it use that weird Class.getPrimitiveClass("double") instead of double.class? That is the primitive pair of the type Double.

The answer is not trivial and you can dig deep into the details of Java and JVM. Since double is not a class, there is nothing like double.class in reality. You can still use this literal in the Java source code though and this is where the Java language, compiler and the run-time has some strong bondage. The compiler knows that the class Double defines a field named TYPE denoting the primitive type of it. Whenever the compiler sees double.class in the source code it generates JVM code Double.TYPE. (Give it a try and then use javap to decode the generated code!) For this very reason the developer of the RT could not write

public static final Class<Double>	TYPE = double.class;

into the source of the class Double. It would compile to the code equivalent:

public static final Class<Double>	TYPE = TYPE;

How is autoboxing going on then? The source

Double b = (double)1.0;

results

         0: dconst_1      
         1: invokestatic  #2                  // Method java/lang/Double.valueOf:(D)Ljava/lang/Double;
         4: astore_1 

however if we replace the two ‘d’ letters:

double b = (Double)1.0;

then we get

         0: dconst_1      
         1: invokestatic  #2                  // Method java/lang/Double.valueOf:(D)Ljava/lang/Double;
         4: invokevirtual #3                  // Method java/lang/Double.doubleValue:()D
         7: dstore_1    

, which ineed explains a lot of things. The instances of the class double.class the class Double.class are not assign compatible. Autoboxing solves this. Java 4 was a long time ago and we, luckily forgot it.

Your homework: reread what happens related to autoboxing when you have overloaded methods that have arguments of the “class” type and the corresponding primitive type.

A classloading mystery solved

Facing a good old problem

I was struggling with some class loading issue on an application server. The libraries were defined as maven dependencies and therefore packaged into the WAR and EAR file. Some of these were also installed into the application server, unfortunately of different version. When we started the application we faced the various exceptions that were related to these types of problems. There is a good IBM article about these exceptions if you want to dig deeper.

Even though we knew that the error was caused by some double defined libraries on the classpath it took more than two hours to investigate which version we really needed, and what JAR to remove.

Same topic by accident on JUG the same week

A few days later we participated the Do you really get Classloaders? session of Java Users’ Society in Zürich. Simon Maple delivered an extremely good intro about class loaders and went into very deep details from the very start. It was an eye opening session for many. I also have to note that Simon works Zero turnaround and he evangelizes for JRebel. In such a situation a tutorial session is usually biased towards the actual product that is the bread for the tutor. In this case my opinion is that Simon was absolutely gentleman and ethic keeping an appropriate balance.

Creating a tool, to solve mystery

just to create another one

A week later I had some time to hobby program that I did not have time for a couple weeks by now and I decided to create a little tool that lists all the classes and JAR files that are on the classpath so investigation can be easier to find duplicates. I tried to rely on the fact that the classloaders are usually instances of URLClassLoader and thus the method getURLs() can be invoked to get all the directory names and JAR files.

Unit testing in such a situation can be very tricky, since the functionality is strongly tied to the class loader behavior. To be pragmatic I decided to just do some manual testing started from JUnit so long as long the code is experimental. First of all I wanted to see if the concept is worth developing it further. I was planning to execute the test and look at the log statements reporting that there were no duplicate classes and then executing the same run but second time adding some redundant dependencies to the classpath. I was using JUnit 4.10 The version is important in this case.

I executed the unit test from the command line and I saw that there were no duplicate classes, and I was happy. After that I was executing the same test from Eclipse and surprise: I got 21 classes redundantly defined!

12:41:51.670 DEBUG c.j.c.ClassCollector - There are 21 redundantly defined classes.
12:41:51.670 DEBUG c.j.c.ClassCollector - Class org/hamcrest/internal/SelfDescribingValue.class is defined 2 times:
12:41:51.671 DEBUG c.j.c.ClassCollector -   sun.misc.Launcher$AppClassLoader@7ea987ac:file:/Users/verhasp/.m2/repository/junit/junit/4.10/junit-4.10.jar
12:41:51.671 DEBUG c.j.c.ClassCollector -   sun.misc.Launcher$AppClassLoader@7ea987ac:file:/Users/verhasp/.m2/repository/org/hamcrest/hamcrest-core/1.1/hamcrest-core-1.1.jar
...

Googling a bit I could discover easily that JUnit 4.10 has an extra dependency as shown by maven

$ mvn dependency:tree
[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building clalotils 1.0.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ clalotils ---
[INFO] com.verhas:clalotils:jar:1.0.0-SNAPSHOT
[INFO] +- junit:junit:jar:4.10:test
[INFO] |  \- org.hamcrest:hamcrest-core:jar:1.1:test
[INFO] +- org.slf4j:slf4j-api:jar:1.7.7:compile
[INFO] \- ch.qos.logback:logback-classic:jar:1.1.2:compile
[INFO]    \- ch.qos.logback:logback-core:jar:1.1.2:compile
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.642s
[INFO] Finished at: Wed Sep 03 12:44:18 CEST 2014
[INFO] Final Memory: 13M/220M
[INFO] ------------------------------------------------------------------------

This is actually fixed in 4.11 so if I change the dependency to JUnit 4.11 I do not face the issue. Ok. Half of the mystery solved. But why maven command line execution does not report the classes double defined?

Extending the logging, logging more and more I could spot out a line:

12:46:19.433 DEBUG c.j.c.ClassCollector - Loading from the jar file /Users/verhasp/github/clalotils/target/surefire/surefirebooter235846110768631567.jar

What is in this file? Let’s unzip it:

$ ls -l /Users/verhasp/github/clalotils/target/surefire/surefirebooter235846110768631567.jar
ls: /Users/verhasp/github/clalotils/target/surefire/surefirebooter235846110768631567.jar: No such file or directory

The file does not exist! Seemingly maven creates this JAR file and then deletes it when the execution of the test is finished. Googling again I found the solution.

Java loads the classes from the classpath. The classpath can be defined on the command line but there are other sources for the application class loaders to fetch files from. One such a source is the manifest file of a JAR. The manifest file of a JAR file can define what other JAR files are needed to execute the classes in the JAR file. Maven creates a JAR file that contains nothing else but the manifest file defining the JARs and directories to list the classpath. These JARs and directories are NOT returned by the method getURLs(), therefore the (first version) of my little tool did not find the duplicates.

For demonstration purposes I was quick enough to make a copy of the file while the mvn test command was running, and got the following output:

$ unzip /Users/verhasp/github/clalotils/target/surefire/surefirebooter5550254534465369201\ copy.jar 
Archive:  /Users/verhasp/github/clalotils/target/surefire/surefirebooter5550254534465369201 copy.jar
  inflating: META-INF/MANIFEST.MF    
$ cat META-INF/MANIFEST.MF 
Manifest-Version: 1.0
Class-Path: file:/Users/verhasp/.m2/repository/org/apache/maven/surefi
 re/surefire-booter/2.8/surefire-booter-2.8.jar file:/Users/verhasp/.m
 2/repository/org/apache/maven/surefire/surefire-api/2.8/surefire-api-
 2.8.jar file:/Users/verhasp/github/clalotils/target/test-classes/ fil
 e:/Users/verhasp/github/clalotils/target/classes/ file:/Users/verhasp
 /.m2/repository/junit/junit/4.10/junit-4.10.jar file:/Users/verhasp/.
 m2/repository/org/hamcrest/hamcrest-core/1.1/hamcrest-core-1.1.jar fi
 le:/Users/verhasp/.m2/repository/org/slf4j/slf4j-api/1.7.7/slf4j-api-
 1.7.7.jar file:/Users/verhasp/.m2/repository/ch/qos/logback/logback-c
 lassic/1.1.2/logback-classic-1.1.2.jar file:/Users/verhasp/.m2/reposi
 tory/ch/qos/logback/logback-core/1.1.2/logback-core-1.1.2.jar
Main-Class: org.apache.maven.surefire.booter.ForkedBooter

$ 

It really is nothing else than the manifest file defining the classpath. But why does maven do it? Sonatype people, some of whom I also know personally are clever people. They don’t do such a thing just for nothing. The reason to create a temporary JAR file to start the tests is that the length of the command line is limited on some of the operating systems that the length of the classpath may exceed. Even though Java (since Java 6) itself resolves wildcard characters in the classpath it is not an option to maven. The JAR files are in different directories in the maven repo each having long name. Wildcard resolution is not recursive, there is a good reason for it, and even if it were you just would not like to have all your local repo on the classpath.

Conclusion

  • Do not use JUnit 4.10! Use something older or newer, or be prepared for surprises.
  • Understand what a classloader is and how it works, what is does.
  • Use an operating system that has huge limit for the maximum size of a command line length.
    Or just live with the limitation.

Something else? Your ideas?

Name of the class

In Java every class has a name. Classes are in packages and this lets us programmers work together avoiding name collision. I can name my class A and you can also name your class A so long as long they are in different packages, they work together fine.

If you looked at the API of the class Class you certainly noticed that there are three different methods that give you the name of a class:

  • getSimpleName() gives you the name of the class without the package.
  • getName() gives you the name of the class with the full package name in front.
  • getCanonicalName() gives you the canonical name of the class.

Simple is it? Well, the first is simple and the second is also meaningful unless there is that disturbing canonical name. That is not evident what that is. And if you do not know what canonical name is, you may feel some disturbance in the force of your Java skills for the second also. What is the difference between the two?

If you want a precise explanation, visit the chapter 6.7 of Java Language Specification. Here we go with something simpler, aimed simpler to understand though not so thorough.

Let’s see some examples:

package pakage.subpackage.evensubberpackage;
import org.junit.Assert;
import org.junit.Test;

public class WhatIsMyName {
	@Test
	public void classHasName() {
		final Class<?> klass = WhatIsMyName.class;
		final String simpleNameExpected = "WhatIsMyName";
		Assert.assertEquals(simpleNameExpected, klass.getSimpleName());
		final String nameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName";
		Assert.assertEquals(nameExpected, klass.getName());
		Assert.assertEquals(nameExpected, klass.getCanonicalName());		
	}
...

This “unit test” just runs fine. But as you can see there is no difference between name and canonical name in this case. (Note that the name of the package is pakage and not package. To test your java lexical skills answer the question why?)

Let’s have a look at the next example from the same junit test file:

	@Test
	public void arrayHasName() {
		final Class<?> klass = WhatIsMyName[].class;
		final String simpleNameExpected = "WhatIsMyName[]";
		Assert.assertEquals(simpleNameExpected, klass.getSimpleName());
		final String nameExpected = "[Lpakage.subpackage.evensubberpackage.WhatIsMyName;";
		Assert.assertEquals(nameExpected, klass.getName());
		final String canonicalNameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName[]";
		Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName());		
	}

Now there are differences. When we talk about arrays the simple name signals it appending the opening and closing brackets, just like we would do in Java source code. The “normal” name looks a bit weird. It starts with an L and semicolon is appended. This reflects the internal representation of the class names in the JVM. The canonical name changed similar to the simple name: it is the same as before for the class having all the package names as prefix with the brackets appended. Seems that getName() is more the JVM name of the class and getCanonicalName() is more like the fully qualified name on Java source level.

Let’s go on with still some other example (we are still in the same file):

	class NestedClass{}
	
	@Test
	public void nestedClassHasName() {
		final Class<?> klass = NestedClass.class;
		final String simpleNameExpected = "NestedClass";
		Assert.assertEquals(simpleNameExpected, klass.getSimpleName());
		final String nameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName$NestedClass";
		Assert.assertEquals(nameExpected, klass.getName());
		final String canonicalNameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName.NestedClass";
		Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName());		
	}

The difference is the dollar sign in the name of the class. Again the “name” is more what is used by the JVM and canonical name is what is Java source code like. If you compile this code, the Java compiler will generate the files:

  • WhatIsMyName.class and
  • WhatIsMyName$NestedClass.class

Even though the class is named nested class it actually is an inner class. However in the naming there is no difference: a static or non-static class inside another class is just named the same. Now let’s see something even more interesting:

	@Test
	public void methodClassHasName() {
		class MethodClass{};
		final Class<?> klass = MethodClass.class;
		final String simpleNameExpected = "MethodClass";
		Assert.assertEquals(simpleNameExpected, klass.getSimpleName());
		final String nameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName$1MethodClass";
		Assert.assertEquals(nameExpected, klass.getName());
		final String canonicalNameExpected = null;
		Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName());
	}

This time we have a class inside a method. Not a usual scenario, but valid from the Java language point of view. The simple name of the class is just that: the simple name of the class. No much surprise.

The “normal” name however is interesting. The Java compiler generates a JVM name for the class and this name contains a number in it. Why? Because nothing would stop me having a class with the same name in another method in our test class and inserting a number is the way to prevent name collisions for the JVM. The JVM does not know or care anything about inner and nested classes or classes defined inside a method. A class is just a class. If you compile the code you will probably see the file WhatIsMyName$1MethodClass.class generated by javac. I had to add “probably” not because I count the possibility of you being blind, but rather because this name is actually the internal matter of the Java compiler. It may choose different name collision avoiding strategy, though I know no compiler that differs from the above.

The canonical name is the most interesting. It does not exist! It is null. Why? Because you can not access this class from outside the method defining it. It does not have a canonical name. Let’s go on.

What about anonymous classes. They should not have name. After all, that is why they are called anonymous.

	@Test
	public void anonymousClassHasName() {
		final Class<?> klass = new Object(){}.getClass();
		final String simpleNameExpected = "";
		Assert.assertEquals(simpleNameExpected, klass.getSimpleName());
		final String nameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName$1";
		Assert.assertEquals(nameExpected, klass.getName());
		final String canonicalNameExpected = null;
		Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName());
	}

Actually they do not have simple name. The simple name is empty string. They do, however have name, made up by the compiler. Poor javac does not have other choice. It has to make up some name even for the unnamed classes. It has to generate the code for the JVM and it has to save it to some file. Canonical name is again null.

Are we ready with the examples? No. We have something simple (a.k.a. primitive) at the end. Java primitives.

	@Test
	public void intClassHasName() {
		final Class<?> klass = int.class;
		final String intNameExpected = "int";
		Assert.assertEquals(intNameExpected, klass.getSimpleName());
		Assert.assertEquals(intNameExpected, klass.getName());
		Assert.assertEquals(intNameExpected, klass.getCanonicalName());
	}

If the class represents a primitive, like int (what can be simpler than an int?) then the simple name, “the” name and the canonical names are all int the name of the primitive.

Just as well an array of a primitive is very simple is it?

	@Test
	public void intArrayClassHasName() {
		final Class<?> klass = int[].class;
		final String simpleNameExpected = "int[]";
		Assert.assertEquals(simpleNameExpected, klass.getSimpleName());
		final String nameExpected = "[I";
		Assert.assertEquals(nameExpected, klass.getName());
		final String canonicalNameExpected = "int[]";
		Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName());
	}

Well, it is not simple. The name is [I, which is a bit mysterious unless you read the respective chapter of the JVM specification. Perhaps I talk about that another time.

Conclusion

The simple name of the class is simple. The “name” returned by getName() is the one interesting for JVM level things. The getCanonicalName() is the one that looks most like Java source.

You can get the full source code of the example above from the gist e789d700d3c9abc6afa0 from GitHub.