Java Deep

Pure Java, what else

Why to use String

Recently I was tutoring juniors during a training session. One of the task was to write a class that can dwarwle maps based on some string key. The result one of the juniors created contained the following method:

	void dwarwle(HashMap<String,Dwarwable> mapToDwarwle, String dwarwleKey){
		for( final Entry<String, Dwarwable> entry : mapToDwarwle.entrySet()){
			dwarwle(entry.getKey(),entry.getValue(),dwarwleKey);
		}
	}

The code was generally ok. The method to dwarwle an individual dwarwable entry using the actual key it is assigned to in the hash map and the dwarwle key is factored to a separate method. It is so simple that I do not list here. The variable names are also meaningful so long as long you know what actually dwarwling is. The method is short and readable, but the argument list expects a HashMap instead of a Map. Why do we want to restrict the caller to use a HashMap? What if the caller has a TreeMap and for good reason. Do we want to have separate method that can dwarwle TreeMap? Certainly not.

Expect the interface, pass the implementation.

The junior changed the code replacing HashMap to Map but after five minutes or so this clever lady raised her hand and had the following question:

“If we changed HashMap to Map, why did not we change String to CharSequence?”

It is not so easy to answer a question like that when it comes out of the blue. The first thing that came up in my mind is that the reason is that we usually do it that way and that is why. But that is not a real argument, at least I would not accept anything like that and I also except my students not to accept such answer. It would be very dictator style anyway.

The real answer is that the parameter is used as a key in a map and the key of a map should be immutable (at least mutation should be resilient to equals and hashcode calculation). CharSequence is an interface and an interface in Java (unfortunately) can not guarantee immutability. Only implementation can. String is a good, widely known and well tested implementation of this interface and therefore can be a good choice. There is a good discussion about it on stackoverflow.

In this special case we expect the implementation because we need something immutable and we “can not” trust the caller to pass a character sequence implementation that is immutable. Or: we can, but it has a price. If a StringBuilder is passed and modified afterwards our dwarwling library may not work and a blame war may start. When we design an API and a library we should also think about not only the possible but also about the actual, average use.

A library is as good as it is used and not as it could be used.

This can also be applied to other products, not only libraries but this may lead too far (physics and weapon).

Autoboxing

Autoboxing is clear for all Java developers since Java 1.5 Well, I may be too optimistic. At least all developers are supposed to be ok with autoboxing. After all there is a good tutorial about it on the page of ORACLE.

Autoboxing is the phenomena when the Java compiler automatically generates code creating an object from a primitive type when it is needed. For example you can write:

Integer a = 42;

and it will automatically generate JVM code that puts the value int 42 into an Integer object. This is so nice of the compiler to do it for us that after a while we, programmers just tend to forget about the complexity behind it and from time to time we run against the wall.

For example we have double.class and Double.class. Both of them are objects (as being a class and each class itself is an object in permgen or just on the heap in post-permgen version of JVM). Both of these objects are of type Class. What is more: since Java 1.5 both of them are of type Class<Double>.

If two objects have the same type, they also have to be assignment compatible aren’t they. Seems to be an obvious statement. If you have object O a and object O b then you can assign a = b.

Looking at the code, however we may realize being oblivious instead of obvious:

public class TypeFun {
    public static void main(String[] args) {
        // public static final Class<Double>   TYPE = (Class<Double>)Class.getPrimitiveClass("double");
        System.out.println("Double.TYPE == double.class: " + (Double.TYPE == double.class));
        System.out.println("Double.TYPE == Double.class: " + (Double.TYPE == Double.class));
        System.out.println("double.class.isAssignableFrom(Double.class): " + (double.class.isAssignableFrom(Double.class)));
        System.out.println("Double.class.isAssignableFrom(double.class): " + (Double.class.isAssignableFrom(double.class)));
    }
}

resulting

Double.TYPE == double.class: true
Double.TYPE == Double.class: false
double.class.isAssignableFrom(Double.class): false
Double.class.isAssignableFrom(double.class): false

This means that the primitive pair of Double is double.class (not surprising). Even though one can not be assigned from the other. We can look at the source at least of the one of the them. The source of the class Double is in the RT.jar and it is open source. There you can see that:

public static final Class<Double>	TYPE = (Class<Double>) Class.getPrimitiveClass("double");

Why does it use that weird Class.getPrimitiveClass("double") instead of double.class? That is the primitive pair of the type Double.

The answer is not trivial and you can dig deep into the details of Java and JVM. Since double is not a class, there is nothing like double.class in reality. You can still use this literal in the Java source code though and this is where the Java language, compiler and the run-time has some strong bondage. The compiler knows that the class Double defines a field named TYPE denoting the primitive type of it. Whenever the compiler sees double.class in the source code it generates JVM code Double.TYPE. (Give it a try and then use javap to decode the generated code!) For this very reason the developer of the RT could not write

public static final Class<Double>	TYPE = double.class;

into the source of the class Double. It would compile to the code equivalent:

public static final Class<Double>	TYPE = TYPE;

How is autoboxing going on then? The source

Double b = (double)1.0;

results

         0: dconst_1      
         1: invokestatic  #2                  // Method java/lang/Double.valueOf:(D)Ljava/lang/Double;
         4: astore_1 

however if we replace the two ‘d’ letters:

double b = (Double)1.0;

then we get

         0: dconst_1      
         1: invokestatic  #2                  // Method java/lang/Double.valueOf:(D)Ljava/lang/Double;
         4: invokevirtual #3                  // Method java/lang/Double.doubleValue:()D
         7: dstore_1    

, which ineed explains a lot of things. The instances of the class double.class the class Double.class are not assign compatible. Autoboxing solves this. Java 4 was a long time ago and we, luckily forgot it.

Your homework: reread what happens related to autoboxing when you have overloaded methods that have arguments of the “class” type and the corresponding primitive type.

A classloading mystery solved

Facing a good old problem

I was struggling with some class loading issue on an application server. The libraries were defined as maven dependencies and therefore packaged into the WAR and EAR file. Some of these were also installed into the application server, unfortunately of different version. When we started the application we faced the various exceptions that were related to these types of problems. There is a good IBM article about these exceptions if you want to dig deeper.

Even though we knew that the error was caused by some double defined libraries on the classpath it took more than two hours to investigate which version we really needed, and what JAR to remove.

Same topic by accident on JUG the same week

A few days later we participated the Do you really get Classloaders? session of Java Users’ Society in Zürich. Simon Maple delivered an extremely good intro about class loaders and went into very deep details from the very start. It was an eye opening session for many. I also have to note that Simon works Zero turnaround and he evangelizes for JRebel. In such a situation a tutorial session is usually biased towards the actual product that is the bread for the tutor. In this case my opinion is that Simon was absolutely gentleman and ethic keeping an appropriate balance.

Creating a tool, to solve mystery

just to create another one

A week later I had some time to hobby program that I did not have time for a couple weeks by now and I decided to create a little tool that lists all the classes and JAR files that are on the classpath so investigation can be easier to find duplicates. I tried to rely on the fact that the classloaders are usually instances of URLClassLoader and thus the method getURLs() can be invoked to get all the directory names and JAR files.

Unit testing in such a situation can be very tricky, since the functionality is strongly tied to the class loader behavior. To be pragmatic I decided to just do some manual testing started from JUnit so long as long the code is experimental. First of all I wanted to see if the concept is worth developing it further. I was planning to execute the test and look at the log statements reporting that there were no duplicate classes and then executing the same run but second time adding some redundant dependencies to the classpath. I was using JUnit 4.10 The version is important in this case.

I executed the unit test from the command line and I saw that there were no duplicate classes, and I was happy. After that I was executing the same test from Eclipse and surprise: I got 21 classes redundantly defined!

12:41:51.670 DEBUG c.j.c.ClassCollector - There are 21 redundantly defined classes.
12:41:51.670 DEBUG c.j.c.ClassCollector - Class org/hamcrest/internal/SelfDescribingValue.class is defined 2 times:
12:41:51.671 DEBUG c.j.c.ClassCollector -   sun.misc.Launcher$AppClassLoader@7ea987ac:file:/Users/verhasp/.m2/repository/junit/junit/4.10/junit-4.10.jar
12:41:51.671 DEBUG c.j.c.ClassCollector -   sun.misc.Launcher$AppClassLoader@7ea987ac:file:/Users/verhasp/.m2/repository/org/hamcrest/hamcrest-core/1.1/hamcrest-core-1.1.jar
...

Googling a bit I could discover easily that JUnit 4.10 has an extra dependency as shown by maven

$ mvn dependency:tree
[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building clalotils 1.0.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ clalotils ---
[INFO] com.verhas:clalotils:jar:1.0.0-SNAPSHOT
[INFO] +- junit:junit:jar:4.10:test
[INFO] |  \- org.hamcrest:hamcrest-core:jar:1.1:test
[INFO] +- org.slf4j:slf4j-api:jar:1.7.7:compile
[INFO] \- ch.qos.logback:logback-classic:jar:1.1.2:compile
[INFO]    \- ch.qos.logback:logback-core:jar:1.1.2:compile
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.642s
[INFO] Finished at: Wed Sep 03 12:44:18 CEST 2014
[INFO] Final Memory: 13M/220M
[INFO] ------------------------------------------------------------------------

This is actually fixed in 4.11 so if I change the dependency to JUnit 4.11 I do not face the issue. Ok. Half of the mystery solved. But why maven command line execution does not report the classes double defined?

Extending the logging, logging more and more I could spot out a line:

12:46:19.433 DEBUG c.j.c.ClassCollector - Loading from the jar file /Users/verhasp/github/clalotils/target/surefire/surefirebooter235846110768631567.jar

What is in this file? Let’s unzip it:

$ ls -l /Users/verhasp/github/clalotils/target/surefire/surefirebooter235846110768631567.jar
ls: /Users/verhasp/github/clalotils/target/surefire/surefirebooter235846110768631567.jar: No such file or directory

The file does not exit! Seemingly maven creates this JAR file and then deletes it when the execution of the test is finished. Googling again I found the solution.

Java loads the classes from the classpath. The classpath can be defined on the command line but there are other sources for the application class loaders to fetch files from. One such a source is the manifest file of a JAR. The manifest file of a JAR file can define what other JAR files are needed to execute the classes in the JAR file. Maven creates a JAR file that contains nothing else but the manifest file defining the JARs and directories to list the classpath. These JARs and directories are NOT returned by the method getURLs(), therefore the (first version) of my little tool did not find the duplicates.

For demonstration purposes I was quick enough to make a copy of the file while the mvn test command was running, and got the following output:

$ unzip /Users/verhasp/github/clalotils/target/surefire/surefirebooter5550254534465369201\ copy.jar 
Archive:  /Users/verhasp/github/clalotils/target/surefire/surefirebooter5550254534465369201 copy.jar
  inflating: META-INF/MANIFEST.MF    
$ cat META-INF/MANIFEST.MF 
Manifest-Version: 1.0
Class-Path: file:/Users/verhasp/.m2/repository/org/apache/maven/surefi
 re/surefire-booter/2.8/surefire-booter-2.8.jar file:/Users/verhasp/.m
 2/repository/org/apache/maven/surefire/surefire-api/2.8/surefire-api-
 2.8.jar file:/Users/verhasp/github/clalotils/target/test-classes/ fil
 e:/Users/verhasp/github/clalotils/target/classes/ file:/Users/verhasp
 /.m2/repository/junit/junit/4.10/junit-4.10.jar file:/Users/verhasp/.
 m2/repository/org/hamcrest/hamcrest-core/1.1/hamcrest-core-1.1.jar fi
 le:/Users/verhasp/.m2/repository/org/slf4j/slf4j-api/1.7.7/slf4j-api-
 1.7.7.jar file:/Users/verhasp/.m2/repository/ch/qos/logback/logback-c
 lassic/1.1.2/logback-classic-1.1.2.jar file:/Users/verhasp/.m2/reposi
 tory/ch/qos/logback/logback-core/1.1.2/logback-core-1.1.2.jar
Main-Class: org.apache.maven.surefire.booter.ForkedBooter

$ 

It really is nothing else than the manifest file defining the classpath. But why does maven do it? Sonatype people, some of whom I also know personally are clever people. They don’t do such a thing just for nothing. The reason to create a temporary JAR file to start the tests is that the length of the command line is limited on some of the operating systems that the length of the classpath may exceed. Even though Java (since Java 6) itself resolves wildcard characters in the classpath it is not an option to maven. The JAR files are in different directories in the maven repo each having long name. Wildcard resolution is not recursive, there is a good reason for it, and even if it were you just would not like to have all your local repo on the classpath.

Conclusion

  • Do not use JUnit 4.10! Use something older or newer, or be prepared for surprises.
  • Understand what a classloader is and how it works, what is does.
  • Use an operating system that has huge limit for the maximum size of a command line length.
    Or just live with the limitation.

Something else? Your ideas?

Name of the class

In Java every class has a name. Classes are in packages and this lets us programmers work together avoiding name collision. I can name my class A and you can also name your class A so long as long they are in different packages, they work together fine.

If you looked at the API of the class Class you certainly noticed that there are three different methods that give you the name of a class:

  • getSimpleName() gives you the name of the class without the package.
  • getName() gives you the name of the class with the full package name in front.
  • getCanonicalName() gives you the canonical name of the class.

Simple is it? Well, the first is simple and the second is also meaningful unless there is that disturbing canonical name. That is not evident what that is. And if you do not know what canonical name is, you may feel some disturbance in the force of your Java skills for the second also. What is the difference between the two?

If you want a precise explanation, visit the chapter 6.7 of Java Language Specification. Here we go with something simpler, aimed simpler to understand though not so thorough.

Let’s see some examples:

package pakage.subpackage.evensubberpackage;
import org.junit.Assert;
import org.junit.Test;

public class WhatIsMyName {
	@Test
	public void classHasName() {
		final Class<?> klass = WhatIsMyName.class;
		final String simpleNameExpected = "WhatIsMyName";
		Assert.assertEquals(simpleNameExpected, klass.getSimpleName());
		final String nameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName";
		Assert.assertEquals(nameExpected, klass.getName());
		Assert.assertEquals(nameExpected, klass.getCanonicalName());		
	}
...

This “unit test” just runs fine. But as you can see there is no difference between name and canonical name in this case. (Note that the name of the package is pakage and not package. To test your java lexical skills answer the question why?)

Let’s have a look at the next example from the same junit test file:

	@Test
	public void arrayHasName() {
		final Class<?> klass = WhatIsMyName[].class;
		final String simpleNameExpected = "WhatIsMyName[]";
		Assert.assertEquals(simpleNameExpected, klass.getSimpleName());
		final String nameExpected = "[Lpakage.subpackage.evensubberpackage.WhatIsMyName;";
		Assert.assertEquals(nameExpected, klass.getName());
		final String canonicalNameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName[]";
		Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName());		
	}

Now there are differences. When we talk about arrays the simple name signals it appending the opening and closing brackets, just like we would do in Java source code. The “normal” name looks a bit weird. It starts with an L and semicolon is appended. This reflects the internal representation of the class names in the JVM. The canonical name changed similar to the simple name: it is the same as before for the class having all the package names as prefix with the brackets appended. Seems that getName() is more the JVM name of the class and getCanonicalName() is more like the fully qualified name on Java source level.

Let’s go on with still some other example (we are still in the same file):

	class NestedClass{}
	
	@Test
	public void nestedClassHasName() {
		final Class<?> klass = NestedClass.class;
		final String simpleNameExpected = "NestedClass";
		Assert.assertEquals(simpleNameExpected, klass.getSimpleName());
		final String nameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName$NestedClass";
		Assert.assertEquals(nameExpected, klass.getName());
		final String canonicalNameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName.NestedClass";
		Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName());		
	}

The difference is the dollar sign in the name of the class. Again the “name” is more what is used by the JVM and canonical name is what is Java source code like. If you compile this code, the Java compiler will generate the files:

  • WhatIsMyName.class and
  • WhatIsMyName$NestedClass.class

Even though the class is named nested class it actually is an inner class. However in the naming there is no difference: a static or non-static class inside another class is just named the same. Now let’s see something even more interesting:

	@Test
	public void methodClassHasName() {
		class MethodClass{};
		final Class<?> klass = MethodClass.class;
		final String simpleNameExpected = "MethodClass";
		Assert.assertEquals(simpleNameExpected, klass.getSimpleName());
		final String nameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName$1MethodClass";
		Assert.assertEquals(nameExpected, klass.getName());
		final String canonicalNameExpected = null;
		Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName());
	}

This time we have a class inside a method. Not a usual scenario, but valid from the Java language point of view. The simple name of the class is just that: the simple name of the class. No much surprise.

The “normal” name however is interesting. The Java compiler generates a JVM name for the class and this name contains a number in it. Why? Because nothing would stop me having a class with the same name in another method in our test class and inserting a number is the way to prevent name collisions for the JVM. The JVM does not know or care anything about inner and nested classes or classes defined inside a method. A class is just a class. If you compile the code you will probably see the file WhatIsMyName$1MethodClass.class generated by javac. I had to add “probably” not because I count the possibility of you being blind, but rather because this name is actually the internal matter of the Java compiler. It may choose different name collision avoiding strategy, though I know no compiler that differs from the above.

The canonical name is the most interesting. It does not exist! It is null. Why? Because you can not access this class from outside the method defining it. It does not have a canonical name. Let’s go on.

What about anonymous classes. They should not have name. After all, that is why they are called anonymous.

	@Test
	public void anonymousClassHasName() {
		final Class<?> klass = new Object(){}.getClass();
		final String simpleNameExpected = "";
		Assert.assertEquals(simpleNameExpected, klass.getSimpleName());
		final String nameExpected = "pakage.subpackage.evensubberpackage.WhatIsMyName$1";
		Assert.assertEquals(nameExpected, klass.getName());
		final String canonicalNameExpected = null;
		Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName());
	}

Actually they do not have simple name. The simple name is empty string. They do, however have name, made up by the compiler. Poor javac does not have other choice. It has to make up some name even for the unnamed classes. It has to generate the code for the JVM and it has to save it to some file. Canonical name is again null.

Are we ready with the examples? No. We have something simple (a.k.a. primitive) at the end. Java primitives.

	@Test
	public void intClassHasName() {
		final Class<?> klass = int.class;
		final String intNameExpected = "int";
		Assert.assertEquals(intNameExpected, klass.getSimpleName());
		Assert.assertEquals(intNameExpected, klass.getName());
		Assert.assertEquals(intNameExpected, klass.getCanonicalName());
	}

If the class represents a primitive, like int (what can be simpler than an int?) then the simple name, “the” name and the canonical names are all int the name of the primitive.

Just as well an array of a primitive is very simple is it?

	@Test
	public void intArrayClassHasName() {
		final Class<?> klass = int[].class;
		final String simpleNameExpected = "int[]";
		Assert.assertEquals(simpleNameExpected, klass.getSimpleName());
		final String nameExpected = "[I";
		Assert.assertEquals(nameExpected, klass.getName());
		final String canonicalNameExpected = "int[]";
		Assert.assertEquals(canonicalNameExpected, klass.getCanonicalName());
	}

Well, it is not simple. The name is [I, which is a bit mysterious unless you read the respective chapter of the JVM specification. Perhaps I talk about that another time.

Conclusion

The simple name of the class is simple. The “name” returned by getName() is the one interesting for JVM level things. The getCanonicalName() is the one that looks most like Java source.

You can get the full source code of the example above from the gist e789d700d3c9abc6afa0 from GitHub.

Named parameters in Java

Creating a method that has many parameters is a major sin. Whenever there is need to create such a method, sniff in the air: it is code smell. Harden your unit tests and then refactor. No excuse, no buts. Refactor! Use builder pattern or even better use Fluent API. For the latter the annotation processor fluflu may be of great help.

Having all that said we may come to a point in our life when we face real life and not the idealistic pattern that we can follow in our hobby projects. There comes the legacy enterprise library monster that has the method of thousands parameters and you do not have the authority, time, courage or interest (bad for you) to modify … ops… refactor it. You could create a builder as a facade that hides the ugly API behind it if you had the time. Creating a builder is still code that you have to unit test even before you write (you know: TDD) and you just may not have the time. The code that calls the monstrous method is also there already, you just maintain it.

You can still do some little trick. It may not be perfect, but still something.

Assume that there is a method

public void monster(String contactName, String contactId, String street, String district,
                    ...
                    Long pT){
...
}

The first thing is to select your local variables at the location of the caller wisely. Pity the names are already chosen and you may not want to change it. There can be some reason for that, for example there is an application wide naming convention followed that may make sense even if not your style. So the call

monster(nm, "05300" + dI, getStrt(), d, ... , z+g % 3L );

is not exactly what I was talking about. That is what you have and you can live with it, or just insert new variables into the code:

String contactName = nm;
String contactId = "05300" + dI;
String street = getStrt();
Street district = d;
...
Long pT = z+g % 3L;
monster(contactName, contactId, street, district, ... ,pT );

or you can even write it in a way that is not usual in Java, though perfectly legal:

String contactName, contactId, street, district;
...
Long pT;
monster(contactName = nm, contactId = "05300" + dI, street = getStrt(), district = d, ... ,pT = z+g % 3L );

Tasty is it? Depends. I would not argue on taste. If you do not like that, there is an alternative way. You can define auxiliary and very simple static methods:

static <T> T contactName(T t){ return T;}
static <T> T contactId(T t){ return T;}
static <T> T street(T t){ return T;}
static <T> T district(T t){ return T;}
...
static <T> T pT(T t){ return T;}

monster(contactName(nm), contactId("05300" + dI), street(getStrt()(, district(d), ... ,pT(z+g % 3L) );

The code is still ugly but a bit more readable at the place of the caller. You can even collect static methods into a utility class, or to an interface in case of Java 8 named like with, using, to and so on. You can statically import them to your code and have some method call as nice as

doSomething(using(someParameter), with(someOtherParameter), to(resultStore));

When all that is there you can feel honky dory if you answer the final question: what the blessed whatever* is parameter pT.

(* “whatever” you can replace with some other words, whichever you like)

Logical thinking…

“The fact that logical thinking is part of the job description of a programmer does not imply that others should not practice that.”

This was a very witty comment on a Hungarian newsletter focusing on Java. The actual issue was about how to handle a SOAP message that is 1.8GB and has to be handled once a day. The issue was around checking the correctness of the message against some predefined XSD and then parsing the content and do some functionality controlled by the content.

This is a nice task and though I had no practical experience with a SOAP message of that huge size I recommended to do some benchmark on a machine which fits more or less the size of the memory and CPU of the production machine no matter what software stack is selected. These days a machine with 16GB or more memory is not so rare and one may be able to handle the 1.8GB SOAP in memory even if the overhead of JVM and Java were huge. (Which I do not say is, but it could be. If you are interested: you can measure and publish an article about that, different story.)

Some of the commenters followed a different pattern. They, the cleverer ones, suggested that perhaps the developer has to ask the business analyst (BA) about the details. It may not necessarily be the best solution from the business point of view to transfer such huge beasts over SOAP. What was the business reason to use SOAP? What was the business reason to use XML? What is the business benefit? What are the business goals? Business goals are rarely related to SOAP or XML. They are tools one (several) level lower in the solution chain.

When the business analyst gets the requirements from the business people, she should not just blindly pass it on. We, developers expect them to think a bit of technology. They are the bridge between the business people and the developers. Some of the BAs are very experienced technically and are eager to learn. Probably they are the ones that are also eager to learn on the other side: how the business work. Some BAs are less technical but still do their job. A SOAP message of 1.8GB should ring the warning bell even for a BA? Or not?

Don not hit sysadmins with NPE!

My opinion is that having a null pointer exception and getting it into the log without catching, handling and re-throwing it in another exception is not inherently bad. If we can do nothing better then it should not be a problem. The thing is that in practice almost always there is a better way to handle the situation.

Recently I was pair programming and we debugged some web code. We could not get through the authentication filter on the development environment and the authentication was not really in scope for the debugging so we decided to switch that totally off for the time. The next thing was an NPE. We looked at the code and we saw at the line something like

principal = SecurityContextHolder.getContext().getAuthentication().getPrincipal();

It was obvious: since the authentication was switched off the result of getAuthentication() was null, and therefore calling getPrincipal() on null caused the NPE. Should we modify the code to check if there is authentication information and throw a different exception? What would be the benefit to do that? The code runs slower, gets bigger (and bigger code is harder to maintain). And the NPE and the code source together are obvious. There is no need to change.

On second thought the NPE may not be that obvious for a support guy operating the code somewhere at the other side of this rotating ball. He may not have handy access to the source code and may not understand easily that the root cause for the NPE was the misconfiguration of the authentication layer. He/she has to start the server, it does not work and calls the support, raises a ticket. You as a responsible developer being at the farthest end of the support line may woke up during your finest sleep just to tell him/her that the authentication layer was misconfigured. Than you regret that you could have told it in the logs and in the exception.

Authentication auth = SecurityContextHolder.getContext().getAuthentication();
if( auth == null ){
  throw new BadConfigurationException("The authentication layer is not properly configured SecurityContextHolder.getContext().getAuthentication() returned null."
}
principal = auth.getPrincipal();

Not that big deal and pays back on the long run.

Moral: What seems to be obvious for the developer during development may not be that easy for the system administrator. Admins are not familiar with the code, may not have access to source code, have less experience in programming. On the other hand they have great experience setting up and running systems. It is a hard work, they deserve proper log messages and talkative exceptions.

What DSLs are not for

Domain specific languages are special programming languages. Each fits some special “domain” and makes the business code simpler. Using a DSL the business level problem can be implemented higher level and therefore the resulting code is simpler, it is created faster, presumably contains less errors. Some DSLs in some areas make it even possible to develop business functionality by the domain experts who have limited programming experience. There are many great books on DSLs Martin Fowler’s one being at least one of, if not the best of the topic.

Many times the decision to use DSL is to shorten release cycles. A mature software in a rapidly changing business domain may change frequently but many times the change is small. If it requires the change of the code then the whole release cycle is to be repeated. Code is modified, unit tested, release candidate is created, QA tests the new version and finally the release is ready after weeks the new business need arose. The obvious approach is to embed some DSL into the application and develop some business function that is likely to be changed in the future in this DSL. The “script” written in DSL may not be part of the real release and therefore the change can go through the system faster. Developers have less obvious coding, which developers usually do not like, business is happy getting the modified functions faster. Right?

WRONG!

But not so obviously at the first time, perhaps. The DSL functions fine, the new behavior is delivered faster and there is no problem. Some time later, however, there come a new feature that can not be implemented in the DSL and needs the change of the code application code. Why not extend the DSL and implement the new functionality in the new version of the DSL? This approach is very lucrative but it is very dangerous.

DSL are like alcohol. They can have a purpose and can serve good. A cup of quality wine after a nice summer evening supper should not harm. Too much of it regularly will ruin your life. A DSL that has too many features may be dangerous. Some may use it for the good, but there is a possibility for abuse. The release process was examined and engineered when the DSL was introduced but may not be reviewed as the DSL became more and more powerful and suddenly you may face a situation when new features are developed into the software out of the release cycle. At some point the release process and the most crucial part of it, quality assurance may be ruined.

DSL should be simple. Modification of the application scripting should also follow some release management. There may not be release management at all. I have heard of software projects where the software was released to public without any significant testing. If there was an error, the users complained about it and a new release came out an hour later. Fixing one bug, creating a new one. No problem if the business can stand that. The actual software was a facebook like application where new feature was more important for the users than uninterrupted use. Other applications in telecom, banking should be tested a bit more rigorous. Regulation may even demand all releases to be archived. In that case scripting out of the release cycle is out of question.

And there may be something in the middle. Some part, some features of the application may need strict release management, while other may not demand that. Some part can be scripted using some DSL, other core functions need strong QA and release management. Some features may mix the both: scripted and still part of the cycle.

The important message is:

Application scripting in DSL does not ease release management and/or QA. If the release management cycle can be releases for some part of the application feature, DSL may be a tool to aid that, but DSL is never the reason.

Java private, protected, public and default

You are a Java programmer, so you know what I am talking about. public modifiers make a method or field accessible from anywhere in the application. That is the simple part. But can you tell me the difference between protected and package private? (Hint: package private is the protection of a method or a field when you do not write any access modifier in front of it. Be aware! I lie!) My interview experience is that many do not know. Do I consider that as a no go for a Java developer? Not really. You may still be a good Java developer even if you do not know that. Perhaps now you will look it up somewhere. Perhaps the Java spec is a good document to start.

I tell you something more interesting.

Literally, none of the candidates know what private is. And you, reading this article, also do not know.

Ok, this is very provocative. You may be one of the few who happen to fill his brain with such a useless information and you may even have read the Java specification.

Most Java programmers think that private methods and fields are accessible only from within the class. Some even think that only from within the object instance. They believe that

public class PrivateAccessOtherObject {
    public PrivateAccessOtherObject(int i) {
        this.i = i;
    }
    private int i;
    void copyiTo(PrivateAccessOtherObject other){
        other.i = i;
    }
}

is not possible. (It is.)

So what is private?

The recent JLS says that A private class member or constructor is accessible only within the body of the top level class (§7.6) that encloses the declaration of the member or constructor.

The example in the Java specification is not the best describing the rule. Perhaps that is just a simple example. Something like this may be better explaining the concept:

public class PrivateFieldsContainingClass {
    private static class NestedClass {
        private int i;
    }
    private NestedClass nestedClassInstance = new NestedClass();
    void set(int i) {
        nestedClassInstance.i = i;
    }
    int get() {
        return nestedClassInstance.i;
    }
}

The field i is accessible from the enclosing class as well as from inside the NestedClass. This example is also simple but more to the point that the specification example misses. Is there any real use of this possibility? Not really.

Bonus question: why did I say I was lying?

How We Chose Framework

When you develop your application most of the time you are writing code that deals with some of the resources. Code lines that open database connection, allocate memory and alikes. The lower level you code the more code is dealing with the computational environment. This is cumbersome and though may be enjoyable for some of the programmers the less such code is needed the better. The real effort delivering business value is when you write code lines that implement business function. It is obvious that you just can not make a simple decision to write only business function implementing code. The other types of code lines are also needed to execute the code and it is also true that the border between infrastructure code and business code is sometimes blurry. You just can not tell sometime whether the code you type is infrastructure or business.

What you really can do is to select a framework that fits the business problem the best. Something that is easy to configure, does not need boiler plate code and easy to learn. That way you can focus more on the business code. Well, easy to say, hard to do. How could you tell which framework will be the best on the long run when the project has so many uncertainties? You can not tell precisely. But you can try and strive for more precision. And a model does to follow does not hurt. So what is the model in this case?

During the lifetime of a project there will be a constant effort to develop the business logic. If the business logic is fixed the number of the code lines to develop that can not change much. There may be some difference because some programming language is more verbose than the other, but this is not significant. The major difference is framework supporting code. There is also an effort to learn the framework, however that may be negligible for a longer project. This effort is needed at the start of the project, say sprint 1 and 2 and after that this fixed cost diminishes compared to the total cost of development. For the model I setup I will neglect this effort not at least because I can not measure a-priori how much effort an average programmer needs to learn a specific framework.

So the final, very simplified model is to compare the amount of code delivering business value compared to the amount of code configuring and supporting the selected framework. How to measure this?

I usually… Well, not usually. Selecting a framework is not an everyday practice. What we did in our team last time to perform a selection was the following:

We pre selected five possible frameworks. We ruled out one of them in the first run as not being widely known and used. We did not want to be on the bleeding edge. Another was filtered out as closer examination showed that the framework is a total misfit for our purpose. There remained three. After that we looked up projects on GitHub that utilized one of the framework, at least two for each framework (and not more than three). We looked at 8 projects total and we counted the lines categorizing each as business versus framework code lines. And then we realized that this just can not be done during the lifetime of a human, therefore we made it simpler. We started to categorize the classes based on their names. There were business classes related to some business data and also classes named after some business functions. The rest was treated as framework supporting, configuration class.

The final outcome was to sculptured into a good old ppt presentation and we added the two slides to the other slides that qualitatively analyzed the three frameworks listing pros and the cons. The final outcome, no surprise, was coherent: the calculation showed that the framework requiring the less configuration and supporting code was the one we favored anyway.

What was the added value then?

Making the measurement we had to review projects and we learnt a lot about the frameworks. Not as much as one coding in it, but more than just staring at marketing materials. We touched real code that programmers created while facing the real problems and the real features of the frameworks. This also helps the evaluator to gain more knowledge, gives a rail to grab on and lead us where to look, what to try when piloting the framework.

It is also an extremely important result that the decision process left less doubt in us. If the outcome were just opposite then we would have been in trouble and it would have made us thinking hard: why did we favor a framework that needs more business irrelevant code. But it did not. The result was concise with common sense.

Would I recommend this calculation to be the sole source for framework selection? Definitely no. But it can be a good addition that you can perform burning two or three days of your scrum team and it also helps your team to get the tip of their fingers into new technologies.

Follow

Get every new post delivered to your Inbox.

Join 931 other followers