Java Deep

Pure Java, what else

Named parameters in Java

Creating a method that has many parameters is a major sin. Whenever there is need to create such a method, sniff in the air: it is code smell. Harden your unit tests and then refactor. No excuse, no buts. Refactor! Use builder pattern or even better use Fluent API. For the latter the annotation processor fluflu may be of great help.

Having all that said we may come to a point in our life when we face real life and not the idealistic pattern that we can follow in our hobby projects. There comes the legacy enterprise library monster that has the method of thousands parameters and you do not have the authority, time, courage or interest (bad for you) to modify … ops… refactor it. You could create a builder as a facade that hides the ugly API behind it if you had the time. Creating a builder is still code that you have to unit test even before you write (you know: TDD) and you just may not have the time. The code that calls the monstrous method is also there already, you just maintain it.

You can still do some little trick. It may not be perfect, but still something.

Assume that there is a method

public void monster(String contactName, String contactId, String street, String district,
                    ...
                    Long pT){
...
}

The first thing is to select your local variables at the location of the caller wisely. Pity the names are already chosen and you may not want to change it. There can be some reason for that, for example there is an application wide naming convention followed that may make sense even if not your style. So the call

monster(nm, "05300" + dI, getStrt(), d, ... , z+g % 3L );

is not exactly what I was talking about. That is what you have and you can live with it, or just insert new variables into the code:

String contactName = nm;
String contactId = "05300" + dI;
String street = getStrt();
Street district = d;
...
Long pT = z+g % 3L;
monster(contactName, contactId, street, district, ... ,pT );

or you can even write it in a way that is not usual in Java, though perfectly legal:

String contactName, contactId, street, district;
...
Long pT;
monster(contactName = nm, contactId = "05300" + dI, street = getStrt(), district = d, ... ,pT = z+g % 3L );

Tasty is it? Depends. I would not argue on taste. If you do not like that, there is an alternative way. You can define auxiliary and very simple static methods:

static <T> T contactName(T t){ return T;}
static <T> T contactId(T t){ return T;}
static <T> T street(T t){ return T;}
static <T> T district(T t){ return T;}
...
static <T> T pT(T t){ return T;}

monster(contactName(nm), contactId("05300" + dI), street(getStrt()(, district(d), ... ,pT(z+g % 3L) );

The code is still ugly but a bit more readable at the place of the caller. You can even collect static methods into a utility class, or to an interface in case of Java 8 named like with, using, to and so on. You can statically import them to your code and have some method call as nice as

doSomething(using(someParameter), with(someOtherParameter), to(resultStore));

When all that is there you can feel honky dory if you answer the final question: what the blessed whatever* is parameter pT.

(* “whatever” you can replace with some other words, whichever you like)

Logical thinking…

“The fact that logical thinking is part of the job description of a programmer does not imply that others should not practice that.”

This was a very witty comment on a Hungarian newsletter focusing on Java. The actual issue was about how to handle a SOAP message that is 1.8GB and has to be handled once a day. The issue was around checking the correctness of the message against some predefined XSD and then parsing the content and do some functionality controlled by the content.

This is a nice task and though I had no practical experience with a SOAP message of that huge size I recommended to do some benchmark on a machine which fits more or less the size of the memory and CPU of the production machine no matter what software stack is selected. These days a machine with 16GB or more memory is not so rare and one may be able to handle the 1.8GB SOAP in memory even if the overhead of JVM and Java were huge. (Which I do not say is, but it could be. If you are interested: you can measure and publish an article about that, different story.)

Some of the commenters followed a different pattern. They, the cleverer ones, suggested that perhaps the developer has to ask the business analyst (BA) about the details. It may not necessarily be the best solution from the business point of view to transfer such huge beasts over SOAP. What was the business reason to use SOAP? What was the business reason to use XML? What is the business benefit? What are the business goals? Business goals are rarely related to SOAP or XML. They are tools one (several) level lower in the solution chain.

When the business analyst gets the requirements from the business people, she should not just blindly pass it on. We, developers expect them to think a bit of technology. They are the bridge between the business people and the developers. Some of the BAs are very experienced technically and are eager to learn. Probably they are the ones that are also eager to learn on the other side: how the business work. Some BAs are less technical but still do their job. A SOAP message of 1.8GB should ring the warning bell even for a BA? Or not?

Don not hit sysadmins with NPE!

My opinion is that having a null pointer exception and getting it into the log without catching, handling and re-throwing it in another exception is not inherently bad. If we can do nothing better then it should not be a problem. The thing is that in practice almost always there is a better way to handle the situation.

Recently I was pair programming and we debugged some web code. We could not get through the authentication filter on the development environment and the authentication was not really in scope for the debugging so we decided to switch that totally off for the time. The next thing was an NPE. We looked at the code and we saw at the line something like

principal = SecurityContextHolder.getContext().getAuthentication().getPrincipal();

It was obvious: since the authentication was switched off the result of getAuthentication() was null, and therefore calling getPrincipal() on null caused the NPE. Should we modify the code to check if there is authentication information and throw a different exception? What would be the benefit to do that? The code runs slower, gets bigger (and bigger code is harder to maintain). And the NPE and the code source together are obvious. There is no need to change.

On second thought the NPE may not be that obvious for a support guy operating the code somewhere at the other side of this rotating ball. He may not have handy access to the source code and may not understand easily that the root cause for the NPE was the misconfiguration of the authentication layer. He/she has to start the server, it does not work and calls the support, raises a ticket. You as a responsible developer being at the farthest end of the support line may woke up during your finest sleep just to tell him/her that the authentication layer was misconfigured. Than you regret that you could have told it in the logs and in the exception.

Authentication auth = SecurityContextHolder.getContext().getAuthentication();
if( auth == null ){
  throw new BadConfigurationException("The authentication layer is not properly configured SecurityContextHolder.getContext().getAuthentication() returned null."
}
principal = auth.getPrincipal();

Not that big deal and pays back on the long run.

Moral: What seems to be obvious for the developer during development may not be that easy for the system administrator. Admins are not familiar with the code, may not have access to source code, have less experience in programming. On the other hand they have great experience setting up and running systems. It is a hard work, they deserve proper log messages and talkative exceptions.

What DSLs are not for

Domain specific languages are special programming languages. Each fits some special “domain” and makes the business code simpler. Using a DSL the business level problem can be implemented higher level and therefore the resulting code is simpler, it is created faster, presumably contains less errors. Some DSLs in some areas make it even possible to develop business functionality by the domain experts who have limited programming experience. There are many great books on DSLs Martin Fowler’s one being at least one of, if not the best of the topic.

Many times the decision to use DSL is to shorten release cycles. A mature software in a rapidly changing business domain may change frequently but many times the change is small. If it requires the change of the code then the whole release cycle is to be repeated. Code is modified, unit tested, release candidate is created, QA tests the new version and finally the release is ready after weeks the new business need arose. The obvious approach is to embed some DSL into the application and develop some business function that is likely to be changed in the future in this DSL. The “script” written in DSL may not be part of the real release and therefore the change can go through the system faster. Developers have less obvious coding, which developers usually do not like, business is happy getting the modified functions faster. Right?

WRONG!

But not so obviously at the first time, perhaps. The DSL functions fine, the new behavior is delivered faster and there is no problem. Some time later, however, there come a new feature that can not be implemented in the DSL and needs the change of the code application code. Why not extend the DSL and implement the new functionality in the new version of the DSL? This approach is very lucrative but it is very dangerous.

DSL are like alcohol. They can have a purpose and can serve good. A cup of quality wine after a nice summer evening supper should not harm. Too much of it regularly will ruin your life. A DSL that has too many features may be dangerous. Some may use it for the good, but there is a possibility for abuse. The release process was examined and engineered when the DSL was introduced but may not be reviewed as the DSL became more and more powerful and suddenly you may face a situation when new features are developed into the software out of the release cycle. At some point the release process and the most crucial part of it, quality assurance may be ruined.

DSL should be simple. Modification of the application scripting should also follow some release management. There may not be release management at all. I have heard of software projects where the software was released to public without any significant testing. If there was an error, the users complained about it and a new release came out an hour later. Fixing one bug, creating a new one. No problem if the business can stand that. The actual software was a facebook like application where new feature was more important for the users than uninterrupted use. Other applications in telecom, banking should be tested a bit more rigorous. Regulation may even demand all releases to be archived. In that case scripting out of the release cycle is out of question.

And there may be something in the middle. Some part, some features of the application may need strict release management, while other may not demand that. Some part can be scripted using some DSL, other core functions need strong QA and release management. Some features may mix the both: scripted and still part of the cycle.

The important message is:

Application scripting in DSL does not ease release management and/or QA. If the release management cycle can be releases for some part of the application feature, DSL may be a tool to aid that, but DSL is never the reason.

Java private, protected, public and default

You are a Java programmer, so you know what I am talking about. public modifiers make a method or field accessible from anywhere in the application. That is the simple part. But can you tell me the difference between protected and package private? (Hint: package private is the protection of a method or a field when you do not write any access modifier in front of it. Be aware! I lie!) My interview experience is that many do not know. Do I consider that as a no go for a Java developer? Not really. You may still be a good Java developer even if you do not know that. Perhaps now you will look it up somewhere. Perhaps the Java spec is a good document to start.

I tell you something more interesting.

Literally, none of the candidates know what private is. And you, reading this article, also do not know.

Ok, this is very provocative. You may be one of the few who happen to fill his brain with such a useless information and you may even have read the Java specification.

Most Java programmers think that private methods and fields are accessible only from within the class. Some even think that only from within the object instance. They believe that

public class PrivateAccessOtherObject {
    public PrivateAccessOtherObject(int i) {
        this.i = i;
    }
    private int i;
    void copyiTo(PrivateAccessOtherObject other){
        other.i = i;
    }
}

is not possible. (It is.)

So what is private?

The recent JLS says that A private class member or constructor is accessible only within the body of the top level class (ยง7.6) that encloses the declaration of the member or constructor.

The example in the Java specification is not the best describing the rule. Perhaps that is just a simple example. Something like this may be better explaining the concept:

public class PrivateFieldsContainingClass {
    private static class NestedClass {
        private int i;
    }
    private NestedClass nestedClassInstance = new NestedClass();
    void set(int i) {
        nestedClassInstance.i = i;
    }
    int get() {
        return nestedClassInstance.i;
    }
}

The field i is accessible from the enclosing class as well as from inside the NestedClass. This example is also simple but more to the point that the specification example misses. Is there any real use of this possibility? Not really.

Bonus question: why did I say I was lying?

How We Chose Framework

When you develop your application most of the time you are writing code that deals with some of the resources. Code lines that open database connection, allocate memory and alikes. The lower level you code the more code is dealing with the computational environment. This is cumbersome and though may be enjoyable for some of the programmers the less such code is needed the better. The real effort delivering business value is when you write code lines that implement business function. It is obvious that you just can not make a simple decision to write only business function implementing code. The other types of code lines are also needed to execute the code and it is also true that the border between infrastructure code and business code is sometimes blurry. You just can not tell sometime whether the code you type is infrastructure or business.

What you really can do is to select a framework that fits the business problem the best. Something that is easy to configure, does not need boiler plate code and easy to learn. That way you can focus more on the business code. Well, easy to say, hard to do. How could you tell which framework will be the best on the long run when the project has so many uncertainties? You can not tell precisely. But you can try and strive for more precision. And a model does to follow does not hurt. So what is the model in this case?

During the lifetime of a project there will be a constant effort to develop the business logic. If the business logic is fixed the number of the code lines to develop that can not change much. There may be some difference because some programming language is more verbose than the other, but this is not significant. The major difference is framework supporting code. There is also an effort to learn the framework, however that may be negligible for a longer project. This effort is needed at the start of the project, say sprint 1 and 2 and after that this fixed cost diminishes compared to the total cost of development. For the model I setup I will neglect this effort not at least because I can not measure a-priori how much effort an average programmer needs to learn a specific framework.

So the final, very simplified model is to compare the amount of code delivering business value compared to the amount of code configuring and supporting the selected framework. How to measure this?

I usually… Well, not usually. Selecting a framework is not an everyday practice. What we did in our team last time to perform a selection was the following:

We pre selected five possible frameworks. We ruled out one of them in the first run as not being widely known and used. We did not want to be on the bleeding edge. Another was filtered out as closer examination showed that the framework is a total misfit for our purpose. There remained three. After that we looked up projects on GitHub that utilized one of the framework, at least two for each framework (and not more than three). We looked at 8 projects total and we counted the lines categorizing each as business versus framework code lines. And then we realized that this just can not be done during the lifetime of a human, therefore we made it simpler. We started to categorize the classes based on their names. There were business classes related to some business data and also classes named after some business functions. The rest was treated as framework supporting, configuration class.

The final outcome was to sculptured into a good old ppt presentation and we added the two slides to the other slides that qualitatively analyzed the three frameworks listing pros and the cons. The final outcome, no surprise, was coherent: the calculation showed that the framework requiring the less configuration and supporting code was the one we favored anyway.

What was the added value then?

Making the measurement we had to review projects and we learnt a lot about the frameworks. Not as much as one coding in it, but more than just staring at marketing materials. We touched real code that programmers created while facing the real problems and the real features of the frameworks. This also helps the evaluator to gain more knowledge, gives a rail to grab on and lead us where to look, what to try when piloting the framework.

It is also an extremely important result that the decision process left less doubt in us. If the outcome were just opposite then we would have been in trouble and it would have made us thinking hard: why did we favor a framework that needs more business irrelevant code. But it did not. The result was concise with common sense.

Would I recommend this calculation to be the sole source for framework selection? Definitely no. But it can be a good addition that you can perform burning two or three days of your scrum team and it also helps your team to get the tip of their fingers into new technologies.

Logging or Commenting ?

When my recent article was republished on dzone Jonathan Fisher added a valuable comment stating:


I think I have something else you should write an article on: “Logging or Commenting?” I see debug statements as “living comments”, one that explain the execution of the program to the next guy, but also provides valuable intel in production.

I have seen that practice only once. There was a coding environment with strict coding rules that forbid commenting. They said you should comment Javadoc on the interfaces but nothing else. If your code is not readable reading the code and the unit tests then your code has to be changed.

In this environment programmers, who wanted to comment, soon started using logs. But before getting to that let’s see why.

Is Commenting Bad?

Generally commenting is bad. It was not bad when assembly and FORTRAN were the programming languages. Those languages were to generate executable code and not to express ideas. Languages today focus more on expressiveness, ease of coding and the translation from ideas to executable is done by compiler more than this was possible in the era of FORTRAN. Now we have the CPU and memory to compile languages like Java, Groovy, Scala. When these language are at hand and you feel like needing comments you have to think about two things:

  1. Is your code really readable? Could not it just been rewritten to express the ideas carved into the comment?
  2. Is the information you want to type as a comment really a comment? Or it should be part of the documentation, and should not be put into the code?

If you think you can not write the code more expressive because the business domain is just complex and does not fit the code, please visit the article The Golden Rules of Code Documentation and the rant to the challenge.

Comments are to be read by programmers, who maintain your code. If the words would rather fit the documentation in a Wiki, do not distract the programmers following you.

Are logs bad?

Generally logging is good. However if logging is to overcome the commenting restrictions then a good weapon is used for the bad. Don’t. It is a more general concept than logging. As a matter of fact even broader than programming. Don’t use a tool it was not designed for.

There is a fear that logging decreases performance. Using modern logging libraries and solutions this should not be a factor to seriously consider except edge cases. Even if it is, first measure and then tune. Do not prejudice or have a-priory assumptions about performance.

Conclusion

Logging should add to readability. Since it is a separate aspect, different from the original business aspect the code was developed for, there is a danger that inserting the logging statements will decrease readability. When you develop code you should pay attention to this. For example I recommend that you never externalize or use fields/variables to store the logging strings. Logging texts should make sense where the statements are.

Insert logging statements bravely, never hesitate. When you code review take into account that there are two sequences of logging statements. One sequence is linear as you read the code. The other sequence is how the printouts get into the log file following the program execution. Perhaps making a log review involving production, support people can be a good practice. It can be similar to code review: a fresh eye looks at the generated logs and gives feedback on readability. I have experienced this ad-hoc but never as a planned activity. If anyone tried, give feedback.

And generally: do logging for logging. Use commenting for commenting. If you have to dig a grave, use a shovel. If you have a nail: use a hammer. Use the appropriate tool and do not mix usage.

Unit test deprecated methods

Deprecated methods have to be treated different. At least in my opinion. The question I did not discuss in that article is if we have to unit test deprecated methods or not. For the impatient here is my statement:

Deprecated methods have to be unit tested the same way as other methods.

Probably this is not a question when there is already a unit test for the method. In that case you just leave it there and keep it running each time the CI server fires. The question may come up in your mind when you inherit some legacy code and you, yourself deprecate some methods or just find it deprecated with no appropriate unit test. Why bother to invest time writing unit tests when the method will no longer be in use?

The answer to this why lays where the difference is between a deprecated and a deleted method. The deprecated method is still in use. It may happen that no one uses the method but that is not guaranteed. If it were you could just delete the method. Deprecated method is still in the published API with a slight comment: you better do not use it. Clear?

What if there is no time to write the unit tests?

If there is no time (treat this precondition as a hypothetic and not questionable: that is another topic for what to have time) then there is no question.

Unit test are not writing themselves during the night, while you sleep.

What if you have some time but kind of short. In that case, if nothing else prevails, you can linger the tests for the deprecated methods to the end of the task list. If nothing else prevails. Being deprecated does not necessarily mean: not important. Many may still use it. It means: deprecated.

You can program bug free

Money Spent Wise on QA

Money Spent Wise on QA

You can not. This is a lie, just like the cake. You can lower the number of bugs. The more you spend wisely on QA the less bugs you will have. The magic word is “wisely”. You can spend unlimited amount without increasing the quality. Old truth in just any area: you can waste money if you are not wise.

On the other side of the line there is no free lunch. If you do not spend enough on QA you can only dream about properly working software. You won’t have less bugs if you do not spend on QA. And you will never be bug free.

Bug free software is contradiction. We have bugs, and we do not like them. We have to zap them.

Ad-hoc bug zap

Most of the bugs are zapped in an ad-hoc manner. The developer writes some test, develops some code. The code does not work, has bugs and the developer fixes the code until it runs fine with the unit tests crafted beforehand. Integration tests and end-to-end tests may also discover some bugs. These are usually reported in a bug tracking system and the developers fix them and they are eliminated in the next release.

Sometimes, however bugs are not that easy to handle.

Classified bugs

Sometimes bugs are not that simple to handle. Sometimes it takes a lot of time to fix a bug. It may need analysis how and when the bug manifests. Sometimes bugs magically disappear like if computers were non deterministic. Other times fixing a bug requires significant code modification. This is, by the way, a clear sign of design shortage on the technical or business level (or both).

If a bug is difficult developers may be reluctant to hunt for it and fix it. If it needs alteration of lot of code they may tend to declare the behavior to be a feature rather than a bug. Fortunately we know who has the last word in such a debate: money. Feature is what business uses to make value. Everything else is just behavior. And here comes the other aspect of bug classification: does business care about a certain behavior? If yes, the behavior is a bug and is a target for fix. If not then this is just a behavior.

Bug Classification Quadrandt

Bug Classification Quadrandt

Looking at that in a quadrant we have four different types of bugs.

  1. There are bugs that are easy to fix and have high impact on the business. They are the easy picks. Developers are usually eager to fix those bugs and become the hero of the project.
  2. Bugs hard to fix and having no business impact will never be fixed. There is no point.
  3. Bugs easy to fix and low impact on business are fixed many times when a developer has some time (minutes) to do the fix. This is the hobby area.
  4. There are bugs that are hard to fix and have high impact on the business. These are the critical bugs that get most of the attention. These bugs will not be fixed from one day to the other and therefore they have to be assessed, budgeted, scheduled and eventually fixed.

Cost of bug fix

Business needs these bugs fixed and many times the cost to fix them does not represent any extra value. Usually business feels that these bugs just have to be fixed but not on their costs. They have already paid for the feature, which eventually fails. If the development organization is separate company then the vendor should have the budget to do the fix. It had to be included in the contractual price. If the development is in-house the cost may not be discussed or may be T&M based. In that latter case the developers “charge” only the hours they spent developing the feature instead of project fixed price but when there is some bug the hours to fix are also charged. This is something not clarified well enough among the players and is a source of interdepartmental stress in many times.

“Easy picks” are fixed without budgeting and business people press the developers to fix the bugs “in their free time” without separate budget. The driver for this may be to lessen the burden of costs that business people are usually measured on and also many times to hide the errors that were in the specification, communication on the business side. On the other hand developers want even easier to fix bugs to be assessed since they are usually measured on billable time. They are also reluctant to burn their so called “free time”. This time is not really “free time”. This time is covered by work hours and is usually used for self education. And developers (at least those that deserve the title) love to educate themselves (for example reading blogs like this).

For this reason there is no clear and precise border between the “Easy Pick” and the “Assess, Budget, Schedule, Fix” areas. Business people want to pull the border to east leaving more and more bugs in the easy pick area, while developers want the border more to the west, and the fight area is in between. And the real problem is when the project members spend significant amount and effort in that area.

Conclusion

When you find yourself in a debate about bugs and features with the business people, try to bring up this quadrant in your mind. Many times the cost/value coordinates of the bugs are not discovered. Think about it even qualitatively.

Knowing the bits

We use complex systems. My mother once said that there could be little leprechauns behind the TV screen redrawing the screen 50 times a second she could not care. (At least she new that the TV in Europe had 50 (half) screens every second.) Most of the people do not care about the electronics and the softwar around us. The trend is that this technology penetration is going to be even more dense. Electronics gets cheaper, programming becomes easier and soon toilet papers will have one-time-use embedded computers on it. (Come up with a good application!) Face recognition is not the privilege of NSA, CIA, KG or Mosad and the technology spread does not stop at the level of big corporations like FB, or Google. Shops start to install cameras and software that recognizes and identifies frequent buyers helping the work of the sales. People get used to it and IT personnel are not different, are we?

Kind of yes. The difference is that we are interested in the details of those leprechauns how they do their job. We know that these days there are liquid crystals in the screen, they are controlled by low voltage signals (at least compared to the voltages of the former CRT solutions) and that there is a processor in the TV/toaster/toilet paper and it is programmed in a language called e.g. Java.

We, Java programmers, program these applications and we not only use the language (including RT) but also layered software, frameworks. How do these layered software work? Should we understand or should we just use it and hope that it works?

The more you know a framework the better you can use it.

Better means faster, more reliable, creating code that is more likely to be compatible with future versions. On the other hand there should be a reasonable stop when you have to halt learning and start using. There is no point to know all the details of a framework, if you never start using it. You should aim for the value you generate.

On the other end of the line however, if you do not have enough knowledge of the framework you may end up using a hammer digging a hole instead of a shovel. I usually feel confident when my knowledge reaches the level of understanding that I know how they (the developers of the framework) did it. When I can bravely say:

If I had time (sometimes perhaps more than lifetime of a single person) I could develop that framework myself.

Of course, I will not, because I do not have the time and also, more importantly because there is no point developing something that is already developed with appropriate quality. Or is there?

I could do it better.

I have heard that many times from junior programmers and from programmers, who considered themselves not that junior. The correct attitude would have been:

I could do it better, but I won’t because it is done and is good enough.

You do not need the best. You just need a solution that is good enough. There is no point to invest more if there is no extra leverage. There is no point to invest more even if there is leverage but it is lower than the investment in other areas would be higher.

Generally that is it when you are professional. Face it!

Follow

Get every new post delivered to your Inbox.

Join 922 other followers