Author Archives: Peter Verhas

About Peter Verhas

Java expert, developer, architect, author, speaker, teacher, mentor and he very much likes to be alive.

Docs as Code is not enough

Docs as Code

it is the first step to the right direction

The approach to treating your documentation the same way as program code is a step in the right direction, but it is far from state-of-the-art. The practice is detailed on many websites that advocate the use of docs-as-code (DAC). For example the Write the Docs community has a great article on docs-as-code. The article lists

Issue Trackers
Version Control (Git)
Plain Text Markup (Markdown, reStructuredText, Asciidoc)
Code Reviews
Automated Tests

as required tools to this approach. Another example is docs-as-code, which is a toolset for documentation maintenance. They write

With docs-as-code, you treat your documentation the same way as your code.

You use…

your IDE to write it
your version control system to store and version it
your test-runner to test it
your build system to build and deploy it”

This is very much the same as the approach of Write the Docs.

You have to have your documents in a format supported by the DAC tools. Use version control, document review, automated build, issue tracker, and automated tests. It is very much the same approach we use in code development.

Essentially it is a copy of the professional software development process’ coding part. Documentation, however, is not coding. While it is a good idea to reuse some parts of the coding methodology technics, there is more to it.

What docs-as-code Ignores

Coding is a transformation process converting documentation, namely the requirement documentation, into code. The requirement documentation may not be documentation in the classical sense. It may be some note, a list of wishes on a jot of paper. Still, the essence is to convert some human affine into machine affine. Some techniques try to support this process, but most of these techniques die when in production. For example, creating the documentation as UML and making it so precise that the code generation is automatic afterward is not feasible. You could do it in principle, but the cost of the effort is too high. It is cheaper to create the code than documentation that defines the functionality with mathematical precision.

Documenting an application is precisely the opposite direction.

Something that failed in one direction does not necessarily fail when we try to go the other direction. You may not be able to jump from the river to the bridge, but the other way is very much possible.

When creating the program documentation, our source is precisely describing the functionality we want to document. After all, the code is the most precise documentation of the application functionality. We already have the precision, which was not feasible to have for the other way around.

The docs-as-code approach, as described by most articles, ignores it. However, it can be amended, and it should be. We can look at it as the next step in the docs-as-code evolution.

The next step

We can categorize documentation into two categories.

Explanatory, and
Reference

Sometimes a document belongs to one of the categories, but documents are a mix of the two most of the time. It may even happen that you cannot even tell if a sentence in a document belongs to one category or the other.

Creating an explanatory style text cannot be automated. It needs human effort to create sentences that are clear and easily understandable. The importance is demonstrated by the lack of them in this article, as you may have noticed.

However, creating the reference text is more or less a mechanical task. The documentarian (a term from write-the-docs) copies the key’s name to explain how to configure the system and writes a sentence around it. Copies some sample code from the unit tests into a code segment into the Asciidoc documentation and adds explanation. The reference is taken from the code verbatim in these examples, and the explanatory part is added.

You can automate the copy of the reference information. Most of the time, it is not automated.

The catch is that automation, just as in the case of tests, is more expensive than doing manual work once. It pays back when the actual operation (copy and paste) happens many times.

And it usually does. To be more precise, it is supposed to happen. However, the documentation maintenance misses the task in practice, and the document becomes stale. It is where the docs-as-code automated test may help. In principle, it is possible to create a test checking the documentation and find discrepancies between the names in the code and the documentation. It can be heuristic, or it can be exact. To do it the exact way, the documentation and/or the code needs meta-information helping the test to perform the consistency check.

Such a test can signal that the documentation may be outdated and need change. For example, it may give a warning, like

“The name of the field XYZ is not the same as in the documentation ZZZ. Change ZZZ in the documentation to XYZ”.

It is a foolish and outrageous error message. I immediately know that the program architecture is messed up when I see such an error message. If the test can tell me what to do with such precision, it could fix the problem with the same effort.

It is much better if we let the automated build copy the actual name instead of checking that the human did it correctly. To do that, the documentarian should put the meta-information into the documentation instead of the copied value. The meta-information is read by the automated build tool, and using that; it copies the actual value or values.

If the value changes, the build process will automatically change it.

Another advantage is the lesser possibility for error. If the documentarian makes a mistake copying the field’s name, the text will not complain. If he writes XXY instead of XYZ, the documentation will contain the wrong name unless some human review process discovers and fixes the bug. If the documentarian inserts the meta information and makes a mistake, the build process will likely fail. If instead of XYZ, I have to write {java:field com.javax0.jamal.api#XYZ} any simple typo will be detected. If there is a field com.javax0.jamal.api#XYZ it is unlikely to have also com.javax0.jamal.api#XXY.

With this approach, the docs-as-code workflow is extended. The documentation’s ” source code ” starts behaving as a source code. The automated build is no longer simply formatting and executing language checks. The goal is to automate everything that you can automate. It may not be cheaper than doing the work manually, but certainly less error-prone.

Tools

All the above theory is pleasant and attractive but worth nothing unless there are tools to implement them. My motivation writing this article is partly to advocate the use of the open-source tool Jamal. Although Jamal is a general-purpose macro language and can be used in many areas, its primary purpose is document maintenance support. It is a simple to write, non-intrusive macro language. Using it, you can insert meta-information into the documentation to be processed by the automated build. You can use it with any plain text document format, like Asciidoc, markdown, apt, etc. The latest releases also support the DOCX format to use it even with Microsoft Word.

The set of the macros is quite extensive, and it is easy to add your own. The documentation support module can gather information from the application’s source code as snippets. Snippets can then be transformed, extracted, and inserted into the documentation. Information from the code can be extracted using text tools using the source code text. However, in the case of Java applications, the document transformation may also collect information using reflection. It can be done because Jamal itself is a Java application.

It can be started on the command line as a maven plugin and a maven extension. It is also embedded as a doclet and a taglet to allow Jamal macros in the JavaDoc documentation.

You can use macros to check the consistency of the documentation and the code. You can mark some part of the code as a snippet, and the documentation related to the specific region may contain the hash code of the piece. When the part changes in the source code, the macro evaluation will automatically signal an error.

The application of Jamal is independent of build automation. It can be antora, jBake, or simply a maven project with different plugins. The application of Jamal is also independent of the documentation format. It can be Asciidoc, markdown, apt, etc., as long as the documentation format is text. Using the Word extension included in the command line version, it can even be Microsoft DOCX Word format.

Conclusion

Treating documentation as source code is a good idea and a good start. It can, and should, however, be extended to include more features. When you treat your documentation as a source code, you should not stop simply using built automation, automated testing, review processes, and versioning. You should also apply techniques like Don’t Repeat Yourself (DRY). Extra tools exist and seamlessly integrate with the already existing build and formatting tools to do that.

Lambda and final variables

5 Replies

Introduction

Lambda expressions can use the variables in the scope of the lambda expression, but only if they are final or effectively final. What is the reason for that? Why is that? It is an interesting question because the answer is not apparent and opinionated.

There is only one ultimate answer, though: because that is what the Java Language Specification says. But saying that is boring. True, but boring. I prefer the answer that says lambdas can only use final and effectively final local variables because lambdas are not closures.

In the following, I will discuss what final and effectively final mean, the differences between closures and lambdas, and finally, how we can create closures in Java using lambda expressions. I am not advocating the creation of lambda expression-based closures in Java, nor the abandonment of the idea.

`final` and effectively final

When declaring it, a local variable is final if we use the final keyword. The compiler will also require that the variable get a value only once. This value assignment may happen at the location of the declaration but can be a bit later. There can be multiple lines that assign value to the final variable so long as long only one of them can execute for each method invocation. The typical case is when you declare a final variable without assigning value to it, and then you have an if statement giving different values in the “then” and the “else” branch.

Needless to say that the variable has to be initialized before the lambda expression is created.

A variable is effectively final if not final, but it could be. It gets an assigned value at the declaration or can get a given value only once.

Life of a Lambda

A lambda expression is a kind of anonymous class. The JVM handles it differently, and it is more efficient than an anonymous class, not to mention that it is more readable. However, from our point of view, we can think of it as an inner class.

public class Anon {

    public static Function<Integer, Integer> incrementer(final int step) {
        return (Integer i) -> i + step;
    }
    public static Function<Integer, Integer> anonIncrementer(final int step) {
        return new Function<Integer, Integer>() {
            @Override
            public Integer apply(Integer i) {
                return i + step;
            }
        };
    }
}

When the lambda expression is created, the JVM makes an instance of the lambda class that implements the Function interface.

var inc = Anon.incrementer(5);
assertThat(inc.getClass().getName()).startsWith("javax0.blog.lambdas.Anon$$Lambda$");
assertThat(inc.getClass().getSuperclass().getName()).isEqualTo("java.lang.Object");
assertThat(inc.getClass().getInterfaces()).hasSize(1);
assertThat(inc.getClass().getInterfaces()[0]).isEqualTo(java.util.function.Function.class);

The JVM will place this object on the heap. In some cases, the compiler may realize that the object cannot get out of the method’s scope, and in this case, it may store it in the stack. It is called local variable escape analysis, which can just put any object on the stack, which cannot escape from the method and may die together with the method return. However, for our discussion, we can forget this advanced feature of the Java environment.

The lambda is created in the method and stored in the heap. It is alive so long as long there is a hard reference to this object and is not collected. If a lambda expression could reference and use a local variable, which lives in the stack, it would need access to something gone after the method returns. It is not possible.

There are two solutions to overcome this discrepancy. One is what Java follows, creating a copy of the variable’s value. The other one is creating a closure.

Closure and Groovy

We will look at Groovy examples when talking about closures. The reason to select Groovy is that it is very close to Java. We will look at some Groovy examples, and for the matter of demonstration, we will use Java-style as much as possible. Groovy is more or less compatible with Java; any Java code can be compiled as a Groovy source. The actual semantic may, however, be different slightly.

Groovy solved the issue of local variable accessibility creating closures. The closure closes the functionality and the environment into a single object. For example, the following Groovy code:

class MyClosure {
    static incrementer() {
        Integer z = 0
        return { Integer x -> z++; x + z }
    }
}

creates a closure, similar to our lambda expression, but it also uses the local variable z. This local variable is not final and not effectively final. What happens here is that the compiler creates a new class that contains a field for each local variable used in the closure. A new local variable references an instance of this new class, and the local variable uses all references to this object and its fields. This object, along with the “lambda expression” code, is the closure.

Since the object is on the heap, it stays alive as long as there is a hard reference. The object, which holds the described function has one, so this object will be available so long as long the closure is alive.

def inc = MyClosure.incrementer();
assert inc(1) == 2
assert inc(1) == 3
assert inc(1) == 4

It is clearly shown in the test execution where the closure increases the z amount at each execution.

Closures are lambdas with state.

Lambda in Java

Java approaches this problem differently. Instead of creating a new synthetic object to hold the referenced local variables, it simply uses the values of the variables. Lambdas seem to use the variables, but they don’t. They use only constants copying the value of the variables.

When designing lambdas, there were two options. I was not part of the team making the decisions, so what I write here is only my opinion, guessing, but it may help you understand why the decision was made. One option could be to copy the variable’s value when the lambda is created, not caring about the later value change of the local variable. Could it work? Inevitably. Would it be readable? In many cases, it would not be. What if the variable changes later? Will the lambda use the changed value? No, it will use the copied, frozen value. It is different from how variables work usually.

Java requires the variable to be final or effectively final to solve this discrepancy. The disturbing case having the different variable value when the lambda is used is avoided.

When designing language elements, there are always tradeoffs. On one end, some constructs provide great power to the hands of the developers. However, great power requires great responsibility. Most of the developers are not mature enough to take on the responsibility.

On the other side of the scale are the simple constructs providing less functionality. It may not solve some problems so elegantly, but you also cannot create unreadable code so easily. Java is usually going this way. There has been an obfuscated C contest almost since the language C started. Who can write less readable code in that programming language? Since then, almost all languages started the contest, except two. Java and Perl. In the case of Java, the contest would be dull, as you cannot write obfuscated code in Java. In the case of Perl, the contest is pointless.

Closure in Java

If you want to have a closure in Java, you can create one yourself. The good old way is to use anonymous, or for that matter, regular classes. The other is to mimic the behavior of the Groovy compiler and create a class that encapsulates the closure data.

The Groovy compiler creates the class for you to enclose the local variables, but nothing stops you from making it manually if you want it in Java. You have to do the same thing. Move every local variable that the closure uses into a class as an instance field.

public static Function<Integer, Integer> incrementer() {
    AtomicInteger z = new AtomicInteger(0);
    return x -> {
        z.set(z.get() + 1);
        return x + z.get();
    };
}

We only had one local variable, int z, in our example. We need a class that can hold an int. The class for that is AtomicInteger. It does many other things, and it is usually used when concurrent execution is an issue. Because of that, some overhead may slightly affect the performance, which I abjectly ignore for now.

If there are more than one local variables, we need to craft a class for them.

public static Function<Integer, Integer> incrementer() {
    class DataHolder{int z; int m;}
    final var dh = new DataHolder();
    return x -> {
        dh.z++;
        dh.m++;
        return x + dh.z*dh.m;
    };
}

As you can see in this example, we can declare a class even inside the method, and for the cohesion of the code, it is the right place. Eventually, it is easy to see that this approach is working.

final var inc = LambdaComplexClosure.incrementer();
assertThat(inc.apply(1)).isEqualTo(2);
assertThat(inc.apply(1)).isEqualTo(5);
assertThat(inc.apply(1)).isEqualTo(10);

It is, however, questionable if you want to use this approach. Lambdas generally should be stateless. When you need a state that a lambda uses, in other words, when you need a closure, which the language does not directly support, you should use a class.

Summary

This article discussed why a lambda expression can access only final and effectively final local variables.
We also discussed the reason and how different languages approach this issue.
Finally, we looked at a Groovy example and how Java can mimic this.

Therefore, if anyone asks you the interview question, why a lambda expression can access only final and effectively final local variables, you will know the answer. Because the Java Language Specification says so. Everything else is speculation.

You an find the code for this article along with the article text source code at https://github.com/verhas/demo/tree/master/LambdaFinal

Why and how to do technical interviews?

Leave a reply

It is a personal blog. The views and opinions expressed in this article are those of the author. They do not represent people, institutions, or organizations that the author may or may not be associated with in a professional or a personal capacity. All information is provided on an as-is basis.

Technology companies are growing and need new personnel. In addition, there is natural attrition in the companies. In a highly competitive market, people are leaving for various reasons, and these needs also have to be met through hiring new employees. Therefore, searching for, selecting, and hiring new co-workers are always a must – it is a standard business for every company.

Companies usually conduct interviews to assess and select their future colleagues from the pool of aspiring candidates. Even though it is the standard practice, there are a lot of controversies with this approach. You can see many social media posts about harmful practices, wrong questions, and ill-treatment of the candidate. One infamous example was when Google asked candidates in their interviews to estimate the number of golf balls that could fit in a bus.

Most of the people having a voice on social media express their opinion that this was total misuse. I tend to see some merit in using such questions, but, as often, personal opinions are irrelevant. So instead of starting a debate about this particular question or similar questionable practices, I will focus on the purpose and the practical approaches we apply when conducting interviews.

I mainly rely on the experiences I gained when completing interviews on behalf of my current employer, but I believe there is nothing company-specific in this article.

When I write ‘we’, I refer to the whole industry or at least to a large group of companies that follow good practice and not specifically to my employer.

What is the purpose of conducting interviews?

Any company could hire a candidate without any prior filtering, but this could cost them a lot. Suppose the candidate does not fit the company or meet the criteria to be successful in a position. In that case, the company would have to pay the salary for the probation period, colleagues guiding the new hire during the onboarding process would invest significant time and effort, and other resources, like office space, infrastructure, heating, network, and so on, that also costs money. It is not a good practice for a company.

However, the money the company loses is not the main issue. Companies have profit and loss, and you can consider the cost of selecting the right candidate as an investment. The highest cost is not monetary, and it is not on the company side. The candidate is the one who would pay the real price for such a practice.

Most of the candidates have a safe job and solid position when looking for a new one. However, losing the place at the new job, getting sent away is a substantial personal burden. As a result, candidates may find themselves “in the street” looking for a new position. Not having a current job is hard to explain during the HR interviews. At the same time, the financial burden and the time pressure may also put the candidate into a hard-to-negotiate corner during the salary discussions.
No company should do this to anyone. If a company wants to hire you without proper assessment: run. Fast and far away.

The thorough assessments of the candidates’ skills, experience, and knowledge are at the other end of the spectrum. Some companies do that by giving out homework, completing full-day assessments filled with role plays, coding tasks, and using other similar techniques to evaluate candidates. The simplest and cheapest way to do an evaluation, however, is an interview.

A full-day assessment almost certainly gives a more reliable result, but it requires significant resources from the company. So, as usual in business, we should follow the Pareto principle and shoot for the cheapest good-enough solution. I will talk a bit later about what ‘good enough’ is.

Overcomplicated hiring processes may distract candidates. Imagine a senior developer who is looking for a new position. How many full-day assessments will they attend? To participate in such a selection, candidates may need to use a vacation day from their holiday budget, and they have to keep it secret at the actual workplace. If your competitor offers you a late afternoon interview instead, you will most probably choose that option.

There are pros and cons. We cannot tell what the best approach is, and certainly, doing interviews as a selection tool is not the imaginable best, but probably the best existing, and indeed the best we know. Nevertheless, it is the industry standard practice.

What is a good interview?

We do not need to complete the best interview in the world, as I wrote above. We have to complete one that is good enough. To say that, we have to know what we consider to be a good interview. We should have a metric that can tell us which interview is “better”.

Candidates often tell me: “This was the best interview of my life.”, even when my conclusion is not to recommend them for hire. Although a happy candidate is essential in bringing your company a good image, it is not the metric we usually look for. A good interview does not need to be enjoyable for the candidate. That is just an extra, a possible byproduct of a good interview.

An interview’s most crucial quality measure is to differentiate a fitting candidate from a non-fitting one. Of course, there are other criteria, like proper communication, politeness, non-disclosure, and conduct. These are all very important. Nevertheless, the primary goal of the interview is the selection.

When doing an interview, there are four possible outcomes. The candidate can be fitting or non-fitting, and at the same time, the interviewer recommends or does not recommend the candidate for hire. These are two dimensions with two values each. Each pair is possible, resulting in the four possible outcomes.

The recommended fitting candidate and the non-recommended non-fitting candidates are the most uncomplicated cases. These are the happy paths.

The remaining two cases, false positive and false negative, are a bit more tricky. The case when the interviewer does not recommend the candidate, although they are fitting, is theoretical. Those candidates do not get employed, and none will discover their fitness. In other words, we will never know when a candidate and/or an interview fall into this category. This case is theoretical in the sense that though it certainly exists, we will never see it.

When a candidate is recommended but not fitting is the costly situation we already discussed. When it happens, it will be clear for many people in the company who will manage the consequences and deal with the problem.

The solution for the situation is often to find a better, more suitable position for the person inside the company. It is done falling into the trap of the sunk cost fallacy. The people involved subconsciously feel the relative cost and burden of finding a new position without an existing need and actual vacancy. This cost is born to the candidate. Feeling responsible for the situation, they do not want to put that on them.

When the company has a good hiring and interviewing practice, it rarely happens. We cannot avoid such situations, however. It is not because of the unique nature of the interviews. It is a general measurement theory. Any decision can have four outcomes: true-positive, true-negative, false-positive, and false-negative. No decision system could avoid the false parts. They exist by principle. The only thing we can tune is to push the scale between the false outcomes. What do we want to have less? Is it the false positive or the false negative result which is less desirable?

At this point, you can tell that I am advocating against the false positive cases, which means that we have to design the interview decisions to avoid those even if we get more false-negative results.

This advocation is not general, though. It is only for the interview decisions. For example, a cancer screening system should be scaled towards favoring false-positive cases. I would rather choose a few days of panic until the repeated test annuls the false-positive result than die because of a false-negative result not detecting the tumor at an early stage.

The fact that we should favor the false-negative cases means that the technical interviewers should recommend hiring only those candidates they are absolutely sure about. When there is any doubt that the candidate is bad, they are better not to be hired.

Note that by doing so, you will filter out some candidates who are good enough but are not very good. You have your doubts not without reason. The potential loss is insignificant in sending away some of the candidates who would fit but are not “really good”.

Do we judge the candidate?

Avoid judging the candidate is extremely important. In the previous section, I deliberately used “good candidate” and “bad candidate”. In addition, I used an example (medical screening) that subconsciously compared candidates to cancer. If you felt inappropriate when you first read that, you are on the right track. If not, you have to think about why you did not.

We must respect the candidates.

Technical interviewers have to be humble. Maybe non-trivial at first, but we also must not evaluate the person, and we should not use expressions that may even unintentionally imply that. You cannot do that if you look down on candidates and you do not respect them. The respect has to be authentic. If it is not, you cannot hide it. So the first thing is that you should feel and show genuine care and then work on your communication.

It is why I prefer to use the word “assess” instead of “judge”. We assess the knowledge, skills, and experience of the candidate. We do not “judge” these, even though linguistically, it would mean the same. For the same reason, I usually talk about the position fitting the candidate and not the candidate fitting the position. Thus, when I say that a position is not good for a specific candidate, nobody will think that it is generally bad, even less that it is stupid or dumb.

On the other hand, the sentence “The candidate is not good for the position.” is heard and interpreted as “The candidate is not good…” The end of the sentence often gets lost in the communication or during the interpretation. It has to be carefully avoided.

Sometimes, I meet lead developers, senior, or even architect candidates who lack even basic skills in their current employment. Even though I feel the temptation to doubt whether their current status is well justified, I don’t. If a candidate’s current position seems to be a lie in the CV, it does not matter. Companies are different, and they need different types of people. There is no such person who is generally not fitting a role. To assess a person’s fitness for a position, you have to compare the person’s qualities to the role. Otherwise, you could plainly say that the candidate is ok but can not tell us for what.

Work with the Candidate

When conducting the interview, you work with the candidate. The candidate helps you, and you help the candidate. To get a clear picture and understand whether the position is really the dream position for the candidate is in your mutual interest.

It means that you can be absolutely honest with the candidate. You can tell them all the things that I wrote in this article. You can explain the aim of the interview, what the possible outcomes are, the recommended and not recommended decisions, and so on.

I usually devote 7 minutes at the start of the interview explaining the above. Of course, it is a bit boring after several hundreds of interviews, but every job has its downsides and upsides, and it is crucial for each candidate.

You can even explain that when candidates are lying or cheating candidates, it might be harmful. It helps when a candidate gets a coding exercise that is too familiar to them. A few times, the candidate proactively warned me that they had already met the task beforehand. So we chose a different one.

Coding Exercise

The above paragraphs are generally valid for all types of interviews and not specific for software development. For example, doing a coding exercise is specific to technical software developer interviews. However, most of the debates on social media are related to this practice. The reason for that is simple. It is very easy to do it wrong.

I would never recommend a candidate who cannot demonstrate the coding skills in an interview. After all, what is the value a developer can deliver who cannot code? It is more questionable if a solution architect needs to code, and I would not get into that this time. I have my personal opinion about it, but it is irrelevant. Maybe I will discuss it in a different article.

I have met some developers hired from different vendors working in the same team for our clients who could not code. We never complained, and we did the extra work instead of them. The client personnel could see who did what and came to their conclusions most of the time. I will also not name the vendor ever. Let’s just say that these developers stay afloat in the industry until they find a different job and become BAs, PMs, or car salesmen. I accept them as a fact of life, but I do not accept hiring one in my workplace. In conclusion, we should agree that some performance measures are needed to assess the coding skills as a work theory.

An excellent coding exercise helps assess three things:

The algorithmic thinking of the candidate.
Coding skills and the muscle memory of the language we test. In my case, it is Java.
Communication skills.

Each of these can easily go wrong, and hence negative stories quickly get to social media.

It is challenging to assess algorithmic thinking. It is much easier to test if the candidate can solve one specific problem or complete a task. That way, the assessment quickly degrades to testing if the candidate knows the particular algorithm. Even though I believe that learning and understanding the most important algorithms and data structures (quick sort, balanced trees, graph traversing) is vital for a developer, many developers do not possess even the fundamental computer science theory. I can also accept that there is no value in knowing many algorithms by heart. It is better to have the skillset to create the algorithm when needed.

To avoid testing the candidate knowing the task instead of solving it, I have several of them you cannot find on the internet. (Fun story about that at the end of the article.) We also discuss the solution while the candidate forges the code step by step. I realize if the candidate has known the algorithm beforehand.

You can test the coding skills easily. Many typical coding practices show off an inexperienced coder.

You can spot old coding constructs that we are not using anymore as the language (in my case, Java) develops. I sometimes see explicit type boxing, which we do not use since Java 1.4 Junior developers tend to compare a boolean value with ‘== true’ or write an ‘if’ statement and return ‘true’ and ‘false’ literal values from the execution branches. Some developers make mistakes, like indexing a ‘String’ as if it was an array.

As an interviewer, you should interpret those with a pinch of salt. The interview is not a normal coding environment. It is much more stressful, and such mistakes are many times caused by stress. The technical tools are usually less advanced than the usual IDE, with less support for code completion, syntax checking, and so on. Do not expect the candidate to know all the JDK API calls from the top of their head.

You can also check communication skills. For example, some candidates blamed me for presenting unprecise, even sloppy task descriptions. They were surprised when I told them that I was aware of that. It is to test if they clarify the task before making bold assumptions and just immediately start coding. Most of them do.

The coding exercise is the most challenging part of the interview. Not for the candidate, though. It is for the interviewer. It is a task that the candidate has to do together with you. If you, as an interviewer, see that the candidate is working on the coding task alone, you are doing it wrong. If you work together, then it is good. It may not be perfect, but most of the usual pitfalls you have already avoided.

Giving Feedback

At the end of the interview, you will know whether to recommend or not to recommend the candidate. If you don’t know, if you are not absolutely sure, then you should not recommend the candidate. I wrote that you must not recommend someone you are not sure about.

The recommendation, usually along with detailed analysis, is the primary outcome of the interview. There can be, however, another valuable byproduct. You can give valuable feedback to the candidate.

Interviewers seldom give feedback about the interview to the candidates, and this is not a good practice. I do not advocate giving feedback no matter what because it is a double-edged sword. If you provide feedback in the wrong way, it may cause a lot of harm to the candidate and the company. Providing valuable, thoughtful, and relevant feedback required some special skills.

Good feedback emphasizes the candidate’s strong points that they can build on and highlights the things that they can improve and that may result in enormous benefits.

The most benefit is evidently for the candidate, but it is also valuable for the company. Getting detailed feedback is always an invaluable help to better ourselves. Good feedback, however, is also beneficial for the company. Even if rejecting a candidate is the correct conclusion, a blatant and unexplained refusal may induce bad feelings towards the company. Feedback can mitigate this risk. Feedback explains the reasons so that the candidate can learn the reasons along with suggestions for improvement. Again, you can emphasize that the refusal is not a judgment; it solely recognizes the incompatibility between the candidate’s skills, experience, or knowledge and what the company requires in a specific role.

You do not know each other. Thus, you have to put a lot of emphasis on the good things that the candidate can build on. You can also explain that the feedback is limited as it is based on a 60-minute interview only.

Some candidates challenge some of my statements during the feedback. It is pointless from the feedback point of view. If I made a mistake, I misread the candidate in some aspect; they can ignore that part of the feedback. Some of the comments may likely be wrong due to the limited nature of the session. At the same time, I give feedback after the decision. It would be best if you did not change the decision based on any feedback debate. Even though I am usually lenient with candidates arguing about some points of the feedback. It reveals a lot about their personality that I can include in the subjective part of the interview record, and at the same time, it helps them vent their feelings.

I had candidates referred to our company by his friend I rejected but sent away with friendly but honest feedback.

Summary and Fun Story

Navigare necesse est. Doing interviews is unavoidable. Vivere no est necesse. Doing good interviews is difficult. In this article, I wrote about some aspects of the interviewing. There are other aspects that I did not discuss. Those I may address in a later article. I also know that many aspects of this topic are opinionated. You are welcome to comment, rant, criticize and tell the truth as you feel fit.

I promised you a fun story, so here it is.

Once I interviewed a candidate who was not outstanding. He had several knowledge gaps related to basic Java. He knew a few things wrong and was a bit stubborn. His coding skills were also less than what we required. When I ended the interview, I asked him if he wanted feedback. He said no, and disconnected the communication. (We usually do remote interviews using IP communication tools, like Zoom, Teams, Skype, etc.)

He immediately wrote an eMail to the talent acquisition team claiming that I was asking him wrong; I did not accept his correct answers and stating that I did not know Java. He also wrote that I was giving him a coding task that anyone can find on the internet, and I did not accept his correct solution because I did not like him. Even though he did not agree to video recording, the coding exercise does get recorded to crosscheck. I did not doubt that the solution was wrong, but his statement that I allegedly copied the exercise from the internet bothered me. So I googled some of the sentences of the task. I could find it on a site along with a wrong solution he also provided. It was word by word the same, including a typo. So you can guess who was copying from whom.

Your coding exercise tasks leak out. So you have to replace them frequently.

Creating a JUnit 5 ExecutionCondition

Leave a reply

Introduction

JUnit 5 has a lot of underutilized features. Developers have learned how to use JUnit 4, and they utilize the same feature set when using JUnit5. The sexy DisplayName annotation is used more and more, but the majority of the new features developers skip. In this article, I describe a particular situation I was facing and how I solved the issue by creating a custom ExecutionCondition.

My Special Testing Need

I am developing Jamal, which is a general-purpose transpiler, text macro language. It converts from an input text to an output text, resolving and executing macros in the text. Sometimes macros can be overcomplicated, and it may not be trivial why the output is what we get. The first approach to this issue is not to use overcomplicated structures, but this is not how developers work. Good developers tend to use the tools they have in their hands to the total capacity.

In the case of Jamal, it needs debugging. Jamal supported debugging for a long time, dumping each atomic step into an XML file that the developer can later examine. It is, however, not as effective as interactive debugging.

To support interactive debugging, I developed a debugger interface into release 1.7.4 accompanied by a Rest.js client application. Jamal starts in debug mode if it sees an environment variable JAMAL_DEBUG or system property JAMAL_DEBUG_SYS. When this variable is defined, Jamal pauses whenever it starts processing a new input and listening on a port configured by the variable. It goes on with processing only when it gets a command through the TCP channel.

The important thing for this article is: Jamal pauses and starts to listen on a TCP port in this mode.

The big question is, how to debug the debugger? The obvious answer is: Start Jamal in debug mode in a JVM started in debug mode. The easiest way in IntelliJ is to start it from a JUnit test by clicking on the debug button. So I had the test:

@Test
@DisplayName("Used to debug the debugger UI")
void testDebugger() throws Exception {
    System.setProperty(Debugger.JAMAL_DEBUG_SYS, "http:8081?cors=*");
    TestThat.theInput(
        "hahóóó\n".repeat(2) +
            "{@define a=1}{@define b(x)=x2x}{b{a}}"
    ).results("hahóóó\n" +
        "hahóóó\n" +
        "121");
    System.clearProperty(Debugger.JAMAL_DEBUG_SYS);
}

You have to //@Test the code before committing to your repo. Forgetting that will break the build because when it starts, it pauses and waits. I forget to comment out the annotation because I am such a forgetful person. Maybe age, maybe something else. However, my experience is that every developer has age, and every developer forgets to comment out such a thing. I needed something that realizes that the test is started from IntelliJ and lets it run but aborts it otherwise.

How to Recognize it is IntelliJ?

When you run a unit test from IntelliJ, IntelliJ will invoke your code from IntelliJ. Not directly. It goes through a few method calls in the stack, but there should be some class that belongs to IntelliJ towards the top of the stack. If the method and the class belong to IntelliJ, then the name of the class should undoubtedly have something specific in it we can check. Generally, this is the idea.

No specifications guarantee it. The name of the classes IntelliJ uses may change from time to time. Like Maven or Gradle, a different execution environment can also use some class names that may be similar to that of IntelliJ. But this is a solution that eventually works. No guarantee, but as for now, it works.

boolean isIntelliJStarted = false;
final var st = new Exception().getStackTrace();
for (final var s : st) {
    if (s.getClassName().contains("Idea")) {
        isIntelliJStarted = true;
        break;
    }
}

The selection of the string Idea to check is more or less arbitrary. It is a string that is not likely to happen in the stack trace of some other application, and at the same time, there is only a tiny chance that it disappears from later IntelliJ versions. It is also to note that creating the stack trace this way is time-consuming. When the code runs from IntelliJ, it is not a problem at all. The time it needs is way less than a fraction of a second, and the next step I have to do after I started the application is opening a browser and the debugger web page. By the time I am finished with that, Java could have analyzed the stack trace a few million times. I, as a human, am much slower than the stack trace gathering.

When the code runs on the CI/CD or Maven on the command line, the delay is considerable. It is not tremendous or really significant, but it should be considered. It adds to the compile time.

I would not use such a solution in a performance-sensitive production code.

Separation of Concern

I could insert this code into the test and return it from the test if it is not executed from IntelliJ. I did that as a first try, but I was aware that this is not an amicable solution. To make a decision separating the environments is not the responsibility of the test.

I was sure that JUnit 5 has a better solution for this. I asked @RealityInUse (Twitter handle) to help me. I was in a lucky situation because we share an office, which happens to be our living room during the pandemic. He is an active contributor of JUnit Pioneer https://junit-pioneer.org project of “@nipafx`, he knows a lot about JUnit 5 extensions. (And he is my son.)

He told me that what I needed was an ExecutionCondition.

ExecutionCondition is an interface. It defines one single method with a direct signature:

ConditionEvaluationResult evaluateExecutionCondition(ExtensionContext ctx);

The implementation should have a method overriding this interface method, and after doing the above stack examination, it has to

return isIntelliJStarted ?
    ConditionEvaluationResult.enabled("started from IntelliJ") :
    ConditionEvaluationResult.disabled("not started from IntelliJ");

It is almost all the work to be done. There is one little thing left: tell JUnit to use this condition for this test.

To do that, we created an abjectly named annotation: @IntelliJOnly. With this, the class we developed was the following (without imports):

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
@ExtendWith(IntelliJOnly.IntelliJOnlyCondition.class)
public @interface IntelliJOnly {

    class IntelliJOnlyCondition implements ExecutionCondition {
        @Override
        public ConditionEvaluationResult evaluateExecutionCondition(ExtensionContext context) {
            final Method method = context.getRequiredTestMethod();
            final var annotation = method.getDeclaredAnnotation(IntelliJOnly.class);
            if (annotation == null) {
                throw new ExtensionConfigurationException("Could not find @" + IntelliJOnly.class + " annotation on the method " + method);
            }
            boolean isIntelliJStarted = false;
            final var st = new Exception().getStackTrace();
            for (final var s : st) {
                if (s.getClassName().contains("Idea")) {
                    isIntelliJStarted = true;
                    break;
                }
            }
            return isIntelliJStarted ? ConditionEvaluationResult.enabled("started from IntelliJ") : ConditionEvaluationResult.disabled("not started from IntelliJ");
        }
    }
}

The test with this annotation is the following:

@Test
@DisplayName("Used to debug the debugger UI")
@IntelliJOnly
void testDebugger() throws Exception {
    System.setProperty(Debugger.JAMAL_DEBUG_SYS, "http:8081?cors=*");
    TestThat.theInput(
        "hahóóó\n".repeat(2) +
            "{@define a=1}{@define b(x)=x2x}{b{a}}"
    ).results("hahóóó\n" +
        "hahóóó\n" +
        "121");
    System.clearProperty(Debugger.JAMAL_DEBUG_SYS);
}

Notes

The implementation of the condition checks that the test method is annotated by @IntelliJOnly. The annotation may not be there if the user (developer using the annotation) makes some mistake, invokes the condition in the wrong way. This extra check may save a few surprises for the developer using this condition.

Summary

In this article, I described a situation that needed conditional test execution with a particular condition. After that, I described how the condition could be evaluated. Finally, we created a JUnit 5 execution condition to separate the Hamletian “run or not to run” dilemma from the test code.

As a takeaway, you should remember that JUnit is way better than JUnit 4. Utilizing only the features, which were already available in version 4, is a waste of resources. Your tests can be much simpler, more expressive, and easier to maintain if you learn and utilize the programming features of JUnit 5. Do!

Automate Technical Documentation using Jamal

2 Replies

Introduction

Writing good technical documentation is an art.

An art is the expression or application of human creative skill and imagination, … to be appreciated primarily for their beauty or emotional power.

But every art, like sculpting, has a craft part. You need chisels, hammers to form the sculpture out of the blob of marble. You need to learn the craft to master the art. Technical documentation writing is similar.

Writing sentences that are easy to read, entertaining for the reader is the art part. Fixing typos and grammatical errors is more like a craft. Making the documentation precise, to the point, and well structured is also the craft part. Crafts can be learned and aided with the right tool.

In technical documentation writing, the tools help address those tasks that are often performed manually though they could be automated. In this article, I will write about a tool that helps in that manner and which I used successfully to write documentation, many articles — also this one –and books.

What can be automated

Many things can be automated for technical document writing. I tried to gather a list from my experience, but it may not be complete. The list is the following:

Eliminate manual text repetition.
Transclude information from the documented system.
Checks internal consistency of the documentation.
Check the consistency of the documentation with the documented system.

In the following, I will talk shortly about these tasks, and then I will explain the tool that I use to address these.

DRY in Documentation

The DRY (Don’t Repeat Yourself) is a fundamental and old principle in programming. If there are the same lines in the source, they should be singled out, moving the common code into a separate method, class, or other coding structure. Copy/Paste programming is evil and must not be done. It does not mean that there is no repeated code in the compiled binary code. Code generators are free to repeat code if they think that is better than in some way eliminating it. One famous example is when a short loop is extended, and the code is repeated instead of creating a binary looping construct. It may consume more memory, but at the same time, optimization may find it faster.

The same should happen when you write the documentation. Except, you do not have methods or classes in the documents. You can reorganize your document into smaller sections, and then you can refer to the areas. It may have an impact on the readability. If the reader has to turn the pages instead of linear reading, comprehending the document becomes challenging. Using non-printed, non-linear documentation, a.k.a. hypertext eases a bit the page-turning, but the reader still can get mentally lost in the maze of the non-linear documentation. The ideal solution would be to have documentation, which is linear and contains all the interesting text for the particular user, reading it in the order as they want to read it.

Eventually, it is impossible. With today’s technology, you cannot create a document that contains precisely what the reader wants to read at the very moment and changes for each reader and even for each reading. The best approach we have is repeating some of the text in the documentation. Some readers may find it boring, while others will just get what they need. Your document “source” should be DRY, and the repeating of the text, the copy-paste operation has to be automated. The advantage is: any change in the text is consistently propagated to every occurrence of the text.

Information Transclusion

A living document has to follow the change of the system it documents. In the case of software, this can partially be automated. A lot of data that may need to be included in the document is available in the source code. For example, the current version of the application, a numeric value, may be included in the documentation at different locations. Updating it to the latest version manually is almost always some error. Sooner or later, one or more references may skip the update and become stale. The solution is partial when we use the technique that eliminates DRY. We define the version in the document in one place, and it will be referred to in other places. It still needs that one place to be updated. Fetching the version number from the source code automatically is one level more automation.

Usage samples are also an excellent example for transclusion. When the usage samples are automatically fetched from the unit tests, they are guaranteed to run during the test execution.

Internal Consistency

Ensuring internal consistency of the document can also be automated to some level. It is such an essential factor that many documentation systems support it related to cross-references. The examples can be various.

You may create a document with use cases. In the use cases, you use actors in the description. A document management system can ensure that all the actors used in the document are also defined. A similar check can be done for abbreviations and other things. Consistency, if it can be formally defined, can be checked by automated tools.

External Consistency

Just as well as the different parts of the document should be consistent and without contradiction, the documentation should also be consistent with the system it documents. It is similar to transcluding information from the source. The difference is that the information, in this case, is mainly existence only. For example, you reference a file, a directory, a method in a Java class. A tool can check that the directory, file, or method exists; it was not renamed nor deleted. Similarly, other consistency checks can be programmed.

Document Writing is Programming

There may be some other cases where some automation may come into the picture. The general approach, however, should be to manage the document similar to the program source. Technical documents need maintenance. Documents have a source, and they should be structured. One change in the documented system should be followed by a single change in the document. Every other occurrence in the output should be created automatically.

It is very much similar to programming. The programmers write source code in a high-level programming language, and the compiler generates the machine code. Sometimes the compilation process is a long chain involving many tools. Programming in machine code is an art of the past. The advantages of using a high-level language fairly compensate for the extra effort using the compiler chain.

In technical documentation, the advantages, at least in the short run, are not so appealing. Creating a document using some WYSIWYG editor is easy as opposed to programming in assembly. It is easy to lure the technical writer to avoid some extra work at the start and avoid the document source code creation.

A work to be done tomorrow is always cheaper today than the avoidable work of now.

The same will not be true tomorrow. Creating the more complex but less redundant documentation source almost always payback, especially if we consider document quality coming from consistency and up-to-date-ness.

The Tool: Java Macro Language

In the rest of this article, I will describe a tool that can automate document management tasks. The tool is the Java version of the text processor Jamal. Originally the name was standing for Just Another Macro Language, and it was created in the late 1990-ies in Perl. A few years ago, I rewrote the implementation in Java, with the original functionality enhanced. Since the application is based on Java, it is now named Java Macro Language, abbreviated as Jamal.

The basic concept of Jamal is that the input text containing free text and macros mixed is processed. The output is a text with all the macros executed and evaluated. The syntax of the macros is free. The only requirement is that each of them starts and ends with a specific string. The start and end string can be defined when the macro processor is initialized. It can also be changed on the fly in the input text. When I document Java programs, then I usually use {% as start string and %} as end string. That way, a simple macro definition will be

{%@define lastUpdated=2021-02-17 16:00%}

Later you can refer to this macro as

{%lastUpdated%}

and it will be replaced by the value 2021-02-17 16:00 for each use.

Jamal distinguishes user-defined and built-in macros. The example above, named lastUpdated is a user-defined macro, as it is defined in the input text. The macro defining it, named define is built-in. It is implemented as a Java class implementing the Jamal Macro interface. The built-in, Java implemented macros are provided in JAR files, in libraries. The core package contains the essential macros, like define, import, begin, end, options, comment, and a few others. These macros are not task-specific. They are needed generally.

Other libraries, like the jamal-snippet library, contain macros that support some specific task. The mentioned jamal-snippet library supports document management.

Snippet Handling, Transclude

The original idea of the snippets is not new. The basic approach to use the source code as part of the documentation originates from D. Knuth with Web and Tangle as early as 1984. https://en.wikipedia.org/wiki/CWEB Creating a program that contains the documentation and the execution code did not become popular as it needed a lot of extra work from the developers and an additional compilation step. The current trend includes the documentation into the source code as a comment. In the case of Java programs, this is JavaDoc. It is also a trend to use unit tests as a form of documentation.

The two are separate, and both lack the aspect that the other provides. JavaDoc does not show sample use unless someone copies some sample code into it manually. The unit test does not contain a proper explanation unless someone copies fragments or the whole from the JavaDoc to the unit test comments. JavaDoc is converted to navigable HTML pages. Unit tests are source code. Although the best documentation is the source code, it would be nice to have a better, more document-like format.

When we talk about snippets, then we copy code fragments automatically into the documentation. In practice, the documentation format is Asciidoc or MarkDown these days. Both formats allow code samples in the document.

using Jamal, the snippets can be marked in the Java source code or any other source code with

    snippet snippetName
    end snippet

lines. The snippetName should be replaced by a unique name that identifies the snippet, and all the lines between the snippet and end snippet lines will be the snippet itself. The snippets are gathered using the {%@snip:collect directory%} macro. Here directory is either a directory or a single file. The collection process reads each file and collects the snippets. After this the snippets can be referenced using the {%@snip snippetName%} macro. When Jamal runs, the macro is replaced with the actual value of the snippet. It ensures that the code sample in the documentation is up-to-date.

Other macros can trim the content, replace some strings in the samples, number the lines, skip some lines, and so on. With these, you can include any code sample.

Snippets are suitable for code samples, but not only for code samples. As JavaDoc is included in the source code, some parts of the documentation can also be included in the code as comments.

For example, the implementation of the macro directory contains the following lines:

    // snippet dirMacroFormatPlaceholders
    "$name", name, // gives the name of the directory as was specified on the macro
    "$absolutePath", dir.getAbsolutePath(), // gives the name of the directory as was specified on the macro
    "$parent", dir.getParent() // the parent directory
).and(
    "$canonicalPath", dir::getCanonicalPath // the canonical path
    //end snippet

These lines list the different placeholders and their values that the built-in template handler knows. The documentation includes this snippet with the following lines:

{%@define replace=|^.*?"(.*?)"|* `$1`!|!.*?//||%}
{%@define pattern=\)\.and\(%}
{%#replaceLines{%#killLines{%@snip dirMacroFormatPlaceholders %}%}%}

(Note: the actual version is a bit more complicated, as you will see later.) It inserts the content of the snippet evaluating the snip macro. The content of the sippet is then passed to the macro killLines. This macro will delete all the lines that match the regular expression defined in the macro pattern. The result is still further modified by the replaceLines macro. It executes the Java String replaceAll() method on each line with the arguments defined in the macro replace. The final result, inserted into the output is:

* `$name` gives the name of the file as was specified on the macro
* `$absolutePath` the absolute path to the file
* `$parent` the parent directory where the file is
* `$canonicalPath` the canonical path

This way, the document is much easier to maintain. The documentation of the parameters is along with the code, and that way, it is harder to forget to update the documentation. Also, the name of the placeholder is taken directly from the source code. Even if the developer makes a typo naming the placeholder in the example above, the documentation will contain the name as it is in the code and the characters it has to be used.

Snippets can come from other sources, not only from file snippet fragments. The built-in macro snip:xml reads a while XML file and assigns it to a macro name. This macro is similar to the built-in core macro define. It also defines a user-defined macro. In this case, however, the macro is not a constant string with argument placeholders as those defined, calling the macro define. In this case, the content is a whole parsed XML file, and the one argument the macro can and should have when invoked must be an XPath. As you can guess, the result of the macro call is the value in the XML found by the XPath.

As an example, the module documentation README.adoc.jam for jamal-plantuml contains the following lines close to the start of the file:

{%@snip:xml pom=pom.xml%}\
{%#define PLANTUML_VERSION={%pom /project/dependencies/dependency/artifactId[text()="plantuml"]/following-sibling::version/text()%}%}\
{%#define VERSION={%pom /project/version/text()%}%}\

It reads the pom.xml file of the macro and defines the PLANTUML_VERSION and VERSION macros to hold the current version of the used PlantUml library and the version of the project, respectively. Later in the documentation, both {%PLANTUML_VERSION%} and {%VERSION%} can be used and will be replaced in the output with the up-to-date version.

We have seen that snippet texts can be fetched from arbitrary source files and XML files. In addition to that, snippets can also be defined in .properties files (even XML format properties file) and can also be defined as a macro. The snippet definition as a macro using the snip:define built-in has a particular use that we will discuss later with the snip:update macro.

File, Directory, Class, Method => Consistency

The macros file, directory, java:class, and java:method are macros that can keep the code consistent with the system. These macros add barely any formatting to the output; therefore, their use needs discipline. They check that the argument file, directory, class, or method exists. If the entity does not exist, then the macro throws an exception. If the entity was renamed, moved, or deleted, the documentation has to be updated, or else it does not compile.

The use of the macros file and directory is straightforward. They check the existence of the file and directory specified as the argument. The name can either be absolute or relative to the input document.

Checking the existence of a class or method is not that straightforward. It needs a Java environment that has the class on the classpath. It is recommended to invoke Jamal from a unit test to convert the document from the input to output. This article is also written using Jamal as a preprocessor, and it is converted from a unit test of the module jamal-snippet using the following code:

private static void generateDoc(final String directory, final String fileName, final String ext) throws Exception {
    final var in = FileTools.getInput(directory + "/" + fileName + "." + ext + ".jam");
    final var processor = new Processor("{%", "%}");
    final var result = processor.process(in);
    FileTools.writeFileContent(directory + "/" + fileName + "." + ext, result);
}

@Test
void convertSnippetArticle() throws Exception {
    generateDoc(".", "ARTICLE", "wp");
}

During the unit test’s execution, the classes of the documented system are on the classpath or on the module path, and that way, these macros, java:class and java:method work.

Updating the Input

The jamal-snippet library has a particular macro, snip:update, which does something exceptional.

Built-in macro implementations get the part of the input, which is between the opening and closing string. It is the part of the input that they are supposed to work on. What they get is the input object containing not only the character but also a position coordinate. This coordinate contains the file name and the line/column position of the input in the file. Some macros use this coordinate to report the position of some error. Other macros, like include or import, use the file name to calculate the imported or included file path relative to the one that contains the macro.

The macro snip:update uses the file name to access the file and modify it physically. The macro scans the file and looks for lines that look like

{%@snip id
   ...
%}

When the lines with that pattern are found, then the lines between the first and last line, practically the lines denoted with ... above, are replaced with the snippet’s actual content. It will help the maintenance of the input document. When you write the document, it is easier to see the actual snippet and not only the reference to the snippet. It is also easier to debug the line killing, character replacement, and other snippet formatting transformations.

The macro snip is not disturbed by these lines. The syntax of the snip macro is like snip id ... anything treated as a comment... to allow this particular use case.

The invocation of the macro updating of the input should occur at the end of the document when all snippets are already defined. It is also essential to save the input to the version control before converting. The use of this possibility makes it possible to include into the document the formatted snippets. It is done, for example, in the documentation of the macro directory. The sample presented before was a simplified one. Here you can see the real one making use of updates.

{%#snip:define dirMacroFormatPlaceholdersFormatted=
{%#replaceLines{%#killLines{%@snip dirMacroFormatPlaceholders %}%}%}%}

{%@snip dirMacroFormatPlaceholdersFormatted
* `$name` gives the name of the directory as was specified on the macro
* `$absolutePath` gives the name of the directory as was specified on the macro
* `$parent` the parent directory
* `$canonicalPath` the canonical path
%}

This structure includes the snippet dirMacroFormatPlaceholders and converts enclosing it into macros killLines and replaceLines. The final formatted result, however, does not get directly into the output. It is assigned to a new snippet using the macro snip:define. The name of the new snippet is dirMacroFormatPlaceholdersFormatted.

After this, when this new, already formatted snippet is defined, it is referenced using the snip macro to be included in the output. When the macro snip:update is used at the end of the file, this second use of the snip macro is updated, and the formatted lines are inserted there, as you can see.

The first use of the macro snip is not updated because there are extra characters before using the macro. Also, there are extra characters after the snippet identifier.

Creating Diagrams

Using diagrams are very important in the documentation. As the saying goes, a picture is worth a thousand words, especially if your readers are non-native and do not even know a thousand words. An excellent tool to create diagrams is PlantUml. The source for the diagrams in this tool is a text that describes the UML diagram structurally. A simple sequence diagram can look like the following:

@startuml
Aladdin -> Jasmine : I love you
Jasmine -> Rajah : Aladdin loves me
Rajah --> Aladdin : wtf buddy?
@enduml

sample.svg

Putting this text into the macro

{%@plantuml sample.svg
Aladdin -> Jasmine : I love you
Jasmine -> Rajah : Aladdin loves me
Rajah --> Aladdin : wtf buddy?
%}

will create the image, and it can then be referenced in the document to get

PlantUml is a widely used tool, and it has integration with many document processors. That way, it is integrated with Markdown and Asciidoc as well. Using Jamal as a preprocessor instead of the PlantUml direct integration has a few advantages, however.

You do not need to have the integration for PlantUml installed on the environment where the document rendering executes. You do not have it, for example, on GitHub or GitLab. Using Jamal, the PlantUml processing is done in your local environment, and after that, you just have a standard Markdown, Asciidoc, or whatever format you use. For example, this document uses WordPress markup, which does not have PlantUml integration, but it does not matter. The source named ARTICLE.wp.jam is processed by Jamal generating ARTICLE.wp, and it has everything it needs. Pictures are generated.

The Jamal preprocessing has other advantages. In this article, as an example, the text of the UML diagram appears three times. Once when I display for the example of how a UML digram is defined in PlantUml. The second time when I show how it is integrated using a Jamal macro. The third time it appears as an image.

The source input contains it only once before the first use. The user-defined macro, named alilove, contains the actual UML, and the latter only references this macro to get the same text. If there is a need to update the structure, it must be done only in one place.

Another advantage is that the macros can access the running Java environment. It is already used when we check the existence and the naming of specific classes and methods. I also plan to extend the PlantUml integration with macros that can leverage the Java environment when we document our code. Running the conversion of the Jamal input during the unit tests reflection can get access to the classes. Using those, I plan to develop macros that need only the listing of the classes you want to be shown on a class diagram. The macro will discover all the relations between the classes and create a UML source to be converted to a diagram using PlantUml. Should your class structure change, the diagrams will also change automatically.

Summary and Takeaway

You should approach technical documentation like programming. Document writers should maintain document source code and compile the document from the source code.

You should automate document content management as much as possible. Use automated tools to transclude information from the documented system. Use automated tools to check the consistency of the documentation. Document everything close to the system documented. Use automated tools to integrate your document source.

Give it a try and use Jamal.

Unit testing private methods

1 Reply

Introduction

In this article, I will contemplate the testing of private methods in unit tests. After that, I will propose a way or pattern to do it, if you must. Finally, I will show how you can generate this pattern automatically.

And yes, I will also write a takeaway section to know what you have read.

Test or not to Test Private Methods

Unit testing is usually not black-box testing. It is debatable if it ought to be or not. Practice shows that it rarely is. When we equip the tested unit with different mocks, we play around with the implementation and not the defined functionality that a black-box test should only deal with.

After setting up and injecting the mock objects, we invoke the tested methods, and these methods are usually public. In other words, the invocation of the tested system is more like a black-box test. You can say that the test setup is not a black-box test, but the actual test is.

The advantage of black-box testing is that it does not need to change if the tested module changes’ internal working. If the functionality changes, it is another story. It is easier to refactor, optimize, simplify, beautify your code if there are clean unit tests that do not depend on the implementation. If the unit tests depend on the implementation, then you cannot reliably refactor your code. As soon as you change the implementation, the test has to follow the change.

I do not particularly appreciate when the unit test cannot be black-box, but there are cases when it is unavoidable. An unusual and frequent case is when we want to test a private method. If you want to, or even God forgive, have to test a private method, it is a code smell. The method may be simple, and you can achieve the coverage of its functionality by invoking only the public API of the tested unit. You do not have to test the private method, and if you do not have to, you must not want.

Another possibility is that that the private method is so complicated that it deserves its own test. In that case, the functionality deserves a separate utility class.

Still, there is a third possibility. After all the contemplating, we decide that the private method remains inside the unit, and we want to test it.

It is a small, insignificant problem that you cannot invoke from outside, and the test is inevitably out of the unit. Some developers remove the private modifier changing the access level from private to “test-private”.

No kidding! After more than 500 technical interviews over the past ten years, I have heard many things. I regret that I did not start recording these. As I heard a few times, one of these lovely things: “test private” as a terminology instead of package-private. Two or three candidates out of the 500 said that the accessibility is test private when there is no access modifier in front of the class member. It means they said that the member can also be accessible from the unit tests. From other classes in the same package? Not so sure.

What this story suggests is that many developers struggle to test private methods. I have also seen this in many other projects.

I am not too fond of this approach because we weaken the access protection of a class member to ease testing.

A different approach is when the tests use reflection to access the class members. There are two issues with this approach. One is the suboptimal performance. The other is the bloated code. The fact that the access to the class members via reflection is slower than the direct access is usually not significant. We are talking about tests. If the test execution needs significant time, then the tests are wrong, or the project is large or has some particular testing need. Even in these cases, the reason for the slow speed is usually not the reflective access.

The bloated code, on the other hand, hinders readability. It is also cumbersome to write every time things like

Field f = sut.getClass().getDeclaredField("counter");
f.setAccessible(true);
f.set(sut, z);

when we want to set a private field, or

Method m = sut.getClass().getDeclaredMethod("increment");
m.setAccessible(true);
m.invoke(sut);

when we want to invoke a private method. The maintenance of such tests is also questionable. If the name of the method or field changes, the test has to follow. There is no significant risk of forgetting because the test will fail, but still, it is a manual editing functionality. Most of the IDEs support renaming. Whenever I rename a method or field, the IDE renames all the references to it. Not when the reference is part of a string.

There is no real solution to this issue, except when you write code that does not need the testing of private methods and fields. Still, some approaches have advantages.

Doing it with a Style

One approach is to declare a private static delegating inner class with the same name as the tested class. This class has to implement the same methods as the original tested class, and these implementations should delegate to the original methods. The class also has to implement setters and getters to all the fields.

If we instantiate this class instead of the original one, then we can invoke any method or set any field without reflective access in the test code. The inner class hides the reflective access.

The reason to name the class with the same simple name as the tested class is that the tests do not need to change this way. If a test has a code that instantiated the tested class calling new Sut() and now we start to have an inner class named Sut, then the constructor all of a sudden will refer to the inner class.

Let’s see an example. The following class is a simple example that has one public method and a private one. The complexity of the methods barely reaches the level that would rectify extensive testing, but this makes it suitable for demonstration purposes.

public class SystemUnderTest {

private int counter = 0;

public int count(int z) {
while (z > 0) {
z--;
increment();
}
return counter;
}

private void increment(){
counter++;
}

}

This file, along with the other samples, can be found in full at https://github.com/verhas/javageci/tree/1.6.1/javageci-jamal/src/test/java/javax0/geci/jamal/sample

The test itself is also very simple:

@Test
void testCounter() throws Exception {
final var sut = new SystemUnderTest();
sut.setCounter(0);
sut.increment();
Assertions.assertEquals(1, sut.getCounter());
}

The only problem with this solution that the system under test does not contain the setter, and the method increment() is private. The code, as it is now, does not compile. We have to provide an implementation of the delegating static inner class named SystemUnderTest.

The following code shows an implementation of this class, which I created manually.

private static class SystemUnderTest {
private javax0.geci.jamal.sample.SystemUnderTest sut = new javax0.geci.jamal.sample.SystemUnderTest();

private void setCounter(int z) throws NoSuchFieldException, IllegalAccessException {
Field f = sut.getClass().getDeclaredField("counter");
f.setAccessible(true);
f.set(sut, z);
}

private int getCounter() throws NoSuchFieldException, IllegalAccessException {
Field f = sut.getClass().getDeclaredField("counter");
f.setAccessible(true);
return (int) f.get(sut);
}

private void increment() throws NoSuchMethodException, InvocationTargetException, IllegalAccessException {
Method m = sut.getClass().getDeclaredMethod("increment");
m.setAccessible(true);
m.invoke(sut);
}

private int count(int z) {
return sut.count(z);
}
}

It is already an achievement because we could separate the messy reflective access from the test code. The test, this way, is more readable. Since we cannot avoid the reflective code, it will not get better than this as per the readability. The other issue, maintainability, however, can still be improved.

Doing it Automated

Creating the delegating inner class is relatively straightforward. It does not need much innovation. If you specify the task precisely, any cheaply hired junior could create the inner class. It is so simple that even a program can create it. It does not need the human brain.

If you tried to write a Java program from scratch that generates this code, it would be, well, not simple. Fortunately (ha ha ha), we have Java::Geci, and even more, we have the Jamal module. Jav::Geci is a code generation framework that you can use to generate Java code. The framework contains readily available code generators, but it is also open and pluggable, providing a clean API for new code generators. It does all the tasks needed for most of the code generators and lets the code generator program focus on its core business.

Code generation.

For simpler applications, when the code generation is straightforward and does not need a lot of algorithm implementation, the module Jamal can be used. Jamal is a text-based templating language, which can be extended with Java classes implementing macros. The Java::Geci Jamal module includes a code generator that parses the source files and looks for code that has the following structure:


/*!Jamal

TEMPLATE

*/

CODE HERE

//__END__

When it sees one, it evaluates the code that is written on the lines TEMPLATE using Jamal, and then it replaces the lines of CODE HERE with the result. It generates code, and if there was a generated code but is stale, it updates the code.

The code generation runs during the test execution time, which has advantages and disadvantages.

One disadvantage is that the empty code or stale code should also compile. The compilation should not depend on the up-to-date-ness of the generated code. In practice, we usually (well, not usually, rather always) can cope with it.

The advantage is that the code generation can access the Java code structures via reflection. That way, for example, the code generators can get a list of all declared fields or methods and can generate some delegating methods for them.

The Jamal module contains Java classes implementing macros that can do that. The fact that you can express the generation of the unit test delegating inner class as Jamal macros shows the tool’s power. On the other hand, I have to note that this task is somewhere at the edge of the tool’s complexity. Nevertheless, I decided to use this task as a sample because generating setter and getters is boring. I also want to avoid lazy readers asking me why to have another setter/getter generator, as it happened at some conferences where I talked about Java::Geci. Setter and getter generator is not a good example, as it does not show you the advantage. You can do that with the IDE or using Lombok or some other tool. Perhaps after reading this article, you can try and implement the setter/getter generation using Jamal just for fun and to practice.

The previous code snippets were from the class ManualTestSystemUnderTest. This class contains the manually created delegating inner class. I created this class for demonstration purposes. The other testing class, GeneratedTestSystemUnderTest contains the generated sample code. We will look at the code in this file and how Java::Geci generates it automatically.

Before looking at the code, however, I have to make two notes:

The example code uses a simplified version of the macros. These macros do not cover all the possible causes.
On the other hand, the code includes all the macros in the source file. Professional code does not need to have these macros in the source. All they need is an import from a resource file and then the invocation of a single macro. Two lines. The macros generating the delegating inner class are defined in a resource file. It is written once, you do not need to write them all the time. I will show you at the end of this article how it is invoked.

Let’s have a look at the class GeneratedTestSystemUnderTest! This class contains the following Jamal template in a Java comment:

/*!jamal
{%@import res:geci.jim%}\
{%beginCode SystemUnderTest proxy generated%}
private static class SystemUnderTest {
private javax0.geci.jamal.sample.SystemUnderTest sut = new javax0.geci.jamal.sample.SystemUnderTest();
{%!#for ($name,$type,$args) in
({%#methods
{%class javax0.geci.jamal.sample.SystemUnderTest%}
{%selector private %}
{%format/$name|$type|$args%}
%}) =
{%@options skipForEmpty%}
private $type $name({%`@argList $args%}) throws Exception {
Method m = sut.getClass().getDeclaredMethod("$name"{%`#classList ,$args%});
m.setAccessible(true);
m.invoke(sut{%`#callArgs ,$args%});
}
%}
{%!#for ($name,$type,$args) in
({%#methods
{%class javax0.geci.jamal.sample.SystemUnderTest%}
{%selector/ !private & declaringClass -> ( ! canonicalName ~ /java.lang.Object/ )%}
{%format/$name|$type|$args%}
%}) =
{%@options skipForEmpty%}
private $type $name({%`@argList $args%}) {
{%`#ifNotVoid $type return %}sut.$name({%`#callArgs $args%});
}
%}
{%!#for ($name,$type) in
({%#fields
{%class javax0.geci.jamal.sample.SystemUnderTest%}
{%selector/ private %}
{%format/$name|$type%}
%}) =
{%@options skipForEmpty%}
private void {%setter=$name%}($type $name) throws Exception {
Field f = sut.getClass().getDeclaredField("$name");
f.setAccessible(true);
f.set(sut,$name);
}

private $type {%getter/$name/$type%}() throws Exception {
Field f = sut.getClass().getDeclaredField("$name");
f.setAccessible(true);
return ($type)f.get(sut);
}
%}
{%!#for ($name,$type) in
({%#fields
{%class javax0.geci.jamal.sample.SystemUnderTest%}
{%selector/ !private %}
{%format/$name|$type%}
%}) =
{%@options skipForEmpty%}
private void {%setter/$name%}($type $name) {
sut.$name = $name;
}

private $type {%getter/$name/$type%}() {
return sut.$name;
}
%}
}
{%endCode%}
*/

In this code the macro start string is {% and the macro closing string is %}. It is the default setting when Java::Geci starts Jamal to process a source file. This way, the macro enhanced template can freely contain standalone { and } characters, which is very common in Java. Macros implemented as Java code use the @ or the # character in front of the macro name. If there is no such character in front of the macro name, then the macro is user-defined from a @define ... macro.

The text of the template contains three parts:

the start of the code,
four loops, and
the end of the generated code in the template (this is just a closing } character).

The start of the template

{%@import res:geci.jim%}\
{%beginCode SystemUnderTest proxy generated%}
private static class SystemUnderTest {
private javax0.geci.jamal.sample.SystemUnderTest sut = new javax0.geci.jamal.sample.SystemUnderTest();

imports the macro definitions from the resource file geci.jim. The file itself is part of the library. If you have the dependency on the classpath when the code generator and the Jamal processor runs, you can import the definition from this resource file. The macro definitions in this file are simple Jamal macros defined as text. You can have a look at them at the URL

https://github.com/verhas/javageci/blob/1.6.1/javageci-jamal/src/main/resources/geci.jim

The next line uses the beginCode user-defined macro, which is defined in geci.jim as the following:

{%@define beginCode(:x)=//<editor-fold desc=":x">%}

When this macro is used it will result the start of an editor fold that helps to keep the generated code non-intrusive when the file is opened in the IDE. When this macro is evaluated, it will be

//<editor-fold desc="SystemUnderTest proxy generated">

The next two lines start the private static inner class. It is just plain text; there is no macro in it.

Now we get to the four loops that generate proxy codes for

Delegating proxy methods for the private methods of the tested class.
Delegating proxy methods for the non-private methods declared in the class or inherited, except those inherited from the Object class.
Setter and getter methods for the private fields of the tested class.
Setter and getter methods for the non-private fields of the tested class.

Since these are very similar, I will discuss here only the first in detail.

{%!#for ($name,$type,$args) in
({%#methods
{%class javax0.geci.jamal.sample.SystemUnderTest%}
{%selector private %}
{%format/$name|$type|$args%}
%}) =
{%@options skipForEmpty%}
private $type $name({%`@argList $args%}) throws Exception {
Method m = sut.getClass().getDeclaredMethod("$name"{%`#classList ,$args%});
m.setAccessible(true);
m.invoke(sut{%`#callArgs ,$args%});
}
%}

The loop is constructed using a for macro, a Java-implemented, built-in macro of Jamal from the core package. This macro is always available for any Jamal processing. This macro iterates through a comma-separated list and repeats its contents for each list element replacing the loop variables with the actual values. There can be more than one loop variable. In such a case, like in our example, the actual value is split up along the | characters. The comma used as a list separator, and the values separator | can be redefined. In the above case, the for loop uses three-loop variables, $name, $type`, and$args. The start with a$` sign has no significance. Any string can be used as a loop variable.

The list of values is between the () characters after the in keyword. This list is the result of the evaluation of the methods built-in macro. This macro is implemented in Java and is part of the Java::Geci Jamal module. It is not a generally available Jamal macro, but when we run the code generation of Java::Geci, this JAR file is on the classpath, and thus this macro is available.

The methods macro lists the methods of a class.

The class name is taken from the user-defined macro $class, which can be defined using the user-defined macro class. The listing also considers a selector expression that can be used to filter out some of the methods. It is also provided in a user-defined macro, and there is also a helper macro in geci.jim to define it, named selector. In the example above, the selector expression is private, which will select only the private methods.

When the list is collected, the macro methods must convert it to a comma-separated list. To do that, it uses a formatting string that can contain placeholders. In our case, the placeholders are $name, $type, and $args. Every element in the list for the for loop will contain these three strings for the listed methods separated by two | characters as indicated by the format string.

The part after the = sign in the for loop is repeated for each method. It will declare a private method that invokes the same method of the tested method. To do that, it uses the help of the Java::Geci Jamal module provided built-in macros argList, classList, and callArgs. These help generating code that declares the arguments, lists the classes of the argument types or lists the arguments for the actual call.

Since this is just an article and not a full-blown documentation of Java::Geci and Jamal, I skip some details. For example, why the macro for uses the # character in front of it instead of @, why there is a backtick character in front of the macros in the loop’s body, and why the for loop uses a ! character. These details control the macro evaluation order. The list of the methods needs to be created before the for loop starts because it requires the method list. On the other hand, the macros in the loop’s body have to be evaluated after the loop generated the text for every listed method.

Also, note that this implementation is for demonstration purposes only. It simplifies the problem and does not cover all the corner cases. For example, it will generate a setter for a final field.

If you want to use this code generation, you can use the macro proxy(KLASS) defined in the resource file res:unittestproxy.jim.

You can have a look at the class UnitTestWithGeneratedUnitTestProxy, which is a tad more complex than the sample and tests these macros. The start of the generated code is the following:

/*!jamal
{%@import res:unittestproxy.jim%}\

{%beginCode SystemUnderTest proxy generated%}
{%proxy javax0.geci.jamal.unittestproxy.TestSystemUnderTest%}
{%endCode%}
*/

It merely imports the res:unittestproxy.jim file, which imports geci.jim and then uses the macro proxy to generate all the needed code covering all the corner cases.

If you want to use the code generator in your code, you have to do two things:

A. Include the dependency in your pom.xml file:

<dependency>
<groupId>com.javax0.geci</groupId>
<artifactId>javageci-jamal</artifactId>
<version>1.6.1</version>
<scope>test</scope>
</dependency>

B. Create a small unit test that runs the code generator:

@Test
@DisplayName("run the Jamal generator")
public void testRunJamalGenerator() throws Exception {
Geci geci = new Geci();
Assertions.assertFalse(
geci.register(new JamalGenerator())
.generate()
, geci.failed()
);
}

The generator runs during the unit test. During the test run, it has access to the structure of the Java code via reflection. The Jamal macros like methods, fields can query the different classes and provide the list of the methods and fields. The test fails if there was any new code generated. It only happens when the code generator runs the first time or when the tested system has changed. In this case, the test fails because the compiled code during the execution is not the final one. In such a case, start Maven again, and the second time the compilation already runs fine. Do not forget to commit the changed code. There is no risk of failing to update the generated code, like in IDE provided code generation that you have to invoke manually.

Takeaway

What you should remember from this article:

Try not to test private methods. If you feel the need, you did something wrong. Probably. Possibly not.
If you test private methods arrange the reflective code into a private static class that delegates the call to the original class. This will remove the implementation of the reflective access from the test and the test remains what it has to be: functionality test.
If you are a lazy person, and as a good programmer you have to be, use a Java::Geci and Jamal to generate these inner classes for your tests.
Master Java::Geci and Jamal and use them to generate code for your other, specific needs.

Start a Java app without installing it

Leave a reply

This article describes how you can use JShell to download and execute a Java application. It will eliminate the need for the installation of the application.

Do not install, just run!

The first obstacle that you have to overcome to make people use your app is the installation. You want people to use the app, try it out. To do that, they first have to install it. At least they have to download it and type in a command line to start it up. If your program is excellent and valuable for the users, they will use it after this step. They will see that it works and can do what it is supposed to do. Before the installation, however, this is a bit different. Users will install the program if they really, really need it. Installation is undoubtedly an entry threshold.

Jamal as Example

My example is Jamal that you can download at https://github.com/verhas/jamal. I wrote the first version twenty years ago, and I named it Jamal. The name stands for Just Another Macro language, which I intended to be sarcastic. I thought it was sarcastic because it was so much more different than any other text macro applications. It seems the name was not interpreted as sarcastic but rather literally. Users saw it really as “just another” and it did not become widespread. A few people bothered to install it.
Now I have a Java version, which is even more versatile and powerful than the previous version. However, if you wanted to use it, you had to install it and start it up with a relatively complex java -cp ... command line. My first attempt to overcome this was to create a Maven plugin. A maven plugin executes without installing it. If you have installed Maven, all you need to run a plugin is a Maven command line. A kind of complex one, though. Or it would help if you had a pom.xml.

I also created the Maven plugin version because I used Jamal to maintain the pom.xml files with Jamal preprocessed. That way, as you can see in an earlier article, I can write

{dependencyManagement|{dependencies|
    {@for MODULE in (testsupport,api,core,tools,engine,extensions)={dependency :com.javax0.jamal:jamal-MODULE:{VERSION}}}
    {@for MODULE in (api,engine,params)={dependency :org.junit.jupiter:junit-jupiter-MODULE:5.2.0:test}}
    }}

instead of a much longer and redundant XML fragment. This source, pom.xml.jam is then converted to pom.xml, and Maven runs fine.

The solution can still be better because many people do not use Maven. Jamal is not a Maven dependent tool.

I also use a different project to *.md.jam files to edit my next book. A book, as a project, does not require Maven. This book is not a Java book. I happen to have Maven on my machine, but the project does not need that.

There is no reason to require installed Maven as a precondition.

There is one precondition that I have to require, and that is an installed Java JDK. I cannot skip that because Jamal is written in Java.

You can also miss this precondition if you have docker, but then you need docker.

However, if you have the JDK installed (at least Java 9), you can quickly start a JShell. JShell executes the Java code from some input file that you specify on the command line. If you want to start Jamal, then the command is:

jshell https://git.io/jamal

The command file is on GitHub, and JShell can download it from there. This command file downloads the JAR files needed to run Jamal, and then it starts Jamal in a separate process.

The actual script splits into separate parts, and the jamal.jsh content is

/open scripts/version.jsh
/open scripts/jarfetcher.jsh
/open scripts/executor.jsh
/open scripts/optionloader.jsh
/open scripts/defaultoptions.jsh

download("01engine/jamal-engine")
download("02api/jamal-api")
download("03tools/jamal-tools")
download("04core/jamal-core")
download("08cmd/jamal-cmd")

loadOptions()

for(String jarUrl:extraJars){
    LOCAL_CACHE.mkdirs();
    downloadUrl(jarUrl,LOCAL_CACHE);
    }

execute()

/exit

As you can see, the JShell commands and the Java snippets are mixed. The script loads other scripts using the JShell /open command. These snippets define the method download(), loadOption() and downloadUrl().

The script version.jsh defines the global variable VERSION:

String VERSION="1.2.0";

Downloading and Caching the Program

The next script, jarfetcher.jsh is a bit more complicated. As of now, it is 100 lines. If you want to look at the whole code, it is available on GitHub. You can calculate the URL from the argument of the /open statement and from the URL above used to start Jamal.

The core functionality implemented in this script is the one that downloads the JAR files. This is the following:

void downloadUrl(String urlString,File cacheRootDirectory) throws IOException {
    final URL url = new URL(urlString);
    File jar = new File(cacheRootDirectory.getAbsolutePath() + "/" + getFile(url));
    classPath.add(jar.getAbsolutePath());
    if (jar.exists()) {
        return;
    }
    System.out.println("downloading " + url);
    System.out.println("saving to file " + jar.getAbsolutePath());
    HttpURLConnection con = (HttpURLConnection) url.openConnection();
    con.setRequestMethod("GET");
    con.setConnectTimeout(CONNECT_TIMEOUT);
    con.setReadTimeout(READ_TIMEOUT);
    con.setInstanceFollowRedirects(true);
    final int status = con.getResponseCode();
    if (status != 200) {
        throw new IOException("GET url '" + url.toString() + "' returned " + status);
    }
    InputStream is = con.getInputStream();
    try (OutputStream outStream = new FileOutputStream(jar)) {
        byte[] buffer = new byte[8 * 1024];
        int bytesRead;
        while ((bytesRead = is.read(buffer)) != -1) {
            outStream.write(buffer, 0, bytesRead);
        }
    }
}

The method caches the downloaded files into a directory. Environment variables can configure the directory. The default location is ~/.jamal/cache/.jar/.

If the file exists, then it does not download it again. The code assumes that the files we are using are released JAR files that do not ever change. If this file was never downloaded before, it downloads the file and stores it in the cache directory.

Executing the macro processor

When all the files are there, then the script executed Jamal. It is coded in the executor.jsh. The method execute.jsh contains the following method:

void execute() throws IOException, InterruptedException {
    ProcessBuilder builder = new ProcessBuilder();
    String sep = System.getProperty("path.separator");
    String cp = String.join(sep,classPath);
    List<String> arguments = new ArrayList<>();
    arguments.addAll(List.of("java", "-cp", cp, "javax0.jamal.cmd.JamalMain"));
    arguments.addAll(commandLineOptions.entrySet().stream().map(e -> "" + e.getKey() + "=" + e.getValue()).collect( Collectors.toSet()));
    System.out.println("EXECUTING");
    for( String a : arguments){
        System.out.println(a);
    }
    builder.command(arguments.toArray(String[]::new))
        .directory(new File("."));
    Process process = builder.start();
    process.getInputStream().transferTo(System.out);
    int exitCode = process.waitFor();
}

As you can see, this script is using the standard Java ProcessBuilder to create a new process and then executes Jamal in it.

Extra details

The actual operation is a bit more complex. Many options can control Jamal. In the Maven plugin version, these options are in the pom.xml file. The command-line version uses, eventually, command-line options. JShell does not handle command-line options that would pass to the executing JShell engine. There are some tricks, like using system properties or environment variables. I find those cumbersome and tricky to use. You usually execute Jamal using the same configuration in a single project. The best way is to have the options in a file. The Jamal startup JShell script reads the file ./jamal.options. The file has a simple key value format. It can contain values for the command line options as keys and extra jar and cp keys. Jamal is extensible. Extra classes on the classpath may contain macro implementations in Java, and they are used from the text files. Every jar defines a URL from where a JAR file downloads. The cp key defines local files to be added to the classpath.

These files are project-specific; therefore, these will cache in the current working directory. The cache directory will be ./.jamal/cache/.jar/.

If the jamal.options file does not exist, then the script’s first execution will create. The auto-created file will contain the default values and also some documentation.

Summary

A Java application can start without downloading it first using JShell. The startup JShell script can be located on the net and downloaded on the fly. It can also fetch other scripts, and you can mix the Java snippets with JShell commands. I recommend having some caching strategy for the JAR files to avoid repetitive downloads. The JShell script can start your application in a new process. You cannot pass command line parameters to a JShell script, but you can use an options file or something else.

Happy scripting.

All you wanted to know about Throwable

2 Replies

This article is a tutorial about exceptions. But not the usual one. There are many of those that tell you what exceptions are for, how you can throw one, catch one, the difference between checked and runtime exceptions, and so on. There is no need for another. It would also be boring for you. If not, then go and read one of those and come back when you have learned what they teach. This article starts where those tutorials end. We dive a bit deeper into Java exceptions, what you can do with them, what you should do with them, and what features they have that you may not have heard about. If setStackTrace(), getCause() and getSuppressed() are the methods you eat for breakfast then you can skip this article. But if not, and you want to know a bit about these, then go on. This article is long. It took a long time to write, and it will take a long time to read. It is needed.

Introduction

In this article, we will talk about exceptions and what we can and should do with Java exceptions. The simplest case is to throw one and then catch it, but there are more complex situations, like setting a cause or suppressed exceptions. We will look at these possibilities, and a bit more. To discover the possibilities we will develop a simple application and step-by-step we will create four versions developing the application further and further using more and more exception handling possibilities. The source code is available in the repository:

https://github.com/verhas/BLOG/tree/master/exception_no_stack

The different versions are in different Java packages. Some classes that did not change in the different versions are one package higher, and they are not versioned.

The first version v1 simply throws en exception, and it is not handled by the application. The test code expects the test setup to throw the exception. This version is the baseline to demonstrate why we need more complex solutions. We will experience that there is not enough information in the exception to see where the actual issue has happened.
The second version v2 catches the exception at higher levels and throws a new exception with more information about the exceptional case, and the new exception has the original one embedded as cause. This approach gives enough information to track the location of the issue, but it can even be enhanced so that it is easier to read and recognize the actual problem.
The third version v3 will demonstrate how we can modify the creation of the new exceptions so that the stack trace of the higher level exceptions will not point to the location where the original exception was caught, but rather where the original exception was thrown.
Finally, the fourth version v4 will demonstrate how we can suppress expressions when it is possible to go on with the processing in case of en exceptional case even if the operation cannot be finished successfully. This “going further” makes it possible to have an exception at the end that collects the information about all discovered exceptional cases and not only the first occurrence.

If you look at the code, you will also find there the original text of this article, and the setup that helps to maintain the code snippets copying them into the article from the source keeping all of them up-to-date. The tool that does it for us is Java::Geci.

Sample Application

We use exceptions to handle something that is outside of the normal flow of the program. When an exception is thrown the normal flow of the program is interrupted, and the execution stops dumping the exception to some output. These exceptions can also be caught using the try and catch command pair built into the language.

    try {
        ... some code ...
        ... even calling methods
                      several level deep     ...
        ...    where exception may be thrown ...
      }catch(SomeException e){
        ... code having access to the exception object 'e'
            and doing someting with it (handling) ....
      }

The exception itself is an object in Java and can contain a lot of information. When we catch an exception in our code, we have access to the exception object, and the code can act upon the exceptional situation also having access to the parameters that the exception object is carrying. It is possible to implement our own exceptions extending the Java
java.lang.Throwable class or some of the classes that directly, or transitively extend Throwable. (Usually, we extend the class Exception.) Our own implementation can hold many parameters that describe the nature of the exceptional situation. We use object fields for the purpose.

Although there is no limit for the data an exception can carry, it usually does not contain more than a message and the stack trace. There is room – as defined in the class Throwable – for other parameters, like the exception that was causing the current one (getCause()), or an array of suppressed exceptions (getSuppressed()). They are rarely used, presumably because developers are not aware of these features and because most cases are simple and do not need these possibilities. We will have a look at these possibilities in this article so that you will not belong to the group of ignorant developers who do not use these methods only because they are not aware of them.

We have a sample application. It is a bit more than just throwing, catching, and handling an exception in the catch branch that lets the code to continue. That is simple and is explained in the tutorial you have read when learning to program in Java the first time.

Our sample application will be a bit more complex. We will list the files in a directory, read the lines, and count the number of wtf strings. This way we automate the code review process quality measurement (joking). It is said that the code quality is reverse proportional to the number of the WTFs during the code review.

The solution contains

a FileLister that can list the files,
a FileReader that can read a file,
a LineWtfCounter that will count the wtfs in a single line,
a FileWtfCounter that will use the previous class to count all the wtfs in the whole file listing the lines, and finally,
a ProjectWtfCounter that counts the wtfs in the whole project using the file level counter, listing all the files.

Version 1, throw and catch

The application functionality is fairly simple and because we focus on the exception handling the implementation is also trivial. For example, the file listing class is as simple as the following:

package javax0.blog.demo.throwable;

import java.util.List;

public class FileLister {

    public FileLister() {
    }

    public List<String> list() {
        return List.of("a.txt", "b.txt", "c.txt");
    }
}

We have three files in the file system, a.txt, b.txt, and c.txt. This is a mock, of course, but in this case, we do not need anything more complex to demonstrate the exception handling. Similarly, the FileReader is also a kind of mock implementation that serves demonstration purposes only:

package javax0.blog.demo.throwable.v1;

import java.util.List;

public class FileReader {
    final String fileName;

    public FileReader(String fileName) {
        this.fileName = fileName;
    }

    public List<String> list() {
        if (fileName.equals("a.txt")) {
            return List.of("wtf wtf", "wtf something", "nothing");
        }
        if (fileName.equals("b.txt")) {
            return List.of("wtf wtf wtf", "wtf something wtf", "nothing wtf");
        }
        if (fileName.equals("c.txt")) {
            return List.of("wtf wtf wtf", "wtf something wtf", "nothing wtf", "");
        }
        throw new RuntimeException("File is not found: "+ fileName);
    }

}

The counter, which counts the number of wtf occurrences in a line is

package javax0.blog.demo.throwable.v1;

public class LineWtfCounter {
    private final String line;

    public LineWtfCounter(String line) {
        this.line = line;
    }

    public static final String WTF = "wtf";
    public static final int WTF_LEN = WTF.length();

    public int count() {
        if (line.length() == 0) {
            throw new LineEmpty();
        }
        // the actual lines are removed from the documentation snippet
    }

}

To save space and focus on our topic the snippet does not display the actual logic (was automatically removed by Java::Geci). The reader can create a code that actually counts the number of wtf substrings in a string, or else simply “wtf”. Even if the reader cannot write such a code it is available from the repository mentioned at the start of the article.

The logic in our application says that this is an exceptional situation if one of the lines in the file has zero length. In that case, we throw an exception.

Usually, such a situation does not verify to be an exception, and I acknowledge that this is a bit contrived example, but we needed something simple. If the length of the line is zero then we throw a LineEmpty exception. (We do not list the code of LineEmpty exception. It is in the code repo, and it is simple, nothing special. It extends RuntimeException, no need to declare where we throw it.) If you look at the mock implementation of FileReader then you can see that we planted an empty line in the file c.txt.

The counter on the file level using the line level counter is the following:

package javax0.blog.demo.throwable.v1;

public class FileWtfCounter {
    // fileReader injection is omitted for brevity
    public int count() {
        final var lines = fileReader.list();
        int sum = 0;
        for (final var line : lines) {
            sum += new LineWtfCounter(line).count();
        }
        return sum;
    }

}

(Again, some trivial lines are skipped from the printout.)

This is the first version of the application. It does not have any special exception handling. It just sums up the values that the line counters return and in case there is an exception on the lower level, in the line wtf counter then this will automatically propagate up. We do not handle that exception in any way on this level.

The project level counter is very similar. It uses the file counter and sums up the results.

package javax0.blog.demo.throwable.v1;

import javax0.blog.demo.throwable.FileLister;

public class ProjectWftCounter {
    // fileLister injection is omitted for brevity
    public int count() {
        final var fileNames = fileLister.list();
        int sum = 0;
        for (final var fileName : fileNames) {
            sum += new FileWtfCounter(new FileReader(fileName)).count();
        }
        return sum;
    }
}

We test it using the simple test code:

package javax0.blog.demo.throwable.v1;

import javax0.blog.demo.throwable.FileLister;
import org.junit.jupiter.api.DisplayName;
import org.junit.jupiter.api.Test;

import static org.assertj.core.api.Assertions.assertThat;
import static org.assertj.core.api.Assertions.catchThrowable;

public class TestWtfCounter {

    @Test
    @DisplayName("Throws up for a zero length line")
    void testThrowing() {
        Throwable thrown = catchThrowable(() ->
                new ProjectWftCounter(new FileLister())
                        .count());
        assertThat(thrown).isInstanceOf(LineEmpty.class);
        thrown.printStackTrace();
    }

}

A unit test usually should not have a stack trace print. In this case we have it to demonstrate what is thrown. The stack trace in the error will show us the error as the following:

javax0.blog.demo.throwable.v1.LineEmpty: There is a zero length line
    at javax0.blog.demo.throwable.v1.LineWtfCounter.count(LineWtfCounter.java:18)
    at javax0.blog.demo.throwable.v1.FileWtfCounter.count(FileWtfCounter.java:19)
    at javax0.blog.demo.throwable.v1.ProjectWftCounter.count(ProjectWftCounter.java:22)
    at javax0.blog.demo.throwable.v1.TestWtfCounter.lambda$testThrowing$0(TestWtfCounter.java:18)
    at org.assertj.core.api.ThrowableAssert.catchThrowable(ThrowableAssert.java:62)
    ...
    at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)

There is a little problem with this exception. When we use this code it does not tell us anything about the actual file and line that is problematic. We have to examine all the files and all the lines if there is an empty one. It is not too difficult to write an application for that, but we do not want to work instead of the programmer who created the application. When there is an exception we expect the exception to give us enough information to successfully tackle the situation. The application has to tell me which file and which line is faulty.

Version 2, setting cause

To provide the information in the exception we have to gather it and insert it into the exception. This is what we do in the second version of the application.

The exception in the first version does not contain the name of the file, or the line number because the code does not put it there. The code has a good reason to do that. The code at the location of the exception throwing does not have the information and thus it cannot insert into the exception what it does not have.

A lucrative approach could be to pass this information along with the other parameters so that when an exception happens the code can insert this information into the exception. I do not recommend that approach. If you look at the source codes I published on GitHub you may find examples of this practice. I am not proud of them, and I am sorry.
Generally, I recommend that the exception handling should not interfere with the main data flow of the application. It has to be separated as it is a separate concern.

The solution is to handle the exception on several levels, on each level adding the information, which is available at the actual level. To do that we modify the classes FileWtfCounter and ProjectWftCounter.

The code of ProjectWftCounter becomes the following:

package javax0.blog.demo.throwable.v2;

public class FileWtfCounter {
    // some lines deleted ...
    public int count() {
        final var lines = fileReader.list();
        int sum = 0;
        int lineNr = 1;
        for (final var line : lines) {
            try {
                sum += new LineWtfCounter(line).count();
            }catch(LineEmpty le){
                throw new NumberedLineEmpty(lineNr,le);
            }
            lineNr ++;
        }
        return sum;
    }

}

The code catches the exception that signals the empty line and throws a new one, which already has a parameter: the serial number of the line.

The code for this exception is not so trivial as in the case of LineEmpty, thus it is listed here:

package javax0.blog.demo.throwable.v2;

public class NumberedLineEmpty extends LineEmpty {
    final protected int lineNr;

    public NumberedLineEmpty(int lineNr, LineEmpty cause) {
        super(cause);
        this.lineNr = lineNr;
    }

    @Override
    public String getMessage() {
        return "line " + lineNr + ". has zero length";
    }
}

We store the line number in an int field, which is final. We do it because

use final variables if possible
use primitives over objects if possible
store the information in its original form as long as possible so that the use of it is not limited

The first two criteria are general. The last one is special in this case, although it is not specific to exception handling. When we are handling exceptions, however, it is very lucrative to just generate a message that contains the line number instead of complicating the structure of the exception class. After all, the reasoning that we will never
use the exception for anything else than printing it to the screen is valid. Or not? It depends. First of all, never say never. Second thought: if we encode the line number into the message then it is certain that we will not ever use it for anything else than printing it to the user. That is because we cannot use it for anything else. We limit ourselves. The today programmer limits the future programmer to do something meaningful with the data.

You may argue that this is YAGNI. We should care about storing the line number as an integer when we want to use it and caring about it at the very moment is too early and is just a waste of time. You are right! At the same time, the person who is creating the extra field and the getMessage() method that calculates the text version of the exception information is also right. Sometimes there is a very thin line between YAGNI and careful and good style programming. YAGNI is to avoid complex code that later you will not need (except that when you create it, you think that you will need). In this example, I have the opinion that the above exception with that one extra int field is not “complex”.

We have a similar code on the “project” level, where we handle all the files. The code of ProjectWftCounter will be

package javax0.blog.demo.throwable.v2;

import javax0.blog.demo.throwable.FileLister;

public class ProjectWftCounter {
    // some lines deleted ...
    public int count() {
        final var fileNames = fileLister.list();
        int sum = 0;
        for (final var fileName : fileNames) {
            try {
                sum += new FileWtfCounter(new FileReader(fileName)).count();
            } catch (NumberedLineEmpty nle) {
                throw new FileNumberedLineEmpty(fileName, nle);
            }
        }
        return sum;
    }
}

Here we know the name of the file and thus we can extend the information adding it to the exception.

The exception FileNumberedLineEmpty is also similar to the code of NumberedLineEmpty. Here is the code of FileNumberedLineEmpty:

package javax0.blog.demo.throwable.v2;

public class FileNumberedLineEmpty extends NumberedLineEmpty {
    final protected String fileName;

    public FileNumberedLineEmpty(String fileName, NumberedLineEmpty cause) {
        super(cause.lineNr, cause);
        this.fileName = fileName;
    }

    @Override
    public String getMessage() {
        return fileName + ":" + lineNr + " is empty";
    }
}

At this moment I would draw your focus to the fact that the exceptions that we created are also in inheritance hierarchy. They extend the other as the information we gather and store is extended, thus:

FileNumberedLineEmpty - extends -> NumberedLineEmpty - extends -> LineEmpty

If the code using these methods expects and tries to handle a LineEmpty exception then it can do even if we throw a more detailed and specialized exception. If a code wants to use the extra information then it, eventually, has to know that the actual instance is not LineEmpty rather something more specialized as NumberedLineEmpty or FileNumberedLineEmpty. However, if it only wants to print it out, get the message then it is absolutely fine to handle the exception as an instance of LineEmpty. Even doing so the message will contain the extra information in human-readable form thanks to OO programming polymorphism.

The proof of the pudding is in the eating. We can run our code with the simple test. The test code is the same as it was in the previous version with the only exception that the expected exception type is FileNumberedLineEmpty instead of LineEmpty. The printout, however, is interesting:

javax0.blog.demo.throwable.v2.FileNumberedLineEmpty: c.txt:4 is empty
    at javax0.blog.demo.throwable.v2.ProjectWftCounter.count(ProjectWftCounter.java:22)
    at javax0.blog.demo.throwable.v2.TestWtfCounter.lambda$testThrowing$0(TestWtfCounter.java:17)
    at org.assertj.core.api.ThrowableAssert.catchThrowable(ThrowableAssert.java:62)
...
    at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
Caused by: javax0.blog.demo.throwable.v2.NumberedLineEmpty: line 4. has zero length
    at javax0.blog.demo.throwable.v2.FileWtfCounter.count(FileWtfCounter.java:21)
    at javax0.blog.demo.throwable.v2.ProjectWftCounter.count(ProjectWftCounter.java:20)
    ... 68 more
Caused by: javax0.blog.demo.throwable.v2.LineEmpty: There is a zero length line
    at javax0.blog.demo.throwable.v2.LineWtfCounter.count(LineWtfCounter.java:15)
    at javax0.blog.demo.throwable.v2.FileWtfCounter.count(FileWtfCounter.java:19)
    ... 69 more

We can be happy with this result as we immediately see that the file, which is causing the problem is c.txt and the fourth line is the one, which is the culprit. On the other hand, we cannot be happy when we want to have a look at the code that was throwing the exception. Sometime in the future, we may not remember why a line must not have zero length. In that case, we want to look at the code. There we will only see that an exception is caught and rethrown. Luckily there is the cause, but it is actually three steps till we get to the code that is the real problem at LineWtfCounter.java:15.

Will anyone ever be interested in the code that is catching and rethrowing an exception? Maybe yes. Maybe no. In our case, we decide that there will not be anyone interested in that code and instead of handling a long chain of exception listing the causation of the guilty we change the stack trace of the exception that we throw to that of the causing
exception.

Version 3, setting the stack trace

In this version, we only change the code of the two exceptions: NumberedLineEmpty and FileNumberedLineEmpty. Now they not only extend one the other and the other one LineEmpty but they also set their own stack trace to the value that the causing exception was holding.

Here is the new version of NumberedLineEmpty:

package javax0.blog.demo.throwable.v3;

public class NumberedLineEmpty extends LineEmpty {
    final protected int lineNr;

    public NumberedLineEmpty(int lineNr, LineEmpty cause) {
        super(cause);
        this.setStackTrace(cause.getStackTrace());
        this.lineNr = lineNr;
    }

    // getMessage() same as in v2

    @Override
    public Throwable fillInStackTrace() {
        return this;
    }
}

Here is the new version of FileNumberedLineEmpty:

package javax0.blog.demo.throwable.v3;

public class FileNumberedLineEmpty extends NumberedLineEmpty {
    final protected String fileName;

    public FileNumberedLineEmpty(String fileName, NumberedLineEmpty cause) {
        super(cause.lineNr, cause);
        this.setStackTrace(cause.getStackTrace());
        this.fileName = fileName;
    }

    // getMessage(), same as in v2

    @Override
    public Throwable fillInStackTrace() {
        return this;
    }
}

There is a public setStackTrace() method that can be used to set the stack trace of an exception. The interesting thing is that this method is really public and not protected. The fact that this method is public means that the stack trace of any exception can be set from outside. Doing that is (probably) against encapsulation rules.
Nevertheless, it is there and if it is there then we can use it to set the stack trace of the exception to be the same as it is that of the causing exception.

There is another interesting piece of code in these exception classes. This is the public fillInStackTrace() method. If we implement this, like the above then we can save the time the exception spends during the object construction collecting its own original stack trace that we replace and throw away anyway.

When we create a new exception the constructor calls a native method to fill in the stack trace. If you look at the default constructor of the class java.lang.Throwable you can see that actually this is all it does (Java 14 OpenJDK):

public Throwable() {
    fillInStackTrace();
}

The method fillInStackTrace() is not native but this is the method that actually invokes the native fillInStackTrace(int) method that does the work. Here is how it is done:

public synchronized Throwable fillInStackTrace() {
    if (stackTrace != null ||
        backtrace != null /* Out of protocol state */ ) {
        fillInStackTrace(0);
        stackTrace = UNASSIGNED_STACK;
    }
    return this;
}

There is some “magic” in it, how it sets the field stackTrace but that is not really important as for now. It is important, however, to note that the method fillInStackTrace() is public. This means that it can be overridden. (For that, protected would have been enough, but public is even more permitting.)

We also set the causing exception, which, in this case will have the same stack trace. Running the test (similar to the previous tests that we listed only one of), we get the stack print out:

javax0.blog.demo.throwable.v3.FileNumberedLineEmpty: c.txt:4 is empty
    at javax0.blog.demo.throwable.v3.LineWtfCounter.count(LineWtfCounter.java:15)
    at javax0.blog.demo.throwable.v3.FileWtfCounter.count(FileWtfCounter.java:16)
    at javax0.blog.demo.throwable.v3.ProjectWftCounter.count(ProjectWftCounter.java:19)
    at javax0.blog.demo.throwable.v3.TestWtfCounter.lambda$testThrowing$0(TestWtfCounter.java:17)
    at org.assertj.core.api.ThrowableAssert.catchThrowable(ThrowableAssert.java:62)
...
    at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
    at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
Caused by: javax0.blog.demo.throwable.v3.NumberedLineEmpty: line 4. has zero length
    ... 71 more
Caused by: javax0.blog.demo.throwable.v3.LineEmpty: There is a zero length line
    ... 71 more

It should be no surprise that we have a FileNumberedLineEmpty with a stack trace that starts on a code line LineWtfCounter.java:15 that does not throw that exception. When we see this there can be some debate about:

Why do we need the causing exceptions attached to the original when we overwrite the stack trace? (We do not.)
Is this a clean solution? It may be confusing that the stack trace originates from a line that does not throw that exception.

Let’s answer these concerns with, yes, they are needed for the demonstration purpose, and in a real application every programmer may decide if they want to use a solution like that.

Is this the best solution we can get? Probably no, because, as I promised, we have a fourth version of the application.

Version 4, suppressing exceptions

When we created the mock FileReader we were optimistic a lot. We assumed that there is only one line that has zero length. What if there are more than one lines like that? In that case, the application stops at the first one. The user fixes the error either adding some characters to the line, so that this is not an empty one, or deleting it altogether so that this is not a line anymore. Then the user runs the application again to get the second location in the exception. If there are many such lines to correct then this process can be cumbersome. You can also imagine that the code in a real application may run for long minutes let alone for hours. To execute the application just to get the next location of the problem is a waste of human time, waste of CPU clock, energy, and thus clean oxygen generating CO2 unnecessarily.

What we can do is, alter the application so that it goes on processing when there is an empty line, and it throws an exception listing all the lines that were empty and discovered during the process only after all the files and all the lines were processed. There are two ways. One is to create some data structure and store the information in there and at the end of the processing, the application can have a look at that and throw an exception if there is any information about some empty lines there. The other one is to use the structures provided by the exception classes to store the information.

The advantage is to use the structures provided by the exception classes are

the structure is already there and there is no need to reinvent the wheel,

it is well-designed by many seasoned developers and used for decades, probably is the right structure,

the structure is general enough to accommodate other types of exceptions, not only those that we have currently, and the data structure does not need any change.

Let’s discuss the last bullet point a bit. It may happen that later we decide that lines that contain WTF all capital are also exceptional and should throw an exception. In that case, we may need to modify our data structures that store these error cases if we decided to craft these structures by hand. If we use the suppressed exceptions of the Throwable class then there is nothing extra to do. There is an exception, we catch it (as you will see in the example soon), store it, and then attach it at the end of the summary exception as a suppressed exception. Is it YAGNI that we think about this future possibility when it is extremely unlikely that this demo application will ever be extended? Yes, and no, and generally it does not matter. YAGNI is usually a problem when you devote time and effort to develop something too early. It is an extra cost in the development and later in the maintenance. When we are just using something simpler that is already there then it is not YAGNI to use it. It is simply clever and knowledgable about the tool we use.

Let’s have a look at the modified FileReader that this time already returns many empty lines in many files:

package javax0.blog.demo.throwable.v4;

import java.io.FileNotFoundException;
import java.util.List;

public class FileReader {
    final String fileName;

    public FileReader(String fileName) {
        this.fileName = fileName;
    }

    public List<String> list() {
        if (fileName.equals("a.txt")) {
            return List.of("wtf wtf", "wtf something", "", "nothing");
        }
        if (fileName.equals("b.txt")) {
            return List.of("wtf wtf wtf", "", "wtf something wtf", "nothing wtf", "");
        }
        if (fileName.equals("c.txt")) {
            return List.of("wtf wtf wtf", "", "wtf something wtf", "nothing wtf", "");
        }
        throw new RuntimeException("File is not found: "+ fileName);
    }

}

Now all three files contain lines that are empty. We do not need to modify the LineWtfCounter counter. When there is an empty line, we throw an exception. On this level, there is no way to suppress this exception. We cannot collect here any exception list. We focus on one single line that may be empty.

The case is different in FileWtfCounter:

package javax0.blog.demo.throwable.v4;

public class FileWtfCounter {
    private final FileReader fileReader;

    public FileWtfCounter(FileReader fileReader) {
        this.fileReader = fileReader;
    }

    public int count() {
        final var lines = fileReader.list();
        NumberedLinesAreEmpty exceptionCollector = null;
        int sum = 0;
        int lineNr = 1;
        for (final var line : lines) {
            try {
                sum += new LineWtfCounter(line).count();
            }catch(LineEmpty le){
                final var nle = new NumberedLineEmpty(lineNr,le);
                if( exceptionCollector == null ){
                    exceptionCollector = new NumberedLinesAreEmpty();
                }
                exceptionCollector.addSuppressed(nle);
            }
            lineNr ++;
        }
        if( exceptionCollector != null ){
            throw exceptionCollector;
        }
        return sum;
    }

}

When we catch a LineEmpty exception we store it in an aggregate exception referenced by the local variable exceptionCollector. If there is not exceptionCollector then we create one before adding the caught exception to it to avoid NPE. At the end of the processing when we processed all the lines we may have many exceptions added to the summary exception exceptionCollector. If it exists then we throw this one.

Similarly, the ProjectWftCounter collects all the exceptions that are thrown by the different FileWtfCounter instances and at the end of the processing it throws the summary exception as you can see in the following code lines:

package javax0.blog.demo.throwable.v4;

import javax0.blog.demo.throwable.FileLister;

public class ProjectWftCounter {

    private final FileLister fileLister;

    public ProjectWftCounter(FileLister fileLister) {
        this.fileLister = fileLister;
    }


    public int count() {
        final var fileNames = fileLister.list();
        FileNumberedLinesAreEmpty exceptionCollector = null;
        int sum = 0;
        for (final var fileName : fileNames) {
            try {
                sum += new FileWtfCounter(new FileReader(fileName)).count();
            } catch (NumberedLinesAreEmpty nle) {
                if( exceptionCollector == null ){
                    exceptionCollector = new FileNumberedLinesAreEmpty();
                }
                exceptionCollector.addSuppressed(nle);
            }
        }
        if( exceptionCollector != null ){
            throw exceptionCollector;
        }
        return sum;
    }
}

Now that we have collected all the problematic lines into a huge exception structure we get a stack trace that we deserve:

javax0.blog.demo.throwable.v4.FileNumberedLinesAreEmpty: There are empty lines
    at javax0.blog.demo.throwable.v4.ProjectWftCounter.count(ProjectWftCounter.java:24)
    at javax0.blog.demo.throwable.v4.TestWtfCounter.lambda$testThrowing$0(TestWtfCounter.java:17)
    at org.assertj.core.api.ThrowableAssert.catchThrowable(ThrowableAssert.java:62)
    at org.assertj.core.api.AssertionsForClassTypes.catchThrowable(AssertionsForClassTypes.java:750)
    at org.assertj.core.api.Assertions.catchThrowable(Assertions.java:1179)
    at javax0.blog.demo.throwable.v4.TestWtfCounter.testThrowing(TestWtfCounter.java:15)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:564)
    at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:686)
    at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
    at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
    at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
    at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
    at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
    at org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
    at org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
    at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
    at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
    at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
    at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
    at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104)
    at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98)
    at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$6(TestMethodTestDescriptor.java:205)
    at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
    at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:201)
    at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:137)
    at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:71)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:135)
    at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:125)
    at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:135)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:123)
    at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:122)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:80)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1510)
    at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:139)
    at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:125)
    at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:135)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:123)
    at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:122)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:80)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1510)
    at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:139)
    at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:125)
    at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:135)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:123)
    at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:122)
    at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:80)
    at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:32)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
    at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:51)
    at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:248)
    at org.junit.platform.launcher.core.DefaultLauncher.lambda$execute$5(DefaultLauncher.java:211)
    at org.junit.platform.launcher.core.DefaultLauncher.withInterceptedStreams(DefaultLauncher.java:226)
    at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:199)
    at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:132)
    at com.intellij.junit5.JUnit5IdeaTestRunner.startRunnerWithArgs(JUnit5IdeaTestRunner.java:69)
    at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
    at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
    at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
    Suppressed: javax0.blog.demo.throwable.v4.NumberedLinesAreEmpty
        at javax0.blog.demo.throwable.v4.FileWtfCounter.count(FileWtfCounter.java:22)
        at javax0.blog.demo.throwable.v4.ProjectWftCounter.count(ProjectWftCounter.java:21)
        ... 68 more
        Suppressed: javax0.blog.demo.throwable.v4.NumberedLineEmpty: line 3.
            at javax0.blog.demo.throwable.v4.LineWtfCounter.count(LineWtfCounter.java:15)
            at javax0.blog.demo.throwable.v4.FileWtfCounter.count(FileWtfCounter.java:18)
            ... 69 more
        Caused by: javax0.blog.demo.throwable.v4.LineEmpty: There is a zero length line
    Suppressed: javax0.blog.demo.throwable.v4.NumberedLinesAreEmpty
        at javax0.blog.demo.throwable.v4.FileWtfCounter.count(FileWtfCounter.java:22)
        at javax0.blog.demo.throwable.v4.ProjectWftCounter.count(ProjectWftCounter.java:21)
        ... 68 more
        Suppressed: javax0.blog.demo.throwable.v4.NumberedLineEmpty: line 2.
            at javax0.blog.demo.throwable.v4.LineWtfCounter.count(LineWtfCounter.java:15)
            at javax0.blog.demo.throwable.v4.FileWtfCounter.count(FileWtfCounter.java:18)
            ... 69 more
        Caused by: javax0.blog.demo.throwable.v4.LineEmpty: There is a zero length line
        Suppressed: javax0.blog.demo.throwable.v4.NumberedLineEmpty: line 5.
            at javax0.blog.demo.throwable.v4.LineWtfCounter.count(LineWtfCounter.java:15)
            at javax0.blog.demo.throwable.v4.FileWtfCounter.count(FileWtfCounter.java:18)
            ... 69 more
        Caused by: javax0.blog.demo.throwable.v4.LineEmpty: There is a zero length line
    Suppressed: javax0.blog.demo.throwable.v4.NumberedLinesAreEmpty
        at javax0.blog.demo.throwable.v4.FileWtfCounter.count(FileWtfCounter.java:22)
        at javax0.blog.demo.throwable.v4.ProjectWftCounter.count(ProjectWftCounter.java:21)
        ... 68 more
        Suppressed: javax0.blog.demo.throwable.v4.NumberedLineEmpty: line 2.
            at javax0.blog.demo.throwable.v4.LineWtfCounter.count(LineWtfCounter.java:15)
            at javax0.blog.demo.throwable.v4.FileWtfCounter.count(FileWtfCounter.java:18)
            ... 69 more
        Caused by: javax0.blog.demo.throwable.v4.LineEmpty: There is a zero length line
        Suppressed: javax0.blog.demo.throwable.v4.NumberedLineEmpty: line 5.
            at javax0.blog.demo.throwable.v4.LineWtfCounter.count(LineWtfCounter.java:15)
            at javax0.blog.demo.throwable.v4.FileWtfCounter.count(FileWtfCounter.java:18)
            ... 69 more
        Caused by: javax0.blog.demo.throwable.v4.LineEmpty: There is a zero length line

This time I did not delete any line to make you feel the weight of it on your shoulder. Now you may start to think if it was really worth using the exception structure instead of some neat, slim special-purpose data structure that contains only the very information that we need. If you start to think that, then stop it. Don’t do it. The problem, if any, is not that we have too much information. The problem is the way we represent it. To overcome it the solution is not to throw out the baby with the bathwater… the excess information but rather to represent it in a more readable way. If the application rarely meets many empty lines, then reading through the stack trace may not be an unbearable burden for the user. If it is a frequent problem, and you want to be nice to your users (customers, who pay your bills) then, perhaps, a nice exception structure printer is a nice solution.

We actually have one for you in the project

javax0.blog.demo.throwable.v4.ExceptionStructurePrettyPrinter

that you can use and even modify at your will. With this the printout of the previous “horrendous” stack trace will print out as:

FileNumberedLinesAreEmpty("There are empty lines")
    Suppressed: NumberedLineEmpty("line 3.")
      Caused by:LineEmpty("There is a zero length line")
    Suppressed: NumberedLineEmpty("line 2.")
      Caused by:LineEmpty("There is a zero length line")
    Suppressed: NumberedLineEmpty("line 5.")
      Caused by:LineEmpty("There is a zero length line")
    Suppressed: NumberedLineEmpty("line 2.")
      Caused by:LineEmpty("There is a zero length line")
    Suppressed: NumberedLineEmpty("line 5.")
      Caused by:LineEmpty("There is a zero length line")

With this, we got to the end of the exercise. We stepped through the steps from v1 simply throwing and catching and exception, v2 setting causing exceptions matryoshka style, v3 altering the stack trace of the embedding exception, and finally v4 storing all the suppressed exceptions that we collected during our process. What you can do now is download the project, play around with it, examine the stack traces, modify the code, and so on. Or read on, we have some extra info about exceptions that are rarely discussed by basic level tutorials, and it is also worth reading the final takeaway section.

Other things to know about exceptions

In this section, we will tell you some information that is not well known and is usually missing from the basic Java tutorials that talk about exceptions.

There is no such thing as checked exception in the JVM

Checked exceptions cannot be thrown from a Java method unless the method declaration explicitly says that this may happen. The interesting thing is that the notion of checked exceptions is not known for the JVM. This is something handled by the Java compiler, but when the code gets into the JVM there is no check about that.

Throwable (checked) <-- Exception (checked) <-- RuntimeException (unchecked)
                                            <-- Other Exceptions (checked)
                    <-- Error (unchecked)

The structure of the exception classes is as described above. The root class for the exceptions is the Throwable. Any object that is an instance of a class, which extends directly or indirectly the Throwable class can be thrown. The root class Throwable is checked, thus if an instance of it is thrown from a method, then it has to be declared.
If any class extends this class directly and is thrown from a method then, again it has to be declared. Except if the object is also an instance of RuntimeException or Error. In that case the exception or error is not checked and can be thrown without declaring on the throwing method.

The idea of checked exception is controversial. There are advantages of its use but there are many languages that do not have the notion of it. This is the reason why the JVM does not enforce the declaration of checked exceptions. If it did it would not be possible reasonably to generate JVM code from languages that do not require exceptions declared and want to interoperate with the Java exceptions. Checked exceptions also cause a lot of headaches when we are using streams in Java.

It is possible to overcome of checked exceptions. A method created with some hack, or simply in a JVM language other than Java can throw a checked exception even if the method does not declare the exception to be thrown. The hacky way uses a simple static utility method, as listed in the following code snippet:

package javax0.blog.demo.throwable.sneaky;

public class SneakyThrower {
    public static <E extends Throwable> E throwSneaky(Throwable e) throws E {
        throw (E) e;
    }
}

When a code throws a checked exception, for example Exception then passing it to throwSneaky() will fool the compiler. The compiler will look at the declaration of the static method and cannot decide if the Throwable it throws is checked or not. That way it will not require the declaration of the exception in the throwing method.

The use of this method is very simple and is demonstrated with the following unit test code:

package javax0.blog.demo.throwable.sneaky;

import org.junit.jupiter.api.DisplayName;
import org.junit.jupiter.api.Test;

import static javax0.blog.demo.throwable.sneaky.SneakyThrower.throwSneaky;
import static org.assertj.core.api.Assertions.assertThat;
import static org.assertj.core.api.Assertions.catchThrowable;

public class TestSneaky {

    @DisplayName("Can throw checked exception without declaring it")
    @Test
    void canThrowChecked() {
        class FlameThrower {
            void throwExceptionDeclared() throws Exception {
                throw new Exception();
            }

            void throwExceptionSecretly() {
                throwSneaky(new Exception());
            }
        }
        final var sut = new FlameThrower();
        assertThat(catchThrowable(() -> sut.throwExceptionDeclared())).isInstanceOf(Exception.class);
        assertThat(catchThrowable(() -> sut.throwExceptionSecretly())).isInstanceOf(Exception.class);
    }

    int doesNotReturn(){
        throw throwSneaky(new Exception());
        // no need for a return command
    }

}

The two methods throwExceptionDeclared() and throwExceptionSecretly() demonstrate the difference between normal and sneaky throwing.

The method throwSneaky() never returns, and it still has a declared return value. The reason for that is to allow the pattern that can be seen in the method doesNotReturn() towards the end of the text code. We know that the method throwSneaky() never returns, but the compiler does not know. If we simply call it then the compiler will still require some return statement in our method. In more complex code flow it may complain about uninitialized variables. On the other hand if we “throw” the return value in the code then it gives the compiler a hint about the execution flow. The actual throwing on this level will never happen actually, but it does not matter.

Never catch `Throwable`, `...Error` or `COVID`

When we catch an exception we can catch checked exception, RuntimeException or just anything that is Throwable. However, there are other things that are Throwable but are not exceptions and are also not checked. These are errors.

Story:

I do a lot of technical interviews where candidates come and answer my questions. I have a lot of reservations and bad feelings about this. I do not like to play “God”. On the other hand, I enjoy a lot when I meet clever people, even if they are not fit for a given work position. I usually try to conduct the interviews that the value from it is not only the evaluation of the candidate but also something that the candidate can learn about Java, the profession, or just about themselves. There is a coding task that can be solved using a loop, but it lures inexperienced developers to have a solution that is recursive. Many of the developers who create the recursive solution realize that there is no exit condition in their code for some type of the input parameters. (Unless there is because they do it in the clever way. However, when they are experienced enough, they do not go for the recursive solution instead of a simple loop. So when it is a recursive solution they almost never have an exit condition.) What will happen if we run that code with an input parameter that never ends the recursive loop? We get a StackOverflowException. Under the pressure and stress of the interview, many of them craft some code that catches this exception. This is problematic. This is a trap!

Why is it a trap? Because the code will not ever throw a StackOverflowException. There is no such thing in the JDK as StackOverflowException. It is StackOverflowError. It is not an exception, and the rule is that

YOUR CODE MUST NEVER CATCH AN ERROR

The StackOverflowError (not exception) extends the class VirtualMachineError which says in the JavaDoc:

Thrown to indicate that the Java Virtual Machine is broken

When something is broken you can glue it together, mend, fix, but you can never make it unbroken. If you catch a Throwable which is also an instance of Error then the code executing in the catch part is run in a broken VM. What can happen there? Anything and the continuation of the execution may not be reliable.

Never catch an Error!

Summary and Takeaway

In this article we discussed exceptions, specifically:

how to throw meaningful exceptions by adding information when it becomes available,

how to replace the stack trace of an exception with setTrackTrace() when it makes sense,

how to collect exceptions with addSuppressed() when your application can throw exceptions multiple times We also discussed some interesting bits about how the JVM does not know about checked exceptions and why you should never catch an Error.

Don’t just (re)throw exceptions when they happen. Think about why and how they happen and handle them appropriately.

Use the information in this article to make your code exceptional 😉

(Code and article were reviewed and proofread by Mihaly Verhas. He also wrote the takeaway section including the last
sentence.)

JDK14 instanceof EA issue

4 Replies

Tagir Valeev recently had a tweet about the preview feature of the soon coming JDK14 release of Java:

#Java14 pattern matching brings name shadowing to the next level of craziness. Here I add or remove the `final` modifier for the `FLAG` field, which is accessed only in unreachable `if` branch. This actually changes the program semantics! #ProgrammingIsFun. pic.twitter.com/UToRY3mpW9

— Tagir Valeev (@tagir_valeev) December 27, 2019

https://platform.twitter.com/widgets.js

The issue is that there is a planned and in the EA release already available new feature of Java that introduces pattern variables and the current version of the proposed new standard leaves room for some really spooky coding issue.

Following the tweet, the details were discussed in detail enough to understand the actual problem. In this article, however, I will summarize what all this is about so that you do not need to dig yourself through the tweets and the standards.

What is a pattern variable

Before getting into the deep detail of the issue outlines in the tweet above, let’s discuss a bit, what a pattern variable is. (Maybe a bit sloppy, more explanatory than precise and complete, but here it comes.)

Programming many times we need to check the type of some objects. The operator instanceof does that for us. A typical sample code can be something like this:

// HOW THIS IS TODAY, JAVA < 14

Object z = "alma";
if (!(z instanceof String)){
    throw new IllegalArgumentException();
}
System.out.println(((String)z).length());

In real life, the variable z may come from somewhere else, in which case it is not so obvious that this is a string. When we want to print out the length of the string using println we already know that the object referenced by z is a string. The compiler, on the other hand, does not.We have to cast the variable to a String and then we can use the length() method. Other languages do it better. Ideally, I could write:

// HOW IT WOULD BE THE SIMPLEST

Object z = "alma";
if (!(z instanceof String)){
    throw new IllegalArgumentException();
}
System.out.println(z.length());

That is not the Java way and also that is not the way JDK14 simplifies this programming pattern. Instead, the proposed feature introduces a new syntax for the instanceof operator that introduces a new variable: a pattern variable.

To make a long story short, the above example will look the following:

// HOW IT IS IN JDK14-EA / OpenJDK (build 14-ea+28-1366)

Object z = "alma";
if (!(z instanceof String s)){
    throw new IllegalArgumentException();
}
System.out.println(s.length());

It introduces a new variable s that is in scope only when the referenced object is a String. A simpler version of the code without the exception throwing part would be

Object z = "alma";
if (z instanceof String s){
    // we have here 's' and it is a String
    System.out.println(s.length());
}

// we do not have 's' here

When the condition is true, the object is a string thus we have ‘s’. If the condition is false then we jump over the then_statement, and there we do not have ‘s’ as we do not have a string. ‘s’ is available in the code which only runs when the object is a string. That way the variable scope of a pattern variable is determined and constrained not only by the syntactical scope of the variable but also by the possible control flow. Only the control flow that can be analyzed with certainty is taken into account.

Such control-flow analysis is not unparalleled in the Java compiler. A Java program will not compile, for example, if there is an unreachable code that the compiler can detect.

So far it seems to be simple and we are all happy to get the new feature in Java 14.

The JSL14 standard

The precise scope calculation is defined in the JLS14 (Java Language Specification 14) standard. At the time of this article, the spec is only available as a preview.

http://cr.openjdk.java.net/~gbierman/jep305/jep305-20191021/specs/patterns-instanceof-jls.html#jls-6.3.2.2

As the execution flow of a Java program can be controlled by many different language-constructs the scope of a pattern variable is defined for each of these structures. There are separate sections for the different logical operators that evaluate short-circuit, ‘if’ statement, ‘while’ statement and so on. I do not want to discuss the different cases extensively. I will focus here only on the case of the ‘if’ statement without the ‘else’ part. The standard cited above says:

The following rules apply to a statement `if (e) S` (14.9.1):

* A pattern variable introduced by e when true is definitely matched at `S`.

It is a compile-time error if any pattern variable introduced by `e` when true is already in scope at `S`.

* `V` is introduced by `if (e) S` if and only if `V` is introduced by `e` when `false` and `S` cannot complete normally.

It is a compile-time error if any pattern variable introduced by the `if` statement is already in scope.

The interesting part is the “cannot complete normally”. A good example of this is our example above: we create a so-called guarding if statement. When the variable z is not a String then we throw an exception, return or do something else that will always prevent the execution to reach the code after the if statement when the variable is not a String.

In the case of a throw or return statement, it is usually very straightforward and easy to see that the code “cannot complete normally”. In case of an infinite loop, this is not always so evident.

The Problem

Let’s have a look at the following code fragment:

private static boolean FLAG = true;
static String variable = "Hello from field";

public static void main() {
    Object z = "Hello from pattern matching";
    if (!(z instanceof String variable)){
        while (FLAG) {
            System.out.println("We are in an endless loop");
        }
    }
    System.out.println(variable);
}

In this case, we have a loop, which is infinite or not. It depends on the other part of the code that may alter the value of the class field FLAG from true to false. This part of the code “can complete normally”.

If we modify the above code just a little making the field FLAG to be final, as

private static final boolean FLAG = true;
static String variable = "Hello from field";

public static void main() {
    Object z = "Hello from pattern matching";
    if (!(z instanceof String variable)){
        while (FLAG) {
            System.out.println("We are in an endless loop");
        }
    }
    System.out.println(variable);
}

then the compiler will see that the loop is infinite and cannot complete normally. The program will print out Hello from field in the first case, and it will print Hello from pattern matching. The pattern variable in the second case hides the field variable because of the scope of the pattern variable is extended to the commands following the if statement because the then-part cannot complete normally.

This is really a problem with this preview feature as it is. The readability of the code, in this case, is very questionable. The scope of the pattern variable and if it is hiding a field or not depends on the final modifier of the field, which is not there. When we look at some code the actual execution and the result of the code should be simple and should not really depend on some code that is far away and may skip our attention reading the code locally.

This is not the only situation in Java that has this anomaly. You can have a class named String for example in your codebase. The code of the classes, which are in the same package will use that class when they refer to the type String. If we delete the String class from the user code then the meaning of the String type becomes java.lang.String. The actual meaning of the code depends on some other code that is “far”.

This second example, however, is a hack and it is not likely that a Java programmer who has not lost their mind names a class String (seriously https://github.com/verhas/jScriptBasic/blob/master/src/main/java/com/scriptbasic/classification/String.java?) or some other name that also exists in the JDK in the java.lang package. Maybe it is pure luck, maybe it was well considered during the decision making to avoid the mandatory import of the classes from the java.lang package. This is history.

The variable name shadowing and the situation above is, on the other hand, does not seem to be so weird and something that surely will not accidentally happen in some Java code.

Fortunately, this is only a preview feature. It will be in the JDK14 as it is, but as a preview feature it is only available when the javac compiler and the java execution uses the --enable-preview flag and the preview feature may change in the future in an incompatible way.

Solution

I cannot tell how it will change. I cannot even tell that it will change at all. It is only my personal opinion that it would be very sad if it remained like that. With this feature, Java would be a better language so long as long we count how brilliantly and readable a seasoned Java programmer can program. But it will worse if we look at how a non-seasoned, fresh junior can fuck the code up. In my humble opinion, this second is the more important and Java has a very strong point in this. Java is not a hacker language, and you should be very desperate to write a very unreadable code. I would not like it changing.

After having said that we can look at the technical possibilities. One is to abandon the feature, which would not really be a good solution. It would not actually be a solution.

Another possibility is to limit the scope of the pattern variables to the then statement or to the else statement.

personally, I would prefer that binding variable scope would only apply to explicitly declared else blocks, and not an implicit like in this case.

— Michael Rasmussen (@jmichaelras) December 27, 2019

That way we do not rely on the “cannot complete normally” feature of the code. The else guarantees that the else branch is executed only when the condition of the if statement is false. This will make the solution less elegant.

Again another possibility is to forbid for the pattern variables to shadow any field variable. It would solve the problem outlined above but would introduce a different one. With this restriction, it could happen that an existing class with methods and pattern variable V stops compiling when we introduce a new field variable named V. At least this issue is compile-time and not some code that is buggy during run-time.

I rather have 100 compile time error than one run-time error.

Still another possibility is to abandon the pattern variable and just to use the original variable with extended type information where the current preview solution uses the pattern variable. Kotlin fans would love this solution. This would also elegantly eliminate the shadowing issue because the local variable already shadows (or does not) the field variable. The drawback of this solution is that the variable type re-scoped would have different types in different places in the code. Let’s have a look at the following code:

package javax0.jdk14.instanceof0;

public class Sample2 {
    public static class A {
        public static void m(){
            System.out.println("A");
        }
    }
    public static class B extends A {
        public static void m(){
            System.out.println("B");
        }
    }
    public static void main(String[] args) {
        A a = new B();
        if( a instanceof B b){
            b.m();
        }
        a.m();
    }
}

This code will print out B then A because the call to b.m() is the same as B.m() based on the declared type of the variable b and the same way a.m() is the same as A.m() based on the declared type of the variable a. Omitting the pattern variable and using the original variable could make confusion:

// NOT ACTUAL CODE
    public static void main(String[] args) {
        A a = new B();
        if( a instanceof B){
            a.m();
        }
        a.m();
    }

Would a.m() call different methods on different lines?

As you can see there is no known good or best solution to this issue… except one. Call your representative in the JDK “parliament” and tell them that it is not good that way. (Psst: they already know it from the original tweet.)

Takeaway

This is a special article because this is not about some well established Java feature or some good programming tool or style, pattern, methodology. We discussed a preview feature. A preview feature that, perhaps, proves why we need preview features in Java.

Use the latest LTS version for long-running commercial projects that will need long term support from you.

Use the latest released Java version for your experiments and opensource projects and be prepared to support older Java versions if the users need it.

Do not use the preview features in your projects or be prepared to have a new release from your code in case they change in the next Java releases when they become non-preview but normal features.

Experiment with the pre-view features to embrace them and to have a kind of muscle memory when they become real features. And also to give feedback to the Java community in case you feel they are not really perfect.

Be part of the community!

Repeated code

Leave a reply

Introduction

It is usually not good to have copy/paste code in our Java application but sometimes it is unavoidable. For example the project License3j provides a method isXXX in the Feature class for each XXX type it supports. In that case, we can do no better than write

    public boolean isBinary() {
        return type == Type.BINARY;
    }

    public boolean isString() {
        return type == Type.STRING;
    }

    public boolean isByte() {
        return type == Type.BYTE;
    }

    public boolean isShort() {
        return type == Type.SHORT;
    }

and so on

for each and every feature type the application supports. And there are some types there: Binary, String, Byte, Short, Int, Long, Float, Double, BigInteger, BigDecimal, Date, UUID. It is not only a boring task to type all the very similar methods, but it is also error-prone. A very few humans are good at doing such a repetitive task. To avoid that we can use the Java::Geci framework and as the simplest solution we can use the generator Iterate.

POM dependency

To use the generator we have to add the dependency

<dependency>
    <groupId>com.javax0.geci</groupId>
    <artifactId>javageci-core</artifactId>
    <scope>test</scope>
    <version>1.4.0</version>
</dependency>

The library is executed only during when the tests run, therefore the use of it does not imply any extra dependency. Whoever wants to use the library License3j does not need to use Java::Geci. This is only a development tool used in test scope.

Unit Test running it

The dependency will not run by itself. After all the dependency is not a program. It is a bunch of class files packaged into a JAR to be available on the classpath. We have to execute the generator and it has to be done through the framework creating a unit test:

    @Test
    @DisplayName("run Iterate on the sources")
    void runIterate() throws IOException {
        Geci geci = new Geci();
        Assertions.assertFalse(
            geci.register(Iterate.builder()
                              .define(ctx -> ctx.segment().param("TYPE", ctx.segment().getParam("Type").orElse("").toUpperCase()))
                              .build())
                .generate()
            , geci.failed()
        );
    }

It creates a Geci object, instantiates the generator using a builder and then invokes generate() on the configured framework Geci object. The define() call seems a bit cryptic as for now. We will shed light on that later.

Source Code Preparation

The final step before executing the build is to define a template and the values to insert into the template. Instead of writing all the methods all we have to do is to write a template and an editor fold segment:

    /* TEMPLATE
    LOOP Type=Binary|String|Byte|Short|Int|Long|Float|Double|BigInteger|BigDecimal|Date|UUID
    public boolean is{{Type}}() {
        return type == Type.{{TYPE}};
    }

     */
    //<editor-fold id="iterate">
    //</editor-fold>

When we execute the generator through the framework it will evaluate the template for each value of the placeholder Type and it will replace each {{Type}} with the actual value. The resulting code will be inserted into the editor-fold segment with the id “iterate”.

Looking at the template you can see that there is a placeholder {{TYPE}}, which is not defined in the list. This is where the unite test define() comes into the picture. It defines a consumer that consumes a context and using that context it reads the actual value of Type, creates the uppercased version of the value and assigns it to the segment parameter named TYPE.

Generally, that is it. There are other functionalities using the generator, like defining multiple values per iteration assigned to different placeholders, escaping or skipping lines and so on. About those here is an excerpt from the documentation that you can read up-to-date and full az https://github.com/verhas/javageci/blob/master/ITERATE.adoc

Documentation Excerpt

In the Java source files where you want to use the generator you have to annotate the class with the annotation @Geci("iterate"). You can also use the @Iterate annotation instead, which is defined in the javageci-core-annotations module. This will instruct the Geci framework that you want to use the iterate generator in the given class.

`TEMPLATE`

A template starts after a line that is /\*TEMPLATE or TEMPLATE. There can be spaces before and after and between the /* and the word TEMPLATE but there should not be anything else on the line. When the generator sees such a line it starts to collect the following lines as the content of the template.

The end of the template is signaled by a line that has */ on it and nothing else (except spaces).

The content of the template can contain parameters between {{ and }} characters similarly as it is used by the mustache template program. (The generator is not using mustache, template handling is simpler.)

`LOOP`

While collecting the lines of the template some of the lines are recognized as parameter definitions for the template. These lines do not get into the trunk of the template. (Command names on these lines are always capital.)

As you could see in the introduction the line

    LOOP type=int|long|short

is not part of the template text. It instructs the generator to iterate through the types and set the parameter {{type}} in the text to int first, long the second and short the last. That way you can iterate over multiple values of a single parameter.

A more complex template may need more than one parameter. In that case, you can list them in the LOOP line as

    LOOP type,var=int,aInt|long,aLong|short,aShort

This will tell the generator to set the parameter {{type}} the same way as before for the three iterations but the same time also set the parameter {{var}} to aInt in the first loop, to aLong in the second loop and aShort in the last loop.

If the list of the values is too long it is possible to split the list into multiple LOOP lines. In this case, however, the variables have to be repeated in the second, third and so on LOOP lines. Their order may be different, but if there is a variable undefined in some of the LOOP lines then the placeholder referring to it will be resolved and remains in the {{placeholder}} form.

The above example can also be written

    LOOP type,var=int,aInt
    LOOP var,type=aLong,long
    LOOP type,var=short,aShort

and it will result in the same values as the above LOOP repeated here:

    LOOP type,var=int,aInt|long,aLong|short,aShort

Default `editor-fold`

The templates are processed from the start of the file towards the end and the code generated is also prepared in this order. The content of the generated code will be inserted into the editor-fold segment that follows the template directly. Although this way the id of the
editor-fold segment is not really interesting you must specify a unique id for each segment. This is a restriction of the the Java::Geci framework.

Advanced Use

`EDITOR-FOLD-ID`

It may happen that you have multiple templates looping over different values and you want the result to go into the same editor-fold segment. It is possible using the EDITOR_FOLD_ID.

In the following example

package javax0.geci.iterate.sutclasses;

public class IterateOverMultipleValues {
    /* TEMPLATE
    {{type}} get_{{type}}Value(){
      {{type}} {{variable}} = 0;
      return {{variable}};
    }

    LOOP type,variable=int,i|long,l|short,s
    EDITOR-FOLD-ID getters
     */
    //
            // nothing gets here
    //

    //
    int get_intValue(){
      int i = 0;
      return i;
    }

    long get_longValue(){
      long l = 0;
      return l;
    }

    short get_shortValue(){
      short s = 0;
      return s;
    }

    //
}

the generated code gets into the editor-fold that has the id name getters even though this is not the one that follows the template definition.

Use this feature to send the generated code into a single segment from multiple iterating templates. Usually, it is a good practice to keep the template and the segment together.

`ESCAPE` and `SKIP`

The end of the template is signaled by a line that is */. This is essentially the end of a comment. What happens if you want to include a comment, like a JavaDoc into the template. You can write the */ characters at the end of the comment lines that still have some characters in it. This solution is not elegant and it essentially is a workaround.

To have a line that is exactly a comment closing or just any line that would be interpreted by the template processing, like a LOOP line you should have a line containing nothing else but an ESCAPE on the previous line. This will tell the template processing to include the next line into the template text and continue the normal processing on the line after.

Similarly, you can have a line SKIP to ignore the following line altogether. Using these two commands you can include anything into a template.

An example shows how you can include a JavaDoc comment into the template:

package javax0.geci.iterate.sutclasses;

public class SkippedLines {
    /* TEMPLATE
    /**
     * A simple zero getter serving as a test example
     * @return zero in the type {{type}}
    ESCAPE
     */
    // SKIP
    /*
    {{type}} get_{{type}}Value(){
      {{type}} {{variable}} = 0;
      return {{variable}};
    }
    LOOP type,variable=int,i|long,l|short,s
    EDITOR-FOLD-ID getters
     */
    //
    /**
     * A simple zero getter serving as a test example
     * @return zero in the type int
     */
    int get_intValue(){
      int i = 0;
      return i;
    }
    /**
     * A simple zero getter serving as a test example
     * @return zero in the type long
     */
    long get_longValue(){
      long l = 0;
      return l;
    }
    /**
     * A simple zero getter serving as a test example
     * @return zero in the type short
     */
    short get_shortValue(){
      short s = 0;
      return s;
    }
    //
}

The template starts with the comment and a comment can actually contain any other comment starting. Java comments are not nested. The end of the template is, however the line that contains the */ string. We want this line to be part of the template thus we precede it with the line ESCAPE so it will not be interpreted as the end of the template. On the other hand, for Java, this ends the comment. To continue the template we have to get “back” into comment mode since we do not want the Java compiler to process the template as code. (Last but not least because the template using placeholders is probably not a syntactically correct Java code fragment.) We need a new /* line, which we do not want to get into the template. This line is, therefore, preceded with a line containing // SKIP. (Skip lines can have optional // before the command.)

The result you can see in the generated code. All methods have the proper JavaDoc documentation.

`SEP1` and `SEP2`

Looping over the values you have to separate the names of the placeholders with , and | the list of the values. For example, the sample above contains

    LOOP type,variable=int,i|long,l|short,s

two placeholder names type and variable and three values for each. Placeholders do not need to contain special characters and it is the best if they are standard identifiers. The values, however, may contain a comma or a vertical bar. In that case, you can redefine the string (not only a single character) that the template LOOP command can use instead of the single character strings , and |.

For example the line

    SEP1 /

says that the names and the values should be separated by / instead of only one and

    SEP2 &

the list of the values should be separated by one character & string. The SEP1 and SEP2 will have effect only if they precede the LOOP command and they are effective only for the template they are used in. Following the above commands, the LOOP example would look like

    LOOP type/variable=int/i&long/l&short/s

That way there is nothing to prevent us to add another value list

    LOOP type/variable=int/i&long/l&short/s&byte,int/z

which eventually will result in a syntax error with the example template, but demonstrates the point redefining the name and the value list separators.

Configuration

The generator is implemented the configuration tools supported by the Geci framework and all the parameters are configurable. You can redefine the regular expressions that match the template start, end, skip and so on lines in the unit test where the generator object is created, in the annotation of the class or in the editor-fold parameters.

Takeaway

The iterate generator is an extremely easy to use generator to create code that is repetitive. This is also the major danger: you should be strong enough to find a better solution and use it only when it is the best solution.

About Peter Verhas

Docs as Code

it is the first step to the right direction

What docs-as-code Ignores

The next step

Tools

Conclusion

Introduction

final and effectively final

Life of a Lambda

Closure and Groovy

Lambda in Java

Closure in Java

Summary

What is the purpose of conducting interviews?

What is a good interview?

Do we judge the candidate?

Work with the Candidate

Coding Exercise

Giving Feedback

Summary and Fun Story

Introduction

My Special Testing Need

How to Recognize it is IntelliJ?

Separation of Concern

Notes

Summary

Introduction

What can be automated

DRY in Documentation

Information Transclusion

Internal Consistency

External Consistency

Document Writing is Programming

The Tool: Java Macro Language

Snippet Handling, Transclude

File, Directory, Class, Method => Consistency

Updating the Input

Creating Diagrams

Summary and Takeaway

Introduction

Test or not to Test Private Methods

Doing it with a Style

Doing it Automated

Takeaway

Do not install, just run!

Jamal as Example

Downloading and Caching the Program

Executing the macro processor

Extra details

Summary

Introduction

Sample Application

Version 1, throw and catch

Version 2, setting cause

Version 3, setting the stack trace

Version 4, suppressing exceptions

Other things to know about exceptions

There is no such thing as checked exception in the JVM

Never catch Throwable, ...Error or COVID

Summary and Takeaway

What is a pattern variable

The JSL14 standard

The Problem

Solution

Takeaway

Introduction

POM dependency

Unit Test running it

Source Code Preparation

Documentation Excerpt

TEMPLATE

LOOP

Default editor-fold

Advanced Use

EDITOR-FOLD-ID

ESCAPE and SKIP

SEP1 and SEP2

Configuration

Takeaway

`final` and effectively final

Never catch `Throwable`, `...Error` or `COVID`

`TEMPLATE`

`LOOP`

Default `editor-fold`

`EDITOR-FOLD-ID`

`ESCAPE` and `SKIP`

`SEP1` and `SEP2`