Use copy paste programming!
January 14, 2015
Posted by on
Copy paste is bad
We hate copy paste. Why? Because the result code is unmaintainable. I get a bug reported from QA, I analyze the code, look at logs, debug, drink a lot of coffee and finally I get to the code that is the root cause of the bug. I fix it, test the use case, release new code to learn next day that a very similar bug appears in a similar use case. In that case a different code runs looking very similar to the one I was mending the day before and I just start to wonder how many more copies of the same code will I face and have to change.
There are things worse than copy paste
Now imagine a schnitt, like in movies. Let’s jump to another time. I get a bug reported from QA, I analyze the code. I do not understand. There are a lot of small interfaces, abstract classes, deep hierarchies. Many of the classes have no direct relation to business. I ask for help from the developer who created the code, and he starts to explain. In two days I start to understand his idea of coding structure and how he was implementing the code in a strict object oriented way, avoiding the slightest copy/paste. After three days I find the place where a single line has to be modified. Before doing that I plan to create the unit test to fail first, then fix the code and re-run the unit test to see that the same error never comes again. So I open the unit test class testing the one I am going to modify and I do not understand how it works. It is complex and extends another class that uses another and another. This time it is a bit easier to understand since I already know the mindset of the programmer, who created it but it is still a whole day to create the new test. We are already in day #4 following the bug report customers banging on the door for fix.
To copy or not to copy…
Which is the better approach? Have some copy paste and face that some of the bugs will just appear in other areas or have an extremely strict but deep hierarchy OO design in the code that avoids bug reappearing but has a steep learning curve?
There is no one and only one best ever answer for this question. Neither of them is a good approach. Sometimes some copy paste may be forgivable sin. Deep inheritance structure is difficult to understand. It is usually recommended not to have more than three levels. One may also argue that in the above example the code could have been created with less levels of inheritance without actual copy paste. (Except that the above is not an actual but an imagined example distilled from many years of experience.) The level of repetition may conflict with OO structure. When you have the OO structure you make abstraction. Abstract code is harder to understand. When you copy-paste-modify the modified code will be on the same abstraction level as the one you copied. It may be easier to understand.
Copy unit test code
When it is about unit tests, I tend to forgive copy paste and verbosity to get simpler structures and readability. But that is because unit tests are more documentation than code. They have to be expressive as you look at it immediately. It should not require investigation and understanding of code structures defined somewhere else to get the point what the test is doing. I tend to agree with a unit test that copies one test and then contains slightly modified code. It still has one of the drawbacks on maintenance: if you change the code the change has to be propagated to all other places as well where the code was copied. But in this case if you forget to propagate the change you will get test errors or failures. This way you can see copy paste as an advantage: when you change the code you are forced to look at, refactor and think through all the test cases that are affected.
Do not copy production code
These effects that turn drawbacks to advantages in case unit test code may become a nightmare in production code. If you have doubt then do not copy. Do not be afraid to create too steep hierarchy structure. Programmers fall more often into the copy-paste trap than into steep hierarchies. Unless you are a senior programmer I recommend that you avoid copy-paste in production code at all costs as a rule of thumb. If you are senior you do not need my recommendation: you will avoid copy-paste on your own.
Just a story: Some time ago I wrote some mail about copy-paste in a code and I created a typo writing copy-pasta. In a few minutes I got a reply: “Pasta? You are referring to spaghetti code?” Nomen est omen.