Tony Marston's Blog About software development, PHP and OOP

The Fallacy Of ReUse

Posted on 20th August 2024 by Tony Marston
Introduction
The Fallacy of ReUse
Summary
References
Comments

Introduction

I have been immersed in the world of Object Oriented Programming for over 20 years now, and I am amazed at the amount of misleading opinions which are still being circulated. Some people simply echo what they have been taught while others offer their own interpretation of what they have been taught, but which turns out to be a mis-interpretation due to either their lack of understanding or confusion about the terminology being used. You can see some examples of this confusion and misunderstanding in the articles listed in my References section.


The Fallacy of ReUse

I recently came across an article called The Fallacy Of ReUse by Udi Dahan which, in my personal opinion, is based on a series of mis-understandings and mis-interpretations. In the sections below I shall highlight some of his dubious statements and explain why I think he is wrong.

  1. This industry is pre-occupied with reuse.

    There's this belief that if we just reused more code, everything would be better.

    Some even go so far as saying that the whole point of object-orientation was reuse - it wasn't, encapsulation was the big thing.
    I disagree. Encapsulation was not the only thing that OOP brought to the world as on its own it does not have much value. Its true value appears when it is coupled with Inheritance and Polymorphism. Encapsulation enables you to create classes containing both state (properties or variables) and behaviour (methods or operations) which can be instantiated into objects. Inheritance allows the contents of one class to be shared by a subclass with a single extends keyword. Polymorphism exists when several classes contain the same methods as it then enables the code which calls one of those methods to be reused, using the technique known as Dependency Injection, with an object created from any one of those classes. These three features together provide ways to reuse code that are not available in non-OO languages. This is emphasised in the following definition of OOP which is the only one that I find useful:
    Object Oriented Programming is programming which is oriented around objects, thus taking advantage of Encapsulation, Inheritance and Polymorphism to increase code reuse and decrease code maintenance.
    This sentiment is echoed in the paper Designing Reusable Classes, which was published in 1988 by Ralph E. Johnson & Brian Foote, where it states the following:
    Since a major motivation for object-oriented programming is software reuse, this paper describes how classes are developed so that they will be reusable.
    You clearly do not understand that reusable software is better than the alternative - duplicated code appearing in multiple places inside your software. This means that if you need to change that code, either to fix a bug or make an enhancement, then you have to make the same change to every copy of that code. The Don't Repeat Yourself (DRY) principle states that it is better to replace all those duplicated copies with a central reusable version as you only need to make changes to that one central version and not all those places which reference that central version.
  2. Entire books of patterns have been written on how to achieve reuse with the orientation of the day. Services have been classified every which way in trying to achieve this, from entity services and activity services, through process services and orchestration services. Composing services has been touted as the key to reusing, and creating reusable services.

    I'm afraid that there is no such thing as an entity service - an object is either an entity or a service. The distinction between the two is explained in When to inject: the distinction between newables and injectables and How to write testable code as follows:

  3. Reuse is a fallacy

    the actual goal of reuse was: getting done faster.

    And here's how reuse fits in to the picture:
    • If we were to write all the code of a system, we'd write a certain amount of code.
    • If we could reuse some code from somewhere else that was written before, we could write less code.
    • The more code we can reuse, the less code we write.
    • The less code we write, the sooner we'll be done!
    I agree. The more code you can reuse then the less code you have to write and the quicker you'll get the job done. Duplicating the same block of code over and over again takes more time than writing a single call to a central version of that code.
  4. Fallacy: All code takes the same amount of time to write
    I disagree. Programmers do not write code at the same speed as a typist working from a dictaphone. There is a lot of thinking involved, mainly on how to translate the human-readable specification into machine-readable instructions, especially the most efficient machine-readable instructions. Simply writing down the first thing that comes into your head may not be the most efficient or effective. It has to be arranged into a proper structure for ease of maintenance.
  5. Fallacy: Writing code is the primary activity in getting a system done
    Writing code is what happens after the analysis and design phases and before the testing, training, documentation and release phases. It is just one activity among many, and just as vital as all the others, so if it is not done, or done badly, then the whole exercise will be an expensive waste of time.
  6. Writing code is actually the least of our worries. We actually spend less time writing code than ...
    Rebugging code.
    Also known as bug regressions.
    This is where we fix one piece of code, and in the process break another piece of code.
    Reducing duplicated code to a single source which can be reused many times is the essence of the DRY principle and is far superior to the alternative. If you take a block of duplicated code and put it into a reusable module which turns out to have bugs then either it was buggy to begin with, in which case the same bugs will exist in every duplicated copy, or you introduced a bug in your reusable module. That is down to poor programming, not the act of creating a reusable module.
  7. It's not like we do it on purpose. It's all those dependencies between the various bits of code. The more dependencies there are, the more likely something's gonna break.

    Dependencies multiply by reuse

    It's to be expected. If you wrote the code all in one place, there are no dependencies. By reusing code, you've created a dependency. The more you reuse, the more dependencies you have. The more dependencies, the more rebugging.
    When writing a large enterprise application which has to deal with 100s of transactions, 100s of database tables and 100s of screens, none of today's professional programmers would write monolithic single-tier programs. He would be regarded as inept if he did not utilise some sort of layered architecture such as the 3-Tier Architecture or the Model-View-Controller design pattern, as shown in Figure 1 below. Note that each of the boxes in the diagram is a hyperlink which will take you to a detailed description of that component.

    Figure 1 - MVC and 3 Tier Architecture combined

    Model View Controller Data Access Object Presentation layer Business layer Data Access layer model-view-controller-03a (5K)

    Each module has a single responsibility, thus following the principle of cohesion. This means that you will always have calls from one module to another, which means you will always have coupling and dependencies. The aim is to produce software which exhibits high cohesion and low/loose coupling. Each user transaction will require one of each of the four modules shown above, but it is possible to create unique versions of each without a single line of reusable code. However, a skilled programmer would never do such as thing as part of his skill would be to identify repeating patterns, where code is duplicated, and to replace all that duplicated code with references to a central shared version. Once the shared module has been written, thoroughly tested and debugged, it should be able to be reused any number of times without any problems. The fact that a piece of code is reusable does not increase the likelihood of bugs suddenly appearing. That is usually down to giving it either bad or unexpected data.

  8. The value of (re)use

    If we were to (re)use a piece of code in only one part of our system, it would be safe to say that we would get less value than if we could (re)use it in more places.

    So, what characterizes the code we use in many places? Well, it's very generic. Actually, the more generic a piece of code, the less likely it is that we'll be changing something in it when fixing a bug in the system.

    However, when looking at the kind of code we reuse, and the reasons around it, we tend to see very non-generic code - something that deals with the domain-specific behaviors of the system. Thus, the likelihood of a bug fix needing to touch that code is higher than in the generic/use-not-reuse case, often much higher.
    A block of code does not increase in value just because you reuse it in more places. The value is the time saved in NOT having to write code that has already been written somewhere else. There may only be a subtle difference between those two viewpoints, but there is a difference.

    I agree that in a software system there are basically only two categories of code - either generic (boilerplate) code or unique business logic. It is only the business logic which has value to the end user, the paying customer, as the generic code has no purpose other than to find a path to where the business logic is located so that it can be processed. This means that it is much more likely that the generic code can be placed in reusable components as writing the same code over and over again would not be very efficient. Higher productivity can only be achieved by reducing the amount of generic/boilerplate code that has to be written.

    In the structure diagram shown in Figure 1 it should be obvious that the business logic should only exist in one place - the business layer. This means that the components in the other layers should consist of generic/boilerplate code and therefore ripe for large amounts of reusability.

  9. This doesn't mean you shouldn't use generic code / frameworks where applicable - absolutely, you should. Just watch the number of kind of dependencies you introduce.

    While reducing the number of dependencies, which is where one module is coupled to another by reason of there being a call from one module to another, is a good idea, another equally important factor is the strength of that coupling.

  10. So, if we follow the above advice with services, we wouldn't want domain specific services reusing each other. If we could get away with it, we probably wouldn't even want them using each other either.
    Any object which is domain specific must exist in the business/domain layer, therefore it is an entity, not a service. I do not agree that a domain object should not call a method on another domain object. In a database application, especially an enterprise application, each table in the database is a separate entity with its own structure and business rules, therefore it should have its own concrete class with all shared code inherited from an abstract class. It is perfectly normal for a user transaction to access more than one table, and each of these accesses should be performed through the object which is responsible for that table.
  11. As use and reuse go down, we can see that service autonomy goes up. And vice-versa.
    Use and reuse has nothing to do with service autonomy which is described in wikipedia as follows:
    Service autonomy is a design principle that is applied within the service-orientation design paradigm, to provide services with improved independence from their execution environments.
    Whether or not a module is independent from its execution environment has no bearing on whether its code is shared or non-shared.
  12. Luckily, we have service interaction mechanisms from Event-Driven Architecture that enable use without breaking autonomy.
    According to this wikipedia article an event is nothing more than "a significant change in state". In every application that I have written an object does not change its state unless it has received a request to do so. In a web application that is an HTTP request which can either be a GET or a POST. In a database application that change in state can only be in response to a request to perform an insert, update or delete operation. Provided that the code for each of those operations has been properly written, and regardless of whether that code is reused or not, the operation should be successful regardless of where the request originated.

Summary

The idea that the volume of reusability in your codebase is directly proportional to the volume of bugs is opposite to what I have experienced in my career in software which has spanned four decades. If you take a block of code that was duplicated in many places and put it into a reusable module and bugs appear then that, in my mind, can only mean one of two conclusions - either the bug was there to begin with or you added it yourself with your clumsy coding skills. Any junior programmer can write code that performs lots of functions without realising that some of those functions have already been coded somewhere else. In that case it is the teams leader's failure to provide a well-documented repository of these common functions which is to blame.

Even worse than not being aware that other programmers have written functions that you could reuse is you yourself writing blocks of code which appear time and time again in other programs that you write. When I found myself in this situation I could not add this code to a central library of reusable functions as the team leader did not maintain, or even know how to maintain, such a library. Instead I found myself copying blocks of code from one program''s source file to another. It got to the point where instead of copying in several small blocks of common code I would start by copying an entire program and just work on the code that was different. This is when I began to structure my code to separate the similar from the different so that I could more easily isolate the blocks of code that needed to be changed. When I became team leader I found out how to create a library of reusable components that I could share with my team, and later with other teams. In 1985 I extended this library into a framework which controlled the running of different applications by building my first Menu and Security System (now called a Role Based Access Control (RBAC) system)

I found out many years later when I came across a paper called Designing Reusable Classes, which was published in 1988 by Ralph E. Johnson & Brian Foote, that this process of separating the similar from the different is called abstraction, and leads to a style of software development known as programming-by-difference. The main point of the paper was that simply using encapsulation, inheritance and polymorphism in your software is not good enough to guarantee that it will contain lots of reusable components, or as it says:

Object-oriented programming is not a panacea. Program components must be designed for reusability. There is a set of design techniques that makes object-oriented software more reusable.

Before I rewrote my development framework in PHP I built a small sample application to demonstrate (to myself at first) that I could write effective software in PHP using this new-fangled buzzword called object oriented programming. I did not not go on any training courses, instead I read the PHP Manual, found some code examples on the interweb thingy and in some books, and combined this with my decades of previous programming experience and simply followed my nose and wrote code that worked, and then refactored it until it worked better. By "better" I mean replacing large swathes of genric/boilerplate code with reusable modules that can be called many times instead of being duplicated many times. This meant that I only had one central version of that code that was easier to debug, and then that bug-free version was instantly available wherever it was referenced.

I started off by creating a separate class for each database table to carry out the CRUD operations that could be performed on that table. After creating my second class I move all the duplicated code into an abstract table class so that it could be shared using inheritance. When I found that I needed to insert some custom code into some concrete table classes I found that my use of an abstract class with its existing set of fixed/invariable methods made it easy as I could implement the Template Method Pattern by adding in some customisable "hook" methods which did nothing unless they were given implementations in any concrete subclass. This means that in every concrete table class the generic/boilerplate code is inherited from the abstract class which means that the only code required is that for the unique business rules.

When you look at all the components in the structure diagram in Figure 1 above and consider that because of the RADICORE framework every user transaction (use case) with basic functionality can be generated at the touch of a few buttons without writing any code whatsoever - no PHP, no HTML and no SQL - the amount of generic/boilerplate code which pre-built and reusable is as follows:

The overall effect of all this reusability is that my levels of productivity have increased by a huge amount. If you take the creation of a standard family of forms as an example, this is how long it took in each of my main development languages:

If you think you can do better then I suggest you take this challenge.


References


counter