No sooner have I commented on Back to Basics - Three or Four OOP Pillars? which erroneously states that data abstraction is the fourth pillar of OOP (which it is not) and that data abstraction means using Data Transfer Objects (DTO) (which it does not) when I come across different articles which encourage other programmers to use DTOs. These erroneous articles are:
A DTO is an object that holds state but no behaviour, which is why it has been described as an anemic domain model, an anti-pattern which is described by Martin Fowler as follows:
The fundamental horror of this anti-pattern is that it's so contrary to the basic idea of object-oriented designing; which is to combine data and process together. The anemic domain model is just a procedural style design, exactly the kind of thing that object bigots like me ... have been fighting since our early days in Smalltalk. What's worse, many people think that anemic objects are real objects, and thus completely miss the point of what object-oriented design is all about.
I no not use DTOs in my multi-award winning framework for the simple reason that Encapsulation was described as The act of placing data and the operations that perform on that data in the same class.
The notion that I should split this up and put behaviour in one class and data in another class never entered my head. If it had I would have dismissed immediately as a stupid idea.
When I built my RADICORE framework I started by basing it on the 3-Tier Architecture which I had encountered in a previous language. This has a Presentation layer which dealt with HTML, a Data Access layer which dealt with SQL, and a Business/Domain layer which dealt with the business rules. The first thing that I noticed with PHP was that it did not use predefined structs (also known as composite data types, records or data buffers), it used untyped arrays. I noticed straight away that regardless of what data was submitted on an HTML form it appeared in a single $_POST array, and regardless of what data was being read from the database it appeared as another untyped array. Having a logical mind and being a pragmatist I asked myself a simple question: If data in the two outer layers appears as an array is there any reason why I cannot leave it in that array when it passes through the business/domain layer?
The answer was "No", so I decided not to waste any time (or keystrokes) in transforming that array into another format so that it could be processed by my domain (Model) objects. I have been using this method for over 20 years, and it has proved to be both simple and effective.
I shall now step through those articles and argue against every claim that DTOs are a Good Thing ™.
The first article under the microscope is Transform Data into Type-safe DTOs with this PHP Package.
This PHP Data Model package provides a lightweight, non-invasive way to hydrate type-safe PHP objects recursively. It uses reflection and PHP attributes to hydrate objects and instantiate them based on type hints:
Whoa! Anybody who says that it is necessary to put data into a type-safe PHP object is not singing from the same hymn sheet as the designers of PHP. Data appears in untyped arrays, and the idea of strict typing is alien to PHP, as described in RFC: Strict and weak parameter type checking where it says
PHP's type system was designed from the ground up so that scalars auto-convert depending on the context.
The idea of wasting time and effort in doing something which is totally unnecessary is also a violation of YAGNI. It is only something that a dogmatist would do. A pragmatist wouldn't touch it with a barge pole.
The main features of this argument are as follows:
Every statement made here is irrelevant and unnecessary. There is no boilerplate code associated with values in an array, the data is just data. Passing data around in untyped arrays is as easy as falling off a log. I have been doing it this way for over 20 years without any problems. All you are doing is providing solutions to problems which don't exist, problems which are figments of your imagination. In my framework data is transformed or validated using standard pre-written components. It is processed by custom business rules which can be placed in any of the available "hook" methods.
The statement Type safety is enforced by PHP itself
is totally wrong. All data starts off by being presented in untyped arrays. You do not need to convert any data from a string to a specific type before using it in a function as PHP uses coercive typing to deal with any differences automatically. Inserting extra code to do manually what is already done automatically is therefor a waste of effort and a violation of YAGNI.
Instead of writing defensive code to check and sanitise your data you should do what clever people do and include a validation object which is built into the framework and called automatically without the need for any extra code from the developer.
I don't need to use the #[Describe()]
attribute, or any other attribute for that matter, as I devised a far simpler solution 20 years ago which allows all primary validation to be carried out by standard code in the framework, not custom code created by the developer.
I do not need to add anything to any of my Model classes as all the necessary processing is carried out within the abstract table class which is inherited by every concrete table class.
Most PHP developers are used to working with arrays when they are at the start or years into their programming career because they seem to be very simple to use.
Arrays however fall short when things get complex. It may seem like they are fine, but they do lack types and visibility.
PHP arrays have always lacked types for one simple reason - PHP was designed from the very start to handle untyped arrays and to use coercive typing when a function argument should be a specific type. Anyone who does not understand that should not be using the language. As for visibility, this was never intended to be applied to individual elements within an array, only to the array as a whole.
There's no way to know what the data inside of the array looks like or what each of the keys contain, what type the values are etc.
This is a silly argument. When an HTML form is submitted all its data appears in the PHP script in a single $_POST array. You cannot tell just by looking at the source code what is in the array, you have to run the script. You have to deal with the contents of the array at run time. The code which processes that array must ensure that its contents is valid for the operation which is being performed. If you are clever you should be performing all primary validation using a standard validation object which automatically ensures that each piece of data is valid for the database column to which it is bound. This eliminates the need for a great deal of boilerplate code.
The sample code also shows separate methods for load()
, authorize()
and store()
which again is the sign of a novice programmer. I learned as long ago in the 1980s that when a group of functions always has to be called in the same order then it is more efficient to create a wrapper function for that group. In that way you make a single call to the wrapper function instead of a separate call to each member of that group. If you examine the contents of the insertRecord() and updateRecord() methods in my common table methods which reside in my abstract table class you will see what I mean. If I want to make changes to this group of functions all I need do is update the wrapper function.
This becomes a problem where in order to see all of the values and their types, you would have to debug the array that is being returned by the '$request?validated()' method call.
You don't need to see all the values and their types. All you need to know is that once the data has passed through the validation()
phase that all values conform to the specifications defined in the $fieldspec array.
This is one of the main reasons why data transfer objects are a great option to use instead of arrays.
As I can do what needs to be done with a plain array I consider that any time spent in defining a DTO and then filling it with data is time wasted and therefore a violation of YAGNI.
The main benefits that DTOs bring are:
- it brings structure to the data
- It shows what data is being transferred and its type
- It improves the intellisense support without any additional IDE or editor plugins
All these are either unnecessary or can be provided in ways which are far less complicated.
It may seem like DTOs are unnecessary at first because it will require you to create additional files but they do help in more complex projects that need to be maintained for years, especially for new people on the team to see what the code does and what data is being used.
The idea that DTOs are a help in more complex projects is a figment of your imagination. I used my framework to build an ERP application in 2007 which is still going strong today. It has grown from 6 subsystems to over 16 and contains many areas of code which I would call complicated, and all without the use of a single DTO.
The main reason why people may find DTOs complex is that either they are just not used to creating additional classes, thinking that it will complicate the project but on the other hand, most of the examples (including this one in this blog post) is very simple.
In my framework each entity (database table) requires only a single class. Replacing a standard array with a separate DTO would have zero benefit but would add costs in the form of time and unnecessary complexity. I would never entertain such an idea.
DTOs shine in situations when you have 5 or more values to work with in a single request / API endpoint as an example. This doesn't mean that DTOs are too much or unnecessary complexity when you have 2 values to work with.
My ERP application often has to work with way more than 5 values at a time, and I do not have any issues.
Having a consistent codebase is very important, makes your and your team's life easier to work with so it's better to use them everywhere if you choose to use them.
I prefer to have a consistent codebase without the unnecessary complexity of DTOs
Arrays may seem like a good idea when you use them but they have many flaws and are not as useful besides them being the easiest approach to use when working with data.
In my humble opinion PHP arrays are the best thing since sliced bread. As for them having many flaws and not as useful I heartily disagree.
As soon as I started working with PHP I knew that it was capable of doing everything that I needed. I was particularly impressed with the way it could handle data in flexible arrays instead of static structures. As an avid follower of the KISS principle I just love the way it lets me get the job done with the minimum of effort.
Then along comes a bunch of dunderheads who see to favour the KICK principle instead. They take something which is simple and add unnecessary complexity just to prove how clever they are. These clueless newbies don't realise that what they are doing is proving the exact opposite.
Here endeth the lesson. Don't applaud, just throw money.
These are reasons why I consider some ideas on how to do OOP "properly" to be complete rubbish: