Tony Marston's Blog About software development, PHP and OOP

Strict typing is for stick-in-the-muds

Posted on 24th June 2024 by Tony Marston

Amended on 24th November 2024

Introduction
Definitions
Type safety
Static typing
Dynamic typing
Strong typing
Weak typing
Duck typing
Type juggling (coercion)
Type casting
Typed structures have been superseded by untyped arrays
PHP's type system
Original type handling
Example of weak typing
The handling of NULL values
Introduction of type hinting
Introduction of strict typing
Pedantic typing
A new inconsistency
Validating user input
The hidden cost of strict typing
What is a "stick-in-the-mud"?
Conclusion
References
Amendment History
Comments

Introduction

The debate about which type system is better, either static or dynamic, is a religious war which will never produce a clear winner. Too may people are entrenched in their views and they will never change despite any the facts and figures which you may throw at them. Mind you, there have been few studies which compare the two styles scientifically as they are mostly anecdotal opinions.

Having spent 20 years working with strictly typed languages and another 20 years working with a dynamically typed language I feel that I am more qualified than most to offer an opinion. Those who have only ever worked with statically typed languages have probably had it drilled into them that it is the best, and either they refuse to even try an alternative, or if they do they struggle to make the mental leap from one system to the other.

Definitions

Before we get started here are some definitions:

Type safety

In computer science, type safety and type soundness are the extent to which a programming language discourages or prevents type errors. The behaviors classified as type errors by a given programming language are usually those that result from attempts to perform operations on values that are not of the appropriate data type, e.g., adding a string to an integer when there's no definition on how to handle this case. This classification is partly based on opinion.

Type enforcement can be static, catching potential errors at compile time, or dynamic, associating type information with values at run-time and consulting them as needed to detect imminent errors, or a combination of both.

Static typing

Static typing is a typing system where variables are bound to a data type during compilation. Once a variable is assigned a data type it remains unchanged throughout the programs execution. This binding promotes type safety and detects errors at an early stage. Since the compiler knows the data types during the development process it can catch errors before runtime resulting in reliable software.

Dynamic typing

Dynamically typed languages check the types and look for type errors during runtime. You can change a variable's type at runtime simply by assigning a value of a different type.

A Dynamic programming language has the following characteristics:

In computer science, a dynamic programming language is a class of high-level programming languages which at runtime execute many common programming behaviours that static programming languages perform during compilation. These behaviors could include an extension of the program, by adding new code, by extending objects and definitions, or by modifying the type system. Although similar behaviors can be emulated in nearly any language, with varying degrees of difficulty, complexity and performance costs, dynamic languages provide direct tools to make use of them. Many of these features were first implemented as native features in the Lisp programming language.

Most dynamic languages are also dynamically typed, but not all are. Dynamic languages are frequently (but not always) referred to as scripting languages, although that term in its narrowest sense refers to languages specific to a given run-time environment.

Strong typing

Strong typing: the type of an object can't change.
Strong typing means that variables are bound to specific data types, and will result in type errors if types do not match up as expected in the expression - regardless of when type checking occurs.
Strong typing prevents mixing operations between mismatched types. In order to mix types, you must use an explicit conversion.

Weak typing

Weak typing: the type of an object can change.
Weak typing means that variables are not bound to a specific data type
Weak typing means that you can mix types without an explicit conversion.

Duck typing

It means that rather than checking a tag to see whether a value has the correct general type to be used in some way, the runtime system merely checks that it supports all of the operations performed on it. Those operations may be implemented differently by different types.

The name Duck Typing comes from the expression If it walks like a duck and it quacks like a duck, then it must be a duck.

Type juggling (coercion)

When an expression expects a variable to be of a particular type but is given a value of a different type then that value will be implicitly converted. See the PHP manual at Type Juggling

This is where the value is treated as if it were the correct type without any intervention by the programmer.

In HTML the absence of a value is denoted with an empty string, while in SQL it is denoted with the value NULL. This is why PHP started out by being perfectly capable of coercing these two values without any intervention from the programmer.

This ability was immensely useful to PHP which was specifically designed to help programmers create web pages with dynamic content. These programs have HTML at the front end and SQL at the back end, with PHP dealing with the business rules in the middle. Programmers who are familiar with HTML and SQL should recognise immediately that neither of those technologies uses data types other than strings for both input and output. Note that HTML input allows blank strings while SQL allows NULL values. PHP's inbuilt ability to coerce each string, including empty strings and NULLs, into a value of a specific type meant that programmers could stop wasting time on trivial type conversions and concentrate on getting the job done. This was a pragmatic solution.

The only time that type juggling would fail was if a value contained characters that were inconsistent with the expected type. Thus the string "123.45" can be coerced into a number while "10 green bottles" cannot. It is therefore incumbent on the programmer to check the validity of all user input at the earliest possible opportunity. Anyone who fails to do so has only themselves to blame. Their incompetence is not the fault of the language, it is theirs and theirs alone. All the tools are there, they just have to learn to use them.

Type casting

This statement converts a variable to a different type. See the PHP manual at Type Casting

This requires the programmer to insert code to explicitly convert the value to the required type, using a different variable if necessary, before the call to the function will be accepted. This adds to code bloat.


Typed structures have been superseded by untyped arrays

Strict typing is only necessary when you are using pre-defined structures as the entire structure is passed around as a single argument.

Before PHP came along earlier languages, such as COBOL, were strictly typed for the simple reason that they used predefined structs (also known as records or composite data types). All input/output operations required a pre-defined and pre-compiled record structure which identified precisely the type and size of every piece of data that was passed in that operation. The entire structure is passed as a single argument. It is imperative that the receiving structure matches the sending structure otherwise the receiving program will not be able to see the correct values. Here is an example:

01  customer-record.
    05  cust-key            PIC X(10).
    05  cust-name.
        10  cust-first-name PIC X(30).
        10  cust-last-name  PIC X(30).
    05  cust-dob            PIC 9(8).
    05  cust-balance        PIC 9(7)V99.

The total length of this record is 87 bytes. Numbers are stored as ASCII digits unless the PICTURE clause is followed by COMP (short for USAGE IS COMPUTATIONAL), in which case PIC 9(4) COMP would be a 2 byte integer and PIC 9(7)V99 COMP would be a 4 byte integer.

The structure used in both the sending and receiving procedures must be identical otherwise the retrieved data will be corrupted.

Each form/screen had to be built and compiled in order to define all the data items on that screen. Each database access had to define every column used in that access. The application code that dealt with that I/O operation had to use the exact same data structure otherwise the the wrong value would be retrieved and chaos would ensue. The process of defining the record structure for each operation also required that each data item be listed in the correct order. Trying to access a string as an integer, or an integer as a string, would result in corrupt data. If the sending and receiving structures had the data items defined in different orders this would result in corrupt data. Arguments passed into and out of subroutine calls were also structures, so again the structures used by the calling and the called components had to match precisely otherwise this would result in corrupt data.

It was physically impossible to change an item's type from its original definition in its structure.

Because every input operation - either from a screen or a database - carried out within these early languages had to use a pre-defined structure then it was possible for the compiler to detect and report any movement of data if the data types were not compatible. It was therefore physically impossible to change an item's type from its original definition in its structure. This behaviour was a by-product of using a pre-defined structure.

Instead of creating the structures manually I created a utility which extracted the structure details directly from the compiled form or the database schema and create COPYLIB entries.

In my COBOL days I helped ensure that the structures used within each program were always synchronised with the structures used by the forms and the database by creating a program called COPYGEN which would read the structures directly from the forms file or database and write them as text files which could be imported into a copy library. When a program required one of these structures it would read it from the copy library instead of having to be hard-coded by the developer. Thus when any structure changed all that was necessary was to rerun the COPYGEN program and recompile all those programs which referenced that structure. This simple piece of automation cut out a lot of developer-induced program bugs and helped with programmer productivity.

PHP does not use predefined and typed structures for any input operation, it uses dynamic arrays which are untyped.

PHP does not use predefined and typed structures for any input operation, it uses dynamic arrays which are untyped. All values appear as strings, and the language was designed to auto-convert each string into the desired type depending on the context. These arrays, either from an HTML form or the database, can be regarded as dynamic structures as the contents of an array does not need to be defined before it is filled with data. The receiving program will use whatever array is passed to it. When the contents of an HTML form is submitted to a PHP script it appears in the $_POST array where the contents are undefined. It is up to the receiving program to detect what values are present.

Arrays produced from HTML forms and SQL databases are always untyped - the values are all strings which are automatically coerced into the relevant types as and when necessary.

$_POST is an associative array which contains a list of name => value pairs where the value is always a string. At runtime values are coerced into the relevant types as and when necessary. This is because the HTML document does not contain any type information for each of its fields. When data is retrieved from the database the result appears as an indexed array of associative arrays. The first level is indexed by row number, which always starts at zero. Each row is an associative array with a separate field for each item specified in the SELECT query. Again each column of data appears as a string as the SQL output is designed to be sent in human-readable format. Again it is not necessary to define the structure of the array before running the query as the array is built dynamically by the DBMS when the query is executed.

The only different between HTML data and SQL data is for empty values - in the $_POST array they appear as empty strings while in the SQL array they appear as NULLs. This did not matter in PHP 4 as any variable containing an empty string, NULL or FALSE was regarded as being empty() and was always coerced successfully into an empty value for the relevant type. This standard behavior has now been removed by those idiot core developers, thus proving that they do not understand the roots of PHP and are seeking to change it to suit their own perverse interpretations of how a "proper" language should behave.

Because the structure of an array does not have to be defined before it can be filled with data it provides a dynamic mechanism which is both powerful and flexible for the sending and receiving of data. I can construct methods which send and receive data as single array arguments and these arrays can contain any number of columns from any number of sources - either HTML or SQL - without having to be amended. This is why I use a standard $fieldarray variable as the input and output arguments in my common table methods which are defined in my abstract table class and inherited by every concrete table class. This provides the polymorphism which allows me to access any Model class from any Controller, using that mechanism called Dependency Injection, which contributes directly to the power of my Transaction Patterns and my high rates of productivity.

It would appear that many developers learned about OOP in one of those early compiled languages which used structs and which required strict typing. They assumed that this was the way it should be done, and they cannot adjust their thinking to deal with the new breed of languages which are not compiled and which do not require strict typing. PHP does not use structs which have to be defined for each input/output operation, it uses untyped arrays which are infinitely more flexible. Being dynamic it means that I can define a function with an array argument, such as my $fieldarray, and at runtime I can fill it with any amount of data from any source. I can change the contents of this array at any time without having to modify any method signatures, thus proving that my software is as loosely coupled as it could possibly be. This directly contributes to the increase in polymorphism which is available in my framework, which leads to an increase in the volume of reusable code, which leads to a decrease in the code which I have to write, and this contributes to my high rate of productivity.


PHP's type system

PHP's default type system is dynamically typed and weakly typed. It is based on untyped arrays and not typed structures.

Original type handling

While many new developers are quick to complain about how PHP works they should realise that it was initially built to provide wrappers for the functions found in the C language, as explained in History of PHP:

The language was deliberately designed to resemble C in structure, making it an easy adoption for developers familiar with C, Perl, and similar languages.

This also meant that C's type system was carried forward into PHP, as explained in RFC: Strict and weak parameter type checking which was written in 2009

PHP's type system was designed from the ground up so that scalars auto-convert depending on the context. That feature became an inherent property of the language, and other than a couple of exceptions the internal type of a scalar value is not exposed to end users. The most important exception is the === operator - however, this operator is used in very specific situations, and obviously only in the context of comparisons. While there are other exceptions (e.g. gettype()) - in the vast majority of scenarios in PHP, scalar types auto-convert to the necessary type depending on the context.

For that reason, developers - even seasoned ones - will feel very comfortable sending the string "123" to a function that semantically expects an integer. If they know how PHP works internally - they rely on the fact the function will auto-convert the type to an integer. If they don't (and many don't) - they don't even think about the fact that their "123" is a string. It's a meaningless implementation detail.

For these reasons - strict type checking is an alien concept to PHP. It goes against PHP's type system by making the implementation detail (zval.type) become much more of a front-stage actor.

In addition, strict type checking puts the burden of validating input on the callers of an API, instead of the API itself. Since typically functions are designed so that they're called numerous times - requiring the user to do necessary conversions on the input before calling the function is counterintuitive and inefficient. It makes much more sense, and it's also much more efficient - to move the conversions to be the responsibility of the called function instead. It's also more likely that the author of the function, the one choosing to use scalar type hints in the first place - would be more knowledgeable about PHP's types than those using his API.

Finally, strict type checking is inconsistent with the way internal (C-based) functions typically behave. For example, strlen(123) returns 3, exactly like strlen('123'). sqrt('9') also return 3, exactly like sqrt(9). Why would userland functions (PHP-based) behave any different?

Proponents of strict type hinting often argue that input coming from end users (forms) should be filtered and sanitized anyway, and that this makes for a great opportunity to do necessary type conversions. While that may be true, it covers a small subset of type checking scenarios. For example, it doesn't cover input coming from 'trusted' sources like a database or files. It also doesn't account for the many developers who are simply unaware of PHP's internal type system, or that presently don't see the need to explicitly do type conversions even if they do sanitize their input. Not to mention those that don't sanitize their input at all...

Introducing 'weak' or auto-converting type hinting

The proposed solution implements a 'weaker' kind of type hinting - which arguably is more consistent with the rest of PHP's type system. Instead of validating the zval.type property only - it uses rules in line with the spirit of PHP and it's auto-conversion system to look into the value in question, and determine whether it 'makes sense' in the required context. If it does - it will be converted to the required type (if it isn't already of that type); If it doesn't - an error will be generated.

For example, consider a function getUserById() that expects an integer value. With strict type hinting, if you feed it with $id, which happens to hold a piece of data from the database with the string value "42", it will be rejected. With auto-converting type hinting, PHP will determine that $id is a string that has an integer format - and it is therefore suitable to be fed into getUserById(). It will then convert the value it to an integer, and pass it on to getUserById(). That means that getUserById() can rely that it will always get its input as an integer - but the caller will still have the luxury of sending non-integer but integer-formatted input to it.

The key advantages of the proposed solutions are that there's less burden on those calling APIs (fail only when really necessary). It should be noted that most of the time coding is spend consuming existing API's and not creating new ones. Furthermore it's consistent with the rest of PHP in the sense that most of PHP does not care about exact matching zval types, and perhaps most importantly - it does not require everyone to become intimately familiar with PHP's type system.

Note the following statements:

PHP has its roots in the C language which had static typing, but it was weakly enforced and implicit conversions were possible.

PHP has its roots in the C language which had static typing, but it was weakly enforced and implicit conversions were possible. It was developed specifically to make it easy to create web pages that were dynamic instead of static by providing the ability to create HTML templates whose content could be supplied from a database.

Developers familiar with HTML and SQL should be instantly aware that neither has typed variables - everything is a string. Just as C had implicit conversions so did PHP, so if it expected a number but was given a string it would only fail if that string could not reasonably converted into a number. Thus the values "123" and "123.45" can both be accepted as numbers but "123x" cannot. Unacceptable values could easily be detected using the is_numeric(), is_int() or is_float() functions. Note that NULLs would be coerced to 0 (zero).

Note that in HTML the absence of a value is denoted with an empty string, while in SQL it is denoted with the value NULL. This is why the language up to and including version 5 was perfectly capable of coercing these two values without any intervention from the programmer. Then the purist S-O-Bs took over and f*cked things up.

Note also that any implicit conversion was carried out within the function to avoid forcing the developer to perform an explicit conversion before calling the function as this would be both counterintuitive and inefficient.

I have this to say on the following statement:

Proponents of strict type hinting often argue that input coming from end users (forms) should be filtered and sanitized anyway, and that this makes for a great opportunity to do necessary type conversions.

The first part of this statement - input coming from end users (forms) should be filtered and sanitized - is perfectly true. I have catered for this is my framework by automatically invoking my validation object on all input and update operations.

The second part of this statement - this makes for a great opportunity to do necessary type conversions - identifies an option, not a requirement. It does not matter if all the values are left as strings as PHP has the ability to automatically coerce strings into the expected types. Adding code to perform this type juggling is therefore NOT necessary, so as far as I am concerned the act of adding code to perform manually what is already performed automatically is a violation of YAGNI.

Example of weak typing (without type hints)

It is possible to declare a function with argument names but no types, as in:

function add($x, $y)
{
    return $x + $y;
}

$result = add(1,2);
echo $result; // 3

$result = add(1.0,2.5);
echo $result; // 3.5 (both values treated as floats)

$result = add(1,'2');
echo $result; // 3 (the string '2' is coerced into the number 2)

$result = add(1,null);
echo $result; // 1 (null is coerced into the value zero)

$result = add(1,'');
echo $result; // BEFORE V8.1: 1 ('' is coerced into the value zero)
// AFTER V8.1: Uncaught TypeError: Unsupported operand types: int + string in ...

$default = add('Hi','There');
// Fatal error: Uncaught TypeError: Unsupported operand types: string + string in ...

Notice here that type checking and type coercing is performed by statements within the internal function. The only way for this behaviour to be duplicated in user-defined functions is for the developer to insert the code necessary to carry out any type juggling.

The handling of NULL values

Since PHP was created it was standard behaviour for internal functions and PHP extensions to silently accept null values for non-nullable arguments in coercive typing mode.

It was clearly stated in this RFC that Internal functions (defined by PHP or PHP extensions) currently silently accept null values for non-nullable arguments in coercive typing mode. This standard behaviour was not documented in any type hints in the PHP manual simply because the ability to mark them as nullable was not available until nullable types were implemented in version 7.1 which introduced Nullable type syntactic sugar which stated the following:

A single base type declaration can be marked nullable by prefixing the type with a question mark (?). Thus ?T and Tnull| are identical.

That was 5 years earlier than version 8.1, but the manual was never updated to ensure that the documentation matched the behaviour.

To say that the treatment of non-nullable arguments had been incorrect for the past 20 years was blatantly incorrect. Although it was not specified in any method signatures it was described in the following sections of the manual:

Converting to string
String conversion is automatically done in the scope of an expression where a string is needed.
null is always converted to an empty string.
Converting to integer
To explicitly convert a value to int, use either the (int) or (integer) casts. However, in most cases the cast is not needed, since a value will be automatically converted if an operator, function or control structure requires an int argument.
null is always converted to zero (0).
Converting to boolean
When converting to bool, the following values are considered false: Every other value will always be converted to FALSE.
Converting to array
Although it was not explicitly stated in the manual, if the value NULL was passed to a function which expected an array, in PHP4 it was implicitly treated as an empty array. This behaviour was changed in PHP5 by some pedantic S-O-B without being supported by an RFC, as shown in this bug report. This now generates the following message:
Fatal error: Uncaught TypeError: array_merge(): Argument #2 must be of type array, null given

If $arr doesn't exist yet or is set to null or false, it will be created, so this is also an alternative way to create an array.

While all PHP's internal functions would accept null values in any arguments the only way to duplicate this behaviour in any user-defined function was for the developer to insert code the necessary code to carry out any type juggling.

Introduction of type hinting

Scalar type hinting was introduced in PHP7 following the adoption of RFC: Scalar Type Declarations v5, which made it possible to "hint" at an argument's type, as in:

function add(int $x, int $y)
{
    return $x + $y;
}

$result = add(1,2);
echo $result; // 3

$result = add(1.0,2.5);
// Deprecated: Implicit conversion from float 2.5 to int loses precision in ...
echo $result; // 3 (both floats treated as integers)

$result = add(1,'2');
echo $result; // 3 (the string '2' is coerced into the number 2)

$result = add(1,null);
// Fatal error: Uncaught TypeError: add(): Argument #2 ($y) must be of type int, null given

$default = add('Hi','There');
// Fatal error: Uncaught TypeError: add(): Argument #1 ($x) must be of type int, string given, called in ... and defined in ...

Note that RFC: Scalar Type Declarations v5 contained the following statements:

Summary
These type declarations would behave identically to the existing mechanisms that built-in PHP functions use.
Behaviour of weak type checks
A weakly type-checked call to an extension or built-in PHP function has exactly the same behaviour as it did in previous PHP versions.
Behaviour of strict type checks
These strict type checking rules are used for userland scalar type hints, and for extension and built-in PHP functions.
Why both?
So far, most advocates of scalar type hints have asked for either strict type checking, or weak type checking. Rather than picking one approach or the other, this RFC instead makes weak type checking the default, and adds an optional directive to use strict type checking within a file.
Nullable and union types
Interest has been expressed in a system to allow for union-types: int|float or nullable-types: int?.
As both of these affect more than just scalar typing, both are considered outside of scope for this proposal.
[NOTE: The idea of nullable types was discussed in a different RFC and implemented in version 7.1]
Backward Incompatible Changes
Since the strict type-checking mode is off by default and must be explicitly used, it does not break backwards-compatibility.
Unaffected PHP Functionality
When the strict type-checking mode isn't in use (which is the default), function calls to built-in and extension PHP functions behave identically to previous PHP versions.

Introduction of strict typing

Strict typing was added to the language in version 7.0, but without it being turned on PHP would still operate in its coercive typing mode. This changed following PHP RFC: Deprecate passing null to non-nullable arguments of internal functions. The logic given for this change was as follows:

Internal functions (defined by PHP or PHP extensions) currently silently accept null values for non-nullable arguments in coercive typing mode. This is contrary to the behavior of user-defined functions, which only accept null for nullable arguments. This RFC aims to resolve this inconsistency.
...
After the changes in PHP 8.0, this is the only remaining fundamental difference in behavior between user-defined and internal functions.

The statement contrary to the behavior of user-defined functions, which only accept null for nullable arguments is based on the fact that none of the type hints for arguments of internal functions were marked as nullable in the PHP manual despite them behaving as if they were nullable. The ability to mark any argument as nullable was not available until version 7.1 with the introduction of nullable types, but the function signatures in the PHP manual were never updated to reflect this fact even though version 8.1 did not appear until 5 years later.

The only inconsistency which actually existed was therefore with the function signatures in the manual and the way that those functions actually behaved, and in order to resolve this "inconsistency" the core developers had two choices:

The core developers could resolve this "inconsistency" in one of two ways. They chose the option which broke backwards compatibility for virtually every script on the planet.
  1. Update the documentation to reflect the current behaviour, which is that all types are nullable.
  2. Change the behaviour of the functions to agree with the documentation.

One of these options would be dogmatic and pedantic while the other would be pragmatic.

One of these options would produce a BC break which would affect virtually every PHP script on the planet, while the other would produce no BC breaks at all. In my humble opinion the core developers chose the wrong option, the anti-pragmatic option, which is why I called them lazy, incompetent idiots.

It is possible to enable strict mode on a per-file basis by inserting declare(strict_types=1) at the start of a file. In strict mode, when a function is called only a value corresponding exactly to the type declaration will be accepted, otherwise a TypeError will be thrown. The only exception to this rule is that an int value will pass a float type declaration.

declare(strict_types=1);
function add(int $x, int $y)
{
    return $x + $y;
}

$result = add(1,2);
echo $result; // 3

$result = add(1.0,2.5);
// Fatal error: Uncaught TypeError: add(): Argument #1 ($x) must be of type int, float given, called in ... and defined in ...

$result = add(1,'2');
// Fatal error: Uncaught TypeError: add(): Argument #2 ($x) must be of type int, string given, called in ... and defined in ...

$default = add('Hi','There');
// Fatal error: Uncaught TypeError: add(): Argument #1 ($x) must be of type int, string given, called in ... and defined in ...

If an application contains a mixture of files with and without the declare(strict_types=1) directive the following warning applies:

NOTE
Strict typing applies to function calls made from within the file with strict typing enabled, not to the functions declared within that file. If a file without strict typing enabled makes a call to a function that was defined in a file with strict typing, the caller's preference (coercive typing) will be respected, and the value will be coerced.

Note that strict typing has no effect unless function signatures are defined with type hints. Those without type hints are not affected.

Pedantic typing

This is what I call the taking of strict typing to a higher level of silliness. The word pedant has the hollowing meanings:

Pedant
Synonyms: dogmatist, purist, quibbler, hair-splitter, nit-picker

The PHP core developers have a history of being pedantic S-O-Bs who love to enforce their personal ideas of how things should be done on the rest of the community. They even like to sneak things in under the radar and break backwards compatibility without the decency of going through the RFC process, just like they did with the array_merge() function in a PHP5 release, as shown in this bug report. This function had always treated a NULL argument as an empty array, just like it did for several other types, but it was changed for no good reason other than to implement someone's idea of purity. Pragmatism had given way to dogmatism and pedantry.

The PHP manual at Strict typing clearly states the following:

By default, PHP will coerce values of the wrong type into the expected scalar type declaration if possible. For example, a function that is given an int for a parameter that expects a string will get a variable of type string.

It is possible to enable strict mode on a per-file basis. In strict mode, only a value corresponding exactly to the type declaration will be accepted, otherwise a TypeError will be thrown. The only exception to this rule is that an int value will pass a float type declaration.

The manual clearly states that the existing behaviour of coercive typing will remain in play (i.e. strict typing is turned OFF) unless it is specifically turned ON. THIS IS NOT TRUE!

This quite clearly states that the existing behaviour of coercive typing will remain in play (i.e. strict typing is turned OFF) unless it is specifically turned ON.

THIS IS NOT TRUE!

When PHP RFC: Deprecate passing null to non-nullable arguments of internal functions was implemented they turned on strict typing for internal functions whether you wanted it or not. This means that any function in the PHP manual which expects a non-nullable argument will no longer tolerate a NULL value and will generate one of the following messages instead:

Passing null to parameter #1 ($string) of type string is deprecated
Passing null to parameter #1 ($num) of type int|float is deprecated

As stated in introduction of strict typing this change was implemented to resolve an inconsistency between the documented behaviour of internal functions and their actual behaviour. This inconsistency could have been resolved in one of two ways, but the core developers chose the option which introduced a massive BC break which will affect virtually every PHP script on the planet.

This inconsistency did not actually exist until Nullable Types were introduced into version 7.1 which made it possible for a type hint to indicate that an argument was nullable instead of always being non-nullable. This is shown in the following:

function say(?string $msg) {
    if ($msg) {
        echo $msg;
    }
}
 
say('hello'); // ok -- prints hello
say(null); // ok -- does not print
say(); // error -- missing parameter
say(new stdclass); // error -- bad type

This was also subject to Nullable type syntactic sugar which stated the following:

A single base type declaration can be marked nullable by prefixing the type with a question mark (?). Thus ?T and T|null are identical.

A new inconsistency

Because the core developers chose to fix the inconsistency in the wrong way all they did was create a totally new inconsistency. While the internal functions can no longer silently coerce nulls into empty numeric or string arguments this is now inconsistent with how other parts of the language operate.

For example, the string operators do not object to arguments containing null, as shown in the following example:

$a = 'a';
$b = null;
$c = 'c';
$result = $a . $b . $c; // 'ac'
$result = "{$a}{$b}{$c}"; // 'ac'

Neither do arithmetic operators:

$a = 1;
$b = null;
$result = $a + $b; // 1

$foo = null;
$foo++; // 1

Validating user input

A common complaint I have seen in several blog posts is that PHP's handling of numeric strings is BAD and can cause all sorts of errors. Take the follow code snippet as an example:

$foo = 1 + "10 Small Pigs";   // PHP7: $foo is integer (11) and an E_WARNING is raised
                              // PHP8: TypeError: Unsupported operand types: int + string

If the value "10 Small Pigs" has come from user input, and that value is then used in an SQL INSERT query, then that query will fail. In this case the fault lies with the programmer and not PHP. Every programmer should be aware that when values from an HTML form are sent to your PHP script on the server they are presented in the $_POST array which is an array of strings. The developer must always treat all user input with suspicion as it is always possible, whether by accident or malicious design, to enter a value which could cause unexpected results. While all values in an SQL query are written as strings, the DBMS will check that each column's value is consistent with that column's specifications, and if it isn't the query will fail. The programmer can avoid this by validating every piece of input before it is processed so that if a field has an invalid value it can be sent back to the user with a suitable error message so that the mistake can be corrected. This is more preferable than the program aborting and having to be restarted.

So how do you validate user input? Simply put you have to know how each column is specified in the database (type, size, et cetera), then sanitise the data before it is sent to the database. This is done by comparing each column's value with its specifications. This validation can be done using any of the following functions:

If any value fails to match its specifications then it should be sent back to the user with a suitable error message so that the error can be corrected and resubmitted.

Validating all user input can be done in two ways:

As I use the second method (obviously) I never have to write any code to perform primary validation, only secondary validation.


The hidden cost of strict typing

PHP was specifically designed to help create dynamic web pages, and so uses HTML at the front-end and SQL at the back-end while it sits in the middle to move data back and forth between the two ends. Neither HTML nor SQL deal with typed data as both their input and output values are nothing but strings or arrays of strings. An HTML document and an SQL query are both strings, the input from an HTML document and the result of an SQL query are both arrays of strings. Any field left blank in an HTML document will be returned as an empty string even if it is supposed to be treated as a number. Any field in an SQL database can be marked as nullable, which means that string, date and numeric fields can be returned with NULL values.

This means that if strict typing is turned on then your PHP code must contain statements to explicitly convert these NULL values into the empty versions of their expected types. If you don't then you are in serious danger of having your code fall over with a TypeError.

Note that even if you do not have strict typing turned ON it does not matter to any internal functions as, from PHP8.1 onwards, they will always operate in strict mode.

In other languages where strict typing was built in from the start it has lead to peculiar solutions when trying to deal with function arguments which could be present with different data types, but without the need to cast each value to the correct type before calling the function. This method is called function overloading and operates as shown in the following C++ example:

int Add(int a, int b) {  // add two integers.
  return a + b;
}

double Add(double a, double b) {  // add two doubles.
  return a + b;
}

This will not work in PHP you cannot have duplicated function names.


What is a "stick-in-the-mud"?

The dictionary definition of such a person is as follows:

My critics (of whom there are many) often accuse me of being a stick-in-the-mud simply because I refuse the adopt all the latest features which have been added to PHP, such as namespaces, autoloaders, interfaces and attributes. What they fail to understand is that, as a follower of the YAGNI principle, I only use a feature if I can find a genuine need for it. I'm not like George Mallory who, when asked why he wanted to climb a mountain replied with "because it's there".

It is wrong to say that I have resisted change as, since the turn of the century, I have done the following:

I have successfully made the switch between three different languages, and in none of those cases did I complain that the new language was different from the old one. Instead I examined the differences and learned how to take advantage of them in order to write software that was more cost effective. With all these languages there were often several different ways of achieving the same result, so I always chose the way that involved the best combination of the simplest, least amount and most effective code. In my multi-decade career that has always been shown to be the best approach for producing software that can be maintained by myself and, more importantly, by others.

By embracing this totally different way of developing software I have seen my productivity go up by leaps and bounds. What used to take me a week to do in COBOL and one day to do in UNIFACE I can now do in 5 minutes with PHP.

My critics tell me that I am writing legacy software. This just proves that, just like most of their mis-interpretations of basic OO principles, they don't know what they are talking about. In 2002 I was using PHP4 and MySQL3, and if I had not maintained and updated that software then it would still require those ancient versions to run. That is not the case - I have upgraded my versions of PHP and MySQL regularly, and I am currently on PHP8.3 and MySQL8.4. I have also added database drivers for PostgreSQL, Oracle and SQL Server.

My critics tell me that I am still writing PHP4 code. If they bothered to look they would see that the vast majority of functions that existed in PHP4 still exist in PHP8, so there is nothing wrong in using them. Any harmful features have been removed from the language, and as I did not use most of them in the first place their loss had minimal effect. They say that PHP4's object model was inadequate, but I disagree. It supported encapsulation, inheritance and polymorphism, and anybody who cannot write effective software with those is not an OOP (Object Oriented Programmer, they are a PPP (Piss-Poor Programmer). The only features which were added in later versions which I readily embraced were abstract classes, the DateTime class and, fairly recently, traits. Everything else I ignore as I can't find a use for it.


Conclusion

Having spent 20 years working with compiled and strictly typed languages and another 20 years working with a dynamically typed language I much prefer the freedom, flexibility and speed of development that a dynamically typed language provides. Anyone who was initially trained to use a strictly typed language and switches to something like PHP and who doesn't have the mental agility to deal with the differences should stop complaining as all the deficiencies are with themselves and not the language. Anyone who continues to use it and tries to convert it to suit their own tastes and f*ck things up for the rest of us have a special place in Hell reserved just for them.

PHP was designed to act as an interface between an HTML front end and an SQL back end, and neither of these technologies, while they both deal with data, has any sort of type system - everything is either a string or an array of strings. The only difference between the two is that where HTML has empty strings SQL has NULL values. PHP was designed from the ground up to auto-convert scalars depending on the context. This meant that in order to avoid type errors all the programmer had to do was check each string for invalid characters and the language would safely coerce each string into its expected type. Note that NULL values and empty strings had never, at least until version 8.1 was released, been treated as invalid characters in any type coercion.

It is much more efficient to perform type checking within the API instead of forcing it upon its users.

This all changed when strict typing was introduced as it forces every programmer to manually convert each value into the correct type before calling an API. This may satisfy our dogmatic and pedantic brethren, but from a pragmatic viewpoint it is is counterintuitive and inefficient. An API is written once but can be used thousands or even millions of times, so anyone with more than two brain cells to rub together should be able to see that it is much more efficient to perform the type checking within the API instead of forcing it upon the users of that API. The fact that the core developers did not see that is the reason why I called them lazy, incompetent idiots.

It is wrong to say that static typing is "better" than dynamic typing. It is nothing but a matter of personal opinion, and some bigots do not like their opinion being questioned. They fail to understand that all type errors can be detected and avoided with basic validation which every programmer should be performing anyway. While it is true that static languages can detect type errors at compile time while with dynamic languages you have to wait until runtime, unbiased studies have indicated that the number of type errors detected is about the same. But here's the rub - the absence of type errors does not prove that the program is correct as it could still be filled with logic errors, and those don't appear until you run the software. Instead of waiting until the software runs in a production environment it should be thoroughly tested before it is released. This is why some experienced people are in favour of dynamic languages as they become more productive and any errors are detected during the testing phase. In his blog Type Wars Uncle Bob states the following:

When a Java programmer gets used to TDD, they start asking themselves a very important question: Why am I wasting time satisfying the type constraints of Java when my unit tests are already checking everything? Those programmers begin to realize that they could become much more productive by switching to a dynamically typed language like Ruby or Python.

PHP was designed to assist in the development of web pages with dynamic content, which means that it has to interface with HTML as well as SQL, and both of these technologies are typeless as every value they deal with, both as input and output, is a string. This means that each value can exist in one of three states:

Provided that each column's value will not be rejected by the database it is perfectly acceptable to leave it as a string throughout its passage through the software.

As stated earlier all user input should be validated before it is processed in order to ensure that each column's value is consistent with that column's specifications in the database. No value has to be converted to a particular type as it is always a string within the SQL query. All data read from the database appears as a string, but it does not have to be validated (that was done when it was input) or converted as it is always shown as a string when added to the HTML output. This leads to one simple conclusion - provided that each column's value will not be rejected by the database it is perfectly acceptable to leave it as a string throughout its passage through the software.

PHP's type juggling could cope with null values until version 8.1 when strict typing was introduced. Even when turned OFF all internal functions will now reject a null value for an argument.

This is how PHP originally behaved, and its implicit type juggling never caused me any issues in the last two decades during the development of the RADICORE framework and the large ERP application which was written with it. That was until the core developers threw a spanner in the works and stopped the coercion of nulls and empty strings even if strict typing had not been turned on.

I regard strict typing as a crutch for the mentally crippled. It is like the training wheels on a bicycle - they are OK when you are a child as they stop you from falling over and hurting yourself, but when you are an adult they simply get in the way. To say that statically-typed languages are better and more popular than dynamically-typed languages is not supported by the fact that 60% of the world's programming languages support dynamic typing and only 40% support static typing. This would indicate to me that those who support static typing are in the minority, and that the newer languages are more likely to support dynamic typing.

It should be noted that the notions of type safety and static typing seemed to be important in the earliest OO languages in the 1970s and 1980s when processors were very slow, very expensive and very large while programmers were, by comparison, very cheap. These notions were seen as ways to make it easier to optimise the code so that it would run faster. These arguments no longer hold water. Processors are now incredibly small, incredibly fast and incredibly cheap while programmers, by comparison, are incredibly expensive. The emphasis now is on the ability to write code faster as even supposedly inefficient code can still run faster than the blink of an eye. It is often more cost-effective to throw more hardware at the problem than to get a programmer to perform micro-optimisations that are barely visible.

All the early programs were compiled, and these checks were performed at compile-time as run-time checking was deemed to be problematic. Modern languages do not have the same problem. If you have a value which starts of as a string, such as input from an HTML form or an SQL database, but you want to use it as an integer in a unction, then the language must provide you with a mechanism to convert a value in one type to another type, and to throw an error only if that conversion cannot be safely made. This is called type juggling or coercion. So if a language has the ability to perform this coercion manually before a function is called then there is no practical reason why the same ability cannot be used within the function after it has been called. After all, this was was the way that PHP was designed to behave from the very start, so the removal of this ability can only been seen as a retrograde step forced into the language by a bunch of dogmatic, pedantic neanderthals who are still living in the past.

The article What to know before debating type systems contains this interesting observation:

Here endeth the lesson. Don't applaud, just throw money.


References

Here are some articles on strong and weak typing:

Here are some articles by people who actually like dynamic typing:

These are reasons why I consider some ideas on how to do OOP "properly" to be complete rubbish:


Amendment History

24 Nov 2024 Added Typed structures have been superseded by untyped arrays

counter