Tony Marston's Blog About software development, PHP and OOP

The concept is OK but your implementation is not

Posted on 18th March 2018 by Tony Marston
Introduction
Control Values
Extra Values
Conclusion
Comments

Introduction

I recently came across a blog post at Database Modelization Anti-Patterns in which he described the use of Entity Attribute Values (EAV) in a relational database. He called this an anti-pattern as it presents the following problems:

  1. The value attribute is of type text so as to be able to host about anything, where some parameters are going to be integer, interval, inet or boolean values.
  2. The entity and parameter fields are likewise free-text, meaning that any typo will actually create new entries, which might not even be used anywhere in the application.
  3. When fetching all the parameters of an entity to set up your application's object, the parameter names are a value in each row rather than the name of the column where to find them, meaning extra work and loops.
  4. When you need to process parameter in SQL queries, you need to add a join to the params table for each parameter you are interested in.

Because of these problems the author reaches the following conclusion:

Never implement an EAV model, this anti-pattern makes everything more complex than it should for a very small gain at modeling time.

He suggests that the need for EAV in your database points to a fault in its design, but there may be a genuine reason for its use. For example, I have developed an ERP application which is a package used by many different organisations, and anybody who has ever worked on a software package will instantly tell you that while the package will cover the most common needs of a large number of users in a standard fashion there will ALWAYS be a customer who does something differently and wants the package to be customised to suit his specific needs. This may be because they have a non-standard requirement, or that they deal with a common situation in a non-standard way. Dealing with each customisation may require some different program logic, or it may require some additional fields in the database. It is generally not a good idea to modify the core code of the application for each customisation, which is why I built into my package the ability to deal with each customisation as a plugin which is kept separate from the core code. In this way the core code is kept pristine, and each customer only gets the customised plugins which are relevant to him.

When it comes to changes in the data requirements I have two approaches:

In this article I shall ignore the building of a custom database and concentrate on the two different ways in which I have implemented EAV - Control Values and Extra Values.

Control Values

This deals with a problem which I first wrote about in 2001 for UNIFACE and updated in 2006 for PHP in which the solution is to have a separate record in the database for each value instead of a separate column on a single control record. As my application is broken down into a series of separate yet interconnected subsystems this allows a different set of control values to be maintained for each subsystem. It makes it easy to add a new control value as all I do is add a new record to the control table instead of a new column to the control record. The database schema is as follows:

CREATE TABLE `mnu_control` (
  `record_id` VARCHAR(16) NOT NULL ,
  `field_id` VARCHAR(32) NOT NULL,
  `field_value` VARCHAR(255) NULL,
  PRIMARY KEY (`record_id`, `field_id`)
);

Here the record_id field holds the subsystem identity, so this single table can be used by any number of subsystems.

I have a standard class file called mnu_control.class.inc which encapsulates all the logic required to access the MNU_CONTROL table in the MENU database. This enables me to treat each value as it it were contained in a separate column in a single table instead of being in a separate record in that table. All the other subsystems in the application reference the same table, but this is how I handle the differences:

This particular type of transaction has its own Transaction Pattern.

Extra Values

Sometimes there may be a need to hold additional pieces of data for different objects within a subsystem instead of just once for the entire subsystem, so this is where I create a pair of tables for each of those objects.

First there is an XXX_EXTRA_VALUES table which is associated with each object, where XXX can be something like 'party', 'product' or 'order'. Note that the specifications for XXX_id should match the primary key for that object.

CREATE TABLE `XXX_extra_values` (
  `XXX_id` int NOT NULL,
  `extra_id` VARCHAR(40) NOT NULL,
  `extra_value` VARCHAR(80) NOT NULL,
  PRIMARY KEY (`<object>_id`, `extra_id`)
);

Then there is an XXX_EXTRA_NAMES table which provides the specifications for each extra value.

CREATE TABLE `XXX_extra_names` (
  `extra_id` VARCHAR(40) NOT NULL,
  `extra_name` VARCHAR(80) NOT NULL,
  `extra_desc` VARCHAR(255) NULL,
  `extra_type` VARCHAR(255) NOT NULL,
  PRIMARY KEY (`extra_id`)
);

On this table the extra_type can be set to one of the following:

When a value is added to the XXX_EXTRA_VALUES table it is then possible to validate the following:

This then takes care of problems #1 and #2 identified in the introduction.

The contents of these two tables can be maintained using the standard LIST1 forms family.

It is also possible to include data from these extra fields in the output from a standard call to getData() on the main table by modifying its _cm_pre_getData() method using code similar to the following:

function _cm_pre_getData ($where, $where_array, $parent_data=null)
{
    ....
		
    // create mapping details: 'external_name = internal_name'
    $mapping['length']     = 'LENGTH';
    $mapping['width']      = 'WIDTH';
    $mapping['height']     = 'HEIGHT';
    $mapping['weight']     = 'WEIGHT';
    $mapping['weight_uom'] = 'WEIGHT-UOM';
    $mapping['plan_date']  = 'PLAN-DATE';
		
    foreach ($mapping as $ext_name => $int_name) {
        // obtain values from PRODUCT_EXTRA_VALUES table
        $pattern = '/\bAS[ ]+'.$ext_name.'\b/i';
        if (!preg_match($pattern, $this->sql_select)) {
            $this->sql_select .= ", (SELECT extra_value FROM product_extra_values AS pev"
                                  ." WHERE pev.product_id=product.product_id"
                                    ." AND pev.extra_id='$int_name'"
                                    ." LIMIT 1"
                                  .") AS $ext_name";
        } // if
    } // foreach
    
    return $where;
		
} // _cm_pre_getData

It is also possible to have this extra data included in the standard call to updateRecord() by putting code into the _cm_post_updateRecord() method similar to the following:

function _cm_post_updateRecord ($fieldarray, $old_data)
{
    $product_id = $fieldarray['product_id'];
		
    // create mapping details: 'external_name = internal_name'
    $mapping['length']     = 'LENGTH';
    $mapping['width']      = 'WIDTH';
    $mapping['height']     = 'HEIGHT';
    $mapping['weight']     = 'WEIGHT';
    $mapping['weight_uom'] = 'WEIGHT-UOM';
    $mapping['plan_date']  = 'PLAN-DATE';
		
    // convert field names from 'external' to 'internal' and add/update/remove from the PRODUCT_EXTRA_VALUES table
    foreach ($mapping as $ext_name => $int_name) {
        if (array_key_exists($ext_name, $fieldarray) AND array_key_exists($ext_name, $old_data)) {
            if ($fieldarray[$ext_name] != $old_data[$ext_name]) {
                if (!$this->updateExtraData($product_id, $int_name, $fieldarray[$ext_name])) {
                    return $fieldarray;
                } // if
            } // if
        } // if
    } // foreach
		
    return $fieldarray;
		
} // cm_post_updateRecord

// ****************************************************************************
function updateExtraData ($product_id, $extra_id, $extra_value)
// add, update or remove a record from the PRODUCT_EXTRA_VALUES table.
{
    $dbobject = RDCsingleton::getInstance('product_extra_values');

    if (strlen($extra_value) > 0) {
        $data['product_id']  = $product_id;
        $data['extra_id']    = $extra_id;
        $data['extra_value'] = $extra_value;
        $data = $dbobject->insertOrUpdate($data);
    } else {
        $data['product_id']  = $product_id;
        $data['extra_id']    = $extra_id;
        $data = $dbobject->deleteRecord($data);
    } // if
    if ($dbobject->errors) {
        $this->errors[$dbobject->getClassName()] = $dbobject->errors;
        return false;
    } // if

    return true;

} // updateExtraData

Note here that a blank value will result in the record being deleted.

Conclusion

As you should see problems 1 and 2 identified in the introduction can be eliminated by a small amount of extra coding. Problems 3 and 4, while requiring another bit of extra coding, need only be defined once within a separate customised class file, and when dealing with a software package which is used by multiple customers this is the safest way to keep each customer's customisations outside of the core package so that they have no impact on other customers.

Just because you discover problems with a particular implementation of an idea does not necessarily mean that the entire idea is bad. You should be aware that an idea could be implemented in several ways, and that the problems could be alleviated by making slight adjustments to your implementation. I have written about this idea before - in Singletons are NOT evil I showed that each of the arguments against using a singleton was in fact due to a particular implementation. By using a different implementation I achieved the objective of using a singleton but without those nasty side effects.

When you come across a particular problem you should remember the following piece of ancient wisdom:

You cannot move mountains if you believe them to be mountains.
You must think of them as collections of small stones,
Which can be moved one at a time, and then reassembled.

-- The Tao of Meow

If you can break the huge problem (the mountain) into a collection of much smaller problems (individual stones) then you have a much better chance of solving each of these small problems. Solve enough of these and, in time, the huge problem simply disappears. This is the approach that I have used successfully for over 30 years, so I advise you to try it.

Here endeth the lesson. Don't applaud, just throw money.


counter