The DRY principle: its cost explained with examples

I know what you are thinking: “Again a boring article on DRY? Don’t we have enough already?”.

You might be right. However I see too many developers (junior or senior) applying DRY like they are doing some witch hunting. Totally randomly or everywhere they can.

So apparently we never have enough DRY principle articles on Internet.

A little reminder for those in the back who don’t follow: the DRY principle means “Don’t Repeat Yourself” and was first introduced in the The Pragmatic Programmer.

The principle itself was known and applied before this book came to life. However the Pragmatic Programmer defined it precisely and put a name on it.

Without waiting more let’s dive into the wonderful land of DRY! If you feel you want to burn this article or to cover it with praise, fill free to leave a lot of comments to increase my glory.

Don’t repeat knowledge

Even if the sentence don’t repeat yourself sounds simple enough, it sounds as well a bit too broad.

In The Pragmatic Programmer, DRY is defined as “every piece of knowledge must have a single, unambiguous, authoritative representation within a system”.

That’s great but… what’s a piece of knowledge?

I would define it as any part of your business domain or an algorithm.

To take overly used e-commerce examples, a shipment class and its behavior would be part of the business domain of your application. A shipment is something real your company use to send products to their customers. It’s part of the business model of your company.

Therefore the logic of this shipment should only appear once in the application.

The reason is obvious: imagine that you need to send shipments to a warehouse. You need to trigger this logic in 76 different places in your application.

No problem: you repeat the logic 76 times.

After a while your boss comes to you and asks you to change the logic. Instead of sending shipment to one warehouse, you need to send them to three different ones.

The result? You will spend a lot of time changing the logic since you will have to change it in 76 places! This is a pure waste of time, a good way to produce bugs and the best method to piss your boss off.

The solution: create a single representation of your knowledge. Put the logic to send the shipment in one place and then use the representation of this knowledge anywhere you need it. In OOP, sending a shipment could be a method of the class Shipment you can reuse at will.

Another quick example: imagine you coded a fancy class to parse B-trees. This can be considered as well as knowledge: it’s an algorithm which should be defined once. The representation of that knowledge should be used everywhere without repeating the knowledge.

DRY and code duplication

So DRY is all about knowledge? All about business logic?

Let’s begin by the obvious:

<?php

interface Product
{
    public function displayPrice();
}

class PlasticDuck implements Product
{
    /** @var int */
    private $price;

    public function __construct(int $price)
    {
        $this->price = $price;
    }

    public function displayPrice()
    {
        echo sprintf("The price of this plastic duck is %d euros!", $this->price);
    }
}

$plasticDuck = new PlasticDuck(2);
$plasticDuck->displayPrice();

This code doesn’t look that bad, does it? Dave, your colleague developer, doesn’t agree though. When Dave sees this code, he comes at your desk and screams:

  1. The word price is repeated 6 times.
  2. displayPrice() method is repeated in the interface, the implementation and called at runtime.

Dave, the expert beginner of your company, has no clue about OOP.

I can see you, wonderful developer looking at Dave like an experienced gardener look at a slug, answering:

  1. It’s a variable (and a property) and you need to repeat them in your code.
  2. However the logic (displaying the price) is only present once, in the method itself. There is no knowledge neither algorithms repeated.

No DRY violation here. Dave is speechless feeling your powerful aura illuminating the whole room.

As a good expert beginner though, you attacked his expertise and he’s angry. He googles the DRY principle, looks at another code you’ve written and comes back to your desk slapping it in your face:

<?php

class CsvValidation
{
    public function validateProduct(array $product)
    {
        if (!isset($product['color'])) {
            throw new \Exception('Import fail: the product attribute color is missing');
        }

        if (!isset($product['size'])) {
            throw new \Exception('Import fail: the product attribute size is missing');
        }

        if (!isset($product['type'])) {
            throw new \Exception('Import fail: the product attribute type is missing');
        }
    }
}

Dave, full of himself, claims: “You stupid! This code is not DRY!”. And you, having read this article, to answer: “But the business logic, the knowledge, is still not repeated!“.

Again, you’re right. The method validates some CSV parsing output in only one place (validateProduct()). This is the knowledge, it’s not repeated.

Dave is not ready to accept it though. “What about all those if everywhere? Isn’t it an obvious DRY violation?”

You take a deep voice to answer that, pronouncing every word perfectly, your knowledge bouncing on the wall to create an infinite echo of awareness:

“Well… no. It’s not. I would call that unnecessary code duplication, but not a DRY principle violation”.

Suddenly your fingers type on your keyboard at the speed of light the following code:

<?php

class CsvValidation
{
    private $productAttributes = [
        'color',
        'size',
        'type',
    ];

    public function validateProduct(array $product)
    {
        foreach ($this->productAttributes as $attribute) {
            if (!isset($product[$attribute])) {
                throw new \Exception(sprintf('Import fail: the product attribute %s is missing', $attribute));
            }
        }
    }
}

Looks better, doesn’t it?

To summarize:

  1. Knowledge duplication is always a DRY principle violation. I can’t think of an example where it’s not (leave a comment if you have some! wink wink).

  2. Code duplication doesn’t necessarily mean violation of the DRY principle.

Dave is still not convinced though. With a serenity defying the highest spiritual masters through the ages, you give him the final stroke.

“Most people take DRY to mean you shouldn’t duplicate code. That’s not its intention. The idea behind DRY is far grander than that.”

Who said that? Dave Thomas, one of the author of the Pragmatic Programmer, the very same book defining the DRY principle itself!

DRY everything: the recipe for disaster

Useless abstractions

Let’s take a more real life, interesting example:

I’m currently working on an application for filmmakers. They can upload their movies and their metadata (title, description, cast and crew of the movie…) on it easily. These information are then displayed on a VOD platform.

This is a MVC application looking like this:

basic project filetree

However not only filmmakers can create content via the application. The content team of my company can as well use it. The reason? Some filmmakers don’t want to bother to do it themselves.

The filmmakers and the content team have both very different needs. The content team is used to work with CMS, the filmmakers are not.

Therefore we decided to create two interfaces. The first one without guidance or explanation but where you can enter content as fast as you can for the content team. Another one explaining everything you should do for the filmmakers, with a more user friendly approach.

First we created the interface for the filmmakers and then we duplicated it for the content team. Here the result:

basic project filetree

At that point if you plan to kidnap my friends and family to force me to change this code, please don’t. I need them. This looks like an obvious and ugly violation of the DRY principle: views and controller repeated all over the place.

What were the other solutions? We could have grouped the common logic by using something like a template method. However this would have coupled the controllers together. Change the abstract class and every single of your controllers need to support the change.

In a lot of cases, we knew that those views would have a very different display depending of the application. It would have create a lot of if in the controllers actions, not something we want. The code would have been way more complex.

Moreover the controllers shouldn’t contain any business logic. If you recall the definition of the DRY principle, it’s this knowledge which should not be duplicated.

In short, trying to apply DRY everywhere can have two results:

  1. Unnecessary coupling
  2. Unnecessary complexity

Obviously you don’t want any of these in your application.

Premature optimization

You shouldn’t apply the DRY principle if your business logic doesn’t have any duplication yet. Again, “it depends”, but, as a rule of thumb, trying to apply DRY to something which is used only once can lead to premature optimisation.

If you begin to abstract something because “it could be useful later”, please don’t. Why?

  1. You will spend time to abstract something which might be never reused. Business needs can change very quickly and drastically.
  2. Again, you will possibly introduce complexity and coupling in your code for… nothing.

Code reuse and code duplication are two different things. DRY states that you shouldn’t duplicate knowledge, not that you should code to be able to reuse everything.

Code first, make it work, and then keep in mind all these principles you know (DRY, SOLID and so on) to refactor efficiently.

DRY principle violation should be handled when the knowledge is already duplicated.

Duplication of knowledge?

You remember when I stated above that repetition of business logic is always a violation of the DRY principle? Obviously this applies when the same business logic is repeated.

An example:

<?php


/** Shipment from the warehouse to the customer */
class Shipment
{
     public $deliveryTime = 4; //in days

     public function calculateDeliveryDay(): DateTime
     {
         return new \DateTime("now +{$this->deliveryTime} day");
     }
}

/** Order return of a customer */
class OrderReturn
{
    public $returnLimit = 4; //in days

    public function calculateLastReturnDay(): DateTime
    {
         return new \DateTime("now +{$this->returnLimit} day");
    }
}

You can hear Dave your colleague developer gently screaming in your ears once again: “This is an obvious violation of everything I believe in! What about the DRY principle? My heart is bleeding!”.

However Dave is again wrong. From an ecommerce perspective, the delivery time of a shipment to a customer (Shipment::calculateDeliveryDay()) has nothing to do with the last day the customer can return his ordered products (Return::calculateLastReturnDay).

These are two different functionalities. What appears to be a code duplication is just a pure coincidence.

What can happen if you combine those two methods in one? If your company decide that the customer has now one month to return his products, you will have to split the method again. If you don’t, the shipment delivery will take one month as well!

This is not the best way to please your customers.

DRY is not only a principle for coding nerds

dry-gin Even the Gin can be DRY nowadays!

DRY is not something you should only respect in your code. You shouldn’t repeat knowledge in every aspects of your project.

To quote Dave Thomas again: “A system’s knowledge is far broader than just its code. It refers to database schemas, test plans, the build system, even documentation.”

The idea of DRY is simple in theory: you shouldn’t need to update in parallel multiple things when one change occurs.

If your knowledge is repeated two times in your code and you forget to update one representation, it will create bugs. In your documentation it will lead to misconception, confusion and ultimately wrong implementation.

And so on and so forth.

DRY is a principle

At the beginning of my career, I was often victim of analysis paralysis. All those principles where holding me back to be productive and efficient. It was too complex and I didn’t want to screw everything.

I wanted to follow principles to the letter to be a good developer.

However principles are not rules. They are just tools for you to go in the good direction.

Everything has a cost in development. DRY is no exception. Obviously you shouldn’t repeat your business logic all over the place, but you shouldn’t neither tightly couple everything using too many abstractions. In short: be careful not to extract your duplicate code and make everything depend on it.

Don’t get me wrong: extracting code to make it available at one place can be useful. However you need to find the good way to do it depending on your application. It must be a thoughtful decision.

Of course this article is meant to evolve regarding your experience and your vision of the DRY principle. It is a very broad (and philosophical) subject so don’t hesitate to leave a comment.