Cleaner Database Queries with Repositories

March 29, 2016

The traditional way to query something from the database is just to execute a bunch of SQL statements from the controller and get the results. Or maybe you use some package or a framework component to abstract the database access away. But as you can see with this approach, your code becomes less flexible as your project gets larger.

So, what should we do instead? Well, there are many options, and the Repository Pattern is a very good one.

Let's start first with an example, then I'll explain the main concept of this pattern and when you should use it.

Example

Here's a very simple example to get all published posts from the database. For simplicity, we'll use PDO with a sqlite database connection.

In the database I have a posts table, which has (id, title, body, published_at) fields.

To get all published posts we need to compare the published_at field with the current date. If it's less than or equal the current date, then it's published, so we fetch it.

First let's see how it's done without repositories.

// The database connection
try {
    $pdo = new PDO('sqlite:database.sqlite');
    $pdo->setAttribute(PDO::ATTR_DEFAULT_FETCH_MODE, PDO::FETCH_OBJ);
    $pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
} catch (PDOException $e) {
    echo $e->getMessage();
    die();
}

$posts = $pdo->query('SELECT * FROM posts WHERE published_at <= CURRENT_DATE')->fetchAll();

// Display all post titles
foreach ($posts as $post) {
    echo "# $post->title";
}

With repositories:

// index.php

require 'vendor/autoload.php';

use Repositories\PostRepository;

try {
    $pdo = new PDO('sqlite:database.sqlite');
    $pdo->setAttribute(PDO::ATTR_DEFAULT_FETCH_MODE, PDO::FETCH_OBJ);
    $pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
} catch (PDOException $e) {
    echo $e->getMessage();
    die();
}

$posts = (new PostRepository($pdo))->getPublished();

foreach ($posts as $post) {
    echo "# $post->title";
}

// Repositories/PostRepository.php

namespace Repositories;

class PostRepository
{
    protected $pdo;

    public function __construct(\PDO $pdo)
    {
        $this->pdo = $pdo;
    }

    public function getPublished()
    {
        return $this->pdo->query('SELECT * FROM posts WHERE published_at <= CURRENT_DATE')->fetchAll();
    }

    public function findById($id)
    {
        // For simplicity, no worries about SQL injection
        return $this->pdo->query('SELECT * FROM posts WHERE id = ' .  $id)->fetch();
    }
}

Before we continue

At first, it would seem that without repositories the code is simpler and easier. That's true! You won't use the repository pattern in those simple cases.

The repository pattern is usually used in those “Enterprise-level applications”.

And to be honest, I don't use them that much in my projects — I always look for the simplest and cleanest ways to accomplish what I have. But nevertheless, it's still something worth learning.

What's the difference in the code?

Later in this article, I'll tell you what improvements can this pattern provide. But for now, let's talk about the changes in the code above.

First, I created a new class called PostRepository in the Repositories directory.

Then, I moved the query that fetches the published posts into a method called getPublished(). And for that to work, I needed to pass an instance of $pdo into the constructor of the Repository class.

I know, things are not so clear now; keep reading for more explanation.

What is a Repository?

You can think of repositories as a simple in-memory collection of domain objects. Or in simple terms, it's just a place where we get data from or put data into — without thinking how the database queries were performed.

So, it's a class that contains a set of methods used to separate the logic of accessing the database from the business logic of the application.

Usually each repository represents a table in the database. That means, you would use it as a mediator between you (the client code) and the data source (the database).

Why to Use a Repository?

The key to understanding repositories is to know why to use them. Here are the top 4 reasons.

1. Minimize duplicate queries

Moving the logic of querying the database into a single place, helps minimize duplicate query logic.

And this is always a good thing, because as you know duplication is almost always a hurt for the design of your application. When you need to change how a query is executed, just go to that single place and all calls to that query will change accordingly — It's the DRY (Don't Repeat Yourself) principle.

2. Separate domain (business logic) from persistence layer (Database)

One important design principle is SRP (single Responsibility Principle), which says that each module should only have one and only one responsibility.

In other words, why should your domain layer know how to query the database? This part should be given to another layer.

Software architecture and layering is a really big topic on its own, and it's beyond the scope of this tutorial.

3. Hide query logic behind readable method names

Using repositories improves the readability of your code significantly; it's easier to understand what a method call does, than trying to understand what a piece of SQL does — imagine if there are multiple table joins with extra conditions.

4. Makes switching between data sources a very easy task.

You can think of a data source as a component or a place from which we retrieve or store the data into. Database is an example of a data source. For other projects the database may not be the only needed data source. Sometimes it is the file system, or some external service — AWS for example.

Imagine a scenario where you want to switch from the database to the file system. Without repositories this would be a very painful task — too many places to change the code.

However, if you're using repositories, you can make this task as easy as a one-line change.

This is possible with the help of polymorphism and coding to interface. All you have to do is to define an interface for each repository, and provide an implementation for each data source.

For example:

namespace Repositories;

interface PostRepository
{
    public function getPublished();
}

class DbPostRepository implements PostRepository
{
    public function getPublished()
    {
        // from database
    }
}

class FsPostRepository implements PostRepository
{
    public function getPublished()
    {
        // from file system
    }
}

This approach also makes unit testing even easier; that's by providing another implementation for in-memory data store. This means, you won't need to touch the database in your tests.

This point may seem really useful for some people and worthless for others because it depends on how large the application is.

Final notes

This is the basic idea of the repository pattern. But before I end up this tutorial, I'd like to give you some final notes.

The repository pattern is not just limited to querying (selecting from) the database, you can also use it to store or update data in the database. For instance, you might have a method called save($title, $body, $publishedAt) to save a new post.
As I mentioned in the last reason, repositories are not exclusive to databases, you can use them for any other data source.
As I said at the beginning, repositories aren't the only option. You have other good ones, such as the Query Object. Definitely, check it out.

OOP Database PHP

About Taha Shashtari

I'm a freelance web developer. Laravel & VueJS are my main tools these days and I love building stuff using them. I write constantly on this blog to share my knowledge and thoughts on things related to web development... Let's be friends on twitter.