Repositories

How to use repositories to store and fetch data.

Collection Oriented
Persistence Oriented
Query Methods
Query Classes
Implementations

The repository was originally introduced as one of the building blocks of tactical Domain-Driven Design. It has since become common in all business applications, mainly thanks to Spring Data. A repository is a persistent container of entities that attempts to abstract the underlying data storage mechanism. At a minimum, it provides methods for basic CRUD operations (i.e., Creating, Retrieving, Updating, and Deleting entities).

Collection Oriented

Collection oriented repositories try to mimic an in-memory collection, such as Map or List. Once an entity has been added to the repository, any changes persist until it has been deleted from the repository. There’s no need for a save or update method.

A collection oriented, generic repository interface could look like this:

public interface Repository<ID, E> {
    Optional<E> get(ID id); 1
    void put(E entity); 2
    void remove(ID id); 3
}

You can retrieve entities by their IDs.
You can store entities in the repository.
You can remove entities from the repository.

Creating and storing new entities could look like this:

var customer = new Customer();
customer.setName("Acme Incorporated");
repository.put(customer);

Retrieving and updating an entity might look like this:

repository.get(CustomerId.of("XRxY2r9P")).ifPresent(customer -> {
    customer.setEmail("info@acme.com");
    customer.setPhoneNumber("123-456-789");
});

Deleting an entity could look like this:

repository.remove(CustomerId.of("XRxY2r9P"));

Collection oriented repositories can be quite difficult to implement. The repository implementation would have to know when an entity has been changed, so that it can write it to the underlying storage. Handling transactions and errors would also be non-trivial. This is a telling example of the underlying storage mechanism leaking into the repository abstraction.

Persistence Oriented

Persistence oriented repositories don’t try to hide the fact that the data has to be written to, and read from, some kind of external storage. They have separate methods for inserting, updating, and deleting the entity. If the repository is able to deduce whether any given entity has been persisted, the insert and update methods can be combined into a single save method. No changes to an entity are ever written to the storage without an explicit call to save.

A persistence oriented, generic repository interface could look like this:

public interface Repository<ID, E> {
    Optional<E> findById(ID id); 1
    E save(E entity); 2
    void delete(ID id); 3
}

You can retrieve entities by their IDs.
You can save entities in the repository.
You can delete entities from the repository.

The method names resemble database operations, instead of in-memory collection operations.

Creating and storing new entities might look like this:

var customer = new Customer();
customer.setName("Acme Incorporated");
customer = repository.save(customer);

The save method returns an entity, which can be the same instance as the one that was passed to the method, or a new one. This makes it easier to implement the repository, as some persistence frameworks (e.g., JPA) work in this way. This is another example of the underlying technology leaking into the repository abstraction.

Retrieving and updating an entity would look like this:

repository.get(CustomerId.of("XRxY2r9P")).ifPresent(customer -> {
    customer.setEmail("info@acme.com");
    customer.setPhoneNumber("123-456-789");
    repository.save(customer);
});

Deleting an entity could look like this:

repository.delete(CustomerId.of("XRxY2r9P"));

Persistence oriented repositories are easier to implement than collection oriented repositories. This is because their API aligns with the read and write operations of the underlying storage. Regardless of whether you’re using a relational database, a flat file, or some external web service, you can write persistence oriented repositories for them all.

Unless you have a good reason for choosing collection-based repositories, you should use persistence oriented repositories in your Vaadin applications.

Query Methods

Although retrieving an entity by its ID is an important operation, it is not enough in most business applications. You need to be able to retrieve more than one entity at the same time, based on different criteria. If the dataset is large, you need to be able to split it into smaller pages and load them one at a time.

You can add query methods to your repositories. Here is an example of a repository with two query methods:

public interface CustomerRepository extends Repository<CustomerId, Customer> {
    List<Customer> findByName(String searchTerm, int maxResult);
    Page<Customer> findAll(PageRequest pageRequest);
}

The first method finds all customers whose names match the given search term. Always limit the size of your query results. This is why the method shown also has a maxResult parameter. It protects the application from running out of memory in case the query result is much larger than anticipated. If too many customers are returned, the user has to tweak the search term and try again.

The second method finds all customers in the data storage, but splits the results into pages. The PageRequest object contains information about where to start retrieving data, how many customers to retrieve, how to sort them, and so on.

As long as you only have a handful of query methods, keeping them in the repository is fine. However, once the number of query methods starts to grow, you may have problems. As the query methods become more specific, they also become more difficult to reuse. Over time, your repository may be full of query methods that are almost similar. When a new, similar use case appears, it’s easier to add a new query method than to determine which of the old ones to reuse.

One way of solving this problem is to introduce query specifications. A query specification is an object that explains which entities should be included in the query result. In the earlier example, you can replace all query methods with a single one:

public interface CustomerRepository extends Repository<CustomerId, Customer> {
    Page<Customer> findBySpecification(CustomerSpecification specification,
        PageRequest pageRequest);
}

You would then use the query method like this:

var result = customerRepository.findBySpecification(
    CustomerSpecification.nameEquals("ACME")
        .and(CustomerSpecification.countryEquals(Country.US)
            .or(CustomerSpecification.countryEquals(Country.FI))
        ),
    PageRequest.ofSize(10)
);
...

This query method would return the first ten customers whose names match the "ACME" query string and who are located in either the U.S. or Finland.

The challenge with this approach is that it’s difficult to build specification objects that aren’t coupled to the technology used to implement the repository. However, most business applications don’t change their databases, nor do they have to support multiple repository implementations. Since the repositories are already a leaky abstraction, you can customize the specifications implementation to make things easier.

You’ll find examples of how to implement specification queries on the JPA and jOOQ documentation pages.

Query Classes

Query specifications are useful when you’re interested in fetching whole entities. However, you often may need to write queries that only include a small part of the entity. For example, if you’re building a customer list view that only shows the customers' names and email addresses, there is no point in fetching the complete Customer-entity. The repository now looks like this:

public interface CustomerRepository extends Repository<CustomerId, Customer> {
    Page<Customer> findBySpecification(CustomerSpecification specification,
        PageRequest pageRequest);

    Page<CustomerListItem> findListItemsBySpecification(
        CustomerSpecification specification,
        PageRequest pageRequest);

    record CustomerListItem(CustomerId id, String name, EmailAddress email) {}
}

Again, if you only have a handful of these queries, you can add them to the repository interface. However, if you have many different views, and every view needs its own query, the repository interface again risks becoming unstructured and difficult to maintain.

To address this issue, you should move all query methods that don’t return entities to their own query classes. After moving the query method from the example above to its own class, you get something like this:

public interface CustomerListQuery {
    Page<CustomerListItem> findBySpecification(
        CustomerSpecification specification,
        PageRequest pageRequest);

    public record CustomerListItem(CustomerId id, String name, EmailAddress email) {}
}

Queries read from the same data source as the repositories. You can create as many query classes as you need without cluttering your repositories.

The queries don’t have to be tied to a particular entity. Summary views, for example, often need complex queries that join data from different types of entities. Putting queries like that in repositories can be difficult. Either you can’t find a single repository that seems like a good candidate, or you have multiple candidates from which to choose. Creating a separate query class solves this problem.

Implementations

JPA Repositories: How to implement repositories with JPA and Spring Data.
jOOQ Repositories: How to implement repositories with jOOQ.

Docs