Docs

Documentation versions (currently viewingVaadin 24)

Repositories

How to use repositories to store and fetch data.

The repository was originally introduced as one of the building blocks of tactical Domain-Driven Design, but has since then become common in all business applications, mainly thanks to Spring Data. A repository is a persistent container of entities that attempts to abstract away the underlying data storage mechanism. At its minimum, it provides methods for basic CRUD operations: Creating, Retrieving, Updating, and Deleting entities.

Collection Oriented

Collection oriented repositories try to mimic an in-memory collection, such as Map or List. Once an entity has been added to the repository, any changes made to it are automatically persisted until it has been deleted from the repository. In other words, there is no need for a save or update method.

A collection oriented, generic repository interface could look like this:

public interface Repository<ID, E> {
    Optional<E> get(ID id); // (1)
    void put(E entity); // (2)
    void remove(ID id); // (3)
}
  1. You can retrieve entities by their IDs.

  2. You can store entities in the repository.

  3. You can remove entities from the repository.

Creating and storing new entities could look like this:

var customer = new Customer();
customer.setName("Acme Incorporated");
repository.put(customer);

Retrieving and updating an entity could look like this:

repository.get(CustomerId.of("XRxY2r9P")).ifPresent(customer -> {
    customer.setEmail("info@acme.com");
    customer.setPhoneNumber("123-456-789");
});

Deleting an entity could look like this:

repository.remove(CustomerId.of("XRxY2r9P"));

Collection oriented repositories can be quite difficult to implement. The repository implementation would have to know when an entity has been changed, so that it can write it to the underlying storage. Handling transactions and errors would also be non-trivial. This is a telling example of the underlying storage mechanism leaking into the repository abstraction.

Persistence Oriented

Persistence oriented repositories do not try to hide the fact that the data has to be written to, and read from, some kind of external storage. They have separate methods for inserting, updating, and deleting the entity. If the repository is able to deduce whether any given entity has been persisted or not, the insert and update methods can be combined into a single save method. No changes to an entity are ever written to the storage without an explicit call to save.

A persistence oriented, generic repository interface could look like this:

public interface Repository<ID, E> {
    Optional<E> findById(ID id); // (1)
    E save(E entity); // (2)
    void delete(ID id); // (3)
}
  1. You can retrieve entities by their IDs.

  2. You can save entities in the repository.

  3. You can delete entities from the repository.

Note how the method names resemble database operations, instead of in-memory collection operations.

Creating and storing new entities could look like this:

var customer = new Customer();
customer.setName("Acme Incorporated");
customer = repository.save(customer);

The save method returns an entity, which can be the same instance as the one that was passed to the method, or a new one. This makes it easier to implement the repository, as some persistence frameworks, such as JPA, work in this way. This is another example of the underlying technology leaking into the repository abstraction.

Retrieving and updating an entity could look like this:

repository.get(CustomerId.of("XRxY2r9P")).ifPresent(customer -> {
    customer.setEmail("info@acme.com");
    customer.setPhoneNumber("123-456-789");
    repository.save(customer);
});

Deleting an entity could look like this:

repository.delete(CustomerId.of("XRxY2r9P"));

Persistence oriented repositories are easier to implement than collection oriented repositories, because their API aligns with the read and write operations of the underlying storage. Regardless of whether you are using a relational database, a flat file, or some external web service, you can write persistence oriented repositories for them all.

Unless you have a good reason for choosing collection-based repositories, you should use persistence oriented repositories in your Vaadin applications.

Query Methods

Although retrieving an entity by its ID is an important operation, it is not enough in most business applications. You need to be able to retrieve more than one entity at the same time, based on different criteria. If the dataset is big, you need to be able to split it into smaller pages and load them one at a time.

You can add query methods to your repositories. For example, here is a repository with two query methods:

public interface CustomerRepository extends Repository<CustomerId, Customer> {
    List<Customer> findByName(String searchTerm, int maxResult);
    Page<Customer> findAll(PageRequest pageRequest);
}

The first method finds all customers whose names match the given search term. It is good practice to always limit the size of your query results, which is why the method also has a maxResult parameter. This protects your application from running out of memory in case the query result turns out to be much larger than anticipated. If too many customers are returned, the user has to tweak the search term and try again.

The second method finds all the customers in the data storage, but splits it up into pages. The PageRequest object contains information about where to start retrieving data, how many customers to retrieve, how to sort them, and so on.

As long as you only have a handful of query methods, keeping them in the repository is fine. However, once the number of query methods starts to grow, you may run into problems. As the query methods become more specific, they also become more difficult to reuse. Over time, your repository may be full of query methods that are almost similar. When a new, similar use case shows up, it is easier to add a new query method than figure out which of the old ones to reuse.

One way of solving this problem is to introduce query specifications. A query specification is an object that explains which entities should be included in the query result. In the earlier example, you can replace all the query methods with a single one:

public interface CustomerRepository extends Repository<CustomerId, Customer> {
    Page<Customer> findBySpecification(CustomerSpecification specification,
        PageRequest pageRequest);
}

You would then use the query method like this:

var result = customerRepository.findBySpecification(
    CustomerSpecification.nameEquals("ACME")
        .and(CustomerSpecification.countryEquals(Country.US)
            .or(CustomerSpecificaiton.countryEquals(Country.FI))
        ),
    PageRequest.ofSize(10)
);
...

This query method would return the 10 first customers whose names match the "ACME" query string and who are located in either the U.S. or Finland.

The challenge with this approach is that it is difficult, but not impossible, to build specification objects that are not coupled to the technology used to implement the repository. However, most business applications do not change their databases, nor do they have to support multiple repository implementations. Since the repositories are already a leaky abstraction, you can make the specifications implementation specific to make things easier.

You can find examples of how to implement specification queries on the JPA and jOOQ documentation pages.

Query Objects

Query specifications are useful when you are interested in fetching whole entities. However, you often need to write queries that only include a small part of the entity. For example, if you are building a customer list view that only shows the customers' names and email addresses, there is no point in fetching the complete Customer-entity. The repository now looks like this:

public interface CustomerRepository extends Repository<CustomerId, Customer> {
    Page<Customer> findBySpecification(CustomerSpecification specification,
        PageRequest pageRequest);

// tag::snippet[]
    Page<CustomerListItem> findListItemsBySpecification(
        CustomerSpecification specification,
        PageRequest pageRequest);

    record CustomerListItem(CustomerId id, String name, EmailAddress email) {}
// end::snippet[]
}

Again, if you only have a handful of these queries, you can add them to the repository interface. However, if you have many different views, and every view needs its own query, the repository interface again risks becoming unstructured and difficult to maintain.

To address this issue, you should move all query methods that don’t return entities to their own query objects. After moving the query method from the example above to its own query object, you end up with something like this:

public interface CustomerListQuery {
    Page<CustomerListItem> findBySpecification(
        CustomerSpecification specification,
        PageRequest pageRequest);

    public record CustomerListItem(CustomerId id, String name, EmailAddress email) {}
}

Query objects read from the same data source as the repositories. You can create as many query objects as you need without cluttering your repositories.

The query objects do not have to be tied to a particular entity. Summary views, for example, often need complex queries that join data from different types of entities together. Putting queries like that in repositories can be difficult. Either you can’t find a single repository that feels like a good candidate, or you have multiple candidates to choose from. Creating a separate query object solves this problem.

Note
If you know the Command Query Responsibility Segregation (CQRS) architectural pattern, the idea of query objects may sound familiar. However, there is a big difference: Whereas CQRS uses different data models for writing and reading, query objects and repositories operate on the same data model, using the same data source.

Building

JPA Repositories
How to implement repositories with JPA and Spring Data.
jOOQ Repositories
How to implement repositories with jOOQ.
Pluggable Persistence
How to support multiple persistence solutions with a single SPI.