Motivation:
Committing the database transaction after each batch execution allows us to avoid long-running transactions and, in case of a failure, we rollback only the failed batch and don't lose the previous batches.
Description:
Let's suppose that we have a one-to-many relationship between Author and Book entities:
We want to create a bunch of authors with books and save them in the database (e.g., a MySQL database) using the batch technique. By default, this will result in batching each author and the books per author (one batch for the author and one batch for the books, another batch for the author and another batch for the books, and so on). In order to batch authors and books, we need to order inserts as in this application.
Moreover, this example commits the database transaction after each batch execution. This way we avoid long-running transactions and, in case of a failure, we rollback only the failed batch and don't lose the previous batches. For each batch, the Persistence Context is flushed and cleared, therefore we maintain a thin Persistence Context. This way the code is not prone to memory errors and performance penalties caused by slow flushes.
Key points:
Beside all setting specific to batching inserts in MySQL (you can use any database, do not conclude that this code works only for MySQL), we set up in application.properties the property: spring.jpa.properties.hibernate.order_inserts:
In your DAO layer, commit the database transaction after each batch execution:
Testing time:
Expose the DAO layer in Spring style via repository interface:
Call saveInBatch() from a service-method:
Tam Ta Da Dam! :) The complete application is available on GitHub.
If you need a deep dive into the performance recipes exposed in this repository then I am sure that you will love my book "Spring Boot Persistence Best Practices".
Comments