Motivation:
In delete batching, the DELETE statements are not sorted at all and this causes more batches than needed for this job. Using orphanRemoval=true and manually dissociating the children from parents help us to obtain an optim number of batches.
150+ PERSISTENCE PERFORMANCE ITEMS
THAT WILL ROCK YOUR APPS
Description:
This application is an example of batching deletes (in MySQL, but other database can be used as well) via orphanRemoval=true.
Note: Spring deleteAllInBatch() and deleteInBatch() don't use delete batching and don't take advantage of automatic optimstic locking mechanism to prevent lost updates (e.g., @Version is ignored). They rely on Query.executeUpdate() to trigger bulk operations. These operations are fast, but Hibernate doesn’t know which entities are removed, therefore, the Persistence Context is not updated accordingly (it's up to you to flush (before delete) and close/clear (after delete) the Persistence Context accordingly to avoid issues created by unflushed (if any) or outdated (if any) entities). The first one (deleteAllInBatch()) simply triggers a delete from entity_name statement and is very useful for deleting all records. The second one (deleteInBatch()) triggers a delete from entity_name where id=? or id=? or id=? ... statement, therefore, is prone to cause issues if the generated DELETE statement exceedes the maximum accepted size. This issue can be controlled by deleting the data in chunks, relying on IN operator, and so on. Bulk operations are faster than batching which can be achieved via the deleteAll(), deleteAll(Iterable<? extends T> entities) or delete() method. Behind the scene, the two flavors of deleteAll() relies on delete(). The delete()/deleteAll() methods rely on EntityManager.remove() therefore the Persistence Context is synchronized accordingly. If automatic optimstic locking mechanism (to prevent lost updates) is enabled then it will be used. Moreover, cascading removals and orphanRemoval works as well.
Key points for using deleteAll()/delete():
In this example, we have the Author entity and each author can have several Book (one-to-many)
First, we use orphanRemoval=true and CascadeType.ALL
Second, we explicitily dissociate all Book from the corresponding Author
Third, we explicitly (manually) flush the Persistence Context; it is time for orphanRemoval=true to enter into the scene; thanks to this setting, all disassociated books will be deleted; the generated DELETE statements are batched (if orphanRemoval is set to false, a bunch of updates will be executed instead of deletes)
Forth, we delete all Author via the deleteAll() or delete() method (since we have dissaciated all Book, the Author deletion will take advantage of batching as well):
Tam Ta Da Dam! :) The complete application is available on GitHub.
If you need a deep dive into the performance recipes exposed in this repository then I am sure that you will love my book "Spring Boot Persistence Best Practices".
Comments