Motivation:
Having an efficient solution for chunking a huge List is useful for processing the resulted chunks in a concurrent approach. For example, we can implement a concurrent batching.
Description:
Is a common scenario to have a big List and to need to chunk it in multiple smaller List of a given size. For example, if we want to employ a concurrent batch implementation we need to give each thread a sublist of items. Chunking a list can be done via Google Guava, Lists.partition(List list, int size) method or Apache Commons Collections, ListUtils.partition(List list, int size) method. But, it can be implemented in plain Java as well. This application exposes 6 ways to do it. The trade-off is between the speed of implementation and speed of execution. For example, while the implementation relying on grouping collectors is not performing very well, it is quite simple and fast to write it.
Key points:
The fastest execution is provided by Chunk.java class which relies on the built-in List.subList() method:
Testing time:
Time-performance trend graphic for chunking 500, 1_000_000, 10_000_000 and 20_000_000 items in lists of 5 items:
Notice the last bar results for 20.000.000 items!
Tam Ta Da Dam! :) The complete application is available on GitHub.
If you need a deep dive into the performance recipes exposed in this repository then I am sure that you will love my book "Spring Boot Persistence Best Practices".
Comments