I recently had a Java feature to implement which required some thought around read/write thread synchronization and performance. I don’t pretend to be expert on the topic of synchronization, and in fact, it’s an area of programming I have A LOT to learn about. It’s the kind of complex topic that often results in the panicky, overzealous and irrational behaviour you see in developers prior to yet another prematurely optimized solution being dumped on the world. When I approached this particular feature, I thought I wouldn’t fall into this trap. I was wrong.
The crux of the synchronization problem was that I wanted to be able to handle frequent concurrent read and writes to a Java ArrayList data structure in a thread-safe manner. It’s not that I didn’t know of a solution, but that I didn’t know of the best solution. After initially over-thinking the problem, I decided to post up on stackoverflow.com just to see how other developers out there with more experience on the subject would approach it. Fortunately, within minutes the community was already asking me questions in the comments that had me realise I never even had adequate data/info as to the load this feature would have to endure; yes, someday it may have to scale big time… but right now, it doesn’t!
Regardless of the process taken, I still managed to learn quite a bit about different synchronization techniques:
- Collections.synchronizedList(): Automatically performs synchronization for single atomic operations (add, remove, get, etc.). Easy solution and likely suitable for most purposes.
- CopyOnWriteArrayList: This results in a copy of the ArrayList being made each time it is modified to prevent exceptions when other threads are traversing the ArrayList. I couldn’t go with this option due to the high number of mutations required in the feature.
- ReadWriteLock: Allows one to manually apply read & write locks to parts of the code requiring synchronization. The great thing with this solution is the ability to have thread-safe concurrent read operations, with the downside being more manual effort and synchronization code.
The funny thing is that as development of the feature continued, requirements changed and it turned out that having random access to the data wasn’t even necessary; as the only requirements needed were then traversals and mutations, the end solution changed to using the ConcurrentLinkedQueue data structure…..meaning all that initial over-thinking was, well, lame