@schancel Almost. Initial sync is already pretty well parallelized. However, after that, there’s not much parallelization of verification for blocks yet. This commit would be the first step in achieving embarrassingly parallelization for that.
I have already got working code based off of this commit that will parallelize the insertion of outputs step under protection of locks, and have verified the correctness of this implementation. I have half-finished doing the same for the inputs step. Due to lock overhead and lock contention, this implementation is currently slower than the serial processing method, but that was expected.
The next step after the parallelization will be switching to an atomics-based concurrent hashtable or trie for UTXO storage. This will eliminate the locks, and should allow performance to far exceed what is possible serially.
Incidentally, I also found a modest and simple optimization on the serial method last night. If we pull the coinbase transaction processing out of the main loops, that saves an unnecessary if statement in the inner loop and improves performance by about 9%.