[bitcoin/bitcoin] Add POWER8 ASM for 4-way SHA256 (#13203)

The comparable change made a big impact on x86 many months ago— who knows now… Synchronization primitives are more expensive on power, so it also wouldn’t be surprising if our glib use of atomics all over the place weren’t moving around the limiting factors in performance on power compared to x86.