I found a way to squeeze two (!) integer square root calculations in addition to the division, without any troubles with rounding and without any additional slowdown! This starts to look very interesting, I’ll test it on GPU tomorrow. Both division and square roots are good to fight ASIC because they are implemented on hardware level either as an iterative logic (slow, many clock cycles), or as a pipelined logic (fast, but occupies a lot of space on chip).
This post was last modified on June 15, 2018, 12:05 am