Vector Class Discussion

Intrinsics vs Assembly
Author: Agner Date: 2013-04-06 02:27
It's a matter of how the compiler optimizes. The compiler will often reuse a memory load to minimize cache contentions. But you are right, there are situations where it is better to re-load the same value in order to reduce the load on other execution ports or to make dependency chains shorter. In many cases, the best compilers (Gnu, Intel) optimize better than a decent assembly programmer does. But there are also many cases where compilers are doing incredibly silly things. If you have a very critical hotspot and the compiler is not optimizing it good enough, then the only alternative is to use assembly.

The tradeoff between copying a previously loaded value versus loading it again disappears with the AVX instruction set where you have non-destructive three-operand instructions.

 
thread Intrinsics vs Assembly new - Nathan Kurz - 2013-04-05
last reply Intrinsics vs Assembly - Agner - 2013-04-06