Search found 2 matches

by Elhardt
2021-11-01, 4:18:56
Forum: Agner's CPU blog
Topic: Intel Floating Point Executing 3 to 4 Times Faster Than it Should. MAKES NO SENSE
Replies: 4
Views: 74425

Re: Intel Floating Point Executing 3 to 4 Times Faster Than it Should. MAKES NO SENSE

Quite the opposite - adding the movss breaks the dependency chain on xmm0 allowing multiple loop iterations to be executed in parallel. The first example, on the other hand, enforces a dependency chain on xmm0 across loop cycles, inhibiting parallel execution across loop iterations. I've been exper...
by Elhardt
2021-10-04, 5:09:12
Forum: Agner's CPU blog
Topic: Intel Floating Point Executing 3 to 4 Times Faster Than it Should. MAKES NO SENSE
Replies: 4
Views: 74425

Intel Floating Point Executing 3 to 4 Times Faster Than it Should. MAKES NO SENSE

Hello Agner. I would sure like an explanation (if you have one) for the insanely fast floating point speeds I'm getting on Ivy Bridge and Haswell processors that seem to defy the laws of physics. Last year I was benchmarking code I was writing to calculate gangs of sine waves for additive synthesis ...