Page 1 of 1

AVX or SSE code performance question

Posted: 2022-04-30, 8:18:00
by enuinc
Hi

My work deals with lots of numerical computations so I regularly use SSE or AVX to speed up numerical intensive work. So far, the results I got have been quite satisfactory, until today

Basically, I used SSE to rewrite a simple vector-vector multiplication. When I evaluate the speed up, I got approximately 3X reduction in run time as expected.

However, once I checked in the code, my colleague did an experiment, by running the compiled binary 56 times (with identical input) on a dell machine with dual Xeon Gold 6238R CPUs (so the machine contains 56 physical cores). I was surprised to see the code now runs about 5-6 slower than if I run such binary just once.

I suspect this is due to cache miss but I don't know how to confirm this. And probably more importantly, is there a way to mitigate such issue if it is indeed the root cause?

Thanks
en

Re: AVX or SSE code performance question

Posted: 2022-04-30, 12:02:34
by agner
I will propose that you post your question on https://stackoverflow.com with details about your code. Your results may in fact reflect cache effects, compiler optimization, or thread effects.