Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

AVX512 performance
Author:  Date: 2018-03-29 14:26
A few days ago I could check that ICL2018 generates better AVX512 code than 2017 equivalent. For a long project of dose distribution modeling (*) on Radiotherapy, the code generated by ICL 2017 takes 286 seconds while 2018 version takes 276 seconds. Tested on i7 7820X.
Important: this improvement is only achieved with the new option: /Qopt-zmm-usage:high
I have not compared code generation for SSEx or AVX2 yet.
(*) - Using "Collapse Cones Superposition" method.
 
thread AVX512 performance - Armando Alaminos Bouza - 2018-03-29
last reply AVX512 performance new - Agner - 2018-04-04