| Hi Agner --- I noticed that the Intel documentation at https://software.intel.com/sites/landingpage/IntrinsicsGuide/ shows "VPTEST ymm, ymm" as having a latency of 4 cycles on Haswell, up from 2 on Sandy and Ivy Bridge. They also list "PTEST xmm, xmm" as having a latency of 2 on all platforms. Your current guide shows a latency of 1 for "PTEST x,x" on Sandy Bridge, and 2 for "PTEST v,v" on Haswell. Are you confident in these measurements, or is it possible that the Intel guide is correct here? Or is this just a terminology difference between PTEST and VPTEST? Thanks! |