Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

Micro-fusion limited to 1-reg addressing modes
Author:  Date: 2016-05-24 07:15
Agner wrote:
The results show that instructions with three input dependencies are fusing alright and use only a single entry in the micro-operation cache

I found official confirmation in Intel's optimization manual that we're both right (see Section 2.3.2.4: "Micro-op Queue and the Loop Stream Detector (LSD)"), we were just measuring different things. SnB-family still micro-fuses such instructions in the decoders and uop-cache, but "un-laminates" uops with an indexed addressing mode before the issue/rename stage. The uop format used in the ROB must be different from the format in the uop cache. The unfused-domain scheduler (RS) must still handle uops with indexed addressing modes, because pure loads with complex addressing modes are still a single p23 uop.

Tacit Murky's earlier post says that the un-lamination happens as uops are written to the IDQ, so the loop-buffer size is measured in un-laminated uops.

For the purposes of pipeline width and tight loops, indexed addressing modes don't micro-fuse. The 4-wide issue width is after un-lamination.

For the record, un-laminate is not a normal English word, but delaminate is. I want to put quotes around it every time I type it. >.<

BTW, I updated my answer on StackOverflow with this info.

 
thread Optimization manuals updated new - Agner - 2013-09-04
reply Optimization manuals updated new - Agner - 2014-02-19
replythread Latency of PTEST/VPTEST new - Nathan Kurz - 2014-05-20
last reply Latency of PTEST/VPTEST new - Agner - 2014-05-20
replythread Optimization manuals updated - Silvermont test new - Agner - 2014-08-08
last replythread Optimization manuals updated - Silvermont test new - Tacit Murky - 2014-08-11
last reply Optimization manuals updated - Silvermont test new - Agner - 2014-08-13
replythread Conditional operation new - Just_Coder - 2014-09-20
last replythread Conditional operation new - Agner - 2014-09-21
last reply Conditional operation new - Slacker - 2014-10-06
replythread Optimization manuals updated new - Slacker - 2014-10-06
last reply Optimization manuals updated new - jenya - 2014-10-10
replythread FP pipelines on Intel's Haswell core new - John D. McCalpin - 2014-10-17
reply FP pipelines on Intel's Haswell core new - Agner - 2014-10-18
last replythread FP pipelines on Intel's Haswell core new - Jorcy de Oliveira Neto - 2015-09-24
last reply FP pipelines on Intel's Haswell core new - Agner - 2015-09-25
replythread Micro-fusion limited to 1-reg addressing modes new - Peter Cordes - 2015-07-11
replythread Micro-fusion limited to 1-reg addressing modes new - Agner - 2015-07-12
last reply Micro-fusion limited to 1-reg addressing modes new - Tacit Murky - 2015-11-15
last replythread Micro-fusion limited to 1-reg addressing modes new - Agner - 2015-12-01
reply Micro-fusion limited to 1-reg addressing modes new - Peter Cordes - 2015-12-15
last reply Micro-fusion limited to 1-reg addressing modes - Peter Cordes - 2016-05-24
last replythread Skylake? new - Travis - 2015-10-21
last replythread Skylake? new - Agner - 2015-10-22
replythread Skylake? new - John D. McCalpin - 2015-10-22
reply Skylake? new - Adrian Bocaniciu - 2015-10-23
last reply Skylake? new - Bigos - 2015-10-23
last replythread Skylake? new - Slacker - 2015-10-24
last replythread Excavator and Puma new - Agner - 2015-12-16
reply Excavator and Puma new - Slacker - 2016-01-03
reply Excavator and Puma new - Daniel - 2016-01-16
last reply Excavator and Puma new - Jonathan Morton - 2016-02-02