Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

List Messageboards

Micro-fusion limited to 1-reg addressing modes

Author:

Date: 2016-05-24 07:15

Agner wrote:

The results show that instructions with three input dependencies are fusing alright and use only a single entry in the micro-operation cache

I found official confirmation in Intel's optimization manual that we're both right (see Section 2.3.2.4: "Micro-op Queue and the Loop Stream Detector (LSD)"), we were just measuring different things. SnB-family still micro-fuses such instructions in the decoders and uop-cache, but "un-laminates" uops with an indexed addressing mode before the issue/rename stage. The uop format used in the ROB must be different from the format in the uop cache. The unfused-domain scheduler (RS) must still handle uops with indexed addressing modes, because pure loads with complex addressing modes are still a single p23 uop.

Tacit Murky's earlier post says that the un-lamination happens as uops are written to the IDQ, so the loop-buffer size is measured in un-laminated uops.

For the purposes of pipeline width and tight loops, indexed addressing modes don't micro-fuse. The 4-wide issue width is after un-lamination.

For the record, un-laminate is not a normal English word, but delaminate is. I want to put quotes around it every time I type it. >.<

BTW, I updated my answer on StackOverflow with this info.

Reply To This Message

Previous Message

Optimization manuals updated new - Agner - 2013-09-04

Optimization manuals updated new - Agner - 2014-02-19

Latency of PTEST/VPTEST new - Nathan Kurz - 2014-05-20

Latency of PTEST/VPTEST new - Agner - 2014-05-20

Optimization manuals updated - Silvermont test new - Agner - 2014-08-08

Optimization manuals updated - Silvermont test new - Tacit Murky - 2014-08-11

Optimization manuals updated - Silvermont test new - Agner - 2014-08-13

Conditional operation new - Just_Coder - 2014-09-20

Conditional operation new - Agner - 2014-09-21

Conditional operation new - Slacker - 2014-10-06

Optimization manuals updated new - Slacker - 2014-10-06

Optimization manuals updated new - jenya - 2014-10-10

FP pipelines on Intel's Haswell core new - John D. McCalpin - 2014-10-17

FP pipelines on Intel's Haswell core new - Agner - 2014-10-18

FP pipelines on Intel's Haswell core new - Jorcy de Oliveira Neto - 2015-09-24

FP pipelines on Intel's Haswell core new - Agner - 2015-09-25

Micro-fusion limited to 1-reg addressing modes new - Peter Cordes - 2015-07-11

Micro-fusion limited to 1-reg addressing modes new - Agner - 2015-07-12

Micro-fusion limited to 1-reg addressing modes new - Tacit Murky - 2015-11-15

Micro-fusion limited to 1-reg addressing modes new - Agner - 2015-12-01

Micro-fusion limited to 1-reg addressing modes new - Peter Cordes - 2015-12-15

Micro-fusion limited to 1-reg addressing modes - Peter Cordes - 2016-05-24

Skylake? new - Travis - 2015-10-21

Skylake? new - Agner - 2015-10-22

Skylake? new - John D. McCalpin - 2015-10-22

Skylake? new - Adrian Bocaniciu - 2015-10-23

Skylake? new - Bigos - 2015-10-23

Skylake? new - Slacker - 2015-10-24

Excavator and Puma new - Agner - 2015-12-16

Excavator and Puma new - Slacker - 2016-01-03

Excavator and Puma new - Daniel - 2016-01-16

Excavator and Puma new - Jonathan Morton - 2016-02-02

List Messageboards