Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

Test results for AMD Bulldozer processor
Author: Agner Date: 2012-03-17 12:41
avk wrote:
What about the other decoding schemes: 1-1-2, 1-2-1? Can BD work with them too?
I don't think so. Haven't tried.
BTW, how do you think: is there any chance that AMD soon will implement the register renaming for YMM? If yes, will it help to implement the 2-2 scheme? Or should we wait for the 128->256 bit broadening somewhen in the 22/20 nm BD's derivative?
All YMM instructions generate two mops, regardless of renaming. A sequence of YMM instructions can not decode as 2-2 so it is less efficient than the corresponding XMM instructions. They will probably fix this somehow in a later version when YMM instructions become more common. It would be much cheaper to improve the decoders than to make a full 256-bit databus and execution unit.

Massimo wrote:

do you mean the first decoder can actually fetch one 2-MOP or two 1MOP instruction and decode them?
I guess it works like this:
There are 4 parallel decode lines. When it finds a double instruction in the first of the 4 lines, and single instructions in the next two lines, it will generate 2-1-1. If it finds a double instruction in any of the other lines, it will delay it to the next clock cycle and put it in line 1.
 
thread Test results for AMD Bulldozer processor new - Agner - 2012-03-02
replythread Test results for AMD Bulldozer processor new - Massimo - 2012-03-13
reply Test results for AMD Bulldozer processor new - Agner - 2012-03-14
last reply Test results for AMD Bulldozer processor new - Alex - 2012-03-14
replythread Test results for AMD Bulldozer processor new - fellix - 2012-03-15
last replythread Test results for AMD Bulldozer processor new - Agner - 2012-03-16
last replythread Test results for AMD Bulldozer processor new - Massimo - 2012-03-16
last replythread Test results for AMD Bulldozer processor new - Agner - 2012-03-17
reply Test results for AMD Bulldozer processor new - avk - 2012-03-17
last replythread Test results for AMD Bulldozer processor new - Massimo - 2012-03-17
last replythread Test results for AMD Bulldozer processor - Agner - 2012-03-17
last replythread Test results for AMD Bulldozer processor new - Massimo - 2012-03-20
last replythread Test results for AMD Bulldozer processor new - Agner - 2012-03-21
last reply Cache WT performance of the AMD Bulldozer CPU new - GordonBGood - 2012-06-05
reply Test results for AMD Bulldozer processor new - zan - 2012-04-03
replythread Multithreads load-store throughput for bulldozer new - A-11 - 2014-06-27
last replythread Multithreads load-store throughput for bulldozer new - Bigos - 2014-06-28
last reply Multithreads load-store throughput for bulldozer new - A-11 - 2014-07-04
last reply Store forwarding stalls of piledriver new - A-11 - 2014-09-07