avk wrote:
What about the other decoding schemes: 1-1-2, 1-2-1?
Can BD work with them too?
I don't think so. Haven't tried.
BTW, how do you think: is there any chance that AMD
soon will implement the register renaming for YMM? If
yes, will it help to implement the 2-2 scheme? Or
should we wait for the 128->256 bit broadening
somewhen in the 22/20 nm BD's derivative?
All YMM instructions generate two mops, regardless of renaming. A sequence of YMM instructions can not decode as 2-2 so it is less efficient than the corresponding XMM instructions. They will probably fix this somehow in a later version when YMM instructions become more common. It would be much cheaper to improve the decoders than to make a full 256-bit databus and execution unit.
Massimo wrote:
do you mean the
first decoder can actually fetch one 2-MOP or two 1MOP
instruction and decode them?
I guess it works like this:
There are 4 parallel decode lines.
When it finds a double instruction in the first of the 4 lines, and single instructions in the next two lines, it will generate 2-1-1. If it finds a double instruction in any of the other lines, it will delay it to the next clock cycle and put it in line 1. |