Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

Store forwarding stalls of piledriver
Author:  Date: 2014-09-07 10:11
I found a blog that says store forwarding on Piledriver is improved from Bulldozer.
blog.stuffedcow.net/2014/01/x86-memory-disambiguation/
Unlike 186th page of "microarchitecture.pdf" where you store 32bits and load upper 16bits of the 32bits, the author of this blog store 64bits and load upper 32bits of the 64bits.
Then he says the loading upper 32bits has no stall.
I also reproduced it on FX-8350 with this code below (GNU as).

.L5
movq %rbx, b(%rip)
movl b+4(%rip), %eax
addq $1, a(%rip) #increase counter
jmp .L5

 
thread Test results for AMD Bulldozer processor new - Agner - 2012-03-02
replythread Test results for AMD Bulldozer processor new - Massimo - 2012-03-13
reply Test results for AMD Bulldozer processor new - Agner - 2012-03-14
last reply Test results for AMD Bulldozer processor new - Alex - 2012-03-14
replythread Test results for AMD Bulldozer processor new - fellix - 2012-03-15
last replythread Test results for AMD Bulldozer processor new - Agner - 2012-03-16
last replythread Test results for AMD Bulldozer processor new - Massimo - 2012-03-16
last replythread Test results for AMD Bulldozer processor new - Agner - 2012-03-17
reply Test results for AMD Bulldozer processor new - avk - 2012-03-17
last replythread Test results for AMD Bulldozer processor new - Massimo - 2012-03-17
last replythread Test results for AMD Bulldozer processor new - Agner - 2012-03-17
last replythread Test results for AMD Bulldozer processor new - Massimo - 2012-03-20
last replythread Test results for AMD Bulldozer processor new - Agner - 2012-03-21
last reply Cache WT performance of the AMD Bulldozer CPU new - GordonBGood - 2012-06-05
reply Test results for AMD Bulldozer processor new - zan - 2012-04-03
replythread Multithreads load-store throughput for bulldozer new - A-11 - 2014-06-27
last replythread Multithreads load-store throughput for bulldozer new - Bigos - 2014-06-28
last reply Multithreads load-store throughput for bulldozer new - A-11 - 2014-07-04
last reply Store forwarding stalls of piledriver - A-11 - 2014-09-07