Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

Haswell upper128 power gating
Author:  Date: 2016-01-29 13:51
The IDF Skylake presentation seems to be saying something quite different than powering down the upper 128-bit lanes. The slide says the AVX2 infrastructure is powered down when not in use -- it says nothing about lanes or about 128 bits -- and the presenter was pretty clear, saying that the whole AVX2 "area" was powered off. This does lead to some problems of interpretation, since it is not clear whether this means only the AVX2 extensions (and not AVX v1, which is also 256-bits wide) or whether the processor keeps (at least) one 64-bit FP pipeline powered up. One can imagine that the number of applications that use no AVX2 instructions is quite large, the number of applications that use no 256-bit registers is quite large, but the number of applications that use no floating-point at all is not nearly as large. Of course no hint is provided about the cost of the transition.

So it looks like both Sandy Bridge, Haswell, and Skylake (client) turn off the upper 128 bits of the SIMD pipelines, but only the Haswell pays the ~10 microsecond stall when the upper lanes are turned on. It may not be a coincidence that of these processors, only Haswell uses in-package voltage regulators. One might speculate that the smaller in-package voltage regulators are unable to hold the voltage steady under large load increases, so powering up the upper 128-bit SIMD lanes requires a stall while the voltage recovers. The 10 microsecond stall is similar in magnitude to the stalls on p-state changes on earlier processors. I don't think that I have seen good measurements of the overhead of p-state changes in Xeon E5 v3 processors.

Of course we know nothing about the nature of the power-saving modes in either Sandy Bridge or Haswell. For example, one might speculate that turning off the clocks to the upper 128-bit SIMD lanes (but leaving the power on) would produce less power saving, but also less voltage drop when the clocks are re-enabled.

There are still some anomalies. Re-reading Agner's comments leaves me with the impression that he has not seen the ~10 microsecond stall on any processors tested. Is this correct? If so, which Haswell models were tested?
On 2015-12-24, Robert noted that he sees the ~10 microsecond stall on Core i7-4770K and Core i7-4700MQ (both Haswell "client" parts), but not on the Core i7-5820K (a "Haswell E" part -- basically a Haswell EP server part in a client configuration). Although experimental error is always a possibility, it is conceivable that some products (perhaps specifically those without all cores enabled) might not suffer enough voltage drop to require this stall?

I imagine a 10 microsecond stall could be very upsetting to some people working in real-time signal processing, so it would be nice to know which processors show this behavior and which do not. One would guess that Skylake will also experience a large stall when it needs to enable the AVX2 "area", but it is not clear how Intel is managing this transition. Looking forward, one would imagine that the power implications of enabling/disabling the 512-bit SIMD units in AVX-512 could lead to even larger disruptions?

 
thread Test results for Intel's Sandy Bridge processor new - Agner - 2011-01-30
reply Test results for Intel's Sandy Bridge processor new - PaulR - 2011-02-15
replythread AVX2 new - phis - 2011-06-23
last reply AVX2 new - Agner - 2011-06-23
replythread Test results for Intel's Sandy Bridge processor new - anon - 2013-08-01
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2013-08-06
last replythread Test results for Intel's Sandy Bridge processor new - anon - 2013-08-07
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2013-08-07
last replythread Test results for Intel's Sandy Bridge processor new - anon - 2013-08-07
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2013-08-08
last replythread Test results for Intel's Sandy Bridge processor new - anon - 2013-08-08
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2013-08-09
last replythread Test results for Intel's Sandy Bridge processor new - anon - 2013-08-09
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2013-08-10
last reply Test results for Intel's Sandy Bridge processor new - Agner - 2013-08-10
replythread Test results for Intel's Sandy Bridge processor new - John D. McCalpin - 2013-10-09
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2013-10-10
last replythread Test results for Intel's Sandy Bridge processor new - John D. McCalpin - 2013-10-11
last replythread SB's L1D banks new - Tacit Murky - 2013-11-03
last reply SB's L1D banks new - John D. McCalpin - 2013-11-07
replythread Test results for Intel's Sandy Bridge processor new - John D. McCalpin - 2015-08-18
replythread Test results for Intel's Sandy Bridge processor new - Agner - 2015-08-18
last replythread Test results for Intel's Sandy Bridge processor new - John D. McCalpin - 2015-08-24
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2015-08-25
last reply Test results for Intel's Sandy Bridge processor new - John D. McCalpin - 2015-08-25
replythread Haswell upper128 power gating new - Peter Cordes - 2015-08-28
last replythread Haswell upper128 power gating new - Agner - 2016-01-16
last replythread Haswell upper128 power gating - John D. McCalpin - 2016-01-29
last reply Haswell upper128 power gating new - Agner - 2016-01-30
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2015-12-20
last replythread Test results for Intel's Sandy Bridge processor new - John D. McCalpin - 2015-12-21
last replythread Test results for Intel's Sandy Bridge processor new - Agner - 2015-12-22
reply Test results for Intel's Sandy Bridge processor new - Robert - 2015-12-24
last replythread Test results for Intel's Sandy Bridge processor new - Just_Coder - 2015-12-25
last reply Test results for Intel's Sandy Bridge processor new - Agner - 2015-12-26
last replythread Test results for Intel's Sandy Bridge processor new - Just_Coder - 2015-08-23
last reply Test results for Intel's Sandy Bridge processor new - Agner - 2015-08-25