Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

Merging with first operand
Author: Agner Date: 2016-08-19 12:00
Sylvain Collange wrote:
Rixner et al., Register Organization for Media Processing, HPCA 2000.
www.ai.mit.edu/projects/aries/papers/imagine/register.pdf
evaluates the tradeoffs using analytical models and considers various distributed and SIMD RF organizations.
Thanks for the reference

Which code is generated when you vectorize :
    for all i {
      int x = a[i];
      int y = b[i];
      if(x > y) {
        y = x + 1;
      }
      c[i] = y;
      d[i] = x;
    }
    
This is straightforward: A compare instruction to generate the mask, an addition instruction, and a masked move instruction to conditionally replace y by x+1.
(And the compiler has to assume that arithmetic exceptions might be enabled.)
If you have enabled exceptions for integer overflow and you want to avoid the hypothetical overflow exception in a not-taken branch then you have to put the mask on the addition instuction as well. In other words, you will have two masked instructions using the same mask.

This is the reason why exceptions are not good in a vector processor. In floating point code you can disable exceptions and rely on the propagation of NAN and INF values to detect numeric errors. But there is no easy way of detecting integer overflow. I have discussed this in a previous post, but there seems to be no good solution. Many compilers and programming languages simply have no easy way of detecting integer overflow, but this may be a security problem, so I thought that we must provide some method. The following methods for detecting integer overflow may be considered:

  1. Use a flag output. This gives every ALU instruction an extra output dependency, and also an extra input dependency if you want the overflow bit to be propagated through a series of calculations. The extra input and output dependencies hamper out-of-order processing.
  2. Add an extra overflow bit for each vector element in a vector register. This adds an extra complication for saving the overflow bits when a vector register is spilled to memory.
  3. Use exception traps (interrupts) to catch integer overflow. This requires a minimum of complexity to the instruction set, but it has a problem with vector code.
  4. Use the even-numbered elements in a vector register for calculation and the odd-numbered elements for propagating overflow information. This is relatively easy to implement but wasteful because only half of the ALUs are used.
Method 1 has been rejected. The other three methods are still being considered for use in ForwardCom.
 
thread Proposal for instruction set - now on Github new - Agner - 2016-06-26
replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-07-04
last replythread Proposal for instruction set - now on Github new - Agner - 2016-07-04
replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-06
last replythread Proposal for instruction set - now on Github new - Agner - 2016-07-06
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-07
last reply Proposal for instruction set - now on Github new - Agner - 2016-07-07
replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-15
reply Number of input dependencies new - Agner - 2016-08-16
last replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-16
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-17
last replythread Merging with first operand new - Sylvain Collange - 2016-08-18
last replythread Merging with first operand - Agner - 2016-08-19
replythread SIMD exceptions are fine with masking new - Sylvain Collange - 2016-08-19
last replythread SIMD exceptions are fine with masking new - Agner - 2016-08-20
reply SIMD exceptions are fine with masking new - Hubert Lamontagne - 2016-08-20
last reply SIMD exceptions are fine with masking new - Sylvain Collange - 2016-08-25
last reply Merging with first operand new - Hubert Lamontagne - 2016-08-19
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-17
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-18
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-31
reply Proposal for instruction set - now on Github new - Agner - 2016-08-31
last reply Proposal for instruction set - now on Github new - Jorcy Neto - 2016-09-01
replythread Proposal for instruction set - now on Github new - Yuhong Bao - 2016-07-12
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-12
replythread Things from MIPS (and novel things) new - Anonymous - 2016-07-28
replythread Things from MIPS (and novel things) new - Agner - 2016-07-28
last reply Things from MIPS (and novel things) new - Hubert Lamontagne - 2016-07-28
last replythread Matrix multiplication new - Agner - 2016-07-29
reply Matrix multiplication new - Hubert Lamontagne - 2016-07-29
last replythread Matrix multiplication new - John D. McCalpin - 2016-07-29
last reply Matrix multiplication new - Agner - 2016-07-29
replythread Introduction website new - Agner - 2016-08-01
last replythread Introduction website new - EricTL - 2017-07-17
last replythread Introduction website new - Agner - 2017-07-18
last replythread Introduction website new - EricTL - 2017-07-20
last reply Introduction website new - Agner - 2017-07-20
replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-04
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-04
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-05
replythread Proposal for instruction set - now on Github new - Agner - 2016-08-06
last replythread Proposal for instruction set - now on Github new - fanoI - 2016-08-08
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-08
last reply Proposal for instruction set - now on Github new - fanoI - 2016-08-09
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-08
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-09
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-11
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-12
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-12
replythread Proposal for instruction set - now on Github new - grant galitz - 2016-08-22
reply Proposal for instruction set - now on Github new - Agner - 2016-08-22
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-24
replythread ARM with scalable vector extensions new - Agner - 2016-08-23
replythread ARM with scalable vector extensions new - Jorcy Neto - 2016-08-23
last reply ARM with scalable vector extensions new - Hubert Lamontagne - 2016-08-26
last reply ARM with scalable vector extensions new - Jorcy Neto - 2016-12-20
replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-05
replythread Proposal for instruction set - now on Github new - Agner - 2016-09-05
replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-05
last replythread Proposal for instruction set - now on Github new - Agner - 2016-09-06
reply Proposal for instruction set - now on Github new - Bigos - 2016-09-06
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-06
last replythread Proposal for instruction set - now on Github new - Agner - 2016-09-07
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-07
last replythread Proposal for instruction set - now on Github new - Agner - 2016-09-08
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-08
last replythread Proposal for instruction set - now on Github new - Commenter - 2016-09-07
last reply Proposal for instruction set - now on Github new - Bigos - 2016-09-08
last replythread Paging new - Kurt Baumgardner - 2016-09-09
replythread Paging new - Agner - 2016-09-10
reply Paging new - Hubert Lamontagne - 2016-09-11
last replythread Paging new - Kurt Baumgardner - 2016-09-13
replythread Paging new - Agner - 2016-09-13
last reply Paging new - Kurt Baumgardner - 2016-09-13
last replythread Paging new - Hubert Lamontagne - 2016-09-13
last reply Paging new - Kurt Baumgardner - 2016-09-14
replythread Paging new - Hubert Lamontagne - 2016-09-11
last reply Paging new - Kurt Baumgardner - 2016-09-13
last replythread Paging new - Agner - 2016-09-14
last reply Paging new - Jorcy Neto - 2016-09-18
replythread A null register? new - csdt - 2016-09-23
last replythread A null register? new - Agner - 2016-09-24
last replythread A null register? new - Hubert Lamontagne - 2016-09-24
replythread A null register? new - csdt - 2016-09-26
last reply A null register? new - Agner - 2016-09-27
last replythread Indexed registers new - Kurt Baumgardner - 2016-09-26
last replythread Indexed registers new - Agner - 2016-09-27
replythread Indexed registers new - Kurt Baumgardner - 2016-09-27
last reply Indexed registers new - Agner - 2016-09-28
last replythread Indexed registers new - Hubert Lamontagne - 2016-09-28
last replythread Indexed registers new - Kurt Baumgardner - 2016-10-03
reply Indexed registers new - Agner - 2016-10-03
last replythread Indexed registers new - Hubert Lamontagne - 2016-10-04
last replythread Bilinear Interpolation new - Hubert Lamontagne - 2016-10-28
last replythread Bilinear Interpolation new - Agner - 2016-10-29
last replythread Bilinear Interpolation new - Hubert Lamontagne - 2016-10-29
last replythread Bilinear Interpolation new - Agner - 2016-10-30
last reply Bilinear Interpolation new - Hubert Lamontagne - 2016-10-30
replythread ForwardCom version 1.04 new - Agner - 2016-12-08
replythread ForwardCom version 1.04 new - Matthias Bentrup - 2016-12-12
last replythread ForwardCom version 1.04 new - Agner - 2016-12-12
last reply ForwardCom version 1.04 new - Matthias Bentrup - 2016-12-14
last replythread Async system calls; horizontal packing instruction new - Joe Duarte - 2016-12-14
reply Async system calls; horizontal packing instruction new - Agner - 2016-12-15
last replythread Comparison of instruction sets new - Agner - 2016-12-17
replythread Comparison of instruction sets new - Joe Duarte - 2016-12-28
reply Comparison of instruction sets new - Agner - 2016-12-29
last reply Comparison of instruction sets new - Hubert Lamontagne - 2016-12-30
last reply Comparison of instruction sets new - Hubert Lamontagne - 2017-01-05
replythread ForwardCom version 1.05 new - Agner - 2017-01-22
replythread Syscall/ISR acceleration new - Jonathan Brandmeyer - 2017-01-22
last replythread Syscall/ISR acceleration new - Agner - 2017-01-23
last replythread Syscall/ISR acceleration new - Jonathan Brandmeyer - 2017-01-25
last reply Syscall/ISR acceleration new - Agner - 2017-01-25
replythread ForwardCom version 1.05 new - Jiří Moravec - 2017-01-23
last reply ForwardCom version 1.05 new - Agner - 2017-01-24
last replythread Jump prefetch? new - csdt - 2017-01-27
last replythread Jump prefetch? new - Agner - 2017-01-27
last replythread Jump prefetch? new - csdt - 2017-01-30
last replythread Jump prefetch? new - Agner - 2017-01-30
last replythread Jump prefetch? new - csdt - 2017-01-30
replythread Jump prefetch? new - Agner - 2017-01-31
reply Jump prefetch? new - csdt - 2017-01-31
last replythread Jump prefetch? new - Hubert Lamontagne - 2017-02-01
last replythread Jump prefetch? new - Agner - 2017-02-01
last replythread Jump prefetch? new - Hubert Lamontagne - 2017-02-01
last replythread Jump prefetch? new - Agner - 2017-02-02
last reply Jump prefetch? new - Agner - 2017-02-14
last replythread Jump prefetch? new - Hubert Lamontagne - 2017-01-31
last replythread High precision arithmetic new - fanoI - 2017-03-21
last reply High precision arithmetic new - Agner - 2017-03-21
replythread Intel's Control-flow Enforcement Technology new - Joe Duarte - 2017-04-13
last reply Intel's Control-flow Enforcement Technology new - Agner - 2017-04-14
reply Proposal for instruction set - now on Github new - Agner - 2017-04-27
replythread Assembler with metaprogramming features new - Agner - 2017-07-27
last replythread Assembler with metaprogramming features new - Kai Rese - 2017-08-11
last replythread Assembler with metaprogramming features new - Agner - 2017-08-11
last replythread Assembler with metaprogramming features new - Kai Rese - 2017-08-14
last replythread Assembler with metaprogramming features new - Agner - 2017-08-14
last reply Assembler with metaprogramming features new - Kai Rese - 2017-08-15
replythread Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-22
last replythread Number of register file ports in implementations new - Agner - 2017-08-23
last replythread Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-27
last replythread Number of register file ports in implementations new - Agner - 2017-08-28
reply Number of register file ports in implementations new - Bigos - 2017-08-28
last reply Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-28
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-20
replythread Proposal for instruction set - now on Github new - Agner - 2017-09-20
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-20
last replythread Proposal for instruction set - now on Github new - Agner - 2017-09-20
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-21
last replythread Proposal for instruction set - now on Github new - Agner - 2017-09-21
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-21
last reply Proposal for instruction set - now on Github new - Agner - 2017-09-23
replythread Proposal for instruction set - now on Github new - - - 2017-09-22
last reply Proposal for instruction set - now on Github new - Agner - 2017-09-23
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2017-09-25
last replythread Proposal for instruction set - now on Github new - Agner - 2017-09-26
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2017-09-26