Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

Proposal for an ideal extensible instruction set
Author: Agner Date: 2016-02-23 02:12
Hubert Lamontagne wrote:
Add with Carry:
It's true that if you want to do lots of BIGNUM computation, then you'll definitely want a flags registers [...] Or perhaps you could use scalar integer registers as flags on SIMD operations designed for BIGNUMs.

Most CPUs today have a flags register which is updated at every ALU instruction, even if it is rarely used. So most ALU instructions have two result registers in current designs. The flags register also needs renaming. My proposal has fewer two-result instructions than most current designs because you only have a flags output when you explicitly ask for it, and you don't need flags for conditional jumps.

If you want to eliminate two-result instructions completely, then we have a problem with add-with-carry. I am not sure I understand your proposal. Do you want to do a bignum computation using the whole vector register as one huge integer? This would be tempting indeed, but it can't be done in a single clock cycle unless you implement some very heavy carry-look-ahead circuitry. And you will still have a two-result instruction.

Some kind of software pipelining might be the most efficient solution to the add-with-carry problem, but I am not sure we have found a software pipelining model that is sufficiently flexible and not too complicated. Maybe the vector registers can be used as software pipelines so that the data are shifted one vector position at each clock tick.

Regarding 16-bit instruction size:
Instruction length decoding is a serious bottleneck in x86 processors. We should have as few different instruction lengths as possible.

Separate Register Files:
The vector registers cannot have callee-save status because the length is variable. This is a problem if you want to reserve the scalar registers for integer instructions only and do all floating point operations in vector registers. Then you have no floating point register with callee-save status, unless you define callee-save status for part of the register only (x64 Windows does this).

Exceptions:
The x86 BOUND instruction was removed in 64-bit mode, apparently because it was inefficient and rarely used (the code byte has been reused for something else). Checking array bounds can be done quite simple with a compare-and-conditional-jump instruction, which is already included in my proposed instruction set. If you are using unsigned compare then you don't need to check the lower bound (assuming that it is zero).

Checking for integer overflow is very complicated in high-level languages (see http://stackoverflow.com/questions/199333/how-to-detect-integer-overflow-in-c-c). We could make it easier by improving support for checking integer overflow in the instruction set. I have already proposed to use the predicate/mask register for flags also. We could add two more flags bits which are accumulating, i.e. the signed and unsigned overflow condition is ORed with the previous value of the flag bit. The program can then check these accumulating overflow flags after a series of instructions. The compiler may use multiple registers for overflow flags in cases where the extra dependencies would prevent out-of-order execution. These flags registers are then ORed together in the end when you need to check for overflow. This mechanism might also be used for floating point errors as an alternative to the NAN-propagation mechanism. High level language support can easily be implemented with try-catch statements:

try {
  ... a long series of calculations ...
}
catch (signed_integer_overflow e) {
  ... error message ...
}
I don't know if this is sufficiently useful to justify the cost of having more two-result instructions.
 
thread Proposal for an ideal extensible instruction set new - Agner - 2015-12-27
replythread Itanium new - Ethan - 2015-12-28
last reply Itanium new - Agner - 2015-12-28
replythread Proposal for an ideal extensible instruction set new - hagbardCeline - 2015-12-28
last replythread Proposal for an ideal extensible instruction set new - Agner - 2015-12-28
reply Proposal for an ideal extensible instruction set new - Adrian Bocaniciu - 2016-01-04
reply Proposal for an ideal extensible instruction set new - Adrian Bocaniciu - 2016-01-04
reply Proposal for an ideal extensible instruction set new - Adrian Bocaniciu - 2016-01-04
replythread Proposal for an ideal extensible instruction set new - Adrian Bocaniciu - 2016-01-04
reply Proposal for an ideal extensible instruction set new - Adrian Bocaniciu - 2016-01-05
replythread Proposal for an ideal extensible instruction set new - John D. McCalpin - 2016-01-05
last reply Proposal for an ideal extensible instruction set new - Adrian Bocaniciu - 2016-01-06
last reply Proposal for an ideal extensible instruction set new - Ook - 2016-01-05
last reply Proposal for an ideal extensible instruction set new - acppcoder - 2016-03-27
reply Proposal for an ideal extensible instruction set new - Jake Stine - 2016-01-11
replythread Proposal for an ideal extensible instruction set new - Agner - 2016-01-12
last replythread Proposal for an ideal extensible instruction set new - Jonathan Morton - 2016-02-02
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-02-03
last replythread Proposal for an ideal extensible instruction set new - Jonathan Morton - 2016-02-12
last replythread Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-02-18
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-02-21
last replythread Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-02-22
last replythread Proposal for an ideal extensible instruction set - Agner - 2016-02-23
replythread Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-02-23
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-02-24
last replythread Proposal for an ideal extensible instruction set new - asdf - 2016-02-24
last reply Proposal for an ideal extensible instruction set new - Agner - 2016-02-24
last reply Proposal for an ideal extensible instruction set new - Agner - 2016-02-25
replythread limit instruction length to power of 2 new - A-11 - 2016-02-24
last replythread limit instruction length to power of 2 new - Agner - 2016-02-24
replythread Any techniques for more than 2 loads per cycle? new - Hubert Lamontagne - 2016-02-24
last reply Any techniques for more than 2 loads per cycle? new - Agner - 2016-02-25
last replythread limit instruction length to power of 2 new - A-11 - 2016-02-25
last reply limit instruction length to power of 2 new - Hubert Lamontagne - 2016-02-25
replythread More ideas new - Agner - 2016-03-04
replythread More ideas new - Hubert Lamontagne - 2016-03-07
last reply More ideas new - Agner - 2016-03-08
last reply More ideas new - Agner - 2016-03-09
replythread Proposal for an ideal extensible instruction set new - Joe Duarte - 2016-03-07
reply Proposal for an ideal extensible instruction set new - Agner - 2016-03-08
last replythread Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-03-08
last replythread Proposal for an ideal extensible instruction set new - Joe Duarte - 2016-03-09
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-03-10
last replythread Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-03-11
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-03-11
last replythread Proposal for an ideal extensible instruction set new - anon2718 - 2016-03-13
last reply Proposal for an ideal extensible instruction set new - Agner - 2016-03-14
replythread A design without a TLB new - Agner - 2016-03-11
replythread A design without a TLB new - Hubert Lamontagne - 2016-03-11
reply A design without a TLB new - Agner - 2016-03-11
last reply A design without a TLB new - Agner - 2016-03-12
reply A design without a TLB new - Bigos - 2016-03-13
last reply A design without a TLB new - Agner - 2016-03-28
replythread Proposal now published new - Agner - 2016-03-22
last replythread Proposal now published new - Hubert Lamontagne - 2016-03-23
last replythread Proposal now published new - Agner - 2016-03-24
last replythread Proposal now published new - Hubert Lamontagne - 2016-03-24
last replythread Proposal now published new - Agner - 2016-03-24
last replythread Proposal now published new - Hubert Lamontagne - 2016-03-24
last replythread Proposal now published new - Agner - 2016-03-25
last replythread Proposal now published new - Hubert Lamontagne - 2016-03-28
last replythread Proposal now published new - Agner - 2016-03-29
last replythread Proposal now published new - Hubert Lamontagne - 2016-03-30
last replythread Proposal now published new - Agner - 2016-03-30
last replythread Do we need instructions with two outputs? new - Agner - 2016-03-31
last replythread Do we need instructions with two outputs? new - Hubert Lamontagne - 2016-04-01
reply Do we need instructions with two outputs? new - Agner - 2016-04-01
replythread Do we need instructions with two outputs? new - Joe Duarte - 2016-04-02
last replythread Do we need instructions with two outputs? new - Agner - 2016-04-02
last reply Do we need instructions with two outputs? new - Joe Duarte - 2016-04-02
last replythread Do we need instructions with two outputs? new - Agner - 2016-04-02
last replythread Do we need instructions with two outputs? new - Hubert Lamontagne - 2016-04-02
last replythread Do we need instructions with two outputs? new - Agner - 2016-04-03
reply Do we need instructions with two outputs? new - Joe Duarte - 2016-04-03
last replythread Do we need instructions with two outputs? new - Hubert Lamontagne - 2016-04-03
last replythread Do we need instructions with two outputs? new - Agner - 2016-04-04
reply Do we need instructions with two outputs? new - Hubert Lamontagne - 2016-04-04
last replythread Do we need instructions with two outputs? new - Joe Duarte - 2016-04-06
last replythread Do we need instructions with two outputs? new - Hubert Lamontagne - 2016-04-07
last replythread Do we need instructions with two outputs? new - HarryDev - 2016-04-08
last reply Do we need instructions with two outputs? new - Hubert Lamontagne - 2016-04-09
replythread How about stack machine ISA? new - A-11 - 2016-04-10
last replythread treating stack ISA as CISC architecure new - A-11 - 2016-04-14
last replythread treating stack ISA as CISC architecure new - Agner - 2016-04-14
last replythread treating stack ISA as CISC architecure new - A-11 - 2016-04-17
replythread treating stack ISA as CISC architecure new - Hubert Lamontagne - 2016-04-17
last replythread stack ISA versus long vectors new - Agner - 2016-04-18
last replythread stack ISA versus long vectors new - Hubert Lamontagne - 2016-04-19
last reply stack ISA versus long vectors new - Agner - 2016-04-20
last reply treating stack ISA as CISC architecure new - A-11 - 2016-04-18
replythread Proposal for an ideal extensible instruction set new - zboson - 2016-04-11
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-04-11
last replythread Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-04-11
last replythread Proposal for an ideal extensible instruction set new - Agner - 2016-04-12
last reply Proposal for an ideal extensible instruction set new - Hubert Lamontagne - 2016-04-12
replythread Version 1.01 new - Agner - 2016-05-10
last replythread Version 1.01 new - Hubert Lamontagne - 2016-05-13
last replythread Version 1.01 new - Agner - 2016-05-14
last replythread Version 1.01 new - Harry - 2016-06-02
replythread Public repository new - Agner - 2016-06-02
reply Public repository new - Harry - 2016-06-02
last reply Public repository new - Harry - 2016-06-02
last reply Public repository new - Agner - 2016-06-09
replythread Rethinking DLLs and shared objects new - Agner - 2016-05-20
replythread Rethinking DLLs and shared objects new - cv - 2016-05-20
last reply Rethinking DLLs and shared objects new - Agner - 2016-05-20
replythread Rethinking DLLs and shared objects new - Peter Cordes - 2016-05-30
last replythread Rethinking DLLs and shared objects new - Agner - 2016-05-30
last replythread Rethinking DLLs and shared objects new - Joe Duarte - 2016-06-17
last replythread Rethinking DLLs and shared objects new - Agner - 2016-06-18
last reply Rethinking DLLs and shared objects new - Bigos - 2016-06-18
last replythread Rethinking DLLs and shared objects new - Freddie Witherden - 2016-06-02
last replythread Rethinking DLLs and shared objects new - Agner - 2016-06-04
last replythread Rethinking DLLs and shared objects new - Freddie Witherden - 2016-06-04
last reply Rethinking DLLs and shared objects new - Agner - 2016-06-06
replythread Is it better to have two stacks? new - Agner - 2016-06-05
reply Is it better to have two stacks? new - Hubert Lamontagne - 2016-06-07
replythread Is it better to have two stacks? new - Eden Segal - 2016-06-13
last replythread Is it better to have two stacks? new - Agner - 2016-06-13
last replythread Is it better to have two stacks? new - Hubert Lamontagne - 2016-06-14
last replythread Is it better to have two stacks? new - Agner - 2016-06-14
last replythread Is it better to have two stacks? new - Hubert Lamontagne - 2016-06-15
last replythread Is it better to have two stacks? new - Agner - 2016-06-15
last replythread Is it better to have two stacks? new - Hubert Lamontagne - 2016-06-16
last replythread Is it better to have two stacks? new - Agner - 2016-06-16
last reply Is it better to have two stacks? new - Hubert Lamontagne - 2016-06-17
last reply Is it better to have two stacks? new - Freddie Witherden - 2016-06-22
last reply Now on Github new - Agner - 2016-06-26