Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

Introduction website
Author:  Date: 2017-07-20 19:19
I thought I replied already. Maybe I did not sent it.

Agner wrote:
Instruction truth_tab2 is a two-input LUT.
Instruction truth_tab3 is a three-input LUT.

These instructions implement arbitrary functions for boolean vectors. However, the function is applied only to the least significant bit of each vector element. There are two reasons for this limitation. (1) This is the way boolean vectors are represented in ForwardCom. The boolean value is stored in the least significant bit. The remaining bits are option bits, which can be propagated to the result and used for specifying options to e.g. a floating point instruction using the LUT result as a mask. (2) The 3-input LUT can be implemented in hardware by using an 8-bit barrel shifter where the 8 bits of the LUT are shifted right by the count defined by the three input bits. The hardware allready has an 8-bit barrel shifter for each 8-bit vector element so that the implementation is cheap. A LUT function that applies to all bits of a vector will require one 8-bit barrel shifter for each vector bit, which would be more expensive in terms of silicon space.

I am not sure I understand what you want to do with the msb, lsb inputs?

Let's put aside the shit left or right by one (the immediate lsb and msb where there to feed the shift on both end of the register. )

A 4 bit LUT can be made with 3 multiplexers (mux.) per bit.

I contend the logical part of the ALU will need at least 3 mux. per bit to choose from A | B, A & B, A & ~B and A ^ B.
So in essence for logical encoded instruction the 2 mux. control bits comme from the instruction while for the LUT base the two control bits come from the registers.


In the simplest implementation the 4 logical ops will be done first then the mux4:1 will choose which one.


In the LUT case while the registers are being read the immediate 4 bits for the LUT are amplified and broadcasted to the mux4:1.


In modern CMOS on SOI a mux2:1 is just 2 transistors. So if one counts two inversors (2 Tr x 2 ) for every 8 bits => 4 x 4 x (64 / 8) = 128 Tr for the broadcast
plus 64 x 6 Tr for the mux4:1

Total 512 Tr for the universal solutions vs a minimum of 64 x (6+1) for the mux4:1 part plus the cost of the four logical ops...
Granted it is over simplified but I guess anyone can get the just of it.


Moreover with the LUT solution one get also Not, Clear, Id (move) plus any other less common logical expression
and in a two registers instruction settings ( Rd = Rd ops Rs ) the LUT enable (Rd := Rd ops Rs) and (Rd := Rs ops Rd).

The case for 8 bits LUT (universal three-input logical operation) is less clear. The mux is now 8:1 => 14 Tr and the broadcast doubled to 256 Tr => 1152 Tr.
I still think that instruction can be leveraged by a compiler to, at the very least, make the code smaller.

To finish on a rather hackish and somewhat silly note In the case of the tiny instruction format the 4 bits LUT can be implemented by using the code for Rs as both the register and the LUT. In that form Clear can be implemented as LUT R0 therefore reclaiming the opcode.

Not opcode can be reclaimed as well by LUT R5 (LUT convention b3,b2,b1,b0 -> 0101)
Freeing a opcode for something else. A clever compiler should be able to play with the limitation somewhat and get some further optimization.

 
thread Proposal for instruction set - now on Github new - Agner - 2016-06-26
replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-07-04
last replythread Proposal for instruction set - now on Github new - Agner - 2016-07-04
replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-06
last replythread Proposal for instruction set - now on Github new - Agner - 2016-07-06
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-07
last reply Proposal for instruction set - now on Github new - Agner - 2016-07-07
replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-15
reply Number of input dependencies new - Agner - 2016-08-16
last replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-16
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-17
last replythread Merging with first operand new - Sylvain Collange - 2016-08-18
last replythread Merging with first operand new - Agner - 2016-08-19
replythread SIMD exceptions are fine with masking new - Sylvain Collange - 2016-08-19
last replythread SIMD exceptions are fine with masking new - Agner - 2016-08-20
reply SIMD exceptions are fine with masking new - Hubert Lamontagne - 2016-08-20
last reply SIMD exceptions are fine with masking new - Sylvain Collange - 2016-08-25
last reply Merging with first operand new - Hubert Lamontagne - 2016-08-19
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-17
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-18
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-31
reply Proposal for instruction set - now on Github new - Agner - 2016-08-31
last reply Proposal for instruction set - now on Github new - Jorcy Neto - 2016-09-01
replythread Proposal for instruction set - now on Github new - Yuhong Bao - 2016-07-12
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-12
replythread Things from MIPS (and novel things) new - Anonymous - 2016-07-28
replythread Things from MIPS (and novel things) new - Agner - 2016-07-28
last reply Things from MIPS (and novel things) new - Hubert Lamontagne - 2016-07-28
last replythread Matrix multiplication new - Agner - 2016-07-29
reply Matrix multiplication new - Hubert Lamontagne - 2016-07-29
last replythread Matrix multiplication new - John D. McCalpin - 2016-07-29
last reply Matrix multiplication new - Agner - 2016-07-29
replythread Introduction website new - Agner - 2016-08-01
last replythread Introduction website new - EricTL - 2017-07-17
last replythread Introduction website new - Agner - 2017-07-18
last replythread Introduction website - EricTL - 2017-07-20
last reply Introduction website new - Agner - 2017-07-20
replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-04
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-04
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-05
replythread Proposal for instruction set - now on Github new - Agner - 2016-08-06
last replythread Proposal for instruction set - now on Github new - fanoI - 2016-08-08
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-08
last reply Proposal for instruction set - now on Github new - fanoI - 2016-08-09
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-08
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-09
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-11
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-12
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-12
replythread Proposal for instruction set - now on Github new - grant galitz - 2016-08-22
reply Proposal for instruction set - now on Github new - Agner - 2016-08-22
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-24
replythread ARM with scalable vector extensions new - Agner - 2016-08-23
replythread ARM with scalable vector extensions new - Jorcy Neto - 2016-08-23
last reply ARM with scalable vector extensions new - Hubert Lamontagne - 2016-08-26
last reply ARM with scalable vector extensions new - Jorcy Neto - 2016-12-20
replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-05
replythread Proposal for instruction set - now on Github new - Agner - 2016-09-05
replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-05
last replythread Proposal for instruction set - now on Github new - Agner - 2016-09-06
reply Proposal for instruction set - now on Github new - Bigos - 2016-09-06
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-06
last replythread Proposal for instruction set - now on Github new - Agner - 2016-09-07
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-07
last replythread Proposal for instruction set - now on Github new - Agner - 2016-09-08
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-08
last replythread Proposal for instruction set - now on Github new - Commenter - 2016-09-07
last reply Proposal for instruction set - now on Github new - Bigos - 2016-09-08
last replythread Paging new - Kurt Baumgardner - 2016-09-09
replythread Paging new - Agner - 2016-09-10
reply Paging new - Hubert Lamontagne - 2016-09-11
last replythread Paging new - Kurt Baumgardner - 2016-09-13
replythread Paging new - Agner - 2016-09-13
last reply Paging new - Kurt Baumgardner - 2016-09-13
last replythread Paging new - Hubert Lamontagne - 2016-09-13
last reply Paging new - Kurt Baumgardner - 2016-09-14
replythread Paging new - Hubert Lamontagne - 2016-09-11
last reply Paging new - Kurt Baumgardner - 2016-09-13
last replythread Paging new - Agner - 2016-09-14
last reply Paging new - Jorcy Neto - 2016-09-18
replythread A null register? new - csdt - 2016-09-23
last replythread A null register? new - Agner - 2016-09-24
last replythread A null register? new - Hubert Lamontagne - 2016-09-24
replythread A null register? new - csdt - 2016-09-26
last reply A null register? new - Agner - 2016-09-27
last replythread Indexed registers new - Kurt Baumgardner - 2016-09-26
last replythread Indexed registers new - Agner - 2016-09-27
replythread Indexed registers new - Kurt Baumgardner - 2016-09-27
last reply Indexed registers new - Agner - 2016-09-28
last replythread Indexed registers new - Hubert Lamontagne - 2016-09-28
last replythread Indexed registers new - Kurt Baumgardner - 2016-10-03
reply Indexed registers new - Agner - 2016-10-03
last replythread Indexed registers new - Hubert Lamontagne - 2016-10-04
last replythread Bilinear Interpolation new - Hubert Lamontagne - 2016-10-28
last replythread Bilinear Interpolation new - Agner - 2016-10-29
last replythread Bilinear Interpolation new - Hubert Lamontagne - 2016-10-29
last replythread Bilinear Interpolation new - Agner - 2016-10-30
last reply Bilinear Interpolation new - Hubert Lamontagne - 2016-10-30
replythread ForwardCom version 1.04 new - Agner - 2016-12-08
replythread ForwardCom version 1.04 new - Matthias Bentrup - 2016-12-12
last replythread ForwardCom version 1.04 new - Agner - 2016-12-12
last reply ForwardCom version 1.04 new - Matthias Bentrup - 2016-12-14
last replythread Async system calls; horizontal packing instruction new - Joe Duarte - 2016-12-14
reply Async system calls; horizontal packing instruction new - Agner - 2016-12-15
last replythread Comparison of instruction sets new - Agner - 2016-12-17
replythread Comparison of instruction sets new - Joe Duarte - 2016-12-28
reply Comparison of instruction sets new - Agner - 2016-12-29
last reply Comparison of instruction sets new - Hubert Lamontagne - 2016-12-30
last reply Comparison of instruction sets new - Hubert Lamontagne - 2017-01-05
replythread ForwardCom version 1.05 new - Agner - 2017-01-22
replythread Syscall/ISR acceleration new - Jonathan Brandmeyer - 2017-01-22
last replythread Syscall/ISR acceleration new - Agner - 2017-01-23
last replythread Syscall/ISR acceleration new - Jonathan Brandmeyer - 2017-01-25
last reply Syscall/ISR acceleration new - Agner - 2017-01-25
replythread ForwardCom version 1.05 new - Jiří Moravec - 2017-01-23
last reply ForwardCom version 1.05 new - Agner - 2017-01-24
last replythread Jump prefetch? new - csdt - 2017-01-27
last replythread Jump prefetch? new - Agner - 2017-01-27
last replythread Jump prefetch? new - csdt - 2017-01-30
last replythread Jump prefetch? new - Agner - 2017-01-30
last replythread Jump prefetch? new - csdt - 2017-01-30
replythread Jump prefetch? new - Agner - 2017-01-31
reply Jump prefetch? new - csdt - 2017-01-31
last replythread Jump prefetch? new - Hubert Lamontagne - 2017-02-01
last replythread Jump prefetch? new - Agner - 2017-02-01
last replythread Jump prefetch? new - Hubert Lamontagne - 2017-02-01
last replythread Jump prefetch? new - Agner - 2017-02-02
last reply Jump prefetch? new - Agner - 2017-02-14
last replythread Jump prefetch? new - Hubert Lamontagne - 2017-01-31
last replythread High precision arithmetic new - fanoI - 2017-03-21
last reply High precision arithmetic new - Agner - 2017-03-21
replythread Intel's Control-flow Enforcement Technology new - Joe Duarte - 2017-04-13
last reply Intel's Control-flow Enforcement Technology new - Agner - 2017-04-14
reply Proposal for instruction set - now on Github new - Agner - 2017-04-27
replythread Assembler with metaprogramming features new - Agner - 2017-07-27
last replythread Assembler with metaprogramming features new - Kai Rese - 2017-08-11
last replythread Assembler with metaprogramming features new - Agner - 2017-08-11
last replythread Assembler with metaprogramming features new - Kai Rese - 2017-08-14
last replythread Assembler with metaprogramming features new - Agner - 2017-08-14
last reply Assembler with metaprogramming features new - Kai Rese - 2017-08-15
replythread Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-22
last replythread Number of register file ports in implementations new - Agner - 2017-08-23
last replythread Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-27
last replythread Number of register file ports in implementations new - Agner - 2017-08-28
reply Number of register file ports in implementations new - Bigos - 2017-08-28
last reply Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-28
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-20
replythread Proposal for instruction set - now on Github new - Agner - 2017-09-20
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-20
last replythread Proposal for instruction set - now on Github new - Agner - 2017-09-20
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-21
last replythread Proposal for instruction set - now on Github new - Agner - 2017-09-21
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-21
last reply Proposal for instruction set - now on Github new - Agner - 2017-09-23
replythread Proposal for instruction set - now on Github new - - - 2017-09-22
last reply Proposal for instruction set - now on Github new - Agner - 2017-09-23
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2017-09-25
last replythread Proposal for instruction set - now on Github new - Agner - 2017-09-26
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2017-09-26