Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

Bilinear Interpolation
Author:  Date: 2016-10-30 21:47
Agner wrote:
I don't think I understand your problem. You have four RGBA points in each their vector register. All of these should be multiplied by a factor and then it should all be added together. It's just multiplication and addition of 8-bit integers. You may zero-extend all to 16 bits to avoid loss of precision; then shift right and compress back to 8 bits.
In pseudo-ASM it looks somewhat like this:

loop:

; generate memory addresses
shr r0, u_coord, 24
shr r1, v_coord, 24
shl r1, 8
add r4, r0, r1
add r2, r0, 1
and r2, 255
add r5, r2, r1
add r3, r1, 256
and r3, 65535
add r6, r0, r3
add r7, r2, r3

; load the 4 texels
mov v0, int32 [texture_adr + r4*4]
mov v1, int32 [texture_adr + r4*5]
mov v2, int32 [texture_adr + r4*6]
mov v3, int32 [texture_adr + r4*7]

; expand components from 8bits to 16bits to avoid wrap-around problems
expand.u8.u16 v0
expand.u8.u16 v1
expand.u8.u16 v2
expand.u8.u16 v3

; generate interpolation factors
shr r8, u_coord, 16
and r8, 255
mov broadcast.u16x4 v4, r8
shr r9, v_coord, 16
and r9, 255
mov broadcast.u16x4 v5, r9

; linear interpolate
sub.s16 v1, v0
sub.s16 v3, v2
mul.s16 v1, v4
mul.s16 v3, v4
shr.s16 v1, 8
shr.s16 v3, 8
add.s16 v1, v0
add.s16 v3, v2
sub.s16 v3, v1
mul.s16 v3, v5
shr.s16 v3, 8
add.s16 v3, v1

; alpha blend
mov v6, u32 [screen_ptr]
expand.u8.u16 v6
mov broadcast.s16x4 v7, v3[3]
sub.s16 v3, v6
mul.s16 v3, v7
shr.s16 v3, 8
add.s16 v3, v6
shrink.u16.u8 v6
mov v6, u32 [screen_ptr]

; increment and loop
add screen_ptr, 4
add u_coord, u_delta
add v_coord, v_delta
cmp r10, screen_ptr, span_end_ptr
jz r10 loop


So the problem I'm seeing is not as much the linear interpolation as having to do ~50 instructions per pixel to do your addressing, interpolation, alpha blending and so forth. I guess you could compute multiple pixels together to soften the blow a bit - which I guess would be good for the interpolation and alpha blending parts, although it doesn't help too much for the addressing part.

 
thread Proposal for instruction set - now on Github new - Agner - 2016-06-26
replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-07-04
last replythread Proposal for instruction set - now on Github new - Agner - 2016-07-04
replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-06
last replythread Proposal for instruction set - now on Github new - Agner - 2016-07-06
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-07
last reply Proposal for instruction set - now on Github new - Agner - 2016-07-07
replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-15
reply Number of input dependencies new - Agner - 2016-08-16
last replythread Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-16
last replythread Whole-function vectorization and conditionals new - Agner - 2016-08-17
last replythread Merging with first operand new - Sylvain Collange - 2016-08-18
last replythread Merging with first operand new - Agner - 2016-08-19
replythread SIMD exceptions are fine with masking new - Sylvain Collange - 2016-08-19
last replythread SIMD exceptions are fine with masking new - Agner - 2016-08-20
reply SIMD exceptions are fine with masking new - Hubert Lamontagne - 2016-08-20
last reply SIMD exceptions are fine with masking new - Sylvain Collange - 2016-08-25
last reply Merging with first operand new - Hubert Lamontagne - 2016-08-19
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-17
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-18
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-31
reply Proposal for instruction set - now on Github new - Agner - 2016-08-31
last reply Proposal for instruction set - now on Github new - Jorcy Neto - 2016-09-01
replythread Proposal for instruction set - now on Github new - Yuhong Bao - 2016-07-12
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-12
replythread Things from MIPS (and novel things) new - Anonymous - 2016-07-28
replythread Things from MIPS (and novel things) new - Agner - 2016-07-28
last reply Things from MIPS (and novel things) new - Hubert Lamontagne - 2016-07-28
last replythread Matrix multiplication new - Agner - 2016-07-29
reply Matrix multiplication new - Hubert Lamontagne - 2016-07-29
last replythread Matrix multiplication new - John D. McCalpin - 2016-07-29
last reply Matrix multiplication new - Agner - 2016-07-29
replythread Introduction website new - Agner - 2016-08-01
last replythread Introduction website new - EricTL - 2017-07-17
last replythread Introduction website new - Agner - 2017-07-18
last replythread Introduction website new - EricTL - 2017-07-20
last reply Introduction website new - Agner - 2017-07-20
replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-04
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-04
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-05
replythread Proposal for instruction set - now on Github new - Agner - 2016-08-06
last replythread Proposal for instruction set - now on Github new - fanoI - 2016-08-08
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-08
last reply Proposal for instruction set - now on Github new - fanoI - 2016-08-09
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-08
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-09
last replythread Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-11
last replythread Proposal for instruction set - now on Github new - Agner - 2016-08-12
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-12
replythread Proposal for instruction set - now on Github new - grant galitz - 2016-08-22
reply Proposal for instruction set - now on Github new - Agner - 2016-08-22
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-24
replythread ARM with scalable vector extensions new - Agner - 2016-08-23
replythread ARM with scalable vector extensions new - Jorcy Neto - 2016-08-23
last reply ARM with scalable vector extensions new - Hubert Lamontagne - 2016-08-26
last reply ARM with scalable vector extensions new - Jorcy Neto - 2016-12-20
replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-05
replythread Proposal for instruction set - now on Github new - Agner - 2016-09-05
replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-05
last replythread Proposal for instruction set - now on Github new - Agner - 2016-09-06
reply Proposal for instruction set - now on Github new - Bigos - 2016-09-06
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-06
last replythread Proposal for instruction set - now on Github new - Agner - 2016-09-07
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-07
last replythread Proposal for instruction set - now on Github new - Agner - 2016-09-08
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-08
last replythread Proposal for instruction set - now on Github new - Commenter - 2016-09-07
last reply Proposal for instruction set - now on Github new - Bigos - 2016-09-08
last replythread Paging new - Kurt Baumgardner - 2016-09-09
replythread Paging new - Agner - 2016-09-10
reply Paging new - Hubert Lamontagne - 2016-09-11
last replythread Paging new - Kurt Baumgardner - 2016-09-13
replythread Paging new - Agner - 2016-09-13
last reply Paging new - Kurt Baumgardner - 2016-09-13
last replythread Paging new - Hubert Lamontagne - 2016-09-13
last reply Paging new - Kurt Baumgardner - 2016-09-14
replythread Paging new - Hubert Lamontagne - 2016-09-11
last reply Paging new - Kurt Baumgardner - 2016-09-13
last replythread Paging new - Agner - 2016-09-14
last reply Paging new - Jorcy Neto - 2016-09-18
replythread A null register? new - csdt - 2016-09-23
last replythread A null register? new - Agner - 2016-09-24
last replythread A null register? new - Hubert Lamontagne - 2016-09-24
replythread A null register? new - csdt - 2016-09-26
last reply A null register? new - Agner - 2016-09-27
last replythread Indexed registers new - Kurt Baumgardner - 2016-09-26
last replythread Indexed registers new - Agner - 2016-09-27
replythread Indexed registers new - Kurt Baumgardner - 2016-09-27
last reply Indexed registers new - Agner - 2016-09-28
last replythread Indexed registers new - Hubert Lamontagne - 2016-09-28
last replythread Indexed registers new - Kurt Baumgardner - 2016-10-03
reply Indexed registers new - Agner - 2016-10-03
last replythread Indexed registers new - Hubert Lamontagne - 2016-10-04
last replythread Bilinear Interpolation new - Hubert Lamontagne - 2016-10-28
last replythread Bilinear Interpolation new - Agner - 2016-10-29
last replythread Bilinear Interpolation new - Hubert Lamontagne - 2016-10-29
last replythread Bilinear Interpolation new - Agner - 2016-10-30
last reply Bilinear Interpolation - Hubert Lamontagne - 2016-10-30
replythread ForwardCom version 1.04 new - Agner - 2016-12-08
replythread ForwardCom version 1.04 new - Matthias Bentrup - 2016-12-12
last replythread ForwardCom version 1.04 new - Agner - 2016-12-12
last reply ForwardCom version 1.04 new - Matthias Bentrup - 2016-12-14
last replythread Async system calls; horizontal packing instruction new - Joe Duarte - 2016-12-14
reply Async system calls; horizontal packing instruction new - Agner - 2016-12-15
last replythread Comparison of instruction sets new - Agner - 2016-12-17
replythread Comparison of instruction sets new - Joe Duarte - 2016-12-28
reply Comparison of instruction sets new - Agner - 2016-12-29
last reply Comparison of instruction sets new - Hubert Lamontagne - 2016-12-30
last reply Comparison of instruction sets new - Hubert Lamontagne - 2017-01-05
replythread ForwardCom version 1.05 new - Agner - 2017-01-22
replythread Syscall/ISR acceleration new - Jonathan Brandmeyer - 2017-01-22
last replythread Syscall/ISR acceleration new - Agner - 2017-01-23
last replythread Syscall/ISR acceleration new - Jonathan Brandmeyer - 2017-01-25
last reply Syscall/ISR acceleration new - Agner - 2017-01-25
replythread ForwardCom version 1.05 new - Jiří Moravec - 2017-01-23
last reply ForwardCom version 1.05 new - Agner - 2017-01-24
last replythread Jump prefetch? new - csdt - 2017-01-27
last replythread Jump prefetch? new - Agner - 2017-01-27
last replythread Jump prefetch? new - csdt - 2017-01-30
last replythread Jump prefetch? new - Agner - 2017-01-30
last replythread Jump prefetch? new - csdt - 2017-01-30
replythread Jump prefetch? new - Agner - 2017-01-31
reply Jump prefetch? new - csdt - 2017-01-31
last replythread Jump prefetch? new - Hubert Lamontagne - 2017-02-01
last replythread Jump prefetch? new - Agner - 2017-02-01
last replythread Jump prefetch? new - Hubert Lamontagne - 2017-02-01
last replythread Jump prefetch? new - Agner - 2017-02-02
last reply Jump prefetch? new - Agner - 2017-02-14
last replythread Jump prefetch? new - Hubert Lamontagne - 2017-01-31
last replythread High precision arithmetic new - fanoI - 2017-03-21
last reply High precision arithmetic new - Agner - 2017-03-21
replythread Intel's Control-flow Enforcement Technology new - Joe Duarte - 2017-04-13
last reply Intel's Control-flow Enforcement Technology new - Agner - 2017-04-14
reply Proposal for instruction set - now on Github new - Agner - 2017-04-27
replythread Assembler with metaprogramming features new - Agner - 2017-07-27
last replythread Assembler with metaprogramming features new - Kai Rese - 2017-08-11
last replythread Assembler with metaprogramming features new - Agner - 2017-08-11
last replythread Assembler with metaprogramming features new - Kai Rese - 2017-08-14
last replythread Assembler with metaprogramming features new - Agner - 2017-08-14
last reply Assembler with metaprogramming features new - Kai Rese - 2017-08-15
replythread Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-22
last replythread Number of register file ports in implementations new - Agner - 2017-08-23
last replythread Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-27
last replythread Number of register file ports in implementations new - Agner - 2017-08-28
reply Number of register file ports in implementations new - Bigos - 2017-08-28
last reply Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-28
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-20
replythread Proposal for instruction set - now on Github new - Agner - 2017-09-20
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-20
last replythread Proposal for instruction set - now on Github new - Agner - 2017-09-20
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-21
last replythread Proposal for instruction set - now on Github new - Agner - 2017-09-21
last replythread Proposal for instruction set - now on Github new - yeengief - 2017-09-21
last reply Proposal for instruction set - now on Github new - Agner - 2017-09-23
replythread Proposal for instruction set - now on Github new - - - 2017-09-22
last reply Proposal for instruction set - now on Github new - Agner - 2017-09-23
last replythread Proposal for instruction set - now on Github new - Hubert Lamontagne - 2017-09-25
last replythread Proposal for instruction set - now on Github new - Agner - 2017-09-26
last reply Proposal for instruction set - now on Github new - Hubert Lamontagne - 2017-09-26