Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

List Messageboards

Bilinear Interpolation

Author:

Date: 2016-10-30 21:47

Agner wrote:

I don't think I understand your problem. You have four RGBA points in each their vector register. All of these should be multiplied by a factor and then it should all be added together. It's just multiplication and addition of 8-bit integers. You may zero-extend all to 16 bits to avoid loss of precision; then shift right and compress back to 8 bits.

In pseudo-ASM it looks somewhat like this:

loop:

; generate memory addresses
shr r0, u_coord, 24
shr r1, v_coord, 24
shl r1, 8
add r4, r0, r1
add r2, r0, 1
and r2, 255
add r5, r2, r1
add r3, r1, 256
and r3, 65535
add r6, r0, r3
add r7, r2, r3

; load the 4 texels
mov v0, int32 [texture_adr + r4*4]
mov v1, int32 [texture_adr + r4*5]
mov v2, int32 [texture_adr + r4*6]
mov v3, int32 [texture_adr + r4*7]

; expand components from 8bits to 16bits to avoid wrap-around problems
expand.u8.u16 v0
expand.u8.u16 v1
expand.u8.u16 v2
expand.u8.u16 v3

; generate interpolation factors
shr r8, u_coord, 16
and r8, 255
mov broadcast.u16x4 v4, r8
shr r9, v_coord, 16
and r9, 255
mov broadcast.u16x4 v5, r9

; linear interpolate
sub.s16 v1, v0
sub.s16 v3, v2
mul.s16 v1, v4
mul.s16 v3, v4
shr.s16 v1, 8
shr.s16 v3, 8
add.s16 v1, v0
add.s16 v3, v2
sub.s16 v3, v1
mul.s16 v3, v5
shr.s16 v3, 8
add.s16 v3, v1

; alpha blend
mov v6, u32 [screen_ptr]
expand.u8.u16 v6
mov broadcast.s16x4 v7, v3[3]
sub.s16 v3, v6
mul.s16 v3, v7
shr.s16 v3, 8
add.s16 v3, v6
shrink.u16.u8 v6
mov v6, u32 [screen_ptr]

; increment and loop
add screen_ptr, 4
add u_coord, u_delta
add v_coord, v_delta
cmp r10, screen_ptr, span_end_ptr
jz r10 loop

So the problem I'm seeing is not as much the linear interpolation as having to do ~50 instructions per pixel to do your addressing, interpolation, alpha blending and so forth. I guess you could compute multiple pixels together to soften the blow a bit - which I guess would be good for the interpolation and alpha blending parts, although it doesn't help too much for the addressing part.

Reply To This Message

Previous Message

Proposal for instruction set - now on Github new - Agner - 2016-06-26

Proposal for instruction set - now on Github new - Joe Duarte - 2016-07-04

Proposal for instruction set - now on Github new - Agner - 2016-07-04

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-06

Proposal for instruction set - now on Github new - Agner - 2016-07-06

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-07

Proposal for instruction set - now on Github new - Agner - 2016-07-07

Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15

Whole-function vectorization and conditionals new - Agner - 2016-08-15

Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15

Whole-function vectorization and conditionals new - Agner - 2016-08-15

Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-15

Whole-function vectorization and conditionals new - Agner - 2016-08-15

Number of input dependencies new - Agner - 2016-08-16

Whole-function vectorization and conditionals new - Sylvain Collange - 2016-08-16

Whole-function vectorization and conditionals new - Agner - 2016-08-17

Merging with first operand new - Sylvain Collange - 2016-08-18

Merging with first operand new - Agner - 2016-08-19

SIMD exceptions are fine with masking new - Sylvain Collange - 2016-08-19

SIMD exceptions are fine with masking new - Agner - 2016-08-20

SIMD exceptions are fine with masking new - Hubert Lamontagne - 2016-08-20

SIMD exceptions are fine with masking new - Sylvain Collange - 2016-08-25

Merging with first operand new - Hubert Lamontagne - 2016-08-19

Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-17

Proposal for instruction set - now on Github new - Agner - 2016-08-18

Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-31

Proposal for instruction set - now on Github new - Agner - 2016-08-31

Proposal for instruction set - now on Github new - Jorcy Neto - 2016-09-01

Proposal for instruction set - now on Github new - Yuhong Bao - 2016-07-12

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-07-12

Things from MIPS (and novel things) new - Anonymous - 2016-07-28

Things from MIPS (and novel things) new - Agner - 2016-07-28

Things from MIPS (and novel things) new - Hubert Lamontagne - 2016-07-28

Matrix multiplication new - Agner - 2016-07-29

Matrix multiplication new - Hubert Lamontagne - 2016-07-29

Matrix multiplication new - John D. McCalpin - 2016-07-29

Matrix multiplication new - Agner - 2016-07-29

Introduction website new - Agner - 2016-08-01

Introduction website new - EricTL - 2017-07-17

Introduction website new - Agner - 2017-07-18

Introduction website new - EricTL - 2017-07-20

Introduction website new - Agner - 2017-07-20

Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-04

Proposal for instruction set - now on Github new - Agner - 2016-08-04

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-05

Proposal for instruction set - now on Github new - Agner - 2016-08-06

Proposal for instruction set - now on Github new - fanoI - 2016-08-08

Proposal for instruction set - now on Github new - Agner - 2016-08-08

Proposal for instruction set - now on Github new - fanoI - 2016-08-09

Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-08

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-09

Proposal for instruction set - now on Github new - Joe Duarte - 2016-08-11

Proposal for instruction set - now on Github new - Agner - 2016-08-12

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-12

Proposal for instruction set - now on Github new - grant galitz - 2016-08-22

Proposal for instruction set - now on Github new - Agner - 2016-08-22

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-08-24

ARM with scalable vector extensions new - Agner - 2016-08-23

ARM with scalable vector extensions new - Jorcy Neto - 2016-08-23

ARM with scalable vector extensions new - Hubert Lamontagne - 2016-08-26

ARM with scalable vector extensions new - Jorcy Neto - 2016-12-20

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-05

Proposal for instruction set - now on Github new - Agner - 2016-09-05

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-05

Proposal for instruction set - now on Github new - Agner - 2016-09-06

Proposal for instruction set - now on Github new - Bigos - 2016-09-06

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-06

Proposal for instruction set - now on Github new - Agner - 2016-09-07

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-07

Proposal for instruction set - now on Github new - Agner - 2016-09-08

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2016-09-08

Proposal for instruction set - now on Github new - Commenter - 2016-09-07

Proposal for instruction set - now on Github new - Bigos - 2016-09-08

Paging new - Kurt Baumgardner - 2016-09-09

Paging new - Agner - 2016-09-10

Paging new - Hubert Lamontagne - 2016-09-11

Paging new - Kurt Baumgardner - 2016-09-13

Paging new - Agner - 2016-09-13

Paging new - Kurt Baumgardner - 2016-09-13

Paging new - Hubert Lamontagne - 2016-09-13

Paging new - Kurt Baumgardner - 2016-09-14

Paging new - Hubert Lamontagne - 2016-09-11

Paging new - Kurt Baumgardner - 2016-09-13

Paging new - Agner - 2016-09-14

Paging new - Jorcy Neto - 2016-09-18

A null register? new - csdt - 2016-09-23

A null register? new - Agner - 2016-09-24

A null register? new - Hubert Lamontagne - 2016-09-24

A null register? new - csdt - 2016-09-26

A null register? new - Agner - 2016-09-27

Indexed registers new - Kurt Baumgardner - 2016-09-26

Indexed registers new - Agner - 2016-09-27

Indexed registers new - Kurt Baumgardner - 2016-09-27

Indexed registers new - Agner - 2016-09-28

Indexed registers new - Hubert Lamontagne - 2016-09-28

Indexed registers new - Kurt Baumgardner - 2016-10-03

Indexed registers new - Agner - 2016-10-03

Indexed registers new - Hubert Lamontagne - 2016-10-04

Bilinear Interpolation new - Hubert Lamontagne - 2016-10-28

Bilinear Interpolation new - Agner - 2016-10-29

Bilinear Interpolation new - Hubert Lamontagne - 2016-10-29

Bilinear Interpolation new - Agner - 2016-10-30

Bilinear Interpolation - Hubert Lamontagne - 2016-10-30

ForwardCom version 1.04 new - Agner - 2016-12-08

ForwardCom version 1.04 new - Matthias Bentrup - 2016-12-12

ForwardCom version 1.04 new - Agner - 2016-12-12

ForwardCom version 1.04 new - Matthias Bentrup - 2016-12-14

Async system calls; horizontal packing instruction new - Joe Duarte - 2016-12-14

Async system calls; horizontal packing instruction new - Agner - 2016-12-15

Comparison of instruction sets new - Agner - 2016-12-17

Comparison of instruction sets new - Joe Duarte - 2016-12-28

Comparison of instruction sets new - Agner - 2016-12-29

Comparison of instruction sets new - Hubert Lamontagne - 2016-12-30

Comparison of instruction sets new - Hubert Lamontagne - 2017-01-05

ForwardCom version 1.05 new - Agner - 2017-01-22

Syscall/ISR acceleration new - Jonathan Brandmeyer - 2017-01-22

Syscall/ISR acceleration new - Agner - 2017-01-23

Syscall/ISR acceleration new - Jonathan Brandmeyer - 2017-01-25

Syscall/ISR acceleration new - Agner - 2017-01-25

ForwardCom version 1.05 new - JiÅ™Ã Moravec - 2017-01-23

ForwardCom version 1.05 new - Agner - 2017-01-24

Jump prefetch? new - csdt - 2017-01-27

Jump prefetch? new - Agner - 2017-01-27

Jump prefetch? new - csdt - 2017-01-30

Jump prefetch? new - Agner - 2017-01-30

Jump prefetch? new - csdt - 2017-01-30

Jump prefetch? new - Agner - 2017-01-31

Jump prefetch? new - csdt - 2017-01-31

Jump prefetch? new - Hubert Lamontagne - 2017-02-01

Jump prefetch? new - Agner - 2017-02-01

Jump prefetch? new - Hubert Lamontagne - 2017-02-01

Jump prefetch? new - Agner - 2017-02-02

Jump prefetch? new - Agner - 2017-02-14

Jump prefetch? new - Hubert Lamontagne - 2017-01-31

High precision arithmetic new - fanoI - 2017-03-21

High precision arithmetic new - Agner - 2017-03-21

Intel's Control-flow Enforcement Technology new - Joe Duarte - 2017-04-13

Intel's Control-flow Enforcement Technology new - Agner - 2017-04-14

Proposal for instruction set - now on Github new - Agner - 2017-04-27

Assembler with metaprogramming features new - Agner - 2017-07-27

Assembler with metaprogramming features new - Kai Rese - 2017-08-11

Assembler with metaprogramming features new - Agner - 2017-08-11

Assembler with metaprogramming features new - Kai Rese - 2017-08-14

Assembler with metaprogramming features new - Agner - 2017-08-14

Assembler with metaprogramming features new - Kai Rese - 2017-08-15

Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-22

Number of register file ports in implementations new - Agner - 2017-08-23

Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-27

Number of register file ports in implementations new - Agner - 2017-08-28

Number of register file ports in implementations new - Bigos - 2017-08-28

Number of register file ports in implementations new - Hubert Lamontagne - 2017-08-28

Proposal for instruction set - now on Github new - yeengief - 2017-09-20

Proposal for instruction set - now on Github new - Agner - 2017-09-20

Proposal for instruction set - now on Github new - yeengief - 2017-09-20

Proposal for instruction set - now on Github new - Agner - 2017-09-20

Proposal for instruction set - now on Github new - yeengief - 2017-09-21

Proposal for instruction set - now on Github new - Agner - 2017-09-21

Proposal for instruction set - now on Github new - yeengief - 2017-09-21

Proposal for instruction set - now on Github new - Agner - 2017-09-23

Proposal for instruction set - now on Github new - - - 2017-09-22

Proposal for instruction set - now on Github new - Agner - 2017-09-23

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2017-09-25

Proposal for instruction set - now on Github new - Agner - 2017-09-26

Proposal for instruction set - now on Github new - Hubert Lamontagne - 2017-09-26

New assembler, new version, new forum new - Agner - 2017-11-03

List Messageboards