x86-64 (and x86-32) machine code, 13 15 13 bytes
changelog:
Bugfix: the first version was only checking G=0xff, not requiring R and B to be 0. I changed to modifying the background in place so I could use lodsd
on the foreground to have fg pixels in eax
for the short-form cmp eax, imm32
encoding (5 bytes), instead of cmp dh,0xff
(3 bytes).
Save 2 bytes: noticed that modifying the bg in place allowed using a memory operand for cmov
, saving a 2-byte mov
load (and saving a register, in case that matters).
This is a function following the x86-64 System V calling convention, callable directly from C or C++ (on x86-64 non-Windows systems) with this signature:
void chromakey_blend_RGB32(uint32_t *background /*rdi*/,
const uint32_t *foreground /*rsi*/,
int dummy, size_t pixel_count /*rcx*/);
The image format is RGB0 32bpp, with the green component at the 2nd lowest memory address within each pixel. The foreground background image is modified in-place. pixel_count
is rows*columns. It doesn't care about rows/columns; it just chromekey blends however many dwords of memory you specify.
RGBA (with A required to be 0xFF) would require using a different constant, but no change in function size. Foreground DWORDs are compared for exact equality against an arbitrary 32-bit constant stored in 4 bytes, so any pixel-order or chroma-key colour can be easily supported.
The same machine code also works in 32-bit mode. To assemble as 32-bit, change rdi
to edi
in the source. All other registers that become 64-bit are implicit (lodsd/stosd, and loop), and the other explicit regs stay 32-bit. But note that you'll need a wrapper to call from 32-bit C, because none of the standard x86-32 calling conventions use the same regs as x86-64 SysV.
NASM listing (machine-code + source), commented for asm beginners with descriptions of what the more complex instructions do. (Duplicating the instruction reference manual is bad style in normal usage.)
1 ;; inputs:
2 ;; Background image pointed to by RDI, RGB0 format (32bpp)
3 ;; Foreground image pointed to by RSI, RGBA or RGBx (32bpp)
4 machine ;; Pixel count in RCX
5 code global chromakey_blend_RGB32
6 bytes chromakey_blend_RGB32:
7 address .loop: ;do {
8 00000000 AD lodsd ; eax=[rsi], esi+=4. load fg++
9 00000001 3D00FF0000 cmp eax, 0x0000ff00 ; check for chromakey
10 00000006 0F4407 cmove eax, [rdi] ; eax = (fg==key) ? bg : fg
11 00000009 AB stosd ; [rdi]=eax, edi+=4. store into bg++
12 0000000A E2F4 loop .loop ;} while(--rcx)
13
14 0000000C C3 ret
## next byte starts at 0x0D, function length is 0xD = 13 bytes
To get the original NASM source out of this listing, strip the leading 26 characters of each line with <chromakey.lst cut -b 26- > chromakey.asm
. I generated this with
nasm -felf64 chromakey-blend.asm -l /dev/stdout | cut -b -28,$((28+12))-
NASM listings leave more blank columns than I want between the machine-code and source. To build an object file you can link with C or C++, use nasm -felf64 chromakey.asm
. (Or yasm -felf64 chromakey.asm
).
untested, but I'm pretty confident that the basic idea of load / load / cmov / store is sound, because it's so simple.
I could save 3 bytes if I could require the caller to pass the chroma-key constant (0x00ff00) as an extra arg, instead of hard-coding the constant into the function. I don't think the usual rules allow writing a more generic function that has the caller set up constants for it. But if it did, the 3rd arg (currently dummy
) is passed in edx
in the x86-64 SysV ABI. Just change cmp eax, 0x0000ff00
(5B) to cmp eax, edx
(2B).
With SSE4 or AVX, you might do this faster (but larger code size) with pcmpeqd
and blendvps
to do a 32-bit element size variable-blend controlled by the compare mask. (With pand
, you could ignore the high byte). For packed RGB24, you might use pcmpeqb
and then 2x pshufb
+pand
to get TRUE in bytes where all 3 components of that pixel match, then pblendvb
.
(I know this is code-golf, but I did consider trying MMX before going with scalar integer.)
2May we take an image object in the native format of the language/library as input, or do we have to read the image via filename? – notjagan – 2017-07-16T19:20:12.907
@notjagan You may take image objects as input. – ckjbgames – 2017-07-16T19:21:42.530
Can you add an example? – ovs – 2017-07-16T19:22:23.697
@ovs i need someone else to provide something – ckjbgames – 2017-07-16T19:23:15.833
3Is I/O of arrays of arrays of integers acceptable or are we actually restricted to some other set of image I/O? – Jonathan Allan – 2017-07-16T21:26:04.460
Can we take the chroma-key (0x00ff00) as a function arg? I'm assuming the constant has to be hard-coded into the function, but I could save bytes by having the caller put it in a register for me, making the function more flexible/generic. – Peter Cordes – 2017-07-17T16:58:33.953
1@PeterCordes I will allow that. – ckjbgames – 2017-07-17T19:27:26.383
@JonathanAllan Yes. Just explain the format used. – ckjbgames – 2017-07-17T19:27:43.073
@ckjbgames: Are you sure that's a good idea? Several existing answers could probably be smaller if they got the caller to put the colorkey into a variable in the form they want it (e.g. a list like
[0,255,0]
). I'd actually recommend that you don't change the rules at this point, from what people were assuming. Unless it's normal to allow that kind of thing for codegolf, and the other answers should have thought of that :P – Peter Cordes – 2017-07-17T19:33:40.0771@PeterCordes ok – ckjbgames – 2017-07-17T19:35:12.487
@PeterCordes Done – ckjbgames – 2017-07-17T19:36:14.893
Oh, I think you were misunderstanding what I was asking. I was talking about the color value to match, not the array of image pixels. I was asking about replacing the
0xff00
32-bit constant with a one-byte reference to an argument passed by the caller (which on second thought I decided would be a bad change, because it's basically cheating by offloading part of the function to the caller, even if you could justify it in a chromakey function). Your edit to the question to allow taking images as arrays of pixels was good, but nothing to do with what I was talking about. – Peter Cordes – 2017-07-17T19:40:32.767@PeterCordes Oh. Undo. – ckjbgames – 2017-07-17T19:44:52.297