How to reduce EXE size of x86 ASM compiled with FASM?

14

0

As an exercise, I've created a simple solution for this challenge, in x86 Assembly Language. I'm running this with FASM on Windows. Here's my source code:

format PE console
entry start

include 'WIN32A.inc'

section '.text' code executable
start:
    push    char            ; Start at 'A'
    call    [printf]        ; Print the current letter 4 times
    call    [printf]
    call    [printf]
    call    [printf]
    inc     [char]          ; Increment the letter
    cmp     [char], 'Z'     ; Compare to 'Z'
    jle     start           ; char <= 'Z' --> goto start

section 'r.data' data readable writeable
    char    db  'A', 10, 0  ; Stores the current letter

section '.idata' data readable import
    library  msvcrt,   'msvcrt.dll'
    import   msvcrt, printf, 'printf'

When I compile this, I get an executable larger than I expected. Here's a hexdump:

https://pastebin.com/W5sUTvTe

I notice there's a lot of empty space between the code section and the data and library import sections, as well as a message saying "This program cannot be run in DOS mode" embedded in the code. How can I assemble my source code to a small file, suitable for Code Golf?

As a side note, suggestions for better ways to print to stdout without importing msvcrt and calling printf are welcome.

vasilescur

Posted 2017-11-21T13:18:39.450

Reputation: 341

@iBug I'm sorry to hear that. Could you please suggest a more suitable place for me to ask? – vasilescur – 2017-11-21T13:22:15.720

12@iBug Tips questions asking for golfing help in specific cases are most definitely not off-topic here. – AdmBorkBork – 2017-11-21T13:22:50.007

12Relevant meta – AdmBorkBork – 2017-11-21T13:25:18.173

Machine assembly code contains a lot of boilerplate, so often people submit functions instead of programs. – user202729 – 2017-11-21T13:51:06.977

Although, in this case, the question explicitly require programs, so golfing the resulting generated program is necessary. – user202729 – 2017-11-21T13:53:28.670

Check out this answer: https://codegolf.stackexchange.com/a/86106/51429 How did he do that?

– vasilescur – 2017-11-21T13:54:37.010

@user3284178 MS-DOS full program boilerplate is often smaller. So, just use MS-DOS instead. – user202729 – 2017-11-21T13:55:59.833

1It has to be: start: push char
Lb: call [printf]
call [printf] call [printf] call [printf] inc [char]
cmp [char], 'Z'
jle Lb because if not, could be consume the stack; one has to see if each call to printf one has to add the instruction that adjust esp
– RosLuP – 2017-11-21T14:55:15.777

Not sure this is still working, but you might want to try targeting Linux instead and put part of your code in the ELF header itself like described here.

– Felix Palmen – 2017-11-21T15:12:21.093

@user202729 it has to be: start: push char Lb: call [printf] pop eax call [printf] pop eax call [printf] pop eax call [printf] pop eax inc [char] cmp [char], 'Z' jle Lb because if not, it would exhaust the stack memory; one has to see if each call to printf one has to add the instruction that adjust esp. It is better? – RosLuP – 2017-11-22T14:17:49.733

golfing the file format is quite different from golfing the code, though. Now we're getting into specifics of the operating-system version because restrictions changed between 32-bit and 64-bit, and even between versions of the same. See http://pferrie.host22.com/misc/tiny/pehdr.htm for tips to build your own file header, and Crinkler has code to tiny resolve imports at runtime.

– peter ferrie – 2017-11-24T17:52:21.297

1instead of printf, you can WriteFile(stdout), needing no imports other than kernel32 (which is present by default, you just need to determine the address) – peter ferrie – 2017-11-24T20:13:13.240

Answers

2

Quite a bit general tip, but

Use COM file format instead of PE EXE.

PE EXE has a few flaws making the format pretty much useless in code-golf. First one is the image aligning (Windows won't run the EXE file if it's not aligned properly), and the second one is the header size. There are a few factors that aren't this important (dividing the executable into sections).

Advantages of using COM file format (that is pretty much equivalent to flat binary) are:

  • Zero header code, file isn't divided into sections
  • No image aligning (so the image size might not be divisible by a strictly defined power of two, but it has to be smaller than 65K. It doesn't change much though, because if your submission is larger than 65K, you're doing something wrong).
  • You can't use external libraries - this is actually a plus, because you without doubt have other way to perform I/O. That's where BIOS interrupts come handy.
  • You have direct control over the memory and devices linked up to the system, therefore there is no paging, no access violations, no memory protection, no concurrency, so on and so forth. These features make it easier to golf really creative programs.

I've revised your code to work as flat binary. It's dead simple:

ORG 100H

MOV DX, P
MOV AH, 9

L:
    INT 21H
    INT 21H
    INT 21H
    INT 21H

    INC BYTE [P]
    CMP BYTE [P], 'Z'
    JLE L

MOV AX, 4C00h
INT 21h

P DB "A", 10, "$"

The output binary is just 32 bytes big. I believe, it's possible to reduce the size even further, but this is just a starting point.

Assemble with nasm -fbin file.asm -o file.com. Note, this example has been made for NASM, but you can translate it freely to FASM, and it will work flawlessly.

Krzysztof Szewczyk

Posted 2017-11-21T13:18:39.450

Reputation: 3 819

I can't believe I've answered this question and went back to it from google – Krzysztof Szewczyk – 2019-11-08T19:29:56.037