2

I've been practicing shell coding on Linux for a while. I used a VM to develop and test my shell code. The VM ran inside VMware workstation 12 pro, on Windows 10. Everything was fine and I could use the usual int 0x80 to switch to kernel mode until now. Just recently​, I upgraded my windows 10 from 1607 to 1703 and used the same VMware and same Linux virtual machine. However, there is no int 0x80 anywhere in the syscall invocation now. I compiled the same source codes that I used to try earlier, which earlier showed int 0x80 now don't show them anymore. For instance, the gdb assembler dump for exit function has the instructions as

<Address> jmp DWORD PTR ds: <address>
<Address> push 0x8
<Address> jmp  <Address>

I wonder what could the reason be.

EDITS 1. I haven't updated my Linux 2. Haven't updated my VMware 3. Use a 32 but Linux VM on a 64 bit machine.

julian
  • 1,269
  • 1
  • 8
  • 15
user148898
  • 113
  • 1
  • 10
  • 2
    how about sharing the source code, the binary and the compiler + flags used – julian Jun 10 '17 at 19:17
  • @SYS_V hi. Sorry for the late response, I've had some connectivity issues and got busy lately. I don't think stack exchange has an option to share files, so I guess I'll upload them as a drive Link and post it. – user148898 Jul 02 '17 at 15:25
  • @SYS_V Here is the link to a folder that I am sharing, so that you can see the binary, source and the details about compiler flags and version information. Looking forward for your answer.... Thanks in advance. :) – user148898 Jul 02 '17 at 19:52
  • I don't see a link anywhere – julian Jul 03 '17 at 03:20
  • That's totally strange. I did put up a link though. Anyways. I'll put it this time again.. I'm sorry for that... https://drive.google.com/folderview?id=0B_KJRT1eAlexSzlQMEN5dkF4dDQ – user148898 Jul 04 '17 at 17:25

1 Answers1

1

int $0x80 is indeed executed by some of the C library functions in the statically linked binary, but not by nearly as many library functions as one would expect given how many library functions are included in the binary:

$ objdump -dj .text binary | grep "cd 80"
 8049401:   cd 80                   int    $0x80
 806c465:   cd 80                   int    $0x80
 806ea5e:   cd 80                   int    $0x80
 806f040:   cd 80                   int    $0x80
 807b5c5:   cd 80                   int    $0x80
 807b5ce:   cd 80                   int    $0x80
 8092651:   cd 80                   int    $0x80
 8094092:   cd 80                   int    $0x80

Let us examine a few of the library functions to see what is going on:

_exit

$ objdump -dj .text binary | grep -A10 "<_exit>:"
0806c451 <_exit>:
 806c451:   8b 5c 24 04             mov    0x4(%esp),%ebx
 806c455:   b8 fc 00 00 00          mov    $0xfc,%eax
 806c45a:   ff 15 f0 a9 0e 08       call   *0x80ea9f0
 806c460:   b8 01 00 00 00          mov    $0x1,%eax
 806c465:   cd 80                   int    $0x80            <---
 806c467:   f4                      hlt    
 806c468:   66 90                   xchg   %ax,%ax
 806c46a:   66 90                   xchg   %ax,%ax
 806c46c:   66 90                   xchg   %ax,%ax
 806c46e:   66 90                   xchg   %ax,%ax

No issues here. Note that exit is different than _exit.

__execve:

$ objdump -dj .text binary | grep -A10 "<__execve>:"
0806c470 <__execve>:
 806c470:   53                      push   %ebx
 806c471:   8b 54 24 10             mov    0x10(%esp),%edx
 806c475:   8b 4c 24 0c             mov    0xc(%esp),%ecx
 806c479:   8b 5c 24 08             mov    0x8(%esp),%ebx
 806c47d:   b8 0b 00 00 00          mov    $0xb,%eax     <-- 11 in eax == sys_execve
 806c482:   ff 15 f0 a9 0e 08       call   *0x80ea9f0
 806c488:   3d 00 f0 ff ff          cmp    $0xfffff000,%eax
 806c48d:   77 02                   ja     806c491 <__execve+0x21>
 806c48f:   5b                      pop    %ebx
 806c490:   c3                      ret

__libc_open and __open_nocancel:

0806cdb0 <__libc_open>:
 806cdb0:   65 83 3d 0c 00 00 00    cmpl   $0x0,%gs:0xc
 806cdb7:   00 
 806cdb8:   75 25                   jne    806cddf <__open_nocancel+0x25>

0806cdba <__open_nocancel>:
 806cdba:   53                      push   %ebx
 806cdbb:   8b 54 24 10             mov    0x10(%esp),%edx
 806cdbf:   8b 4c 24 0c             mov    0xc(%esp),%ecx
 806cdc3:   8b 5c 24 08             mov    0x8(%esp),%ebx
 806cdc7:   b8 05 00 00 00          mov    $0x5,%eax      <-- 5 in eax == sys_open
 806cdcc:   ff 15 f0 a9 0e 08       call   *0x80ea9f0
 806cdd2:   5b                      pop    %ebx
 806cdd3:   3d 01 f0 ff ff          cmp    $0xfffff001,%eax
 806cdd8:   0f 83 82 32 00 00       jae    8070060 <__syscall_error>
 806cdde:   c3                      ret    
 806cddf:   e8 cc 1b 00 00          call   806e9b0 <__libc_enable_asynccancel>
 806cde4:   50                      push   %eax
 806cde5:   53                      push   %ebx
 806cde6:   8b 54 24 14             mov    0x14(%esp),%edx
 806cdea:   8b 4c 24 10             mov    0x10(%esp),%ecx
 806cdee:   8b 5c 24 0c             mov    0xc(%esp),%ebx
 806cdf2:   b8 05 00 00 00          mov    $0x5,%eax      <-- 5 in eax == sys_open
 806cdf7:   ff 15 f0 a9 0e 08       call   *0x80ea9f0

It looks like many of these library functions are making system calls through an intermediary beginning at the address in location 0x80ea9f0 which is in the .data section.

$ readelf -SW binary

[23] .got.plt          PROGBITS        080ea000 0a1000 000044 04  WA  0   0  4
[24] .data             PROGBITS        080ea060 0a1060 000f20 00  WA  0   0 32
[25] .bss              NOBITS          080eaf80 0a1f80 00136c 00  WA  0   0 32
[26] __libc_freeres_ptrs NOBITS        080ec2ec 0a1f80 000018 00  WA  0   0  4
[27] .comment          PROGBITS        00000000 0a1f80 00002b 01  MS  0   0  1
[28] .shstrtab         STRTAB          00000000 0a1fab 00014c 00      0   0  1

Let's take a look inside the .data section:

readelf -x .data binary | less:

0x080ea9d0 00000000 00000000 00000000 00000000 ................
0x080ea9e0 00000000 01000000 00000000 00000000 ................
0x080ea9f0 40f00608 b0ad0908 07000000 7f030000 @...............
0x080eaa00 03000000 02000000 00100000 107a0908 .............z..

Remembering that x86 is little-endian, we read 40f00608 as 0806f040.

Good news: 0x0806f040 is in the .text section of the binary:

[ 3] .rel.plt          REL             08048138 000138 000070 08   A  0   5  4
[ 4] .init             PROGBITS        080481a8 0001a8 000023 00  AX  0   0  4
[ 5] .plt              PROGBITS        080481d0 0001d0 0000e0 00  AX  0   0 16
[ 6] .text             PROGBITS        080482b0 0002b0 075b64 00  AX  0   0 16
[ 7] __libc_freeres_fn PROGBITS        080bde20 075e20 000b36 00  AX  0   0 16

Let's take a peek at the code at that address:

$ objdump -dj .text binary | grep -A5 0806f040
0806f040 <_dl_sysinfo_int80>:
 806f040:   cd 80                   int    $0x80      <--- !!!!!!!!!
 806f042:   c3                      ret    
 806f043:   8d b6 00 00 00 00       lea    0x0(%esi),%esi
 806f049:   8d bc 27 00 00 00 00    lea    0x0(%edi,%eiz,1),%edi

The memory address 0x0806f040 contains the first instruction of function _dl_sysinfo_int80, which is the int instruction followed by the interrupt vector 128.

In conclusion, every library function that calls *0x80ea9f0 is making a system call via function _dl_sysinfo_int80. No dynamic analysis necesary.

julian
  • 1,269
  • 1
  • 8
  • 15
  • That's a very satisfying answer. It took me some time to walk the binary myself following the commands you used. I really appreciate the answer. Although one question still lingers in my mind; I've been reading the book "The Shellcoders Handbook", and in the text the disassembly directly has a call to int 0x80. Could this the be due to some modification in GCC? – user148898 Jul 08 '17 at 16:47
  • It may be done to ensure compatibility when creating a statically linked binary. See [What's the purpose of _dl_sysinfo_int80?](https://stackoverflow.com/questions/30251944/whats-the-purpose-of-dl-sysinfo-int80) for more info. Note that the vast majority of ELF binaries are dynamically linked; the book probably uses statically linked binaries for teaching purposes as it simplifies certain things. See [MBE](https://github.com/RPISEC/MBE) and [CSCI 1951H – Software Security and Exploitation](https://cs.brown.edu/courses/csci1951-h/lectures.html) for a more modern treatment of the subject. – julian Jul 08 '17 at 17:16
  • Take a look at the comments starting on line 36: https://github.com/lattera/glibc/blob/master/nptl/sysdeps/unix/sysv/linux/i386/i686/dl-sysdep.h – julian Jul 08 '17 at 17:22
  • That was a very interesting read. It took me time to go through them. Thank you very much. I'll get back if I have any more questions. – user148898 Jul 18 '17 at 11:36