I agree with the answer that suggested the overall design of determining what is, and is not, a good system call may be difficult. Indeed, in isolation, a single system call might not be enough information.
In terms of the way Windows works, there are actually two levels of operation:
- NT-level calls. These are the typical system calls you'd expect via
sysenter
to code in ntdll.dll
i.e. Kernel mode routines.
- The Win32 subsystem level. These are the functions implemented in
advapi32.dll
, kernel32.dll
, user32.dll
and others. Some of these act directly, others may call the above functions themselves depending on what is needed.
In addition to this, as a broad survey there are two abstraction layers on top of the Win32 subsystem:
- COM and friends, also known as COM, COM+, DCOM, ActiveX, SuperDuper (okay, joking) and others.
- .Net and friends. .net executables are modified PE binaries that load in
mscoree.dll
amongst other things and may use APIs in other .net binaries, or Win32, or via COM interfaces.
Now, for hooking techniques, you have:
- Kernel. These days, ISR Hooking is tantamount to stability suicide unless you want to reverse engineer PatchGuard and disable it, or are permanently running your OS under a debugger. However, there are options for filtering implemented in Windows, provided by Filter Drivers. This is typically how an antivirus product might scan network activity and implement a firewall.
- User. You have a few options here:
- Patching the binary on launch. @Poly talked about code caves and redirects - given a target binary, one option is to overwrite code in its functions, or the functions of interesting APIs to redirect to your code. Typically, you don't have enough space to patch a jump to any old place in the address space, so you need a "code cave" near to the function to put your call/full size jump.
- Alternatives to this include IAT Hooking, a way to modify a binary's import table to pull in your functions rather than the original target.
- Another alternative is to use DLL redirection. Under Linux, you'd know this as
LD_PRELOAD
, - in plain English, you'd load your DLL of the same name and with matching symbols before the desired API dll, and steal the calls that way. It's even a feature of MSVC that you can produce redirected link functions in DLLs rather than having to implement your own stubs for things you don't care about!
- You can patch COM calls by overwriting the relevant entry in the vtable of the resultant interface class.
- There are other techniques, e.g. DLL Injection etc.
So if you are looking to implement a sandbox meeting your criteria, then there are a few considerations to take into account:
- First up, where and what you hook. If you hook all of
kernel32.dll
it is still feasible that API calls may get around that if they do not call these functions. I don't know any COM or .net code like this, but I'd bet there is some.
- You need to be careful with Kernel patching, as a result of PatchGuard. The filter manager provides most of what you could ever want to do antivirus (and therefore probably sandbox) wise, however, if you're going outside the filter manager you're going into unstable land pretty quickly. There may be limits to what you can do.
- Then, whatever hooks you place in userland are almost definitely detectable by the sandbox assuming you are not using binary translation. For example, let's pick on patching
CreateFile
. Untrusted code could simply read the memory of this function and check that the prologue is as expected. Likewise, there are ways to detect injected dlls, IAT hooks etc.
- The same applies to sandboxing kernel mode code. If you let the untrusted code install drivers, you are again on a level playing field with it.
This kind of work has been done - Sandboxie is one product that attempts to implement a sandbox at this kind of level, although it likely has more kernel level stuff than you'd necessarily like. I've only briefly looked at Sandboxie back when I knew less about this sort of thing, but I suspect a core component is kernel level because almost any amount of user mode hooking can be undone if you know to expect it. The only defence is to have a greater level of permission.
You've mentioned not having binary translation - I've not covered the various debugging APIs in Windows for this reason. Using these, you could essentially run a program as if you were stepping through it, and analyse any calls or jumps before they were made. However, this is basically binary translation, and debugging affects performance in any case. Not only that, but if I were writing a program to resist your sandbox, there are a number of known tricks for detecting debuggers (one being you can't debug a debugger, so if you relaunch yourself and try to debug that new process and can't, you're being debugged. This definitely applies to ptrace
) which would render the program unusable in the sandbox.
On a positive note, most non-malware products would likely play well with whichever set of techniques you used, since they're probably not trying to resist running under these environments. This can be quite a useful fact.
Last but not least, I apologise for the excessive use of bullet points! And, my list of techniques for hooking calls is not exhaustive - there are definitely others.