60

Seems like a relatively obvious way to prevent (software) keylogging would be to force only the current (in-focus) app to be able to receive keystrokes.

There could be a way to make explicit exceptions for macro apps etc. Querying the exception list would make finding a keylogger trivial.

Is there any reason operating systems don't enforce this policy by default?

user66309
  • 679
  • 1
  • 5
  • 5
  • 25
    The ability to hit Ctrl-Alt-Del? Ctrl-C? Alt-Tab? – schroeder Oct 11 '15 at 19:17
  • 59
    The OS kernel needs to intercept key input. Put a keylogger at the kernel level and you have everything. – schroeder Oct 11 '15 at 19:18
  • 7
    Most mobile OS do enforce this. With desktop OS it's a matter of history. They were never designed with malware in mind - apps run with the full privileges of the user that runs them. – paj28 Oct 11 '15 at 19:21
  • 3
    Key loggers are generally installed with root privilege. By definition, root can do anything. Key loggers that doesn't require root are much more limited in functionality. – Lie Ryan Oct 12 '15 at 07:07
  • 4
    @schroeder: Ctrl+C? Why shouldn't that be handled by the current program only? – Thomas Weller Oct 12 '15 at 08:25
  • @schroeder In operating systems designed from the ground up for security (that means iOS and Android currently), you can't do that. – user253751 Oct 12 '15 at 08:41
  • To see what is lost by sandboxing applications, look at all the things that are possible in Windows or GNU/Linux but not on Android or iOS. Classic Shell and f.lux are the first things that come to mind. You could also imagine an app adding StickyKeys. – user253751 Oct 12 '15 at 08:42
  • @immibis - with sandboxing, you can have a special permission like "permanent keyboard access". This is effectively what apps like third-party keyboards do. It's still much more secure than giving this permission to every app – paj28 Oct 12 '15 at 09:59
  • 5
    @Thomas, Regarding CTRL + C, it doesn't make sense to expect each individual application to implement copying when it could be implemented once for every application. Multiple implementations also lead to feature fragmentation, and inconsistent UI. – Chris Murray Oct 12 '15 at 15:28
  • 2
    @Thomas what happens when the current program is in a loop, or crashed? – schroeder Oct 12 '15 at 16:24
  • Remembered some malware which opens an exactly same window above the real one when a specific application (that happens to be hidden in the taskbar) is just run. Maybe they can do something even easier: just replace the original application. – user23013 Oct 12 '15 at 17:45
  • @paj28 Current systems seem to avoid having broad permissions like that, perhaps because they know how easy it would be to get into a situation where every app requests full control, and users always accept it because otherwise they can't use any apps. – user253751 Oct 12 '15 at 23:13
  • Similar but different: I find it infuriating when I am typing, and another app comes up and "steals" the keyboard away. **If I am actively typing, this should not be allowed to happen!** (This occurs when my PC is slow and I am trying to open multiple things and log in to more than one place at a time, for example.) –  Oct 13 '15 at 18:00
  • 2
    Not only does the OS kernel need to intercept key input the OS kernel _handles_ the IO related to key input coming in, there's no way to stop the OS from seeing the input without stopping the input from happening at all. I don't think that's what you want. – dave Oct 14 '15 at 02:18
  • 1
    I've seen some bad design programs somehow prevent ctrl+C to work and the only way to copy some text is right click > copy – phuclv Oct 14 '15 at 04:08
  • I guess here wasn't talked about the short cut for copying textes. It was talked about the linux terminal shortcut for stoping the execution of the current programm. – Zaibis Oct 14 '15 at 10:07
  • @schroeder, you switch/open another program using either the mouse or a global combination (eg. Alt+Tab or Ctrl+Alt+Supr), and use that to stop the misbehaving program, like the Task manager or the classic [kill(1)](http://manpages.ubuntu.com/manpages/vivid/en/man1/kill.1.html). For a console program, it makes sense that the or Terminal emulator can catch some keystrokes, or your DE, but not other programs. – Ángel Oct 14 '15 at 21:14
  • @LưuVĩnhPhúc: Have you tried Ctrl+Ins (Copy) and Shift+Ins (Paste)? Those are the CUA (Common User Access) keys as used in OS/2 and other CUA-compatible operating systems, such as Windows... – HeartWare Oct 15 '15 at 13:46

10 Answers10

81

Because it wouldn't help.

Most keyloggers are installed at the operating-system level, and the operating system needs to have access to the keystrokes. Alt-Tab program switching, using Ctrl-Alt-Del to terminate malfunctioning programs, and detecting keyboard activity to keep your screensaver from activating all require the OS to see keystrokes.

There's also the minor matter that if you eliminated OS access to the keyboard, every application would need to have a complete set of keyboard drivers built into it.

Mark
  • 34,390
  • 9
  • 85
  • 134
  • What do you mean by "installed at the operating-system level"? Like drivers? – user66309 Oct 11 '15 at 23:20
  • Typically in a driver-like way, yes. – Mark Oct 12 '15 at 02:09
  • 7
    Actually, there is [specific handling for Ctrl+Alt+Delete](https://en.wikipedia.org/wiki/Secure_attention_key) which makes it harder to catch with a keylogger. – Philipp Oct 12 '15 at 08:52
  • 8
    @Philipp, true, Ctrl+Alt+Delete can *only* be caught by an OS-level (or hardware) keylogger. – Mark Oct 12 '15 at 09:00
  • 6
    Mark, do you have any links or further reading about most key loggers being drivers? – paj28 Oct 12 '15 at 09:52
  • 9
    Where are you getting your information about "most" keyloggers? Taken literally, if 51% are at the kernel level, and 49% are at the user level, cutting out 49% would be very useful. Hell, even if 10% are userland loggers, eliminating 10% would be progress. I don't know of any data on what keyloggers are kernel level, but your answer begs the question about that that number really is. – Steve Sether Oct 12 '15 at 14:27
  • 1
    @SteveSether I don't have the number to hand, but "The vast, vast majority" are at the OS level (apart from anything else, anything above that level tends to be obvious to the user and will be picked up very quickly) – Jon Story Oct 12 '15 at 14:44
  • 4
    @JonStory Hi Jon. What's the source of your information? – Steve Sether Oct 12 '15 at 14:47
  • 2
    The security modules of my Computer Science degree - unfortunately as mentioned I don't have the numbers to hand (it was a few years ago now and I didn't note sources at the time!), but I had no reason to doubt the information and was presented by an academic with good familiarity with the field. – Jon Story Oct 12 '15 at 14:51
  • If you'll be targetting atow leading desktop os you won't need to go to kernel level, implementation will be trivial on userspace as you can see from many example programs logging keystrokes. I just cant see why most of keyloggers would ever been written as kernel modules even if it is true. – Sampo Sarrala - codidact.org Oct 14 '15 at 18:03
  • @SampoSarrala, kernel modules are easier to hide. – Mark Oct 14 '15 at 18:23
  • 2
    I doubt about the accuracy of this. I agree that most keyloggers will use the OS facilities (otherwise they would need to inject themselves on every process -which some do-), but I don't think they will be kernel modules. They are harder to code, and not being an administrator would keep you safe. – Ángel Oct 14 '15 at 21:18
31

The keyboard to application interface goes through several phases, some of which the OS has little control, and some that is provides explicit hooks into for additional functionality. The basic design goes like this: hardware events are received by driver chains, which then pass messages to the kernel, that then dispatches it to a global hotkey chain, and finally to the intended application (if not cancelled by any prior step in the chain).

The driver chain allows the kernel to not care about "how" the keystrokes are generated, only that they are. They could be from a keyboard, from a IR device, or any other source that could send a signal designed to be interpreted as a keyboard. A hardware keyboard logger, for example, is a dongle that has a USB or PS/2 input on one end, and a USB port or PS/2 on the other, such that the keyboard passes data through this device and is intercepted. The OS literally cannot detect that such logging is going on.

The other common kind of logging happens in software, which can happen either before the OS has a chance to see the keyboard messages, or after. Drivers can do pretty much whatever they want, and the OS can't strictly detect that a driver is diverting messages for nefarious purposes, because they get to inspect messages before the OS does. This is the nature of the hardware abstraction layer (HAL) that the drivers are a part of. Fortunately, since they are in memory, anti-malware software can detect and disable such behavior.

Finally, you have an intentional "hole" in the OS, usually referred to as "global hot keys", that allows any application to request that the keyboard messages are passed to them before the in-focus application. This allows not just Alt-Tab to work (the window manager intercepts these messages to switch apps), but also most media programs request handlers to support multi-media keys like play, rewind, and fast forward, and other user-land apps for volume control, etc. Without all of these global hot keys, the OS would be very annoying to use, and apps would be far more complex as a result. However, just as this is a great feature to have, it can also be abused by a program.

However, you should note that not "all" programs get a copy of a keyboard message, only drivers, global hot key handlers, and the in-focus application. The problem has nothing to do with the fact that every program gets a copy of a keyboard event, but the fact that the HAL needs to be able to transform messages from hardware to kernel messages, and global hot keys are necessary to provide features to the user without each program needing to be built to provide the same features.

There have been advances to lock down the process though, such as requiring "signed drivers," which reduces the likelihood of malicious drivers getting into the driver chain, and anti-viruses that can detect bad behavior by apps. However, until many of the security vulnerabilities are addressed, such as hardware level keyboard logging and insecure global hot key registration exists, loggers will still have an opportunity to log keystrokes. Even though normal keystrokes appear to go simply from hardware to an app, there's actually several intervening steps necessary, and these steps are required for basic compatibility (drivers) and functionality (global hot keys).

phyrfox
  • 5,724
  • 20
  • 24
  • 7
    Also, IME software has to fit in to the chain somewhere to allow typing in certain languages, primarily Chinese, Japanese and Korean. – alex.forencich Oct 12 '15 at 01:13
  • 1
    Drivers could be locked down preventing them to make a network connection (except if they register as a network driver, in which case they should not be able to access devices registering as keyboards). True, it would be hard to set such a system up, but your story makes it sound as if that would be impossible. And of course installing arbitrary drivers in the first place can be disabled especially on devices like laptops (e.g. as done on Chromebooks). Additionally hardware key loggers are extremely uncommon, so not sure why you give those so much attention. – David Mulder Oct 12 '15 at 04:53
  • 3
    @DavidMulder A driver framework where all drivers are essentially untrusted user applications would be complex at best, but not necessarily impossible. The problem is balancing security with usability. Of course, nobody's saying we want to install arbitrary drivers (hence, why "signed" drivers are now commonplace), but we do want users to be able to actually use the computer. The only point I was trying to make about hardware loggers is that the OS can't trust its own hardware-- devices can pretend to be whatever they want. There's no way to make sure that a keyboard is just a keyboard. – phyrfox Oct 12 '15 at 12:49
  • 1
    *Drivers could be locked down preventing them to make a network connection* could easily be circumvented by e.g. writing the logged keys to disk, or simply creating a new system API for handing the keystrokes to a usermode component. – pjc50 Oct 12 '15 at 15:15
  • "Fortunately, since they are in memory, anti-malware software can detect and disable such behavior." What do you mean? Drivers live in kernel space, which antivirus as any other user space process cannot access. – edmz Oct 12 '15 at 15:25
  • 2
    @black: It's a poor antivirus that runs (only) in user space. – Ben Voigt Oct 12 '15 at 17:18
  • A keylogger driver wouldn't even need a backdoor to send the keys via the network or similar stealthy methods. It can just save the keystrokes, and when the companion application starts (say a screensaver), that companion will request the keystrokes and the driver will just replay them using the default OS interface. Remember, you already needed root-level access to install that driver, adding another screensaver is trivial at that point. – MSalters Oct 14 '15 at 14:56
  • It's important to note that this is mostly a solved problem on mobile and browser-based operating systems. Only older operating systems like OS X, Linux, and Windows that need to support legacy software still have this problem. – Ajedi32 Oct 15 '15 at 13:25
  • From reading through various elements of your answer (particularly, "The driver chain allows the kernel to not care about "how" the keystrokes are generated... They could be from... a signal designed to be interpreted as a keyboard"), I theorized that some keyloggers might have no trouble intercepting keystrokes entered using Windows' **On-Screen Keyboard** accessibility feature. Is that accurate? – Dan Henderson Oct 15 '15 at 16:52
  • @DanHenderson Yes, by the nature of how they work (simulating key presses through the event API), they can be tracked as well. There are some secure web sites, such as what the military uses, that do not trigger key events by using a randomized keyboard and JavaScript, but most native apps that don't roll their own faux keyboard, as just mentioned, would be susceptible to key loggers. Other forms of input, such as Dash, can also be logged by key loggers, despite using mouse-based input. – phyrfox Oct 15 '15 at 19:54
  • @MSalters Obviously proper sandboxing is a part of such a setup where applications can't blindly speak to each other (or specifically a driver shouldn't have writing privileges to the same places as normal applications have). Considering sandboxing of applications has been an increasingly important concern this is not impossible (just hard, as there are a lot of ways for applications to communicate to each other, but that's what sandboxing is all about). – David Mulder Mar 20 '16 at 22:01
24

The reason this isn't done by default is because the previous-generation operating system design didn't have a huge focus on sandboxing and the like, so right now it would require big architectural changes to make such changes work. Mark touches upon those to some extent in his answer, but it boils down to that you can't allow applications to blindly run with OS privileges.

It is however far more interesting to note that modern OSes like for example Google's Android, Apple's iOS and even Google's (desktop) ChromeOS all do limit keystrokes only to current applications1. Now, focusing just on ChromeOS as it's the only desktop OS in the list it's also important to note that global shortcuts create no problems in such a case. An application can 'simply' tell the OS that they wish to bind to a specific shortcut which then can be configured in the OS by the user. The relevant specification can be found here for those of you who're curious how this looks.

Similarly, by taking a look at Android, we can find that accessibility software that requires global keyboard access can still be written in such modern environments by exposing such information if and only if the application is explicitly granted such permissions in the accessibility settings panel. This makes setting up such software a bit of a pain, but it does prevent keyloggers from being distributed.

In conclusion, the only reason it's not done is not because it doesn't help or is impossible, but because due to historical priorities it's just taking a bit longer to get there. We will get there in time and in the meantime it can make sense to use a more modern, locked-down OS in secure environments.


1 I know that Mac OS X has recently been expanding their sandboxing effort of applications. I presume that a properly sandboxed application (one not requiring administrator privileges) will also be unable to act as a keylogger, however I have spent very little time reading up on how their sandboxing really works. If anybody knows for sure, do share!

Peter Mortensen
  • 877
  • 5
  • 10
David Mulder
  • 1,349
  • 1
  • 8
  • 16
  • 2
    *"an application can 'simply' tell the OS that they wish to bind to a specific shortcut which then can be configured in the OS by the user"* And the OS prevents applications from 'self-registering' such a binding... how? It's software that registers the binding; in principle, what's keeping some other software from doing the same thing? (And I'm not sure I'd want to have to set up Alt+Tab, Ctrl+Esc, Windows+L, Windows+R, Windows+Shift+2, and so on from scratch on every new system...) – user Oct 12 '15 at 09:53
  • 2
    @MichaelKjörling Not sure what you're trying to say, you would have OS-level bindings which could or could not be overwritten, but there would be no way to just bind 'a' for example, so there is no danger of key loggers. – David Mulder Oct 12 '15 at 14:11
  • 1
    @MichaelKjörling Like this: http://i.imgur.com/Y9EHvJs.png – Ajedi32 Oct 15 '15 at 13:29
5

On Windows, there is very little protection between applications running as the same user. If you try to take away SetWindowsHookEx, then malware writers will switch to DLL injection and a whole set of other techniques. You could even just draw a transparent window over the targeted application which would have focus and recieve keystrokes, then pass on those keystrokes by sending Windows messages. Fundamentally Windows was not designed with the possibility of malicious executables in mind.

A system designed for running untrusted applications can sandbox them from one another. But it's still vulnerable to (much rarer) exploits that punch out of the sandbox into the kernel.

There's also another technique that could be used: arbitary code execution in the browser gives an exploit the ability to record every keystroke the browser sees, even if it can't escape a sandbox.

pjc50
  • 2,986
  • 12
  • 17
  • 1
    This is no longer true, Windows Vista added "User Interface Privilege Isolation" and most browsers use it (running web pages in low UI privilege). You are correct that this doesn't protect against capturing keystrokes typed into the web browser on a different window/tab. – Ben Voigt Oct 12 '15 at 17:16
  • This discussion https://security.stackexchange.com/questions/3759/how-does-the-windows-secure-desktop-mode-work claims that DLL injection is still a viable attack. – pjc50 Oct 12 '15 at 17:48
  • You linked an answer that admits the author does not know how DLL injection works. [This documentation](https://msdn.microsoft.com/en-us/library/bb625963.aspx) states that UIPI blocks DLL injection (at least using conventional means such as `CreateRemoteThread`... if code is running from a user-writable directory then there is still a risk of hijacking through insecure DLL search path) – Ben Voigt Oct 12 '15 at 19:01
  • X on Linux isn't any better in this regard. http://unix.stackexchange.com/questions/129159/record-every-keystroke-and-store-in-a-file – PSkocik Oct 13 '15 at 00:45
2

There are very valid reasons you want keystrokes visible to applications outside of the currently running foreground process. Unless every programmer implements cut/copy/paste for example, the OS must monitor for certain keystrokes. Lets say you have a program for taking screenshots and you want it activated by a certain keystroke, that program must be able to monitor the keyboard activity to detect when its keystrokes are activated. It would make more sense to run everything in a sandbox and if monitoring input was a requirement for a background program it is a permission that must be allowed, but no matter what is done, there is always a way around it. If someone wants to know what your typing they will find a way. All we can do is try to make it harder. Obfuscation could be an interesting method. Inject a lot of false input messages to the OS that the running program generates to fool anything outside of it?

David-
  • 434
  • 2
  • 8
  • 1
    This is not wrong, but in fact Windows applications *are* obliged to implement cut/copy/paste themselves (and Ctrl-Z/X/C/V/B are just a convention). – zwol Oct 12 '15 at 21:08
  • Even in Windows though, cut, copy, and paste will always work in most applications. Should we implement it ourselves yes, but even it we don't, it is still available through windows. Create a simple window with a textbox and Windows will take care of the rest. – David- Oct 12 '15 at 21:17
  • 4
    The textbox is an OS-provided library, but as far as the global event queue is concerned, it's part of your application. What I'm saying is that the standard editing keystrokes are different from things like Alt-Tab (handled by the window manager, application never sees it) and Ctrl-Alt-Del (which doesn't even get out of the kernel IIUC). – zwol Oct 12 '15 at 21:30
2

Several have observed that other legitimate applications may need to transform the original keyboard input (which comes in the form of keycap X was pressed/ released) into some kind of text. People who have lost the use of one hand, for instance.

There are also keyboard mapping tools that allow people to type in languages for which the OS does not supply a keyboard map for that language (or the user doesn't want to use the OS-defined keyboard map--for instance, a Dvorak-like layout for some language other than English). One such app is Keyman (keyman.com).

And finally, there are people like me (or maybe I'm the only one) who have keyboard mappers that map things like -H to the left cursor arrow, -B to -, and so forth. So we lazy people (or just me) don't have to move our hands from the alphabetic part of the keyboard to move around.

Mike Maxwell
  • 221
  • 1
  • 2
1

There are in fact some operating systems that allow this, namely Mac OS X. Key presses that aren't 'reserved' system key combinations or 'modifier' keys (alt, ctrl, shift etc) are only sent to the application currently in focus.

Of course, that would make it very annoying for some applications designed for accessibility and programs that use VoIP that need push-to-talk keys. Because of this, there's a part of System Preferences where you can enable this functionality for specific applications.

As for Windows, I would imagine that it's not done because the current method of handling keypresses is quite complex and is designed to work with all of the current drivers and programs.

JamEngulfer
  • 233
  • 1
  • 4
1

One of the reasons is that for some software, being able to read keystrokes while being active but not currently selected is vital to the working of the software.

  • For example, push-to-talk keys on VoIP software like Teamspeak or Ventrilo allow gamers to chat at the time they want to chat so you don't have a torrent of noise.
  • People who make screen recordings for a living (like Youtubers/Streamers, computer tutorialists and software reviewers) rely on shortcuts from software like Fraps, OBS, Xsplit,... to ensure that their recordings work properly.
  • Software developers often benefit from being able to focus their project while stepping through their code on their other monitor with the function keys.
  • I personally like that I can have my World of Warcraft focused while I'm using my mousewheel to scroll in Chrome and don't need to switch between both programs constantly.
Nzall
  • 7,313
  • 6
  • 29
  • 45
0

Let’s give some practical example.

Some disabled people find typing hard and are very slow at typing. They often use “word prediction” software that shows a list of the words the match what they have typed so far, allowing a word from the list to be picked with a single keystroke.

Blind people use software that reads the text displayed on the screen (screen readers). This software can also speak out each letter as it is pressed on the keyboard. The software has a few hot keys that read the current line, etc.

In the UK, there is software that allows you to type an address into a letter by typing the postcode, then hitting a hotkey to get the postcode expanded into nearly the complete address. This software is often used with customer database, etc. that don’t support “quick” address entry themselves.

Testers wish to be able to record all keystrokes and mouse moves they make while testing software, so they can then play them back to repeat the test.

In all the above cases the software DEPENDS on being able to see what keystrokes are being sent to other programs.


Smartphones and tablets limit how applications can interact with each other, to the extent that most people still turn to a “PC” or Mac when they wish to get work done that need the use of more than one application.

Safety and freedom to be productive does not always go together; there is a tradeoff to be made. The iPhone sits on one end of the spectrum and Windows/Max on the other end.

Peter Mortensen
  • 877
  • 5
  • 10
Ian Ringrose
  • 641
  • 1
  • 4
  • 9
  • 2
    This hardly is the reason why this isn't available in traditional desktop OS's. If you for example take a look at Android such information is only exposed when an application is selected in accessibility as being allowed to explicitly access such information. – David Mulder Oct 12 '15 at 20:38
0

The X Window system already does this to some extent. It allows programs to grab the mouse and keyboard and nothing else can intercept the inputs. Not even the window manager. It is rather annoying for full screen games since there is no way to switch away from the game. Try opening xterm and holding down the ctrl key and right clicking (I think) and select grab keyboard. Now what you type in to xterm cannot be seen by any other client connecting to the X server. Edit: it looks like X still has an XQueryKeymap function that can bypass this and be used by a keylogger.

Edit: Windows implements such a feature by making it so that pressing ctrl+alt+del will always bring up a valid Windows account management screen, as long as fast user switching is turned off. So fake password entry screens are not possible.

Alex Cannon
  • 402
  • 2
  • 7
  • 1
    This is the `XGrabKeyboard` function, I believe. Unfortunately it's not useful for security, as it can be rather trivially bypassed (which is silly because some programs attempt to use it for security, like xterm as you mentioned). – forest Feb 14 '18 at 02:58
  • Are you referring to the ctrl+alt+* key combination to release keyboard grabs? Modern distros have disabled that feature. It can only be used by the users of the X server anyway, not by other clients connecting to the X server. – Alex Cannon Feb 14 '18 at 03:48
  • 2
    No, I'm referring to an X11 protocol command. The `XQueryKeymap` can bypass keyboard grabbing on Xorg. – forest Feb 14 '18 at 03:51
  • 1
    Re. your edit, Linux and many Unices have something similar called SAK (Secure Attention Key). For Linux, when enabled, it's Ctrl+Alt+SysRq+K. It will kill all processes in the current TTY, causing agetty/logind to respawn and ensuring you are presented with a non-hijacked, non-forged login prompt. – forest Feb 14 '18 at 04:03