Does blocking keywords prevent code injection inside this interactive Python file?

Question

We recently had issues with people messing around inside our system. To prevent code injections within my python code, I implemented the following if block:

#! /usr/bin/python3
#-*- coding:utf-8 -*-
def main():
    print("Your interactive Python shell!")
    text = input('>>> ')
    for keyword in ['eval', 'exec', 'import', 'open', 'os', 'read', 'system', 'write']:
        if keyword in text:
            print("You are not allowed to do this!")
            return;
    else:
        exec(text)
        print('Executed your code!')
if __name__ == "__main__":
    main()

A few (user) people can run this python file with sudo rights inside our Ubuntu system. I know this sounds like a security hole, but I don't see any possibility to escape from this. Is there any possibility to inject inside my code block? Do you have any tips to prevent code injections?

Related and likely a better approach than blacklisting substrings (which leads to overblocking): [How to sandbox students' Python 3 code submissions for automatic assignment evaluation on Windows 10?](https://security.stackexchange.com/questions/216372/how-to-sandbox-students-python-3-code-submissions-for-automatic-assignment-eval), [Security concerns with user Python scripts](https://security.stackexchange.com/questions/90659/security-concerns-with-user-python-scripts). — Steffen Ullrich, Mar 15 '21 at 14:14
Thank you. But as far as I can tell, I am secure now with the code block. So this would be enough ? I don't want to make more effort if this is enough to stop code injections — user252790, Mar 15 '21 at 14:27
We have limited resources. If this is enough to stop code injections, I am fine. That's why I was asking for any example to break it — user252790, Mar 15 '21 at 14:29
Why would you need to let them run code as root if you don't trust them? Most code can be executed as normal user - which limits the security implications... — vidarlo, Mar 15 '21 at 14:41
@user252790 No, just checking for known-vulnerable code patterns is NOT sufficient as a security mechanism. I could show an exploit for your mechanism, you could patch it, and we could repeat the cycle for infinity. Instead, use actual security barriers. Under Linux, the usual technique is to set up a pipe and fork, then drop as many privileges in the child process as possible (and/or use containers), and then execute the less-trusted code. The child process can communicate with the privileged root process via the pipe. — amon, Mar 15 '21 at 14:44
“ I don't see any possibility to escape from this.” [Yet there are plenty of ways.](https://codegolf.stackexchange.com/questions/209734/tips-for-restricted-source-in-python) — Gilles 'SO- stop being evil', Mar 15 '21 at 19:27
@Gilles'SO-stopbeingevil' Wow, next time I need to bypass blacklists, I'll search on codegolf first :) — nobody, Mar 15 '21 at 19:47
I think the blacklist approach is an excellent idea, it will make the students learn a lot while circumventing it. But more seriously, either use an existing sandbox solution or just give every person a normal user account. — eckes, Mar 15 '21 at 23:38
If you ever find yourself saying "I can't think of a way to break X" and then assuming it's secure, then you're already hacked. eg. [Try it online!](https://tio.run/##VVAxbsMwDNz7CsaLbKBbtxb21r1rUBSCYNGwUFkSJDqOH5Eh6Zx/9Fn5gStZDdxy0BE68nikm6m35mlZJHYwCGXK6vkBYjivDJXF3o4eYoZetKQOCG9rA4Qetd4V1VpLeCSoY5kbqWRN0wDLRGc9fOI8WS8jC@8MD0KzR2B4xDahGpz1lDLr0KwY0utRyIRhDoRDyiavCNlHNpdCdX@lk4WN@78ACI9gLIHQ2k4ogSxIC9SrcN/gHh5p9OZl/UMdcJNMjss0ZWvIE9hrZEaKsnO6VWsl7uL60R7nRgzIOdQ1FJyn63JeZMl86mW5Xc@X2/XrlPHyXbJf324yKKM9Vv0A "Python 3 – Try It Online") — throx, Mar 16 '21 at 00:20
Can you elaborate on "We recently had issues with people messing around inside our system."? There's [something odd about this question](https://ctftime.org/writeup/17085). — Owen, Mar 16 '21 at 06:57
Can you instead whitelist the few things those few users are supposedly capable of doing? — Hobbamok, Mar 16 '21 at 10:00
Why are you using exec, if you don't want code injections? Injecting code is pretty much what exec does. What are users supposed to do in there? — Kafein, Mar 16 '21 at 16:20
This can be broken with this one-liner: `globals()['__builtins__'].__dict__['ex' + 'ec']('imp' + 'ort code; code.interact()')`. You'll end up with an interactive shell that looks just like the original one, but doesn't impose any limitations at all. — Paul, Mar 16 '21 at 20:01
This question smells like a prime example of [the XY problem](https://xyproblem.info/). If you clearly described the issue you are trying to fix, someone would probably be able to offer a better solution than the one you have proposed here. — MJ713, Mar 16 '21 at 21:16
@MJ713 The story is made up. This user is just trying to solve an old CTF puzzle. Owen's comment linked to a write-up about it. — Fire Quacker, Mar 16 '21 at 21:28
VTC because this question is asking for security review of existing code. Said code being from an old CTF challenge makes it even more appropriate for closing. — Voile, Mar 17 '21 at 03:47

nobody · Answer 1 · 2021-06-02T20:40:18.623

Simple blacklists are about the worst way to patch a vulnerability. They'll create a lot of false positives and any determined attacker will usually find a way around them.

Web Application Firewalls are a good example. The detection rules they employ are way more complicated than your simplistic blacklist, yet they don't catch everything, and every now and then, bypasses come up for things they are supposed to catch.

I don't see any possibility to escape from this.

Or so you think. You just haven't looked long enough.

vars(__builtins__)['ex'+'ec']("print('pwned')")

This sails right through your filter. It calls the exec() function which then goes on to print 'pwned'.

Now you can modify your blacklist to catch this as well, but someone else will come up with another way.

99% of the time I see someone using something like exec or eval, it's completely unnecessary. If you can avoid using exec, do yourself a favor and get rid of this vulnerability waiting to be exploited. If, however, you absolutely need to use exec, then do as @amon suggested in the comments: Create a new pipe, fork and drop as many privileges in the child as possible, and then execute the code. Use the pipe to communicate between the child and the parent processes. Or sandbox it as Steffen Ullrich suggested.

Nick Bonilla · Answer 2 · 2021-03-16T19:51:42.697

4

The problem I see with your blacklist is that in Python, one could write Python code that generates additional Python code, and then run the additional code. So one could generate the word 'eval' from inside of a program that itself doesn't contain the word. Part of the problem is the dynamism of Python itself, whereas a more strongly typed, static language might be safer.

The Computer Science Theory way of accepting a valid string is by constructing a grammar, and then using a grammar parser.

https://en.wikipedia.org/wiki/Formal_grammar

You'd need to understand how grammar parsers work in order to construct one. However, in this case your grammar parser would be a subset of python's grammar parser itself, so it would be a lot of work, if you had unlimited resources I'd go for it but otherwise don't pick this solution.

edited Mar 16 '21 at 19:51

answered Mar 16 '21 at 19:45

Nick Bonilla

217
1
3

6

However, deciding whether a given Python fragment is “safe” is an undecidable problem (for any interesting definition of safety). The usual syntax-driven approaches could be used to allow a tiny safe subset of the language though. Due to these constraints, methods that audit security-sensitive *behaviour* at runtime are more popular, e.g. running a process with untrusted code under a restrictive seccomp profile. – amon Mar 16 '21 at 20:31
You could generate additional Python code (into say a string), but the typical way to execute generated code would be using `eval` or by writing a file and then executing it with OS commands. How would you execute the code without `eval` or any imports? – NotThatGuy Mar 17 '21 at 14:18
@NotThatGuy `vars(__builtins__)['ev'+'al']("print('pwned')")` (credit for this comment goes to a modified version of https://security.stackexchange.com/a/246169/240562 where he used 'exec' instead of 'eval') – Nick Bonilla Mar 17 '21 at 18:56

KlausF · Answer 3 · 2021-03-19T09:06:44.210

I wanted to suggest the approach to set __builtins__ to {} but even this is not safe


def myfunc(a, b):
    return a + b
    

global_namespace = {"__builtins__": {}, "myfunc": myfunc}
exec(yourstring, global_namespace, {})

This disables all builtin keywords, allows only the explicitly passed globals (in this example myfunc) and no locals.

You have to declare explicitly all builtin, global, local functions / objects that you want to allow.

However

in above example one could escape with object.__globals__["__builtins__"].print("I escaped") (Thanks @user236968 for pointing this out)

So the only thing that seems to be safe is to disable all builtins, globals and locals with

global_namespace = {"__builtins__": {}}
exec(yourstring, global_namespace, {})

But perhaps even there could be a loop hole.

As others pointeds out. Probably the only safe way is to not use exec and eval but to implement a parser and then use eval (but your parse must be watertight) or to use you rown parser and own interpreter

You can still get the builtins `yourstring = "myfunc.__globals__['__builtins__'].print('Escaped')"` — user236968, Mar 18 '21 at 22:26

Does blocking keywords prevent code injection inside this interactive Python file?

3 Answers3