It is well known that PyPI does not prevent the upload of malicious code.
Unfortunately, automated tools often cannot distinguish between features of a program and malicious code.
In the case of Linux distributions, there is at least the package maintainer who might look at the source code occasionally.
Basically the security of software repositories like PyPI boils down to the idea that somebody would notice malicious code, if enough people look at the source code. So, if I like to be one of the people occasionally looking at the source code, what should I look out for?
Reading every line of code before installing a python package is infeasible.
For a programmer (not a security researcher), what are easy checks/ best practices to identify obvious malicious code-fragments?
Some obvious things to do are:
- grep for
import
and see if any module imports something it should not. In particular look forsys
,os
,http
etc... These modules have many legit uses, but a lot of power to do unsafe things. - grep for
eval
and the like. - open a random file and see if it looks reasonable.
- Pay particular attention to
setup.py
What is the quickest way to have a highest chance of detecting malicious code in python scripts?