23

I'm interested in becoming an ethical hacker someday. I've been reading articles saying the Python language is very popular in hacking activity because of the extent of its modules (including network).

Nowadays, lots of applications are web applications or mobile ones and the antivirus software make a great job removing malware written in C. Because of that I'm a bit confused. Is the knowledge of the C language important for the ethical hacker career?

Peter Mortensen
  • 877
  • 5
  • 10
Cronos
  • 233
  • 2
  • 5
  • 8
    Is some of the knowledge you'll gain when learning C required? Certainly - but other languages might do as well, C is just the most ubiquitous of them. Is an *in-depth* knowledge of C itself required? Definitely not. – Bergi Nov 29 '20 at 10:41
  • 3
    C is something of a lingua franca, as most languages provide a way to interact with a library written in C. – chepner Nov 29 '20 at 18:17
  • 1
    There are many good answers, mine isn't a complete answer just an observation. Questions like "is language X worth learning" are never really good questions. I know what I say will be contested but I always say you're not yet a programmer before your 12th programming language. Basically, if you have really mastered this craft, one more language is just nothing, a couple of hours of looking up and a few days at most to become reasonably proficient (and that doesn't mean you have to know everything by heart and doesn't pertain to the abomination named C++). – Gábor Nov 30 '20 at 19:20
  • 4
    @Gábor but the context here is not about becoming a programmer. – schroeder Dec 01 '20 at 08:40

6 Answers6

69

Of course, you don't necessarily have to know C, or the given platform's Assembly (read: instruction set), but knowing them is a great help in figuring out many possible low-level vulnerabilities.

It is not the C language itself that matters, but rather the fact that in order to know C, one must first understand many fundamental computer principles, which is what allows you to then (ab)use them in any other language. You could learn about all of them in theory, but without ever practically experiencing them (which is what you achieve by programming in C), you may not be able to use them very efficiently or even realize where they're best applicable.

Similarly, you don't have to know the exact packet structure of networking protocols. However, if you do, you may suddenly be able to figure out ways to break something, which wouldn't ever occur to those who make, often incorrect, assumptions about how these protocols function solely based on their high-level experience.

natiiix
  • 616
  • 3
  • 5
  • 46
    "in order to know C, one must first understand many fundamental computer principles". You deserve a thousand upvotes just for this :) – Margaret Bloom Nov 29 '20 at 18:47
  • 10
    @MargaretBloom Thank you, but I felt a bit emotional about this question because I see similar questions appear all over the place (forums, Q&A platforms, "tech" articles, student communication channels, etc.) and it's simply not true that the "ancient" languages are dead and useless nowadays. If for nothing else, they will always be some of the best ways to gain a deeper understanding of computers. This is precisely why virtually every interviewer asks about which datatype (list/map/set) is good for what and why. Because low-level knowledge matters when it comes to scalable performance! – natiiix Nov 30 '20 at 12:58
  • 1
    @natiiix About a week ago I was explaining this pretty much exact thought to a friend of mine, who moaned about not learning more "modern" language in programming basics in their requalification course (after a lecturer dismissed the language when teaching them about automated testing). – mishan Nov 30 '20 at 16:16
  • This makes it seem like C isn't just another language that compiles down to machine code. – Martijn Dec 01 '20 at 09:02
  • 1
    @Martijn it's the primary one most people will encounter which compiles to platform code rather than bytecode, *and* offers ready control over memory layout. C or its inheritor C++ account for the vast majority of operating system and native app binaries that you might want to exploit. – pjc50 Dec 01 '20 at 10:30
  • @pjc50 that's an entirely different (arguably even opposite) take than the answer has, which argues that it doesn't matter that the programs are written in C: "It's not the C language itself that matters, but rather the fact that in order to know C, one must first understand fundamental computer principles" C++ isn't even mentioned in the answer. – Martijn Dec 01 '20 at 10:51
  • @Martijn Yeah, because C++ isn't really relevant to what I talked about. C++ stdlib already comes bundled with all the stuff that you have to re-invent yourself if you want to write anything in C, hence you wouldn't learn a whole lot from it. It also has a far more high-level memory management, which would further cloud the learner's understanding of the underlying processes. This is why almost every serious IT/CS/SE/whatever university teaches some C basics as early as possible. It's uncomfortable but necessary. – natiiix Dec 01 '20 at 12:14
  • @Martijn The fact that OS and libraries are written in C doesn't matter too much because even programs in high-level languages suffer from the same exact problems. You just can't see it as directly as you would have in C, but that doesn't matter when you're attacking them from the outside anyway. The knowledge of C, and computers in general, is helpful even if you're "hacking"/reverse-engineering Java/C#/ECMAScript/Python code. A buffer overflow is a buffer overflow, a socket is still a socket (both in networking and -nix OS terminology), regardless of the language/framework/runtime. – natiiix Dec 01 '20 at 12:16
  • I strongly agree with @pjc50's analysis, and strongly disagree with yours. That they present their analysis as support for yours while you both say it's opposite to each other is very confusing. – Martijn Dec 01 '20 at 14:32
  • I think the three of us may be talking past each other somewhat; while buffer overflows are *possible* in other languages, they are usually a lot harder; C's decision to use zero-termination and not carry the length around with array types makes it uniquely easy to write code vulnerable to a stack-smashing attack; whereas in Java or Python or C# you can't overwrite the machine stack even if you want to. – pjc50 Dec 01 '20 at 14:41
  • @pjc50 I used a very recognizable example, but what I meant was far more generic. There are various points of view that give you different advantages of learning C, but the point is that it still very much remains relevant and that it is most certainly worth learning if you're interested in low-level computer security and exploitation. – natiiix Dec 01 '20 at 21:40
29

It depends what you want to do.

If you want to build tools that can be used to automate tasks that are often performed for ethical hacking (such as penetration testing, port scanning, SSL/TLS testing etc.), then Python can be used for this.

If you want to analyze code to look for bugs in packages that are widely deployed such as the Linux kernel, openssl, apache, etc. - many of these packages are written in C, so a solid understanding of C would be helpful for this.

mti2935
  • 19,868
  • 2
  • 45
  • 64
  • 2
    There's also lots of widely deployed packages not written in C that are interesting to analyze. I would reformulate that into "*if you are interested in systems programming…*". – Bergi Nov 29 '20 at 10:38
11

In ethical hacking (and hacking in general), the more you know about software and hardware in general, the better off you are - keep in mind there's a lot of different solutions written in lots of different languages, running on lots of different hardware.

As most operating systems are written in C, it can definitiely be advantageous to at least be able to understand C code. Most OS modules are written in C and/or Assembly. From this, you can gather valuable intel on any bugs or exploits that may be present in the target OS's various modules.

Regardless of whether you hunt bugs or try to penetrate a system, at least some understanding of C can help you a lot.

In the same vein, knowing at least the more popular languages (Python, Java, C#) can be of immense help as well - lots of systems (including corporate solutions) are written in those languages.

Python does have its own advantages in the hacker toolbox - it gives you the ability to write exploits and programs rather quickly, and has a lot of libraries that can be used to roll your own EH/pentester toolset.

Tylon Foxx
  • 211
  • 1
  • 3
  • And as both Java and C# are derived from C and have a syntax very similar to C, you can learn them much easier if you already know C. This is why C is still taught at many places as *the* introductory programming language. – vsz Nov 29 '20 at 15:46
  • 2
    @vsz The syntax (braces for blocks, keywords) is not particularly important and still different enough in detail. What's important is learning structured programming, and the C development experience is pretty bad for introducing that. – Bergi Nov 29 '20 at 16:38
2

Information security consists many fields and career pathes. Answering your question depends on your goal and target. If you are interested in fields like reverse engineering, malware analysis, software vulnerability analysis and such fields knowing C is essential. But in fields like penetration testing, vulnerability scanning, network security and such fields Python is a good choice. On the other hand, consider that C is a main programming language, knowing C will help you to understand many details of software and system. Also Python is a handy popular language which will help you by providing useful libraries in security.

2

Python can be the toolset.

It is a high-level language that you can use to write proof-of-concepts, analyze datasets, etc, etc... Writing the same things in some lower-level language makes these things boresome, error-prone and less understandable for others.

C is the knowledge.

Only being literate in C, you can understand how data structures and algorithms are represented at the low level where the interesting stuff happens. Things like buffer overruns, stack corruptions and likes, cannot be understood (and even less, discovered and analyzed) without knowledge of C.

schroeder
  • 123,438
  • 55
  • 284
  • 319
fraxinus
  • 3,425
  • 5
  • 20
  • 3
    C is just a high-level assembly language. The knowledge about computer architecture that you refer to is independent of the concrete language, and can be gained independently without studying C. – Bergi Nov 30 '20 at 16:23
2

There are pros and cons to all languages, but it sounds like C fits your goals.

An interesting definition of "hacker" is someone who sees and understands the reality behind the abstractions. A butterknife is a very functional screwdriver. A locked door is a mechanical device with numerous components - for example it can be opened by taking the pins out of the hinges. Everything on a computer is just a series of numbers subject to precisely defined behavior. It doesn't matter how people assume or expect software to work, what matters is how it actually does work. If you understand it, you can often do unexpected things.

Assembly language gives the clearest picture and the most direct control over how a computer works, and at some point you might want to get a basic understanding of assembly just to understand computers better. However assembly is generally far too low level and too painful to program anything significant. Seriously. With rare exceptions, I do not recommend assembly as active productive language for programming. Definitely not a first language in any case.

The C language is a widely used and productive language, and aside from Assembly, it gives the closest view of the reality of how the computer works. C gives you the closest control over how the computer works. C is (or can be) a very fast language, a powerful language, but that comes at a cost. C is less friendly, and C makes it easy to make mistakes. Python "protects" you from shooting yourself in the foot in various ways, whereas C just obeys any strange or dangerous code you write. Python has some really neat features, but it hides what's happening under the hood. That's great if you don't want to know.

Alsee
  • 21
  • 1