19

I am about to begin a 4-year information security degree at Purdue. The degree does not call for any programming courses. So the only time I will be able to take one is the occasional elective. So most of my learning will be on my own. At the start of my senior year of high school I decided to completely switch to Linux. So far I have been learning some Linux and security stuff. However, I also believe it will be important for me to also learn a few programming languages.

Basically I am planning on learning to program side-by-side with learning how to use Vim. So it most likely will be a slow process. In the end I think it will be worth it, though. As I said, I am going into the security, so I will mostly be creating security related applications, most of which will be networking related. I would also like to begin developing Android applications, but that will be later down the road.

With that said I have a few ideas. I was thinking of starting with JavaScript, because it is cross-platform, and I have seen it suggested before. I have also been hearing a lot about Ruby or could go the natural Linux route with C. What direction should I go?

Peter Mortensen
  • 2,319
  • 5
  • 23
  • 24
JD Davis
  • 191
  • 3

12 Answers12

53

First and foremost: bash, along with the common command line utilities. Bash is the default user interface to the operating system, and a lot of programs on a Linux system will be wrapped in a shell script at some level. It can be quirky, has some idiosyncrasies, and often seems downright dumb, but it's something you will have to deal with, so get comfortable with it. The standard tools like grep, diff, head, tail, sort, uniq, and so on, will be very helpful not only with shell scripting, but with your productivity on the command-line.

Learn at least some c. It's what the lowest levels of the system are written in, and it will give you a better understanding of the system as a whole.

Pick any higher level language you like. Python, ruby, perl, java, whatever - as long as you enjoy it. This is where you really start to learn how to "program", and from here on out it will be easier to pick up more languages, and keep learning.

JimB
  • 1,924
  • 12
  • 15
  • 2
    I agree with all points listed here. – Joe Apr 25 '11 at 16:27
  • 6
    If you are going to learn bash then you should include learning about awk, sed, head, tail, cut, tr, diff, grep, find, ps, netstat, tcpdump, sort, uniq, etc., etc., etc. – jftuga Apr 25 '11 at 17:07
  • 2
    @jftuga Agreed. A huge percentage of my toolkit is the skills to do "data extraction and integration" - a fancy way of saying "manipulating formatted text files". Once you understand the pipeline, and get to know the common tools, you are unfettered from the workflow of any pre-rolled software. I'm also a huge fan of MS Powershell, which stands on the shoulders of Bash in many respects. – AndyN Apr 25 '11 at 17:28
  • @jftuga - noted. Though I think these are important, I don't think a someone new to linux needs to worry about learning the big guns like sed/awk in depth, but they should have enough understanding to be able to decipher existing code if they should encounter it. – JimB Apr 25 '11 at 17:53
  • 8
    Agree with all, but I'd recommend Python over the others as a high-level language specifically because it has become the most common higher-level language for server-management tools. You'll run into it far more often than ruby or java. Perl is also common because its so old, so i'd pick that one next. – tylerl Apr 25 '11 at 23:38
  • I'd say bash, c, then perl. Then anything else. If nothing else, learn perl. It's very useful in the UNIX world. – LawrenceC Apr 26 '11 at 02:48
  • @tylerl - though python is one of my favorite languages, wanted to avoid presenting personal bias. I also wanted to emphasize "**pick up more languages, and keep learning**" - more than a single language. Learning multiple programming languages/styles/paradigms is very important to continued progress. Also FYI, at this point perl isn't much older than python ('87 vs '91). – JimB Apr 26 '11 at 13:49
10

You will find that everyone will pretty much just recommend their favorite language. My favorite language is Perl so that's what I recommend. :) However, beyond my personal preference, there are some great reasons why you should consider using perl:

First, perl is a wonderful general-purpose language. It's easy to get started in perl by writing simple scripts to do the same sorts of things you do in shell scripts, like manipulate files and make decisions based on user input. This provides a very easy and gradual introduction to the general ideas of programming. Perl has been around for quite a while so there are lots of resources (books and websites) for getting started.

Second, perl is an incredibly powerful and expressive language that supports all modern programming features. I've been using it for over fifteen years and I'm still learning new ways to do things more efficiently. For example, if you want to explore object oriented programming, perl has that (particularly through Moose).

Third, perl comes with the almost infinite power and flexibility of it's official add-on repository, CPAN. For example, to follow the idea of writing security software, say you want to develop some sort of custom network security scanner. Instead of writing all that yourself, you could start by using NMAP::Scanner as a scanning engine, and then write your own tweaks and improvements on top of that.

Finally, if you want to explore web programming, Perl has that too. One popular modern way to write web software in Perl is Catalyst, which provides a modern MVC web framework for quickly developing any sort of web app.

Putting all that together, the advantage of perl is that it allows you to start small writing little command line scripts and programs, and gradually grow into writing full-featured modern applications. Of course, the price of this flexibility is complexity. It's up to you to study and learn how to do things the right way, perl doesn't enforce good practices the way other languages do. I personally like this freedom, especially coupled with all the great resources out on the web for learning how to use perl.

Phil Hollenback
  • 14,647
  • 4
  • 34
  • 51
  • Perl is a (primarily) scripting language; and most of the OS is programed in C. – Chris S Apr 25 '11 at 17:17
  • Sure, but I don't think that matters much in the context of this question. It doesn't sound like the OP really wants or needs to learn about OS-level programming. – Phil Hollenback Apr 25 '11 at 17:25
  • I'd suggest that Perl is a good way to write code for a security app that targets something written in another language (often C) that you already understand. Therefore, if this distinction makes sense, I'd classify Perl as a tool you should learn to use, rather than a language you should learn to (completely) understand. – BMDan Apr 25 '11 at 18:54
6

Three languages will hold you in good stead. In decreasing order of importance (i.e., the first is the most important):

  1. Pseudocode. Oftentimes, the implementation with which you're working will be an amalgam of a half-dozen languages and tools, only some of which you'll know directly. If you know what pattern is being implemented, however, you can figure out what input is being handed to a piece and what output it should give, test it in exclusion, and figure out whether it's the piece that's breaking.
  2. C. For better or for worse, it's what runs the (Linux) universe, and it gives you a close-enough-to-the-ground view of things that you can understand what any of the higher-level languages are actually doing (e.g. PHP's pass-by-reference, or Java's thread model).
  3. Just About Anything Else except C++. A functional language like SQL or, better yet, Haskell, or something not quite as C-ey—if all else fails, Java can work for this, but it's still very procedural—get high marks here; the idea is to wrap your brain around something sufficiently completely different from what you've done before that you can't help but realize the limitations of what you've done before. In terms of sheer utility, bash probably also belongs in this list, but if you don't have a basis in something else, you'll just end up writing C with bash syntax instead of really exploring its power.
BMDan
  • 7,129
  • 2
  • 22
  • 34
  • 1
    +1 for pseudocode. I like to pseudocode in comments and then code around the comments – sreimer Apr 25 '11 at 15:59
  • 2
    +1 for pseudocode and "Anything else except C++". In linux world, you should learn at least one `script language`, like `python` or `PHP`. – yegle Apr 25 '11 at 16:03
  • 2
    I don't know if pseudocode is a good recommendation to a beginner. Pseudocode will be understandable after learning most any c-style language, so should one really put effort into learning a pseudocode itself? Not to mention, there is no single pseudocode syntax, which is likely to confuse a beginner. – JimB Apr 25 '11 at 16:20
  • @JimB: Point is to learn to think logically, regardless of language. ERD, UML, and the like are the ivory tower tools for doing this, but you don't have to learn anything quite that formal. The idea is to learn to chart your thoughts and then compare that to the expected outcome. When you realize that, for example, the database has no (inherent) way of knowing what HTTP headers were sent as part of the request to the Web server, you can eliminate a variable from consideration. Similarly, when you write pseudocode for a function, you often spot edge and corner cases you wouldn't otherwise. – BMDan Apr 25 '11 at 17:19
  • I agree completely, I just think that pointing a beginner to "pseudocode" is going to cause confusion, since it's not a formal thing. Most people's pseudocode is going to be an amalgamation of the c-style languages they're most comfortable in, plus some ad-hoc syntax. – JimB Apr 25 '11 at 17:46
4

You should learn several languages. I would suggest starting with Python. It's widely recommended for being easy to learn and for being very useful, lots of excellent self-study materials are available for free, and I gather it's widely used by IT security professionals. Almost every time I see security pros post demonstration code, it's Python.

bgvaughan
  • 256
  • 2
  • 9
4

I am about to begin a 4 year information security degree at Purdue. The degree does not call for any programming courses.

Am I the only one one to think OMG!!!! at this point?

I also believe it will be important for me to also learn a few programming languages

I would say so. Although you're not really expecting to become a proficient programmer, you probably need the skills required to simulate / recreate attacks and understand how the programming process works. You will also need the skills to analyse data, and extract information from bulk sources (like logs). JimB has mentioned bash - and while you'll no doubt be using this - it only takes around a couple of hours to learn the essential bits. Actualy the only place you're ever likely to see bash is on Linux systems - but the other shells are very similar.

I'd recommend learning awk and/or Perl for data crunching. Don't bother about any requirement to tick an object-oriented box - but I would recommend looking at non-procedural langauges too.

Learning C will also expose you to alot of information about how code turns into an executable program (compiling is just one step in a very complex process).

The obvious choice for someone interested in Android/mobile development would be java - but java tries very hard to insulate the developer from having to deal with the realities of the operating and protocols - its my experience that this is where you get security problems with Java apps. i.e. it might help you achieve your ultimate objective, and it will look nice on your CV but don't expect learning Java to supplement your security knowledge much.

symcbean
  • 19,931
  • 1
  • 29
  • 49
2

My advice:

  • bash (and its ilk) are not general purpose programming languages. While it is possible to accomplish some sophisticated scripting in bash it's not the best way to learn programming in general. It is the most natural way to accomplish systems administration tasks which primarily revolve around executing other programs, handling their data files and directories and marshaling input and output into and from them. If bash is a hammer, reserve it for problems that really do look like nails. Learning to do anything non-trivial in bash will be considerably easier if you learn some very small subsets of sed and awk (since the string manipulation in bash is largely inspired by the syntax of similar operations in these "little" languages).
  • For general purpose programming under Linux you'll hear many impassioned arguments. The two best contenders are Perl and Python. These are both very high level scripting languages which are general purpose, which expose enough low level functionality to do perform almost any operation that's accessible to any user space process on your system, and with huge collections of pre-written modules and libraries available.

I do recommend that you read an introductory text on C and spend some time running the strace and ltrace commands on some simple utility commands like ls and mkdir and /bin/echo etc. (Actually these days I'd suggest ltrace -S in lieu of strace but forays into the output from both commands and into the ltrace output as augmented by the -S option will be extremely educational).

C is the primary programming language in which the Linux kernel and the GNU libc are written. (Small parts are in assembly). Almost all programs on a Linux (or other UNIX-like) system are linked against the C libraries (libc). The primary Perl and Python interpreters (and most other scripting languages) are also written in C. These programs (the kernel, the common system libraries and the various scripting language interpreters) are all written by C programmers are their design and features are strongly influenced by their underlying implementations. Thus a deeper understanding of any of these eventually entails understanding C. You don't need to know anything about C++ nor Java to understand programming at this level. (Those may each be interesting and necessary in their own right depending on your career patch, but the differences between C and C++ and Java are largely orthogonal to the use of C for most of your system's programs and utilities.

So, if you agree with my premises so far, we've boiled it down to a choice between Perl and Python.

Here's where real flame wars begin.

My advice is to focus on Python (2.x) first. Python has a relatively simple and consistent syntax. You can learn the basics of Python syntax in a few hours and that's the vast majority of the syntax you'll ever encounter. There are only a few features (list comprehensions, generator expressions, decorators) which are wrinkles to the basic syntax. So most of your effort in learning Python will be devoted to learning the extensive standard libraries and trying to find the "best" way to use them (and figuring out which are the specific sets of exceptions that are worth handling to make your programs robust) and, most importantly, in learning the underlying concepts.

I think that Python's extensive libraries and relatively simple syntax does have two distinct disadvantages.

First, as you learn how do to thing in a very high level in Python you might find the thought of having to work at a lower level to be tedious. Where I work Perl is the standard. I prototype my work in Python, where I know I can get it working far more quickly and reliably then in Perl; then I dread having to go through and port it to Perl for my colleagues. (I was reasonably good at Perl years before I ever used Python --- so it's not a matter of simple familiarity).

The other disadvantage is that it's sometimes difficult to find the highest level way to accomplish a given task in Python. For example to fetch a web page you might initially try doing it with low level sockets ... which will work. However, you'd be duplicating quite a bit of code that you can already find including in the urllib and/or urllib2 modules. The very fact that the standard libraries, as of 2.7.1, includes both of these makes my point. Where possible the maintainers of Python have extended older modules and APIs transparently; however there are dozens of cases where Python retains two or three modules where transparent extension was not possible for some reason. (For another example you could look at the options for parsing command line options: argparse, optparse, and getopt. There's little harm to writing your programs using getopt (the oldest of these). For very simple utilities with few options and a rigid calling convention (used only by a small group of people, for example) then there's nothing inherently wrong with walking over sys.argv yourself. However, it's usually worth reading the docs carefully and following the links at the bottoms of older or lower level modules which describe the newer or higher level features available.

My advice is based on my opinion that you want to focus on deeper concepts and not have to spend much of your time and effort on syntactic and language specific issues. Understanding when to use a subprocess, versus a thread, or the multi-processing features that are included with Python has relatively little to do with the language and everything to do with programming proficiency regardless of language. (At the point where you can understand arguments about Twisted's event driven model by comparison to threading and multiprocessing then you'll probably have mastered Python and be ready to program in any language).

The counter argument, for Perl, is simple and practical. There are quite a few more jobs out there which will call, specifically, for skills with Perl. Perl is a powerful language and has extremely extensive libraries. (The core of Perl that's distributed with most Linux system is covers a smaller range of functionality than the standard Python libraries; it's assumed that you'll installed a significant number of additional packages from your distribution or through CPAN --- the Comprehensive Perl Archive Network). (By contrast there are fewer Python modules and packages that I have to fetch separately ... those are available from PyPI --- the Python Package Index).

So, if you learn Perl you'll have a leg up for finding jobs, particularly sysadmin jobs, in the short term. However, Perl's syntax is ... well ... in the words of some of it's own enthusiasts ... "pathologically eclectic!" Perl can be extremely terse and its code is filled with quite a bit of punctuation. Those who love it will argue endlessly that it's "easy" and makes perfect sense --- and will have endless opportunities to do so in forums which are filled with confusion about exactly how a given snippet of code was interpreted. The syntax and the language used in the documentation and by those who support it in public forums are nuanced to the point where you can spend considerable effort to learn them.

Now, please realize that this preceding commentary is subjective and biased. It's possible that you'll try Perl and find it's syntax to be intuitive and pleasant. If so, more power to you. However, I personally find that my grasp of Perl's idiosyncrasies decays very rapidly. The fundamentals I retain but I find it to be a struggle whenever I have to switch back to it for more than a few lines of code.

There are lots of other languages you would study, Java, Lisp and Scheme, TCL, Scala, and so on. However, I'd suggest starting with one that offers the best balance between utility and simplicity.

Jim Dennis
  • 807
  • 1
  • 10
  • 22
1

In Linux world, you should know two basic things:

  1. Regular Expression. This is a must. RegEx is a universal "language", once you know how to use regular expression, your life will be much easier :-)
  2. "quick & dirty" is very common in Linux world. If you can have your job done, no matter how ugly your solution is, you have your job done.

So, if you want to learn a language on Linux, you SHOULD choose a script language, like python, PHP, or even bash scripting.

And, my recommend is PHP. It's simple and ugly. It has a detailed online manual. It has good RegEx support. That's all.

yegle
  • 696
  • 7
  • 18
  • 2
    That's got to be the best description of PHP I've ever heard...When you take out the popularity there isn't a whole lot left behind it :P – Gordon Gustafson Apr 25 '11 at 23:11
  • A bit off topic, but I think the main reason for PHPs popularity is many of the functions that would be in scattered across many linker-library/packages in other languages are standard in PHP (bit like bundling most things you'll ever need in libc) – Phil Lello Apr 26 '11 at 02:22
1

You can do almost any task in almost any language, so the right choice is largely dictated by the problem you're solving.

It's definitely worth knowing languages from the following categories:

  • Compiled languages (like C/C++/Java). C/C++ is a good place to explore security issues with buffer overflows, stack corruption, etc.
  • Interpreted languages (like PHP). A good place to explore the problems with loosely-typed variables, and not detecting undefined functions until you call them
  • Scripting languages (sh/bash/csh/ksh). Really useful for gluing the many useful shell utilities (see /bin & /usr/bin) together

I'd make an effort to learn C/C++, as this allows stack corruption and direct memory access. This is important if you want to experiment with security issues. Many languages have C-like syntax (including JavaScript), so it's a good springboard.

If you're working in a shell much, which I guess you are as you're learning vim, you'll end up learning shell basic scripts as a side effect. UNIX Power Tools was a good book to learn more advanced stuff; I don't know if it's still published.

PHP can also be a good language to learn; the main advantage it offers new programmers is that a lot of functionality is built in, rather than in a library you'd need to link to (which isn't a complex task). Because of this, browsing the main docs will teach you about many things.

Phil Lello
  • 111
  • 4
1

First, some negative advice:

Basically what I am planning on doing is learning to program side-by-side with learning how to use Vim.

Do not do that. Find an editor that is really comfortable. Learning both the language and the editor is thrice as hard as learning them in order (obviously the editor should go first :-) )

As I said I am going into the security so I will be creating mostly security related apps. Most of which will be networking related.

Networking related security apps? C is a no brainer choice here. You will need to access the network at the system API level, which means C is the way to go. Of course this does not mean that everything needs to be in C - a C library + a $favorite-high-level-language wrapper might save you the C-related hassles in parts that do not need C.

With that said I have a few ideas. I was thinking of starting with JavaScript, because it is cross-platform, and I have seen it suggested before.

I bet it was suggested in a context different from yours. JavaScript has its strengths but it is not a good general purpose language, not yet. JS has no standard libraries comparable to those available to C, Python, Perl, Ruby, Java & company.

Speaking of $favorite-high-level-language - my advice is to go with Python. It interfaces nicely with C, it ships with a lot of useful libraries and has much more libraries available as add ons.

Rafał Dowgird
  • 133
  • 1
  • 6
0

Nothing at all wrong with learning Javascript but it only runs under a browser, so your programming will be limited to web page related apps.

There are probably as many answers to the 'what is a good programming language to start on' as there are languages. My tuppence worth is that you could do worse than starting with basic shell scripting, just seeing what you can do to automate tasks without using anything beyond the shell, then extend that using Perl, or a similar language which grew out of the need to do tasks more complex than the shell is capable of, after that, and if you are really interested it will only take a few weeks, start using C or a derivative.

blankabout
  • 1,004
  • 1
  • 9
  • 16
  • 3
    Only on browsers? [Not Anymore](http://en.wikipedia.org/wiki/Nodejs). – EEAA Apr 25 '11 at 16:09
  • Thanks, I stand corrected, but I still would not recommend Javascript as a learning environment, for a beginner, getting support for non-browser versions would be a nightmare. – blankabout Apr 25 '11 at 17:24
  • 2
    I've always thought ECMAScript is the programming language and Javascipt sits on top of ECMAScript to provide useful functions for manipulating the DOM. Javascript is for the browser. – Jonathan Mayhak Apr 25 '11 at 19:42
0

My recommendations? Hmmm. Well, you may have to decide as you go. For the wholesome, well-rounded range, you can go the usual CS degree route, maybe not in this order.

(1) C/C++ - You can get the Object Oriented stuff down, and at least you will have tried. C++ is the 'professional' standard. (2) Assembly (just for a little bit - it'll help you understand the real workings of the processor, memory, etc. - you don't have to marry it.) (3) Python/Perl/Bash - Get these scripting languages down, they'll be most useful for your level of Linux (4) PHP/Ruby, MySQL and HTML - Just get your Web programming on! You'll understand the whole server-client interaction process, another level of computing.

Need helpful concepts? AI, neural networks. These should round you out.

You can choose one of these to do or touch on all of them. My language? All of them, as needed, but I'v been programming since 1984, and have a CS degree, written games, and all kinds of embedded systems apps. It's what I do. You need to find out who you are and what YOU do. Just make sure you're having fun.

Enjoy!

0

oops, misread info-security for info-systems... oh well most of this still applies

Java

  1. There must be an entry level programming course that's java-based that'll count toward your degree. Always nice to get credits for the stuff you're learning.
  2. It'll give you some career growth if you get sick of being strictly a systems-guy... or if the company you work later decides that systems are like toasters and so are the people who run them.
  3. Object Oriented
  4. You said you wanted to do some Android development. That's going to mean Java.

Honestly, if you're working in Linux and take a beginning programming course in Java, and then maybe follow up with a couple more programming courses, the other tools like Bash sed/awk, etc... should sort of fall into place. If you really get into systems, you might pickup some C later, but I wouldn't say it's in great demand and I wouldn't even say it's a requirement of being a systems guy -- unless you're really into internals.

YMMV

MLH
  • 101
  • 1