51

I am currently studying IT at college (UK college aka not University) and the coursework is boring me to death. I have been coding for quite a while now mainly in OO languages such as C# and Java but often get bored and give up quickly because the majority of it is boring UI stuff I hate doing, the projects I come up with rarely have much to do with code design and actually creating algorithms. I want to start writing my own algorithms of sorts and start moving away from the user friendliness side and start learning things that interest me, namely cryptography and compression. I want to write my own encryption algorithm, to encrypt the bytes of a file or string. I have a few questions:

  • Where would I start with this, What books/materials are recommended for starting with cryptography?
  • Do I need extensive cryptography knowledge to get started on a basic algorithm?
  • Will C# be OK for putting an encryption algorithm into practice?

Any help would be sincerely appreciated. I want to start writing code so when it comes to applying to uni, I have something to show for all of my bold claims on my application!

Confuto
  • 647
  • 1
  • 6
  • 6
  • 136
    "Do I need extensive cryptography knowledge to get started on a basic algorithm"? YES. Both designing and implementing cryptographic algorithms is extremely difficult, done only by professionals in teams. Play around all you want, learn, enjoy.... but don't even _think_ about using your own crypto algorithm for real-world encryption. – S.L. Barth Nov 22 '15 at 15:24
  • 7
    this question is way too broad to be answered. Designing an encryption-algorithm can be anything from stuff a primary-school pupil could do (caesar-chiffre for eg) to complex mathematical problems that drive grown mathematicans to despair. – Paul Nov 22 '15 at 15:25
  • @S.L.Barth Thanks for the comment. I see, maybe best if I find something else to work on then. I would not dare to use anything I design in the real world at all haha! – Confuto Nov 22 '15 at 15:26
  • 1
    `Will C# be OK for putting an encryption algorithm into practice?` SOme encryption algorithm should work independent of any language. If you use something which can't be done in eg. C, your algorithm can be thrown away. – deviantfan Nov 22 '15 at 15:40
  • 8
    `when it comes to applying to uni, I have something to show` Don't do this. There are (too) many ways to backfire with an encryption algorithm – deviantfan Nov 22 '15 at 15:43
  • 1
    C# will be fine as you're only doing this for practice anyway and performance is thus not a priority (again, you should never use homebrew encryption in production and the crap performance is actually a benefit as it'll be a deterrent from doing that). – André Borie Nov 22 '15 at 17:13
  • 13
    if you're looking for a set of crypto related programming challenges, I'd recommend lookig at http://cryptopals.com/ – Rory McCune Nov 22 '15 at 17:19
  • 34
    @deviantfan I don't see the issue with this - homebrew crypto is definitely a no-no in production, but showing it off as a hobby project should be fine and at the very least demonstrate programming knowledge even if the crypto itself is bad. – André Borie Nov 22 '15 at 17:25
  • Why do you want to write a new algorithm? Are the existing ones insufficient? What particular problem in existing algorithms do you expect to solve? Or do you just want to do it as a learning exercise? – Superbest Nov 24 '15 at 01:33
  • @RоryMcCune I cannot thank you enough for that reference. I'm a mathematician and most of my programming experience comes from solving puzzles. Those problem sets seem to be a fun way to explore the implementation side of cryptographic algorithms! – A.P. Nov 24 '15 at 11:09
  • @JamesRyan That is true, but I am teaching myself as a hobby and forcing myself to start a project can be hard at times, let alone doing something I hate. I am aware one day if I manage to make this a career I will have to do boring things, but not yet :D – Confuto Nov 24 '15 at 21:46
  • @Superbest Yes this is just a learning exercise, I don't expect to be creating anything to be used by others. I don't expect to solve any problems to be honest, just to gain some knowledge on how things work and give myself a challenge. – Confuto Nov 24 '15 at 21:54
  • 1
    @Confuto For learning exercises, why not try an online problem DB like ProjectEuler (not crypto specific) or the exercises at each chapter end of a popular cryptography textbook? – Superbest Nov 25 '15 at 01:05
  • 2
    @S.L.Barth Extensive knowledge is required to build a *strong* encryption algorithm, but not for a *basic* one, which is what the OP wants. I fear your comment has mislead him into giving up the endeavour based on his response to it! I think the OP can easily write a basic encryption algorithm and learn some useful concepts from the exercise. – Jon Bentley Nov 25 '15 at 19:21
  • ["I've built my own crypto"](https://twitter.com/old_sound/status/602996592531091456) – Dan Dascalescu Nov 25 '15 at 20:39
  • When I read the title, I couldn't help but uttered "uh-oh"... – Vladislavs Dovgalecs Nov 25 '15 at 22:08
  • Guys how is this well researched? If you google anything to do with cryptography it is either a paper published by god-like mathematicians or "DO NOT DO THIS YOURSELF" I am surprised Google doesn't just simply say "no" if you search for this. – Alec Teal Nov 26 '15 at 01:12
  • 4
    I just came here for the comments. – Curtis Mattoon Nov 26 '15 at 02:46
  • You can also have some fun if you can get a bunch of novice programmers together, and challenge them to make an encryption algorithm none of the others can break (which doesn't mean it's unbreakable, of course) and then to break each others' algorithms. – user253751 Nov 27 '15 at 04:36

9 Answers9

137

Of course you can start small and implement your own algorithms. But do not assume they provide any security beyond obfuscation.

The difficult thing when it comes to cryptography is finding reasons why something actually is secure. You won't be able to decide that within months and if you feel like you are at that point, you are most probably wrong.

It is much easier to find reasons why things are insecure than reasons why they are secure, so if you want to start somewhere, develop your own algorithms until you think they are secure and then try to find out why they are not and find ways to attack them.

Most mistakes are made when implementing algorithms. So if you want to get a well paid job you could learn how to implement that stuff correctly.

I would recommend starting to implement something like AES and than continue to different operation modes like CBC or CCM and find out why randomness is important. Continue with SHA-2 and HMAC and proceed to asymmetric cryptography. Always check what others did and why they did it and have a special look at side channel attacks and how they are performed. If you are at that point you will find your way to go on.

The reference to start with would be the "HAC", which is freely available online: http://cacr.uwaterloo.ca/hac/

[Edit] A suggestion from JRsz which shall not be buried in the comments. A good book for beginners: http://crypto-textbook.com/

fr00tyl00p
  • 2,329
  • 1
  • 15
  • 17
  • The HAC is a great reference text for those seriously implementing new crypto math, but is likely too dense for a beginner. – Adam Shostack Nov 22 '15 at 17:13
  • 78
    +1 for `Of course you can start small and implement your own algorithms. But do not assume they provide any security beyond obfuscation.` A lot of people keep saying, "don't do it!", but this provides a good opportunity to learn more. This little disclaimer is excellent. – Mark Buffalo Nov 22 '15 at 17:26
  • 10
    I am more than aware anything I make will not be usable in the real world, this is simply to give me something to learn and work towards. – Confuto Nov 22 '15 at 18:39
  • 1
    Not that I disagree with you overall message, but... *"The difficult thing when it comes to cryptography is finding reasons why something actually is secure. You won't be able to decide that within months and if you feel like you are at that point, you are most probably wrong."* So do people know *why* an algorithm like RSA is secure? As far as I'm aware, it's because no one has been able to find a way to break it. So why should a student believe his/her algorithm is any worse? – user541686 Nov 23 '15 at 05:49
  • 5
    @Mehrdad I think that it is simply because his/her algorithm has not been around for 35 years and still unbroken. – mbrt Nov 23 '15 at 08:01
  • @mbrt: By your logic it is impossible to invent a secure encryption algorithm. – user541686 Nov 23 '15 at 08:50
  • 6
    @Mehrdad The reason why RSA is secure is that the RSA problem is hard. You are right. It is hard because no one found way to break it. A student should believe their algo is weak because it is a defensive attitude. IMHO that is more reasonable than assuming their algorithm is secure unless someone breaks it, because those who try may be much less than those who tried to break RSA already. – fr00tyl00p Nov 23 '15 at 09:42
  • 2
    @fr00tyl00p: Again, I'm not disagreeing with your overall message or conclusion. What I'm disagreeing with is your premise that cryptographers "find reasons why [their algorithm] actually is secure". To the best of my understanding, they really don't do this: the reason comes after the algorithm is already invented, by the fact that it is not broken. Until then, all of it is based on the fact that they just have a gut feeling that their algorithms should be secure, and they don't have any evidence to the contrary. Your answer makes it seem like cryptographers prove uncrackability or something. – user541686 Nov 23 '15 at 10:13
  • @Confuto It might just as well be usable in real world in the following sense: no robot would be able to target you without supervision if you do a standard encryption + nonstandard encryption (i.e. your cypher will be broken only if there're real live people after you; this is not always true when using only library stuff that tends to get stale like DES). This might or might not be important. – Eugene Ryabtsev Nov 23 '15 at 10:14
  • 2
    @fr00tyl00p: To put it into perspective, I have one hell of hard time believing that ECC is actually secure. It doesn't look like it should be, and it looks like something I could come up with on a weekend since I don't need any proof of hardness or anything whatsoever, just a gut feeling. The one thing I would lack is a good reputation to make people take a similar algorithm seriously, but what else would I lack? I could just claim it's hard and wait for someone to prove me wrong. So why do people consider ECC any more secure than a random algorithm I might come up with? – user541686 Nov 23 '15 at 10:17
  • 1
    the hac is very complicated for beginners. I recomment this book for a good start: http://crypto-textbook.com/ – JRsz Nov 23 '15 at 16:37
  • I like that this encourages the user to try it with *managed expectations*. It would be a good (and I would argue necessary) exercise to have peers attempt to break the encryption. – corsiKa Nov 23 '15 at 18:51
  • Selected as answer, thanks. Overwhelmed with the awsome responses from here, thanks a lot to all! – Confuto Nov 23 '15 at 22:13
  • 3
    @Mehrdad -- cryptographers do have reason to believe that RSA is secure other than just "it is not broken yet". By studying the underlying mathematics they can prove that breaking RSA is very difficult. There are assumptions involved (such as that prime factorisation is slow), but those assumptions can be spelled out and quantified and are much stronger than just "no-one has found a hack yet". A good introductory mathematical crypto book should explain this in detail, or try searching for e.g. "why RSA is secure". – Rich Nov 24 '15 at 12:41
  • 3
    @Rich There are no mathematical nor computational proofs for RSA security. The assumption is that factorization takes exponential time. But this is only an assumption, not a proof. Again, this means in practice that RSA "is not broken yet" and nothing more. – mbrt Nov 24 '15 at 15:10
  • 4
    @mbrt: No, my point was that RSA's strength relies on factorization not being broken yet, which is a far stronger claim than "X not being broken yet" for any new cipher X, because factorization is much better understood than X. As I said in my comment, there are assumptions involved, but those assumptions can be spelled out and quantified and are much stronger than just "no-one has broken mbrt's cipher #7 yet". – Rich Nov 24 '15 at 15:18
  • @Rich: *"They can prove that breaking RSA is very difficult."* ...can you link me to said proof? All it relies on is factorization being difficult; there is no proof behind that. Not sure if you're following the news, but there was a very recent development on a quasi-polynomial-time algorithm for graph isomorphism, and after hearing about it some people are beginning to wonder whether factorization can be solved in quasi-polynomial-time too. Also, can you say the same for ECC? People think ECC is hard because no one has been able to break it either. Is there a proof? No. – user541686 Nov 24 '15 at 18:53
  • 3
    @Mehrdad: yes, wikipedia has the citations: "Miller has shown that – assuming the Extended Riemann Hypothesis – finding d from n and e is as hard as factoring" via https://en.wikipedia.org/wiki/RSA_(cryptosystem)#Integer_factorization_and_RSA_problem . As I stated before, the cryptographers have stated their assumptions: i.e. Extended Riemann Hypothesis and difficult prime factorisation, and then proven RSA secure under these assumptions. My point is that these assumptions are much better understood and more likely to be true than just any old "but X isn't broken yet" claim. – Rich Nov 24 '15 at 18:58
  • @Rich: And what about ECC? Also, if you state your assumptions that doesn't make your algorithm more secure. I can come up with an algorithm and state my assumptions too. Does that mean it's secure now? I feel like you're missing my point... – user541686 Nov 24 '15 at 19:01
  • @Mehrdad: I think you are misunderstanding me. I did not say that there is a proof that RSA is "secure". I said there is a proof that RSA is secure, given certain well understood assumptions. I am trying to answer your initial question "why should a student believe his/her algorithm is any worse [than RSA]" -- it is because the assumptions used in these proofs are very strong and much stronger than just a vague claim of "no-one has broken student X's crypto yet". – Rich Nov 24 '15 at 19:02
  • "if you state your assumptions that doesn't make your algorithm more secure" -- it does, if your assumptions are strong and you *prove* a link between your assumptions and your algo's security – Rich Nov 24 '15 at 19:02
  • 1
    @Rich: (1) The assumption that factoring is hard is only "strong" because no one has been able to break it. Why does that imply a new assumption that no one has worked on is weak? (2) What is the assumption that belies the strength of ECC? You keep avoiding answering this. – user541686 Nov 24 '15 at 19:08
  • Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/32069/discussion-between-rich-and-mehrdad). – Rich Nov 24 '15 at 19:16
  • Building your own encryption algorithm without enough math/cs/crypto knowledge is not called *encryption* but *en-crap-tion*. Better than trying to learn by implementing an algorithm, you should learn more math (deep and hard math topics) and CS, and reimplement already-existent algorithms and study their weak points. – Luis Masuelli Nov 25 '15 at 15:47
37

Coursera

Here's my 2 cents:

Join the Coursera Cryptography online class:

The class takes six weeks. Each week there are several lecture videos, a graded quiz and an optional programming assignment. (And these assignments involve implementing crypto parts.)

At the end of the six weeks there is a test.

If you want to be challenged, then this is the right way to go. It is a lot of work. I suggest you plan 10+ hours each week. More if you want to do the programming assignment as well.

(Edit: Here's a table of contents for a previous run of this class.)

Clarification: The programming assignments are just there to generate a deeper understanding of the topic. They are explicitly NOT something that you are then meant to release into the wild.
On the contrary: The "Don't you ever implement this yourself!" message is repeated again and again. (And without giving too much away: Whenever "Just implement it yourself!" is on one of the multiple choice tests, then it's wrong.)

StackzOfZtuff
  • 17,783
  • 1
  • 50
  • 86
  • I forgot about Coursera, thanks! I am subscribed to Lynda.com but they d not have much on crypto. – Confuto Nov 22 '15 at 18:38
  • 6
    +1 on the class. It is quite challenging. Note that the programming assignments are in C. – Scott C Wilson Nov 22 '15 at 18:54
  • 5
    Actually any language will do. Only results are submitted. Not the implementation. On the forums there are implementations in different languages. Examples are in Python2 though. – StackzOfZtuff Nov 22 '15 at 19:08
21

Start by breaking, not building your own. There's a worrisomely large number of stackexchange posts by people who've written their own algorithms. Take a look around and figure out what's wrong with them. (Don't look at the posted answers.) [Good searches include "Is this secure" and "whats wrong with this algorithm".]

Only when you've found issues in other people's work should you move to trying to implement other people's algorithms. (@stackzofztuff's comment about Coursera is not bad--if I recall, Dan Boneh starts out that way, with more structure than poking here.)

Adam Shostack
  • 2,659
  • 1
  • 10
  • 12
  • *Take a look around* Can you please explain how to find these posts? Do we have to filter questions by tag, websites, etc.? – A.L Nov 26 '15 at 11:00
  • @A.L, I'm thinking searching by multiple tags would be most effective. Something like [cryptography+algorithm](http://security.stackexchange.com/questions/tagged/cryptography+algorithm) or [encryption+algorithm](http://security.stackexchange.com/questions/tagged/encryption+algorithm). You'll have to dig a bit, but at a glance, I do see a few questions. Although they're largely buried beneath smaller things like mixing algorithms or small obfuscations added to existing algorithm. Lots of stuff that doesn't improve security nor weaken it. I don't agree with Adam Shostack that this is effective. – Kat Nov 26 '15 at 17:33
  • I searched for "Is this secure" and think the first 5 questions are good places to start; I've edited my response to be more crisp. – Adam Shostack Nov 27 '15 at 17:44
13

Bruce Schneier's Applied Cryptography is a must read if you want to start studying this field. I am surprised that nobody suggested it before.

And yes, you need to know a lot about crypto even before trying to roll your own algorithms for fun. Don't even think of using them for real-world problems, though -- there's already a lot of bad crypto around.

Concerning programming, avoid proprietary languages like the plague. I'd suggest C, or even C++.

dr_
  • 5,060
  • 4
  • 19
  • 30
  • 3
    Why do you say *"Concerning programming, avoid proprietary languages like the plague"*? My experience is that the code can be converted to C/C++ pretty easily (often almost identical, line for line) if you write optimised code in C# (not that many people do, though…). Also, the C# compiler is now open source (released under Apache 2.0). – Toothbrush Nov 23 '15 at 12:55
  • 1
    C# has also always been publicly documented as a standard, through ECMA if I recall correctly (and that is the standard that Microsoft's compiler is expected to adhere to). – user Nov 23 '15 at 13:45
  • 2
    +1 for Bruce Schneier's book. – tcrosley Nov 24 '15 at 08:01
  • @Toothbrush Then why not using directly C which is open, mature, and widespread. – dr_ Nov 24 '15 at 08:22
  • 1
    @dr01 Sure if you're writing a serious encryption algorithm then you should definitely write it in C so it can be formally reviewed. However if you're just knocking up some code to understand what encryption does then why not use whatever language you're most comfortable with? If the only people that will see it are him and his cat, what difference does it make if it is written in AutoLisp! Maybe Applescript will make a come back and become the new Encryption standard... ;) – Michael B Nov 24 '15 at 11:37
  • @dr01 When learning something more general than a specific language, it is useful if language used does not give you loaded gun, cock it, and duct-tape it on your waist pointing at your foot. Language which has "undefined behaviour" as core concept is not suitable for learning about algorithms. While C has a lot of things going for it, and has its place, it requires specific expertise which is totally unrelated to learning algorithms. – hyde Nov 26 '15 at 17:53
  • And always remember Schneier's rule... Anybody can create a truly amazing encryption algorithm that they cannot break. ie... peer review is necessary to prove that it actually works. – Fiasco Labs Nov 28 '15 at 03:43
10

A good start would be to implement existing algorithms and learn how they work in depth. For example, the one-time pad algorithm is easy to learn and implement, and studying its strengths and weaknesses will get you started. It will also get you comfortable with the kind of bit-twiddling that's important in cryptography. Doing a search for "one-time pad" will get you started.

Scott C Wilson
  • 543
  • 3
  • 11
  • 2
    But note that ontimepad is a "odd" one (algorithm as well as possible usage), it won't help much in understanding AES etc., which is much more used in practice and has many concepts which an be found in other algos too (etc.etc.) – deviantfan Nov 22 '15 at 15:39
  • 2
    the point that would be relevant for OP about encryption-algorithms is usually not the logic that is implemented, but the math behind it. AES is quite a good example for this. In the code all one can see are some simple bit-shifts and low-level mathematical instructions. The algorithm itself is designed in a way to make the encrypted data look purely random, which requires quite some skills to achieve from the designer, to say the least. – Paul Nov 22 '15 at 15:56
  • One time pad is an interesting case though - it's quite a simple algorithm, and in optimal conditions - unbreakable. It's weaknesses stem from scalability and key exchange, which are non trivial for 'real world' applications. But I don't know it really helps understand the other crypto options, which are way more complicated as a result of needing to deal with trust, key exchange etc. – Sobrique Nov 24 '15 at 12:44
6

Just to pile on to the great answers that are here, with something of a different angle.

If you make the assumption that your v1 algorithm is going to be insecure and awful, and your v100 will be only very slightly better, but equally insecure. (as will your v1000)

With that assumption in mind, you can learn a lot by solving the problems that cryptography has needed to solve in order to become the science it is today. i.e. reinventing the wheel.

Personally I find reinventing the wheel an excellent way to learn a complex task. It gives you something to solve. If that is your thing there are fewer things that are more tricky to solve than cryptography.

That way, you can go to University and say, well I've been designing my own cryptography algorithm as a vehicle for learning how to solve difficult problems. At v50 the algorithm sucks, but these are the lessons I've learnt, the solutions I found, and this is how they're solved in the real world.

There's a big difference between saying I've written a 'good' cryptography protocol and saying I've written a protocol that is almost certainly very weak but the purpose wasn't creating a secure protocol. Personally I'd have a large amount of respect for someone who said the latter. (I'd likely nod to the former - and quickly show him the door)

Michael B
  • 436
  • 4
  • 13
  • This. OP is looking for an interesting challenge - a chance to learn, not to become the inventor of the Next Big Thing. – Floris Nov 24 '15 at 16:06
3

You can implement already existing encryption algorithms, but designing your own encryption algorithm is one of the most complex matters you could deal with. For a general introduction I highly recommend this channel: https://www.youtube.com/channel/UC1usFRN4LCMcfIV7UjHNuQg/videos or the book "Understanding Cryptography" by Christoph Paar and Jan Pelzl (http://crypto-textbook.com/). I assume you are aiming at symmetric algorithms and I would recomment you to start reading a lot of theory about them, what is safe, was is unsafe (historicle) and explicitly how have current state of the art algorithms made their way to what they are (how was AES developed, chosen, etc).

You will encounter many mathematicle problems when you deal with different attacker models, even in the symmetric part. The asymmetric cryptography is manly based on mathematicle problems and there are some very tricky attacks which are very powerful against asymmetric cryptography.

Bottom line is do not develop your own algorithms, unless you have many years of experience and are very familiar with the topic (all parts of it). Implementing a few of them is nevertheless a good idea, but if you are looking for some kind of project why dont you write a program which uses already existing algorithms and decrypts some data for you. You will enough issues to deal with that, because a secure algorithm is not a guarantee at all for secure encryption. Different operation modes will be important on this matter and some other issues as well.

JRsz
  • 131
  • 6
2

You could follow Scott Wilson's suggestion about the One-Time pad, but with real random data. You can e.g. consider the noise from the computer's webcam. Let the webcam take a few pictures of a static scene, convert the images to 32 bit floating point images, normalize the pictures to the same brightness, take the average and then subtract one of the pictures from the average. If you map negative pixels values to 0 and positive pixel values to 1, you almost have perfectly random bits, that are uncorrelated when the pixels are not too close. Applying von Neumann's algorithm to pairs of bits taken from distant pixels:

(0,1) ---> 0

(1,0)---> 1

(0,0) and (1,1) are discarded

will yield perfect random bits with 0 and 1 having exactly 50% probability.

Count Iblis
  • 228
  • 1
  • 5
2

Go ahead, write an algorithm but at the end give a task to one of your friends/fellows who regularly deal with cryptography; tell them to break your encryption if they can.

You will notice that they will be able to break it in a matter of minutes and you'll be left stunned thinking as to how many loopholes were there that gave the game away to people with extensive cryptography knowledge (to qoute your words)

This is exactly what happened to me when I was learning to program long ago and wanted to challenge some guys who were smart at such stuff and I failed miserably. So much so they could even decipher the message printed on a paper using just their mind and skills.

You certainly can write one and perfect it over the time but by no means it can be secure anytime soon against the people with that knowledge.

That will be a good starting point (or possibly even a stopping point) on your quest to write your own algorithm :)

Hanky Panky
  • 231
  • 1
  • 9