(LZMA is a compression algorithm, not cryptographic.)
For the purpose of implementing cryptographic algorithms, the generic method is getting the relevant descriptive standard, grabbing your keyboard, and trying. Most standards include "test vectors", i.e. sample values which let you know whether your implementation returns the correct answers. At that point, things differ, depending on what kind of algorithm you are considering.
Symmetric cryptography:
Symmetric algorithms cover symmetric encryption, hash functions, and message authentication codes (MAC). You do not need to know much mathematics to handle these; most of it is about additions of 32-bit and 64-bit integers (that's modular arithmetic, with 232 or 264 as modulus) and bitwise operations (XOR, AND...).
Such code is usually done in C. Good performance is achieved by having some notions on how the C compiler will understand and translate the code into instructions for the CPU; knowledge of assembly is not strictly mandatory, but quite useful. An important parameter is cache memory: loop unrolling is usually a good tool, but if you overdo it, performance drops sharply.
I suggest beginning by implementing the classical hash functions (the SHA family, described in FIPS 180-3) and trying to make them fast. As a comparison point, get OpenSSL and use the command-line tool openssl speed
to see what kind of performance can be obtained (this tool is already included in any decent Linux distribution, and it works on Windows and MacOS too). For instance, on my PC:
$ openssl speed sha256
Doing sha256 for 3s on 16 size blocks: 4842590 sha256's in 3.00s
Doing sha256 for 3s on 64 size blocks: 2820288 sha256's in 2.99s
Doing sha256 for 3s on 256 size blocks: 1262067 sha256's in 2.99s
Doing sha256 for 3s on 1024 size blocks: 395563 sha256's in 3.00s
Doing sha256 for 3s on 8192 size blocks: 53564 sha256's in 3.00s
OpenSSL 0.9.8o 01 Jun 2010
built on: Wed Feb 23 00:47:27 UTC 2011
options:bn(64,64) md2(int) rc4(ptr,char) des(idx,cisc,16,int) aes(partial) blowfish(ptr2)
compiler: cc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN
-DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wa,--noexecstack -g -Wall -DMD32_REG_T=int
-DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
sha256 25827.15k 60367.37k 108056.57k 135018.84k 146265.43k
which means that OpenSSL includes a SHA-256 implementation hand-optimized in assembly, which achieves 146 MB/s when processing 8 kB messages. On the same machine, a pure C implementation ought to get to at least 130 MB/s.
For an example of how hash functions are implemented in C and Java, and how hashing speed can be measured in a meaningful way, see sphlib.
Afterwards, you can try symmetric encryption, in particular the AES (FIPS 197). It helps a bit to know what a finite field of characteristic 2 is, but the standard is clear enough to guide you through a perfunctory implementation. Then, try to optimize things. OpenSSL can serve as a comparison point, and get inspiration from the AES implementations of Brian Gladman. As for security, there has been some concern about what key-dependent information could be leaked through use of look-up tables in the implementation (try to search for "AES cache timing attack"); trying to reproduce that kind of attack is a very good exercise (mind you, it is not easy, but if you succeed in demonstrating it in lab conditions then you will have learned a good deal on how cryptographic implementations work).
Asymmetric cryptography:
Asymmetric cryptography is about the algorithms which involve more than one party. This includes asymmetric encryption (RSA, ElGamal), key exchange (Diffie-Hellman) and digital signatures (RSA again, DSA...). The maths contents are much bigger there, and optimization is a much broader subject than for symmetric cryptography, because there are several ways to implement each algorithm, instead of a single "obvious" implementation path.
A good reference is the Guide to Elliptic Curve Cryptography. Although it is mainly about elliptic curves, it includes a general treatment of the implementation of operations in finite fields, and it so happens that this is the sample chapter which can be downloaded for free at the URL linked to above. So get it and read it now. Another indispensable reference is the Handbook of Applied Cryptography, which can be freely downloaded; chapter 14, in particular, is about efficient implementation.
RSA is simple enough, and is adequately described in PKCS#1. There are possible timing attacks on RSA, which are countered by masking (yes, this is a paper "written by a researcher", but in the subject of cryptography, researchers are the people who understand what is going on). If you get the hang of modular arithmetic, you can try to implement DSA (FIPS 186-3). Diffie-Hellman is mathematically simple (it needs nothing more than is needed to implement DSA) but its describing standard (ANSI X9.42) is not downloadable for free.
Elliptic curves are a popular future replacement for modular arithmetic; EC variants of DSA and Diffie-Hellman are faster and believed more secure with shorter public keys. But that's more mathematics. There again, the Guide to Elliptic Curve Cryptography is the must-have reference.
There are other kinds of asymmetric cryptography algorithms, e.g. the McEliece cryptosystem (asymmetric encryption; there is a variant for signatures described by Niederreiter) and algorithms based on lattice reduction. But they do not (yet) benefit from published standards which take care of the implementation details, and there are not so many existing implementations to compare with. You'd better begin with RSA and DSA.
Cryptanalysis:
Cryptanalysis uses a much higher dose of mathematics than implementation.
For symmetric cryptography, the two main tools are differential and linear cryptanalysis; see this tutorial.
My own path to cryptography began by implementing DES, and then implementing Matsui's linear cryptanalysis on a reduced version of DES (8 rounds instead of 16). DES is described in FIPS 46-3, which is officially withdrawn, but still available. From DES can be defined Triple-DES (three DES instances, with three distinct keys, the middle one being use in "decryption" direction) and there are published test vectors for Triple-DES (also known as "TDES", "3DES", or sometimes "DES", which is arguably confusing).
For asymmetric algorithms, cryptanalysis mostly involves working on the mathematical structure of the keys, e.g. by trying to factor big non-prime integers in order to break RSA variants. Mathematics here range from the non-trivial to the totally unimaginable, so this might be too steep a learning curve to begin cryptography by trying to break RSA...