I'm trying to find a way to crack XKCD-like passphrases (correcthorsebatterystaple) from word dictionaries. Basically concatenate X number of words from a dictionary file. Right now it honestly doesn't look like there is an easy way.
After the XKCD comic there were plenty of tools like this made, and I could always spin up my own and create a dictionary file, but I feel like there should be a better way. My specific use is John the Ripper on MD5 hashs.
It doesn't look like there is a way to concatenate entire words in JtR, only modify existing ones. Reference HERE at the bottom
A multi-word "passphrase" cracking mode, or an enhancement to the wordlist mode, might be added in a future version of JtR.
This was 2008, but I still don't see a reference to anything like this changing. I could use the technique described in that document, but it gets real ugly if you need to do more than 2 word phrases or if your dictionary is of any significant length.
My second idea was to use Crunch to pipe in the words, for example...crunch 0 0 -p word1 word2 | john --pipe mypasswd*
I could modify that a bit and use -q wordlist.txt and crunch would use the words from that text file. The problem here is that there is no way (that I have found) to limit the number of words used for a passphrase. For example, if your dictionary contained 1,000 words, each passphrase would be a concatenation of all 1,000 words. Again, it gets real ugly if your dictionary is of any significant length.
Edit: Note here for people that suggest changing the above crunch command to specify min length and max length. This does not work with the -p or -q option, however, the numbers must still be specified (thus the 0 placeholders). reference under -p flag
and it ignores min and max length however you must still specify two numbers.
That leaves my requirements at something that writes to stdout, not a file due to the size such a file would be, and allows you to specify the number of words to join (2 word phrases, 3 word, etc). Preferably such a tool would also allow you to specify separating characters (correct.horse.battery.staple correct|horse|battery|staple) in case other characters or even spaces are allowed.
Hope this is the correct stack exchange, someone let me know if there is another I should try.
Edit
For anyone else out there looking for this same kind of thing, here's a python code snippit that does more or less what I want.
# iterable=['test','correct','horse','battery','staple']
## Change the file specified here to your dictionary file
iterable = [i.strip().split() for i in open("wordList.txt").readlines()]
def permutations(iterable, r=None, s=''):
pool = tuple(iterable)
n = len(pool)
r = n if r is None else r
if r > n:
#return
r = n # Lets assume users want all permutations
indices = range(n)
cycles = range(n, n-r, -1)
temp = tuple(pool[i] for i in indices[:r])
for item in temp: # iterate through the current tuple and turn it into a string, to get rid of brackets and such
print s.join([str(i[0]) for i in temp])
while n:
for i in reversed(range(r)):
cycles[i] -= 1
if cycles[i] == 0:
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
else:
j = cycles[i]
indices[i], indices[-j] = indices[-j], indices[i]
temp = tuple(pool[i] for i in indices[:r])
for item in temp:
print s.join([str(i[0]) for i in temp])
break
else:
return
# The first value relates to our variable (dictionary file) at the top
# The second value is the number of words we want to combine, a 2 would indicate 2
## word combinations, such as correct.horse, 3 would be correct.horse.staple
# The third value is the seperater, such as . in the example above, to create words
## that run together enter nothing, or ''
permutations(iterable, 2, '.')
To use this with JtR you would use python xkcd.py | john --pipe mypasswd*
The code was taken from python's itertools so it should return...
r-length tuples, all possible orderings, no repeated elements
I wanted all this, plus it doesn't store the array in memory (it would run out quick with a long list) and doesn't write to disk (although you can redirect the output if you want).
Now, I have run into errors (IOError: [Errno 32] Broken pipe) with long runs and JtR. The code is sloppy, etc. So no, this isn't a perfect real world solution. However, as has been pointed out, this may not be practical in the real world even without errors, due to the number of possibilities. Sometimes I just want to know if something is possible though!
If anyone out there watching this knows C and wants to add something like this directly to JtR or Crunch that would be amazing, I have the feeling that would speed things up and make this much more reliable if it was written into a program designed for this kind of task.