An Affinity for Ciphers

7

Description

You, an armchair cryptanalyst, have noticed that the first letters of each word in your neighbor's mail (yes, you read it -- how else would you know if he's plotting something?) look very suspicious put together. To make sure that his mail is safe to send to its destination, you need to first decode these messages. You could check one or two, but your neighbor has many correspondences, and so you determine the best solution is to write a program to do the decoding for you.

Because you're sure that this neighbor is up to no good, you make your program assuming that most of the encoded words are actual words in English. You know that the backups of your backups of your backups are taking up a lot of room on your single drive (paranoia doesn't necessitate intelligence), and so you want to write this program in as little space as possible to minimize the space it takes up when you back it up thrice (it is important, after all).


Description

You are to output what you think is the solution to a given Affine Cipher. In short, Affine ciphers are encoded by the following formula for each character in the plaintext: C ≡ aP + b (mod 26) where a and b are constants, C is the ciphertext letter, and P is the plaintext letter. In this case, the letter "a" has the value of 0, "b" 1, and so on and so forth.

Rules

  1. With reference to my previous equation, in the ciphers you will be decoding a will only be a number coprime with 26.

  2. For testing the validity of decoded words, you may use a dictionary of your choosing (note that "i" and "a" should be words listed; the validity of the other letters do not matter, but I recommend making only "i" and "a" valid).

  3. If you choose to use a dictionary, you must be able to read from a text file and use it as a dictionary (i.e. you shouldn't be restricted to one dictionary). You may accomplish this by taking input of a filename and reading that file or hard-coding in a filename. This is so that every program can use the same dictionary for the test cases. This dictionary will be a text file with a word on each newline, all in lowercase.

  4. In each message, if you are using a dictionary, you should account for three words not matching a word in English; any more and the decoding is wrong. If you are not using a dictionary, this number should be constant but may be different (so long as you obtain correct answers for the test cases). If there is no output with fewer than three wrong words, you are not to give an output.

  5. When you output your decoding, you should output the one you think is most correct (i.e. the one that has the most English words).

  6. If you do not use a dictionary, claim a -100 byte bonus on your submission.

  7. When testing words to see if they are valid, check only contiguous alphabetical characters (e.g. you'd test #affineciphers as one word but pa$$word as two: pa and word). Words will not necessarily be of one case.

  8. Your program must work for all of the test cases, but not be hardcoded for them. In theory, if I add another test case (one within reason), your program should be able to decode it.

  9. Though it would be nice if you did so, optimizing for speed is unnecessary and will net you no bonuses.

  10. Although I would like to test your program, it does not have to have a free or online interpreter.

  11. If you do not use a dictionary, you may not read external files (i.e. the logic used to determine whether a word is a word in the English language or not must be contained within the program).

Sample encoding

Here's a sample of encoding, done with simple numbers (a = 3, b = 2)

foobar

First, we express our input numerically

6, 15, 15, 2, 0, 18

Then we multiply by a

18, 45, 45, 12, 0, 54

Optionally reduce to least positive residue

18, 19, 19, 12, 0, 2

Add b

20, 21, 21, 18, 2, 4

Now we reduce to least positive residue (there is no reduction to do for this case, though) and convert to text

tyyrbd

Input

You should be able to handle any characters, but the only encoded characters will be alphabetical ones (both capital and lowercase).

Output

You should output what your program thinks to be the decoded cipher, with the case matching that of the input and any non-alphabetical character remaining (i.e. don't change the case and strip the input of non-alphabetical characters when you output)


Test Cases

Input followed by expected output. Here is the dictionary you are to use (if you are using one). Save it as dictionary.txt (or whatever you want, really), and input its name for testing. This dictionary should have almost all the words in the test cases, with some left out intentionally to test if your program will still output.

Test Case 1:

@nLWc@ wc! m neCN.

@tHIs@ is! a teST.

Test Case 2:

Yeq lkcv bdk xcgk ez bdk uexlv'm qplqgwskmb.

You lead the race of the world's unluckiest.

Test Case 3:

Ugifc ism ozsmvd kpync g fssw st ryceco sb estbcnny.

Maybe you should write a book on pieces of confetti.

Test Case 4:

Nh wrteq tfwrx, tgy t yezvxh gjnmgrxx stpgx / Nh xrgxr, tx qwzjdw zk wrnuzfb p wty yejgb, / Ze rnsqpry xznr yjuu zsptqr qz qwr yetpgx / Zgr npgjqr stxq, tgy Urqwr-vteyx wty xjgb.

My heart aches, and a drowsy numbness pains / My sense, as though of hemlock I had drunk, / Or emptied some dull opiate to the drains / One minute past, and Lethe-wards had sunk.

Notes

The -100 byte bonus is subject to increase (i.e. more bytes will be subtracted) if I deem it to be too low. If it's not low enough (i.e. it's too much of a bonus), I will not change it provided that there are answers that have claimed it already.

I will add an ungolfed reference solution (to show how I would attempt the problem) if this question doesn't get any answers soon or if there is popular demand (let me know if you would like a reference solution).

cole

Posted 2015-10-13T02:41:06.157

Reputation: 3 526

When you say dictionary, do you mean polygrams? – Addison Crump – 2015-10-13T02:48:34.007

You don't necessarily need a dictionary of words. Do you just mean that we cannot use a source that has to do with the English language that may provide help to finding an answer? – Addison Crump – 2015-10-13T02:53:49.633

You need to clarify what you mean by dictionary - do you mean like a Webster dictionary, or commonalities of the language as a whole? – Addison Crump – 2015-10-13T03:02:46.833

@VTCAKAVSMoACE I think I understand some more; I apologize that I lack further knowledge in this (I'm just learning about cryptanalysis). I will allow it and grant it the bonus if it's self-contained in the code. To obtain the bonus you must not read anything from external sources (that's it); if you choose not to get it you must read a text dictionary. Does that clarify things? – cole – 2015-10-13T03:07:51.647

Absolutely. You might want to say "external sources" instead of "dictionary". – Addison Crump – 2015-10-13T03:09:21.337

Answers

1

Java 8, 608

import java.nio.file.*;import java.util.*;import java.util.function.*;s->{try{List<String>d=Files.readAllLines(Paths.get("d"));int v=java.util.stream.IntStream.range(0,676).mapToObj(I->new int[]{I,(int)Arrays.stream(s.toLowerCase().split("[^a-z]+")).map(u->{char[]x=new char[u.length()];for(int i=x.length;i-->0;)x[i]=(char)(((u.charAt(i)-'a')*(I%26)+I/26)%26+'a');return new String(x);}).filter(d::contains).count()}).max((m,n)->m[1]-n[1]).get()[0],a=v%26,b=v/26;s.chars().forEach(h->System.out.print((char)(h>='a'&&h<='z'?((h-'a')*a+b)%26+'a':h>='A'&&h<='Z'?((h-'A')*a+b)%26+'A':h)));}catch(Exception e){}}

This is a bunch of imports, and then a lambda expression. Runnable, readable code follows:

import java.nio.file.*;
import java.util.*;
import java.util.function.*;

public class C {

    static final Consumer<String> f = s -> {
        try {
            List<String> d = Files.readAllLines(Paths.get("d"));
            int v = java.util.stream.IntStream.range(0, 676).mapToObj(I -> new int[]{I, (int) Arrays.stream(s.toLowerCase().split("[^a-z]+")).map(u -> {
                char[] x = new char[u.length()];
                for (int i = x.length; i-- > 0;) {
                    x[i] = (char) (((u.charAt(i) - 'a') * (I % 26) + I / 26) % 26 + 'a');
                }
                return new String(x);
            }).filter(d::contains).count()}).max((m, n) -> m[1] - n[1]).get()[0],
                a = v % 26,
                b = v / 26;
            s.chars().forEach(h -> System.out.print((char) (h >= 'a' && h <= 'z' ? ((h - 'a') * a + b) % 26 + 'a' : h >= 'A' && h <= 'Z' ? ((h - 'A') * a + b) % 26 + 'A' : h)));
        } catch (Exception e) {
        }
    };

    public static void main(String[] args) {
        f.accept("@nLWc@ wc! m neCN.");
    }
}

The dictionary should be stored in a file named "d", with no extension. One word per line.

Ypnypn

Posted 2015-10-13T02:41:06.157

Reputation: 10 485

Looks good; I realize now that asking for input of a filename is somewhat unnecessary, so I'll change the rule regarding that. – cole – 2015-10-13T15:57:09.363