42

I would like to stay out of the automatic filters in place by security agencies and not be accidentally placed on a no-fly list or such. Say I'm having a political debate with a friend about democracy and stuff, and terms like revolution, capitalism, freedom and such (oh, hello there NSA!) are thrown around a lot. Sending an encrypted email where normally I would send emails in plain text is a sure fire way to trigger some of the filters, I assume.

Is there a way to encrypt emails so that the cypher text is hard for an algorithm to distinguish from regular (let's say spam, hard to get any more regular than spam) email?

For example:

Normal PGP cyphertext, easy to distinguish:

-----BEGIN PGP MESSAGE-----
Comment: GPGTools - http://gpgtools.org

hQIMA7t6lidYOUd0AQ//Z7y+/tvQQ0TRoOT0ydUwVjJZh5sLQOEVQNDHGEUjfvL9
7UJhtEaisVwlDsqTEqpa04FWzgehBBDnxgOUFcPB3xSGD9Bi61MItK6gm1phTnEn
hOezHmGqAyrCarofkYn5vpwPZtpSmRvpS9tykhRTKMlhsN5EOLvaDa8TsqMnqwGm
pPC8j219YG2U/OmRa96GTslMaDtIx6470Ea4fcJf2jdo3RlgLEc7BGQVcrOpHj/0
-----END PGP MESSAGE-----

Cyphertext, less easy to distinguish:

pen3s grow for cheap russion brides are looking for parntners in 
Detroid area visit our website now click to unsubscribe
Psychonaut
  • 615
  • 4
  • 14
Rodin
  • 531
  • 4
  • 7
  • 31
    This is called [steganography](http://en.wikipedia.org/wiki/Steganography). – CodesInChaos Nov 08 '13 at 13:56
  • 1
    ^ this. However, if you make your encrypted email look like spam, are you not going to lose it to filters for actual spam? One thing to bear in mind is that if you were to use some kind of steganography to hide email, the resultant message is likely to be much longer than the original PGP, perhaps suspiciously so. – Owen Nov 08 '13 at 16:00
  • 2
    PGP is so common that plain-text steganography is going to look more suspicious. You might be better off using image-based steganography. – Kevin Borders Nov 08 '13 at 18:49
  • 4
    And now everyone who posted to this thread IS on an NSA watch list..... oop, and now I am too. – squarecandy Nov 08 '13 at 22:25
  • 2
    PS - why do none of the answers here suggest **not using email** if you're concerned about security? – squarecandy Nov 08 '13 at 22:27
  • The question specifically asked about securing email, but I'm sure your point has been observed. – David Houde Nov 09 '13 at 02:46
  • Looks more like spamanography to me. – Michael Hampton Nov 11 '13 at 00:44
  • @Rodin: You know that as a result of using those forbidden words (rev*lution, etc.) in this posting, the NSA has already determined who you are and placed your name on the no-fly list :-). – Ralph Nov 13 '13 at 12:25
  • 1
    [Spamcryption](http://security.stackexchange.com/questions/74608/would-spam-mail-really-avoid-eavesdropping) –  Apr 29 '15 at 16:09

10 Answers10

44

Say what you actually want to do is to make your encrypted email look like spam. OK, how to accomplish that?

One possible way would be to take the ciphertext and break it down into managable chunks of, say, nine bits each. Using a set of dictionaries, these nine-bit quantities are mapped to one or more words in a target language (nine bits would require a dictionary of 512 words, which is feasible while at the same time providing variation). A Markov chain could possibly be used to pick the next dictionary based on the word selected in the previous dictionary, which likely could be made to make the output resemble very poorly written text in the given language.

By tweaking the interaction between the two parts, the output of such a scheme could conceivably be anything from nonsense to semi-legible text (much like a lot of spam emails). And it'll be text, not binary data.

An even simpler variant would be to simply encode the ciphertext using something like the PGP word list. The result of that will of course be complete and utter nonsense, but it'll probably pass the most simple statistical tests for a given target language.

Now that I've described these ideas, they are of course totally useless. You'll have to come up with something of your own. ;-)

Adi
  • 43,808
  • 16
  • 135
  • 167
user
  • 7,670
  • 2
  • 30
  • 54
  • 1
    You're correct. I was under the impression that the OP is looking to send _specific_ ciphertext for a _specific_ plaintext. – Adi Nov 08 '13 at 14:00
  • @Adnan Even that is easy, just use an OTP. Pick the key to give the ciphertext you prefer. Key distribution becomes a problem, however. ;) Btw, if you agree, I appreciate an upvote. :) – user Nov 08 '13 at 14:01
  • Then he will have the additional problem of his email being caught in some spam filter – SplashHit Nov 08 '13 at 16:06
  • http://www.spammimic.com/ has an implementation of the "encode as spam" idea (they also have one which encodes a message as whitespace) – OJW Nov 08 '13 at 18:13
  • @OJW [That has already been posted as an answer.](http://security.stackexchange.com/a/45168/2138) If you feel that is a better answer to this question, you should upvote that answer rather than commenting on an unrelated answer. It also appears from the answer that their service is considerably less sophisticated than what I had in mind when writing my answer. – user Nov 10 '13 at 22:11
10

You're asking for http://spammimic.com/ , which is a web site that does exactly that. They use a steganographic method for encoding bits using spam sentences.

The drawbacks of the spammimic implementation are severe, though. They're publicly known, so you can bet that someone who might interested in what flows through their site is already intercepting it. And their word and phrase list is static, so a single encoding is always recognizable as such. Next, their algorithm seems to encode 13 byte blocks into about 1 kb of spammish text, so it's highly inefficient. People don't read (nor send) more than about a paragraph of spam, so a 10kb block of spam text would be highly suspicious.

People trying to send more than a few words are disguising them in larger files, such as pictures or music files. Not that these can't also be spotted, but exchanging pictures with a friend is less suspicious than someone who exchanges large blocks of spam.

Finally, people are using "dead drops". Rather than email the secret plans in an image to their co-conspirator, they can post the image to any one of thousands of image hosting sites, or attached to a product review, or an eBay sale. That makes it a bit harder for someone to know which of the viewers of the page was the person who decoded it.

John Deters
  • 33,650
  • 3
  • 57
  • 110
  • 1
    A well known service as spammimic.com can always be improved very cheaply: take your text, compress it, scramble it, use base64 if it does not accept binaries. Now, almost any pattern for specific cyphered words would have been destroyed. – Carlos Eugenio Thompson Pinzón Nov 08 '13 at 17:15
6

Say you substitute each character of ciphertext for a set dictionary word, you would then have a bunch of unrelated words in a very long message. While this wouldn't trip an algorithm looking for typical ciphertext it probably would trip the algorithms the extremely intelligent people employed by the security services have put in place to detect novel methods of encrypting communication, so you'd be far more likely to draw attention to yourself than if you just used something run of the mill like GPG.

I think that your assumption that sending encrypted messages will automatically draw attention on yourself is probably wrong. Plenty of people send encrypted emails for perfectly normal reasons, they don't suddenly get put on no-fly lists just because of that.

GdD
  • 17,291
  • 2
  • 41
  • 63
  • These filters have to deal with encrypted messages somehow. They cannot simply ignore all encrypted messages. They would employ some heuristics, like: "this person never send an encrypted message ever. Suddenly he does. Better bounce it upward," or something. Just as a mental exercise more than anything: how would one design ones email traffic, so it generates the least hits by such algorithms? – Rodin Nov 08 '13 at 14:40
  • I'm sure they do, what of it though? I'm sure that happens a million times a day, it's doubtful it would make any difference to them at all. – GdD Nov 08 '13 at 14:42
  • If it happens a million times a day, probably a few times a day it is misclassified by the humans. It is problematic exactly because it happens so often... says my paranoid mind. – Rodin Nov 11 '13 at 09:10
6

Leo Marks devised a system for communicating in code that appeared to be innocuous looking plaintext. He describes it briefly in chapter 79 of Between Silk and Cyanide. Essentially, a large code book which maps characters to sentence fragments allows a short message to be encoded as a number of sentences. Given the pithy nature of the crap text that is thrown into SPAM, it would be relatively easy to generate a code book that would generate text that didn't raise eyebrows.

There are two drawbacks to this scheme:

  1. Size - a small amount of plaintext will generate a large volume of encoded text
  2. Code books -
    1. You need to generate a very large code book
    2. You need to share the code book with your friend without anyone else getting a copy

Essentially, this is just another transposition cipher, and as such its vulnerable to various attacks. If you use it to encode ciphertext output by a modern algorithm, maybe the pair meets your requirements for both confidentiality and secrecy - although, as Adnan points out, protecting your traffic against a sufficiently skilled and capable opponent such as the NSA has all sorts of pitfalls.

gowenfawr
  • 71,975
  • 17
  • 161
  • 198
  • If the NSA is after you specifically, you're almost certainly screwed, I agree. But fooling a filter that monitors large swatches of emails might be possible. Such a filter cannot apply sophisticated attacks against every message that passes through it can it? – Rodin Nov 08 '13 at 14:42
4

There are two sides of this. One is the encryption, and the other is hiding the encryption in spam (steganography). Both are solved problems, so yes, this is doable today using off-the-shelf components.

Note that steganography is not encryption. If you want the security of both, you should do both.

First of all, the encryption. We'll assume you can do that already. The result is a string of bits, which you'll now bury in spam.

The next step will necessarily require a certain amount of secrecy. Since you're hiding the message "in plain sight", if your attacker knows your encoding mechanism, he may be able to detect encoded messages. But let's compose an algorithm right here:

Step 1: Acquire spammy content
You could generate this algorithmically using Markov chains, or you could capture inbound spam, or some other similar source.

Step 2: Subtly modify the content in a difficult-to-detect way
This will depend somewhat on your spammy source. But one technique might be word capitalization. Convert the inbound spam to lowercase, and then, moving through your ciphertext one bit at a time, capitalize the next word if the bit is a 1, or leave it lowercase if it's a zero.

The more space-efficient your stenographic algorithm, the easier it is to detect. Capitalizing the first letter of each word is less obvious than capitalizing each individual letter. Capitalizing a letter for each sentence is even less obvious. Adding or omitting punctuation might be another tactic.

If you and the receiving party both have a copy of the source spam, you could subtly modify the contents to indicate bit positions. For example, you could add otherwise unnecessary words, omit sentences, alter punctuation, or other similar techniques.

Step 3: Decode the message
This is as simple as reversing the technique applied in step 2. For example, test the capitalization of each word, and record a 1 or 0 based on the result.

Step 4: Decrypt the message
Now that you have your ciphertext bitstream back, decrypt it using traditional techniques.

tylerl
  • 82,225
  • 25
  • 148
  • 226
3

Probably the best way this can be done is by steganography, i.e. the process of hiding something within something. This can be done by perhaps twiddling unimportant bits of files or even by hiding entire documents within images etc. There is loads of free-ware stegaonography tools around which can give you the gist of it (particularly if you would like to build your own).

I would highly recommend looking into it, it's a hidden world out there ;)

user
  • 7,670
  • 2
  • 30
  • 54
daark
  • 272
  • 2
  • 7
2

You can do this, but only if you change the method of encryption.

As @Adnan mentioned, it's practically impossible to find a key that spits out English ciphetext for a given plaintext.

However, you could create a method of encoding where you map the ciphertext to English words. Basically, find a way to associate numbers with words (for example, take a database of, say, 65536 words, and map each one to each possible pairing of two ASCII characters)

However, the recipients will have to know that you are using this obfuscation and need to be provided with your database for unencrypting it.

Manishearth
  • 8,237
  • 5
  • 34
  • 56
1

Something that is both clever and easy is to send a true colour bitmap image where you have changed the last bit of each byte (apart from metadata) to hold a binary message.

I made something like this myself, for an AS Computer Science project. It's not very difficult, and it hides the data completely. Encoding and decoding is fast (I used a rather inefficient Python solution that is capable of writing thousands of characters to a single image in less than a second) and having images in an email won't trigger any algorithms.

Simply storing (encoded) ASCII characters allows large amounts of plain text to be hidden in a single image. The image is not visibly different (as true colour, by definition, has no visible change from changing the least significant bit), and so will appear completely innocent, especially if it's simply of you/your family, or maybe a cat doing something funny.

As long as you can approximate an even distribution of bits in the encoded text you want to send, it will appear to be completely innocuous. This is also why a personal image is good - it will (normally) not be found elsewhere on the Internet, and so cannot be compared to find the hidden ciphertext. To evenly distribute the bits, use a dictionary file.

Dakeyras
  • 113
  • 4
  • Easy to do, but also easy to detect. When it comes to steganography, the two tend to go together. – Gilles 'SO- stop being evil' Nov 08 '13 at 19:04
  • 1
    Impossible to detect by looking, but that's not the point of this method. It's not going to trigger any algorithms searching for ciphertext - THAT'S what's important in this case. – Dakeyras Nov 08 '13 at 19:27
  • But it's going to trigger algorithms searching for steganography. Why isn't that important? – Gilles 'SO- stop being evil' Nov 08 '13 at 19:29
  • @Gilles It's unlikely to do so if properly encoded - it will appear to be a normal image file with the end bit of each byte *appearing* random. This shouldn't raise any flags, especially with the likely volume being processed. – Dakeyras Nov 08 '13 at 19:44
  • “It will appear to be a normal image file with the end bit of each byte appearing random” — yes, exactly. That's as much of a red flag as a bunch of random bytes or a bunch of random letters. – Gilles 'SO- stop being evil' Nov 08 '13 at 19:49
  • @Gilles The random byte endings is exactly what a normal image file looks like. I've tried to 'read' a normal image file before, and it looked very much like the ciphertext example above. Therefore the image file won't stand out. – Dakeyras Nov 08 '13 at 22:01
  • 1
    @Dakeyras Except it doesn't. The least significant bit of most images doesn't follow a uniform distribution. It might be imperceptible to the human eye, but an algorithm checking the statistical properties of the LSB of each pixel will detect that bias instantly. This method is popular, and there **will** be such an algorithm somewhere along the way. – Thomas Nov 08 '13 at 23:18
  • @Thomas A good encoding choice for the text will average out the statistical properties, and then the issue is moot. – Dakeyras Nov 08 '13 at 23:21
  • 2
    @Dakeyras Have you tried it? Encoding data to approximate the statistical properties of a legitimate carrier is ultimately what steganography is all about, you make it sound like a solved problem, but in reality this is hard to achieve and you have to work hard to reach a point where only a determined analyst would not be fooled. – Thomas Nov 08 '13 at 23:29
  • 1
    @Dakeyras So you can create a cryptosystem that you can't break. [That's the easy part](https://www.schneier.com/crypto-gram-9810.html#cipherdesign) (known as [Schneier's law](https://www.schneier.com/blog/archives/2011/04/schneiers_law.html)). The difficult part is making something that others can't break. Just because *you* don't see the patterns doesn't mean that others are equally blind. – Gilles 'SO- stop being evil' Nov 09 '13 at 07:45
0

I think that your best bet would be not to store ciphertext inside email message, but do one of the following:

  • Find a plausible explanation to send a lot of images between you, and use steanography to hide ciphertext inside (there are simple off-the shelf solutions).
  • You can do the same if you send a lot of scanned text between you (like scientific articles)
  • Find a plausible explanation to store encrypted attachments --- in Poland one would be sending some kind of personal data between you.
  • Store ciphertext outside of the message, for example in contents of a torrent file, or on some kind of free pasting service --- and send only link to that resource. These links could be also encoded using steanography --- in this case you'd still win because you'd have to hide much smaller amount of text inside email (for example using method @gowenfawr proposed).
jb.
  • 111
  • 3
0

As pointed out by other posts in this thread, there are ways of hiding your encrypted text. Unfortunately, as also mentioned in this thread, such techniques have considerable drawbacks. However, I think your initial premise is possibly over stated. While it might be true that agencies such as the NSA are collecting lots of data and even doing some very sophisticated sorting and classification of the collected data, this is not the same thing as using or applying the data to do things such as put people on no-fly lists etc. The real problem with the type of data collection which is occurring mainly comes into play when other events occur. For example, you do something else which brings you to the attention of the authorities, such as being arrested for some crime, running for political office, getting involved in political activism etc. When these types of events occur, agencies such as the NSA are likely to search through their data and build up a profile based on what they have collected. Prior to this, the data they have gathered is likely to just sit in their data repositories and never get looked at by a human. The problem with this is that it is very difficult to know what events or actions you take are likely to raise their attention. It should be noted that when you do raise their attention, email is likely to be one of many data sources they use - they will likely also have loads of data on your financial transitions, have traced your movements via your credit card use, developed detailed profiles based on your purchases, movements, telephone conversations, contact lists etc. etc.

All the information which has come to light on what various government agencies around the world have been up to is very disturbing. However, a negative aspect of this knowledge has been to make people quite paranoid. Perhaps we need to be paranoid. However, far too many people are concerned about what people put in their email and who has access to it. The reality is, if you are concerned about people knowing what you put in your email messages, don't use email. In fact, if you are concerned that the NSA is likely to put you on a no-fly list because of what you have put in your email messages, you should be even more concerned over what they will do because you asked this question on this site. It is far more likely that if they are monitoring at the level you are concerned about, this thread and post will have done more 'damage' than anything you send to a friend in an email message.

Tim X
  • 3,242
  • 13
  • 13
  • although not an answer to my question, I enjoyed reading this. Your point is clear: when a intelligence agency focuses its attention on you, they will probably succeed in obtaining the data they want. This was never questioned by me. Or anyone. However, prior to this, the data they have gather does NOT just sit in their data repositories. They are analyzed by scripts that try to point out red flags. They are indeed never looked at by a human, so we're not trying to fool a human here. We're trying to fool the script into not sending out red flags. – Rodin Dec 06 '13 at 10:29