24

As a programming exercise I need to decrypt a message. The only clues that I have is that it seems are:

  • The encoded message contains only Base64 Characters
  • a n letter sequence (n in 1,4,7,10) times it returns an encrypted message with "==".
  • a n letter sequence (n in 2,5,8,11) times it returns an encrypted message with "=".
  • a n letter sequence (n in 3,6,9,12) returns an encrypted message without a specific character.

I do not want a solution, I am just wondering if this sequence of occurrences for the equal sign provide a clue or not.

bad_coder
  • 129
  • 4
czioutas
  • 429
  • 1
  • 3
  • 8
  • Try not just repeating characters - try a sequence like "a", "ab", "abc". You should see a pattern in the = appearing. – Matthew Nov 17 '15 at 09:26
  • true, any sequence of characters repeated n times provides the above results. will update the Question – czioutas Nov 17 '15 at 09:28
  • @Matthew but is that the clue then? – czioutas Nov 17 '15 at 09:30
  • OK, don't think I made the point quite as clear as it could have been! The = signs relate to the length of the string being encoded in base64. See https://en.wikipedia.org/wiki/Base64 – Matthew Nov 17 '15 at 09:30
  • @Matthew The '==' sequence indicates that the last group contained only one byte, and '=' indicates that it contained two bytes. I get it, but does this tell us anything else except that its base64? – czioutas Nov 17 '15 at 09:52
  • 1
    Not really. But you can decode base64 trivially, which should give you more information about any actual encryption (especially if it is character based, rather than byte based) – Matthew Nov 17 '15 at 09:56
  • I did decode it using an online decoder but the output is "gibberish" and not in formats like UTF-8 or ASCII. So I am getting confused. But you should post as an answer that the clue here is the base 64? – czioutas Nov 17 '15 at 10:00
  • This reminds me of a StackOverflow post where a guy was trying to get help decoding a Base64 string for a job interview question on a Greek car website. – JPhi1618 Nov 17 '15 at 15:47
  • lol, I am greek, but I am doing this as a practise. – czioutas Nov 17 '15 at 15:49

3 Answers3

54

The = signs relate to the length of the string being encoded in Base64. Essentially, in probably the most common form of Base64, = is used as a padding character to ensure that the last block can be decoded properly.

Base64 is not encryption - there is no hiding going on in it - but is often used to allow for binary data to be sent in text only form. All the characters used in Base64 will paste correctly, and can be entered using a keyboard with no modifier keys beyond shift.

ThoriumBR
  • 50,648
  • 13
  • 127
  • 142
Matthew
  • 27,233
  • 7
  • 87
  • 101
  • 3
    Worth noting that in base64 encoding, the padding isn't actually necessary. As the asker noted, it's purely a product of the total length of the message. It provides a small amount of error detection, but if it's dropped it can always be added back just before decoding. – glibdud Nov 17 '15 at 13:28
  • 3
    @glibdud Whether it can be decoded without the padding depends on the implementation. – kasperd Nov 17 '15 at 15:28
  • 3
    @kasperd Whether it can be decoded with the padding depends on the implementation too. – Paul Nov 17 '15 at 19:11
  • It also makes it clear the message is a base64 message and not, for instance, an md5 hash or something. – corsiKa Nov 17 '15 at 19:29
  • 3
    @corsiKa It's fairly obvious anyway - base64 uses 64 distinct symbols. Most hashes return a hex string, with just 16 distinct symbols. Hashes also tend to be fixed length, while base 64 strings can be pretty much any length – Matthew Nov 17 '15 at 22:38
  • Not really, @corsiKa, because if input length in bytes is divisible by 3 there will be no padding. – transistor09 Nov 18 '15 at 16:52
3

As mentioned above, Base64, is not an encryption but an encoding. As you can see at the RFC that specified the standard, base64 works the following way.

  • You have a stream of characters s, of length n.
  • You read 3 8-bit values from the stream (now you have a total of 24 bits = 3 bytes)
  • You break these 24 bits to 4 groups of 6 bits each
  • Using the table of the Base 64 alphabet, you encode each of the 6-bit groups to the Base64 equivalent

Now, there is a chance that you reach the end of the stream and you don't have a 24 bit group (s mod 6 != 0). If this happens, then you add zeros to the end of your input, until you have an integral number of 6 bit groups.

Given that your input stream is ASCII encoded, so it's composed of 8-bit characters, there are only two cases where you end up in the above scenario.

  1. You have 8 bits in the last group

  2. You have 16 bits in the last group

In the first case, 4 zeros are added (giving you 12 bits) and the output would be two characters (2 * 6 bits = 12 bits) encoded based on the alphabet, and two "=" padding characters

In the second case, 2 zeros would be added (giving you a total of 18 bits) and the output would be three characters (3 * 6 bits = 18 bits) and one "=" padding character.

That's how sometimes you end up with one, two, or no "=" at the end of the encoded text. For more info you should really read the RFC which defined that standard and the wikipedia entry related to it.

G. Kaklam.
  • 161
  • 2
  • Um, the arithmetic seems to be off here. The residue can be 0, 1, 2, or 3 groups of 6-bit fragments, corresponding to 0, 1, 2, or 3 bytes of padding to make the output an even multiple of 4 output bytes. – tripleee Oct 27 '16 at 12:35
1

There are several other encoding systems that use the = sign some include:

ESAB46 (BASE64 backwards)

ATOM128

MEGAN35

FERON74

As has already been stated, the = is filler/buffer to tell the unencoder the length. This is particularly why cryptologists are hunting for a better way to do buffering because that = is a dead giveaway.

Chad Baxter
  • 632
  • 4
  • 8
  • maybe its one of those, because up until now I only get gibberish for the decrypted message. Anyway nice answer – czioutas Nov 17 '15 at 12:52
  • 5
    Could just mean that it is binary data - for example, `X03MO1qnZdYdgyfeuILPmQ==` decodes as gibberish, but is actually the md5 hash of `password`, in binary representation, rather than the normal hex. – Matthew Nov 17 '15 at 13:48
  • 1
    none of the online md5 decrypters recognise the string as md5. – czioutas Nov 17 '15 at 14:42
  • That was more just an example that I could generate quickly - you can't decrypt MD5, so it would be a very unfair exercise. – Matthew Nov 17 '15 at 14:54
  • 2
    @drakoumelitos md5 is not an encryption method, so by definition is cannot be decrypted. It's a hash function, and it's meant to go only one-way. – nanny Nov 17 '15 at 15:46
  • for example http://www.hashkiller.co.uk/md5-decrypter.aspx contains hashes of md5, therefore it can decrypt them. The example md5 you gave me was solved. @Matthew – czioutas Nov 17 '15 at 15:52
  • 2
    Technically, it's not decrypting. It is performing a lookup in a database table - if it knows the input, it will have a matching decrypted string. If it doesn't have the given hash in the database though, all it can do is try other inputs. An encryption algorithm would be reversible given a key. – Matthew Nov 17 '15 at 15:56
  • 17
    "cryptologists are hunting for a better way to do buffering because that = is a dead giveaway" Why on earth do cryptologists care about base64? The purpose of base64 isn't encryption at all... It's literally only data encoding. "Encoding" sounds like a fancy word, but has nothing to do with security. – Cruncher Nov 17 '15 at 16:26
  • 2
    @Matthew Also, what he provided is not the sole "decryption" of that md5. There's an infinite number of inputs that produce that hash. It's simply not a 1 to 1 function, where as encryption is. – Cruncher Nov 17 '15 at 16:29
  • @Cruncher Yep - I hinted at that with "a decrypted string", but it could have been clearer. Technically, it should have been "an input string resulting in the provided hash" – Matthew Nov 17 '15 at 16:34
  • "looking for a better way" -- citation please? – Yakk Nov 18 '15 at 16:24