Aes256-cbc encoding/decrypting error

There are multiple ways to represent such characters as bytes – e.g. the letter "š" becomes {c5 a1} when encoded using UTF-8 (which JavaScript uses), but it could also be {f0} in ISO-8859-13 or Windows-1257, or {61 01} in UTF-16LE.

So you need to make sure you're always using the same text encoding for passphrases (ideally UTF-8). How to do it depends on the programming language as well as the encryption library. Some APIs require the passphrase to be supplied as a byte-array for exactly this reason – to force the developer to select a specific encoding.

When specifying passphrases with accented characters directly inside source code files (.py, etc), they're encoded into bytes by your text editor – make sure you know what encoding it uses, and try to use UTF-8 whenever possible. If that's not possible, write the accented characters using \x or \u escapes instead. For example (Py2/Py3):

passphrase = u"password£".encode("utf-8")

passphrase = u"password\u00A3".encode("utf-8")

passphrase = b"password\xC2\xA3"        # byte array – already encoded

In some languages, the compiler/interpreter will again decode the source file, so make sure it knows what encoding was used by your editor (e.g. in Python add a # encoding: utf-8 line at the top).

When working directly on command line, the encoding from keypresses to bytes is done by your terminal app, so make sure it is in UTF-8 mode. The command-line shell (bash) should also have $LANG telling it to use UTF-8. (All programs running inside the terminal already receive series of bytes; they have no control over the encoding that the terminal used.)

If in doubt, try sending the passphrase to a "hexdump" tool like hd or xxd:

Good (UTF-8):

$ echo -n password£á | hexdump -C
00000000  70 61 73 73 77 6f 72 64 c2 a3 c3 a1              |password....|

Bad (ISO-8859-1):

$ echo -n password£á | hexdump -C
00000000  70 61 73 73 77 6f 72 64 a3 e1                    |password..|

I tested your input using:

echo "U2FsdGVkX18EWZNx70TPi0dYuiQG+7Zpg5RiGa2/mQsWU4A6JhWMwt3+mP1y6+xIQYN45t65oB+VntZfEd6EArB0X4nPmCJ18jBfO57a1jE=" \
  | base64 -d \
  | openssl enc -aes-256-cbc -d -md md5 -k "password£"

As well as:

#!/usr/bin/env python3

from base64 import b64decode
from Crypto.Hash import MD5
from Crypto.Cipher import AES

def OpenSSL_parse_enc_header(data):
    if data[0:8] != b"Salted__":
        raise ValueError("missing OpenSSL header")
    salt = data[8:16]
    data = data[16:]
    return salt, data

def OpenSSL_EVP_BytesToKey(passphrase, salt, key_size, iv_size):
    buf = b""
    hash = b""
    while len(buf) < key_size + iv_size:
        hash = MD5.new(hash + passphrase + salt).digest()
        buf += hash
    key = buf[0:key_size]
    iv = buf[key_size:key_size+iv_size]
    return key, iv

def PKCS7_remove_padding(data, block_size):
    if len(data) % block_size != 0:
        raise ValueError("data is not padded")
    pad_len = data[-1]
    if pad_len < 1 or pad_len > block_size:
        raise ValueError("PKCS#7 padding incorrect")
    if data[-pad_len:] != bytes([pad_len] * pad_len):
        raise ValueError("PKCS#7 padding incorrect")
    return data[:-pad_len]

enc_data = b64decode("U2FsdGVkX18EWZNx70TPi0dYuiQG+7Zpg5RiGa2/mQsWU4"
                     "A6JhWMwt3+mP1y6+xIQYN45t65oB+VntZfEd6EArB0X4nP"
                     "mCJ18jBfO57a1jE=")
kdf_salt, enc_data = OpenSSL_parse_enc_header(enc_data)

passphrase = "password£".encode("utf-8")
key, iv = OpenSSL_EVP_BytesToKey(passphrase,
                                 kdf_salt,
                                 key_size=256//8,
                                 iv_size=AES.block_size)

plain_data = AES.new(key, AES.MODE_CBC, iv=iv).decrypt(enc_data)
plain_data = PKCS7_remove_padding(plain_data, AES.block_size)
print(plain_data)

In both cases it returns this text (with valid PKCS#7 padding, therefore successful decryption):

L3scoV8yhgA9tqbXBA2SXTczghGUSGTDsWkakCwgK6jk13TAUfXi

user1686

Posted 2019-09-03T09:46:29.113

Reputation: 283 655

I understand this and thank you for answering however I have an encrypted file that I need to decrypt through bruteforce in python, and need to make sure it is accounting for these kind of special characters, the file itself was created in gibberish-aes, but both my current program and openssl are failing to crack even the known passwords, I have used both UTF-8, iso-8859-1 and still failing to crack. – jayboy – 2019-09-03T10:17:04.530

UTF-8 works perfectly fine here with the openssl command-line tool. What CLI commands are you using? What Python code are you using? – user1686 – 2019-09-03T10:19:59.310

To encrypt, It was gibberish-aes javascript, I'm beginning to believe that they must use a different encoding than utf-8 or iso-8859-1 and will do further research for now. CLi.. openssl enc -d -aes-256-cbc -p -md md5 -a -in encrypt.txt – jayboy – 2019-09-03T10:22:38.757

JavaScript code generally uses UTF-8, but it may sometimes use "double-encoded" UTF-8 by accident. I've posted an update showing that your example successfully decrypts into ASCII text when "password\xC2\xA3" is used as the passphrase. – user1686 – 2019-09-03T10:42:37.820

Failing for me with "Non-ASCII character '\xa3' in file decrypt.py on line 39, but no encoding declared; ", but I'm using windows, guess I will run it on a linux vm, can't thank you enough for your help! – jayboy – 2019-09-03T10:57:48.533

That's a Python runtime error, not a decryption error. Since the Python file contains non-ASCII text, you need to tell the runtime how to decode bytes into text, by adding an # encoding: ... header at the top. (In this case, specify what encoding your text editor used, and again try to use UTF-8 if possible. In my case, I used # encoding: utf-8.) – user1686 – 2019-09-03T10:59:45.333

Yea I have tried that before but same issue- ""password¬£".encode("utf-8") UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 8: ordinal not in range(128)", file is saved in utf-8 without sig. – jayboy – 2019-09-03T11:01:55.640

Are you writing in Python 3 or Python 2? In Py2, instead of "..." use u"...". Also, in both versions, instead of £ you can use \u00A3. You can also specify the passphrase as an already encoded byte array: b'password\xC2\xA3', skipping encode() entirely. – user1686 – 2019-09-03T11:07:06.977

YES.. at last :D that little u".." trick worked, I was using python 2.7, thank you grawity, you my friend are a legend. – jayboy – 2019-09-03T11:33:11.707

Aes256-cbc encoding/decrypting error

Answers