Hex to base64 on *nix

1

1

Note: I have marked this question as a duplicate of another question. But I am keeping it nonetheless as it has an example and a clearly explained answer, so hopefully it should help others.

-- I need to convert a string of Hex characters into base64 as done by this online convertor in *nix.

For "5C78336D77D8DF448007D277DAD5C569"(Hex) I know the expected output is "XHgzbXfY30SAB9J32tXFaQ=="(base64).

But when I try converting it into binary and then base64 I get this:

[kent@server SrcFiles]$ echo "5C78336D77D8DF448007D277DAD5C569" | xxd -b
0000000: 00110101 01000011 00110111 00111000 00110011 00110011  5C7833
0000006: 00110110 01000100 00110111 00110111 01000100 00111000  6D77D8
000000c: 01000100 01000110 00110100 00110100 00111000 00110000  DF4480
0000012: 00110000 00110111 01000100 00110010 00110111 00110111  07D277
0000018: 01000100 01000001 01000100 00110101 01000011 00110101  DAD5C5
000001e: 00110110 00111001 00001010                             69.
[kent@server SrcFiles]$ echo "001101010100001100110111001110000011001100110011001101100100010000110111001101110100010000111000010001000100011000110100001101000011100000110000001100000011011101000100001100100011011100110111010001000100000101000100001101010100001100110101001101100011100100001010" | base64
MDAxMTAxMDEwMTAwMDAxMTAwMTEwMTExMDAxMTEwMDAwMDExMDAxMTAwMTEwMDExMDAxMTAxMTAw
MTAwMDEwMDAwMTEwMTExMDAxMTAxMTEwMTAwMDEwMDAwMTExMDAwMDEwMDAxMDAwMTAwMDExMDAw
MTEwMTAwMDAxMTAxMDAwMDExMTAwMDAwMTEwMDAwMDAxMTAwMDAwMDExMDExMTAxMDAwMTAwMDAx
MTAwMTAwMDExMDExMTAwMTEwMTExMDEwMDAxMDAwMTAwMDAwMTAxMDAwMTAwMDAxMTAxMDEwMTAw
MDAxMTAwMTEwMTAxMDAxMTAxMTAwMDExMTAwMTAwMDAxMDEwCg==

Can anyone point me in the right direction?

Kent Pawar

Posted 2013-07-25T15:53:41.713

Reputation: 562

Question was closed 2013-08-05T14:32:41.293

1

Converting a Base 16 value to binary will keep it Base 16. You need to convert the Base 16 to Base 64 then convert it to Binary. This has been asked and answered before http://superuser.com/questions/158142/how-can-i-convert-from-hex-to-base64?rq=1

– Ramhound – 2013-07-25T16:04:44.617

@Ramhound - Thanks. I tried echo "obase=10; ibase=16;cat in.dat" | bc | base64 > out.dat and echo "obase=64; ibase=16;cat in.dat" | bc. Could you give me some pointers where I am going wrong..? – Kent Pawar – 2013-07-25T16:14:17.980

Your syntax is wrong. Look up the correct syntax. – Ramhound – 2013-07-25T16:18:18.227

1As a comment to the original question, echo "0011010..." | base64 does not send the binary string of 0011010... but, the string of ascii 0 and ascii 1 characters. That is why your output is not what you expect. – Kent – 2013-07-26T07:01:24.857

Thanks @Kent ! okay. But then shouldn't this work too..? echo "obase=64; ibase=16; 5C78336D77D8DF448007D277DAD5C569" # Tell bc to accept the input ASCII string as a Hex representation and convert it into base64. – Kent Pawar – 2013-07-26T07:12:55.537

1bc is not doing what you expect it to do either. As a numeric processor, its value for 0, 1, 2 in base 64 is still 0, 1, 2. However, the base64 encoding of 0 is A, 1=B, 2=C, etc. It is arbitrary; and, since bc is a calculator, it won't make that arbitrary conversion for us. I have some more to say on this, but, it requires formatting and much more than 512 characters. I'll add it to the answer section below, even though it doesn't answer your original question. – Kent – 2013-07-31T00:44:45.350

Answers

6

If you want to use xxd to decode your hex string, you need to use xxd -r -p. And thus, you get:

echo "5c78336d77d8df448007d277dad5c569" | xxd -r -p | base64
XHgzbXfY30SAB9J32tXFaQ==

-r is for reverse, so that xxd will decode your hex dump, and -p is to say that the input is a plain dump (i.e., an unadorned hex string), with no formatting such as line-number(s).

Levans

Posted 2013-07-25T15:53:41.713

Reputation: 2 010

You could get the same result by a brute-force approach: printf "\x5C\x78\x33\x6D\x77\xD8\xDF\x44\x80\x07\xD2\x77\xDA\xD5\xC5\x69" | base64. – Scott – 2018-05-18T22:24:19.067

3

This is a continuation of the comments in the original question. It ended up much longer than I anticipated; so, moved it to the answer section.

bc is a numeric processor capable of dealing with any arbitrary base; however, from the bc Command Manual:

For bases greater than 16, bc uses a multi-character digit method of printing the numbers where each higher base digit is printed as a base 10 number.  The multi-character digits are separated by spaces.

(A similar statement appears in the bc man page.)

What we call "base64" is a special assignment of characters to each of the 64 values. (See the Wikipedia article on Base64.)

The representation of "0" by bc would be "0"; whereas the actual base64 is "A".

The output of your suggested bc command is:

$ echo "obase=64; ibase=16; 5C78336D77D8DF448007D277DAD5C569" | bc 
 01 28 30 03 13 45 29 61 35 31 17 08 00 07 52 39 31 26 53 28 21 41

So, if bc outputs 01 28 30 03 . . ., why can't we just look up the values for 01, 28, 30, etc., in the table (yielding "B", "c", and "e"; which is different from the "XHg…" that you expect)?

First, let's simplify the problem.

If we feed a shorter string into bc, such as only the first 2.5 bytes, the output looks similar:

$ echo "obase=64; ibase=16; 5C783" | bc
  01 28 30 03

But, an even shorter string is completely different:

$ echo "obase=64; ibase=16; 5C78" | bc
 05 49 56

Why is that? Your original string was 32 characters, (2^4*32; 2^128), which divided into 64 (2^6) requires 22 characters (22*6=132), with a remainder of four. This remainder is important when looking at the output of bc but not really for anything else.

The 4-character input string has 2^16 values. Divided by 64 (2^6), it can fit in three 64-bit words (with 2 bits left over); but, the 5-character input string has 2^20 values, and divided by 2^6, it requires four words to display (with four bits left over; the same remainder as in your original string).

An even shorter input value (5C) also has the same remainder (2^8 / 2^6 = 2 + 4 bits)

$ echo "obase=64; ibase=16; 5C" | bc
 01 28

So, using this "feature" of bc, we can use the first two characters to develop a simple description of what is actually going on.

5C in binary is 01011100. In the base64 world, we look at the first 6 bits (010111, or decimal 23) and see in the wikipedia table, 23 is actually X.  Great!  It matches what you expect!  We'd then continue on through the string, six bits at a time.

In bc, on the other hand, where does the 01 28 come from? Back to the binary 01011100. As opposed to the base64 procedure, which starts at the beginning of the string and pads "=" to the end, if there is a remainder (the number of base16 characters is not a multiple of 3), bc pads 0's to the beginning of the input value. So, with the aforementioned remainder of 4, bc is actually going to work with 0000 01011100; and, in 6-bit chunks (base64), this ends up as 000001 011100, or the decimal values of 01 and 28.


By the way, if you pad the end of the input string to bc so that its length is a multiple of three, you get something which is similar to the desired output:

$ echo "obase=64; ibase=16; 5C78336D77D8DF448007D277DAD5C5690" | bc 
 23 07 32 51 27 23 31 24 55 52 18 00 01 61 09 55 54 45 23 05 26 16

You still need to look up 23=X, 07=H, 32=g, etc., in the table.

Kent

Posted 2013-07-25T15:53:41.713

Reputation: 1 354

+1. Forgot to thank you for this. It is quite detailed and helpful. – Kent Pawar – 2013-12-04T06:54:08.223