Solve a Diagonal Burrows-Wheeler transform

11

1

Introduction

In this challenge you will be solving diagonal Burrows-Wheeler transforms. Here is a general overview of what a diagonal Burrows-Wheeler transform is. To encode a message, you first must guarantee that it is odd in length (i.e. 5, 7, 9, etc.). Then you make a grid, n by n, where n is the length of the message. The first row is the original message. Each row after that is the row above it, but shifted 1 character left with the first character moving to the back. For example:

Hello World
ello WorldH
llo WorldHe
lo WorldHel
o WorldHell
 WorldHello
WorldHello 
orldHello W
rldHello Wo
ldHello Wor
dHello Worl

Then you take each letter on the NW to SE diagonal and put that into a new string:

Hello World  H
ello WorldH  l
llo WorldHe  o
lo WorldHel  W
o WorldHell  r
 WorldHello  d
WorldHello   e
orldHello W  l
rldHello Wo  (space)
ldHello Wor  o
dHello Worl  l

Your encoded message is HloWrdel ol. To decode, first take the length of the encoded message, add 1, and divide by 2. Lets call this number x. Now that we know x, starting at the first letter, each letter is x after the last, looping around. For example:

H   l   o   W   r   d   e   l     o   l
1   

Then...

H   l   o   W   r   d   e   l     o   l
1                       2

And again...

H   l   o   W   r   d   e   l     o   l
1   3                   2

Until you get...

H   l   o   W   r   d   e   l       o   l
1   3   5   7   9  11   2   4   6   8  10

Now just rearrange the letters in the correct order to get Hello World!

Challenge

Your challenge is to write either two programs, functions, or one of each. However, both must use the same language. The first program will accept a string as input via STDIN, program arguments, or function parameters and encode it using this method. The second program will accept a string as input via STDIN, program arguments, or function parameters and decode it using this method.

Requirements

First Program/Function

  • A single string input using any method listed above.
  • Must encode the string using a diagonal Burrows-Wheeler transform style.

Second Program/Function

  • A single string input using any method listed above.
  • Must decode the string using a diagonal Burrows-Wheeler transform style.

Constraints

  • You cannot use any built-in or external functions that accomplish this task.
  • Standard loopholes are not allowed.
  • Both programs/functions must be in the same language.

Scoring

This is code golf, so shortest program in bytes wins.

If I need to add more information, leave a comment!

GamrCorps

Posted 2015-02-27T02:28:50.097

Reputation: 7 058

2Do we have to convert even length input string to odd length ? – Optimizer – 2015-02-27T05:23:59.660

5This is not a Burrows-Wheeler transoformation. – FUZxxl – 2015-02-27T13:05:45.040

3A Burrows-Wheeler transformation is different in that the array of all rotations is sorted lexicographically before you take the last items. – FUZxxl – 2015-02-27T13:06:48.250

@Optimizer it is not necessary. – GamrCorps – 2015-02-27T14:46:59.083

Answers

12

CJam, (4 + 8 = ) 12 bytes

Encoding program:

q2/z

Try it online here

Decoding program:

q_,2/)/z

Try it online here

How (or rather, why) they work:

The Diagonal Burrows-Wheeler transform is basically every other character of the string, with wrapping from the end. If we treat the String as a 2D matrix of 2 columns, it simply boils down to taking the transform of the matrix. Example:

Hello World

Is represented as 2D matrix as

He
ll
o 
Wo
rl
d

Now, simply reading it column wise, give:

HloWrdel ol

Which is the Burrows-Wheeler transform.

Decoding is simply reverse of the process, write the string as a 2 row 2D matrix and read column wise.

Code expansion:

Encoder:

q          "Read the input";
 2/        "divide it into sub arrays of 2 characters";
   z       "Take transform";

Decoder:

q_,        "Read the input, take copy and get length of copy";
   2/      "Divide the length by 2";
     )/    "Increment and split the input into two rows";
       z   "Take transform";

Optimizer

Posted 2015-02-27T02:28:50.097

Reputation: 25 836

7

Python 2, 61 bytes

E=lambda x:x[::2]+x[1::2]
D=lambda y:(-~len(y)/2*y)[::len(y)/2+1]

E encrypts and D decrypts. I'm not counting the E= and D= for the score.

The decryption takes every n'th character wrapping around, where n is half the string length round up. The reason this inverts is that 2 and n are inverses modulo the length of the string, so taking every nth character inverts taking every 2nd one.

If using a single function were allowed, I could do 44 bytes

def F(x,b):n=1+len(x)**b>>b;return(n*x)[::n]

The encrypts when b is False and decrypts when b is True. The expression 1+len(x)**b>>b is equal to [2,len(x)/2+1][b].

xnor

Posted 2015-02-27T02:28:50.097

Reputation: 115 687

4

J, 10 + 10 = 20

   ({~#|2*i.@#) 'Hello World'
HloWrdel ol

   (/:#|2*i.@#) 'HloWrdel ol'
Hello World

(Surrounding braces are not counted into the score as they are not part of the function definition.)

Thanks for FUZxxl for a 3-byte improvement.

Now it is nicely shown that the two functions are inverses as the first one takes characters from positions defined by the list #|2*i.@# and the second function arranges back the characters using the same list as ordering.

Try it online here.

randomra

Posted 2015-02-27T02:28:50.097

Reputation: 19 909

The first one can be done in 10 characters as well: {~#|2*i.@#. – FUZxxl – 2015-02-28T02:34:55.690

@FUZxxl Thanks, updated. Now the relation between the two functions is shown really nicely. – randomra – 2015-02-28T03:01:22.723

3

Pyth - 5 + 11 = 16 bytes

I noticed a pattern! ~Does happy dance~ The transform is just really just looping through the string picking every other elements. It only works on odd since otherwise it would never get half the elements. This is equivalent to rotating a 2 wide matrix.

Encoder:

%2*2z

Python's step slicing doesn't loop around so I repeated the string.

%2      Take every other elements
 *2z    Double input string

Decoder:

K/hlz2%K*Kz

Again no wrap-around for step-slicing.

K/hlz2       K=length of (input+1)/2
%K           Every kth element
 *Kz         From K*the input

Maltysen

Posted 2015-02-27T02:28:50.097

Reputation: 25 023

@FryAmTheEggman I'm pretty sure that it is only supposed to take odd length string. It was in the beginning of the description. – Maltysen – 2015-02-27T23:11:05.150

Oops, sorry. :S – FryAmTheEggman – 2015-02-27T23:12:28.163

2

JavaScript ES6, 41 + 49 = 90 bytes

Encoder

(t=>t.replace(/./g,(_,o)=>t[o*2%t.length]))('Hello World')

Decoder

(t=>t.replace(/./g,(_,o)=>t[-~(l=t.length)/2*o%l]))('HloWrdel ol')

These are anonymous functions, so I only counting the code inside the parentheses because that is the whole function definition. Try it out with the snippet below: (modified to use ES5)

E=function(t){
  return t.replace(/./g,function(_,o){
    return t[o*2%t.length]
  })
}
D=function(t){
  return t.replace(/./g,function(_,o){
    return t[-~(l=t.length)/2*o%l]
  })
}
i=prompt();
alert('Encoded: '+E(i)+'\nDecoded: '+D(i))

NinjaBearMonkey

Posted 2015-02-27T02:28:50.097

Reputation: 9 925

How about this: [t=>t.replace(/./g,(_,o)=>t[o*2%t.length]),t=>t.replace(/./g,(_,o)=>t[(1+(l=t.length))/2*o%l])]? You use it like [...][0]('encode string') and [...][1]('decode string'). There is nothing saying that this can't be done! And you save 1 byte. – Ismael Miguel – 2015-02-27T20:24:44.540

Thanks, bu it says to write 2 functions, and I don't think this would count. – NinjaBearMonkey – 2015-02-27T23:24:27.467

That's still 2 functions. The rules don't specify names or ways to access the functions. It only says that you must use 2 functions. – Ismael Miguel – 2015-02-28T00:47:04.363

1@IsmaelMiguel Now that I think about it, I think anonymous functions are allowed by themselves, so using that saves me even more bytes. – NinjaBearMonkey – 2015-02-28T01:43:05.027

I'm glad you cut down the byte count. – Ismael Miguel – 2015-02-28T02:05:55.157

2

GNU sed -r, (20 + 104 + 1) = 125

The extra +1 in the score is for the -r option to sed. Odd length input strings are assumed.

Encoder:

s/.*/&&/
s/(.)./\1/g
  • Double up the input string
  • Drop every odd (counting from 1) character

Decoder:

The decoder makes use of : as a temporary marker character, so if it appears in the input string, you'll get undefined behaviour. If the input string is restricted to the 95 ASCII characters, then these markers can be replaced with something outside the ASCII range (e.g. BEL 0x7) to fix this.

s/.*/:&:/
:l;s/:(.)(.+)(.):/\1:\2:\3/;tl
s/:(.*)/\1:/
:m;s/(.)(.*):(.?)(.*):(.*)/\2:\4:\5\1\3/;tm
s/://g
  • Put : markers at the start and end of the input string
  • Shuffle the first : forward and second : backward one character at a time until the : markers are either side of the middle character
  • Remove the first : and add another : to the end leaving "A:B:", where A is the string composed of odd characters from the plaintext input and B is the string composed of the even characters
  • Riffle the characters of A and B together after the last : to reassemble the plaintext input
  • Remove the remaining : markers

Digital Trauma

Posted 2015-02-27T02:28:50.097

Reputation: 64 644