Parse a Quaternion

27

3

If you don't know already, a quaternion is basically a 4-part number. For the purposes of this challenge, it has a real component and three imaginary components. The imaginary components are represented by the suffix i, j, k. For example, 1-2i+3j-4k is a quaternion with 1 being the real component and -2, 3, and -4 being the imaginary components.

In this challenge you have to parse the string form of a quaternion (ex. "1+2i-3j-4k") into a list/array of coefficients (ex. [1 2 -3 -4]). However, the quaternion string can be formatted in many different ways...

  • It may be normal: 1+2i-3j-4k
  • It may have missing terms: 1-3k, 2i-4k (If you have missing terms, output 0 for those terms)
  • It may have missing coefficients: i+j-k (In this case, this is equivalent to 1i+1j-1k. In other words, a i, j, or k without a number in front is assumed to have a 1 in front by default)
  • It may not be in the right order: 2i-1+3k-4j
  • The coefficients may be simply integers or decimals: 7-2.4i+3.75j-4.0k

There are some things to note while parsing:

  • There will always be a + or - between terms
  • You will always be passed valid input with at least 1 term, and without repeated letters (no j-js)
  • All numbers can be assumed to be valid
  • You can change numbers into another form after parsing if you want (ex. 3.0 => 3, 0.4 => .4, 7 => 7.0)

Parsing/quaternion builtins and standard loopholes are disallowed. This includes eval keywords and functions. Input will be a single string and output will be a list, an array, values separated by whitespace, etc.

As this is , shortest code in bytes wins.

Tons o' test cases

1+2i+3j+4k             => [1 2 3 4]
-1+3i-3j+7k            => [-1 3 -3 7]
-1-4i-9j-2k            => [-1 -4 -9 -2]
17-16i-15j-14k         => [17 -16 -15 -14]

7+2i                   => [7 2 0 0]
2i-6k                  => [0 2 0 -6]
1-5j+2k                => [1 0 -5 2]
3+4i-9k                => [3 4 0 -9]

42i+j-k                => [0 42 1 -1]
6-2i+j-3k              => [6 -2 1 -3]
1+i+j+k                => [1 1 1 1]
-1-i-j-k               => [-1 -1 -1 -1]

16k-20j+2i-7           => [-7 2 -20 16]
i+4k-3j+2              => [2 1 -3 4]
5k-2i+9+3j             => [9 -2 3 5]
5k-2j+3                => [3 0 -2 5]

1.75-1.75i-1.75j-1.75k => [1.75 -1.75 -1.75 -1.75]
2.0j-3k+0.47i-13       => [-13 0.47 2.0 -3] or [-13 .47 2 -3]
5.6-3i                 => [5.6 -3 0 0]
k-7.6i                 => [0 -7.6 0 1]

0                      => [0 0 0 0]
0j+0k                  => [0 0 0 0]
-0j                    => [0 0 0 0] or [0 0 -0 0]
1-0k                   => [1 0 0 0] or [1 0 0 -0]

GamrCorps

Posted 2016-03-30T03:08:44.750

Reputation: 7 058

Will there ever be unnecessary + signs in the input? Like: +1k? – FryAmTheEggman – 2016-03-30T03:53:16.910

@FryAmTheEggman No. inputs will never start with a +. – GamrCorps – 2016-03-30T03:54:13.677

1Is -0 a part of the legal output for the last two examples? – isaacg – 2016-03-30T06:47:48.103

What exactly constitutes a "parsing built-in"? – LLlAMnYP – 2016-03-30T12:20:53.423

1@isaacg yes that is fine – GamrCorps – 2016-03-30T13:46:26.647

@LLlAMnYP anything that trivializes this challenge, ie that can parse this quaternion and return the array of coefficients – GamrCorps – 2016-03-30T13:47:30.557

@GamrCorps that doesn't make things clearer for me. Sure ToQuaternion is obviously off limits (quaternion built-in), but what of my use of ToExpression? It interprets the string as actual input to the interpreter ("1+i-2j+k" -> 1 + i - 2 j + k). Does mathematica's capability of working with actual mathematical expressions trivialize the challenge? – LLlAMnYP – 2016-03-30T13:58:56.103

@LLlAMnYP in that case, I would take that command to be a form of eval which is prohibited. Sorry about the confusions – GamrCorps – 2016-03-30T14:03:26.247

@GamrCorps Interesting. That would imply, that the input should be dealt with only with string-manipulating functions, regex and the like. I'm then curious, the output array has to be a list of explicit numbers? Because any kind of conversion like "2.5" -> 2.5 is a (possibly restricted) form of eval. – LLlAMnYP – 2016-03-30T14:27:48.117

1@LLlAMnYP You bring up a good point. Lets define the eval restriction to be takes in a string, interprets as code and/or input. Any conversions do not count under this because you cant pass, for example, the string "test" to an integer conversion function to receive an integer, but test would be interpreted as code in a normal eval function. TLDR: eval: no, type conversions: yes. – GamrCorps – 2016-03-30T14:34:19.930

@GamrCorps I see. That makes Mathematica a bad choice for this then, because it isn't really a typeified language and most type conversions are done by the rather blanketing ToExpression. There is of course the (undocumented) Internal`StringToDouble, but that's hardly a good fit for golf :-) – LLlAMnYP – 2016-03-30T14:40:01.833

FromDigits Golookitup – CalculatorFeline – 2016-03-31T12:52:38.573

@CatsAreFluffy that's only for integers. But I fully agree, there's plenty of ways to do it, e.g. ImportString too. In this case it just feels like reinventing the wheel, instead of taking advantage of the strengths of the language. But who knows, maybe there's still a beautiful MMA solution out there. – LLlAMnYP – 2016-04-01T06:15:04.567

Answers

5

Pyth, 48 bytes

jm+Wg\-K--e|d0G\+K1+]-I#GJczfT.e*k<b\.zm/#dJ"ijk

Demonstration Test suite

The output format is newline separated. The test suite code uses space separation, for ease of reading, but is otherwise the same.

Outputs a -0 in the last 2 cases, which I hope is OK.

Explanation to follow.

isaacg

Posted 2016-03-30T03:08:44.750

Reputation: 39 268

9

Retina, 115

\b[ijk]
1$&
^(?!.*\d([+-]|$))
0+
^(?!.*i)
+0i+
^(?!.*j)
0j+
^(?!.*k)
0k+
O$`[+-]*[\d.]*(\w?)
$1
-
+-
^\+

S`[ijk+]+

Try it online!

1 byte saved thanks to @Chris Jester-Young.

A bug fixed and 6 bytes saved thanks to @Martin Büttner

Found a couple bugs involving some edge cases, bumped up byte count quite a bit.

Returns the numbers newline separated. Anyway, this has a mostly elegant solution that sort of gets ruined by edge cases but hey I got to use sort mode, that means I used the right tool for the job, right?

Explanation:

Stage by stage, as usual.

\b[ijk]
1$&

The only characters in the input that can create word boundaries are -+.. This means that if we find a boundary followed by a letter, we have an implicit 1 which we add in with the replacement. $& is a synonym for $0.

^(?!.*\d([+-]|$))
0+

Big thanks to Martin for this one, this one adds in the implicit 0 for the real part if it was missing in the input. We make sure that we can't find a number that is followed by a plus or minus sign, or the end of the string. All the complex numbers will have a letter after them.

^(?!.*i)
+0i+

The next 3 stages are all pretty much the same, barring which letter they impact. All of them look to see if we can't match the letter, and if we can't we add a 0 term for it. The only reason i has an extra + before it is to prevent the real value from being unreadable with the is coefficient, the other numbers are all separated by their complex variable.

O$`[+-]*[\d.]*(\w?)
$1

Ah, the fun part. This uses the newish sort stage, denoted by the O before the option separator backtick. The trick here is to grab the whole number followed optionally by a word character, which in this case will only ever match one of ijk. The other option used is $ which causes the value used to sort these matches to be the replacement. Here we just use the optional letter left over as our sort value. Since Retina sorts lexicographically by default, the values are sorted like they would be in a dictionary, meaning we get the matches in "", "i", "j", "k" order.

-
+-

This stage puts a + sign in front of all the minus signs, this is needed if we have a negative value for i in the split stage, later.

^\+

We remove the leading + to make sure we have no extra leading newline.

S`[ijk+]+

Split the remaining lines on runs of the complex variables or the plus sign. This nicely gives us one value per line.

FryAmTheEggman

Posted 2016-03-30T03:08:44.750

Reputation: 16 206

3

Lua, 185 187 195 183 166 bytes (try it online) [used regex]

Thanks to @Chris Jester-Young for the improved regex.

Thanks to @Katenkyo for bringing it down to 166 bytes.

Golfed:

r={0,0,0,0}for u in(...):gsub("([+-])(%a)","%11%2"):gmatch("-?[%d.]+%a?")do n,i=u:match("(.+)(%a)")r[i and(" ijk"):find(i)or 1]=(n or u)end print(table.concat(r," "))

Ungolfed:

n = "42i+j-k+0.7"

result = {0,0,0,0}

for unit in n:gsub("([+-])(%a)","%11%2"):gmatch("-?[%d.]+%a?") do
  num, index = unit:match("(.+)(%a)")
  if index == "i" then
    result[2] = num
  elseif index == "j" then
    result[3] = num
  elseif index == "k" then
    result[4] = num
  else
    result[1] = unit
  end
end

print(table.concat(result," "))

Leaky Nun

Posted 2016-03-30T03:08:44.750

Reputation: 45 011

2Hi Kenny, thanks for the solution. Usually we don't allow the input starting in a variable (like n in this case), so you should add the code to read the input. – isaacg – 2016-03-30T03:38:04.493

You should be able to save some byte by changing your input from STDIN to argument, instead of io.read(), use (...). It will point to the first command-line argument and will allow you to save 4 more bytes :) – Katenkyo – 2016-03-30T08:36:53.667

1Also, the asked output can be anything, as long as it can be interepreted by humans as a list, so you can remove the extra formatting. Including some more whitespaces you can shave off, your code can go down to 166 bytes -> r={0,0,0,0}for u in(...):gsub("([+-])(%a)","%11%2"):gmatch("-?[%d.]+%a?")do n,i=u:match("(.+)(%a)")r[i and(" ijk"):find(i)or 1]=(n or u)end print(table.concat(r," ")) – Katenkyo – 2016-03-30T08:42:18.380

3

Perl 5, 125 bytes

#!perl -p
%n=(h,0,i,0,j,0,k,0);$n{$4//h}=0+"$1@{[$3//$5//1]}"while/([+-]?)(([\d.]+)?([ijk])|([\d.]+))/g;s/.*/@n{qw(h i j k)}/

Chris Jester-Young

Posted 2016-03-30T03:08:44.750

Reputation: 4 464

1

@KennyLau Sadly, your proposed change does not do what you expect. I have tried that before I posted my answer. ;-)

– Chris Jester-Young – 2016-03-30T04:32:01.643

@KennyLau Regarding this proposed change, Perl's \a matches on "alarm", not alphabetic. There's \w for word-character (alphanumeric and underscore), but that won't work here; we need it not to match on a number.

– Chris Jester-Young – 2016-03-30T04:39:05.680

3

@KennyLau BTW, you have enough rep to talk in chat. Feel free to discuss ideas there, rather than continually getting your edit suggestions rejected. ;-)

– Chris Jester-Young – 2016-03-30T04:43:20.023

I also have enough rep to comment now. Does Perl not have a pattern for [a-z]? – Leaky Nun – 2016-03-30T09:50:03.427

You can count this as 116 (as per http://meta.codegolf.stackexchange.com/a/274/3808).

– Doorknob – 2016-03-30T11:15:47.880

Golfing your solution a bit more gives 75 strokes: perl -lpe 's%([+-]|)([\d.]+)?(\pL|$)?%${$3//h}=$1.($2//1)%eg;$_="@{[map$$_+0,h..k]}" – Ton Hospel – 2016-03-30T11:38:32.553

1@KennyLau Not to my knowledge. – Chris Jester-Young – 2016-03-30T16:35:51.497

@TonHospel That doesn't reset %$'s state between lines, which means some of the results are wrong. But I'll use some of your strategies! – Chris Jester-Young – 2016-03-30T16:36:42.540

The challenge only asks for parsing 1 quaternion, so leaving the state disturbed is fine. Input will be a single string ... – Ton Hospel – 2016-03-30T18:21:49.750

3

C, 236 bytes

char j,n[9][9],s[9],y[9],i=8,k,*p=n[8];main(c){for(**n=48;c=getchar(),c+1;)c-32&&(c<46&&(k&&(y[1]=i),k=0,s[--i]=c-43,p=n[i])||c>57&&(k||(*p=49),k=0,y[c-103]=i)||(*p++=c,k=1));for(k&&(y[1]=i);++j<5;)printf("%c%s ",s[y[j]]?45:0,n[y[j]]);}

(For values like -0 or -0.0, the minus sign is also printed in the output, but since the challenge states that "you can change numbers into another form after parsing if you want", and if -0 appears in the input, it follows that it's also acceptable in the output. @GamrCorps has now clarified that this is ok.)

mIllIbyte

Posted 2016-03-30T03:08:44.750

Reputation: 1 129

3

JavaScript (ES6), 103 100 bytes

f=s=>s.replace(/(?=.)(\+|-|)([\d.]*)(\w?)/g,(_,s,x,c)=>a[c.charCodeAt()&3]=+(s+(x||1)),a=[0,0,0,0])&&a

Edit: Saved 3 bytes by switching from parseInt to charCodeAt, which conveniently just needs &3 to get me the correct array index.

Neil

Posted 2016-03-30T03:08:44.750

Reputation: 95 035

Nice idea parseInt + mod. Thinking about base and prefix – edc65 – 2016-03-30T10:27:10.953

1

JavaScript (ES6) 106

s=>(s.replace(/([+-]?)([\d.]*)(\w?)/g,(a,b,c,d)=>a&&(s[d||9]=b+(c||1)),s={}),[...'9ijk'].map(i=>+s[i]||0))

Test

f=s=>(s.replace(/([+-]?)([\d.]*)(\w?)/g,(a,b,c,d)=>a&&(s[d||9]=b+(c||1)),s={}),[...'9ijk'].map(i=>+s[i]||0))

function Test()
{
  var t,k,r,ts=TS.value.split('\n')
  
  O.textContent=ts.map(x=>x.trim()&&(
    [t,k]=x.split('=>').map(x=>x.trim()),
    console.log(t,'*',k),
    k=k.match(/[\d+-.]+/g).map(x=>+x),
    r=f(t),
    t+' => '+r+(r+''==k+''?' OK':' KO (check: '+k+')')
  )).join('\n')
}    

Test()
#TS { width:90%; height:10em}
<pre id=O></pre>

Test data (modify if you like)<button onclick='Test()'>repeat test</button>
<textarea id=TS>
1+2i+3j+4k             => [1 2 3 4]
-1+3i-3j+7k            => [-1 3 -3 7]
-1-4i-9j-2k            => [-1 -4 -9 -2]
17-16i-15j-14k         => [17 -16 -15 -14]
  
7+2i                   => [7 2 0 0]
2i-6k                  => [0 2 0 -6]
1-5j+2k                => [1 0 -5 2]
3+4i-9k                => [3 4 0 -9]
  
42i+j-k                => [0 42 1 -1]
6-2i+j-3k              => [6 -2 1 -3]
1+i+j+k                => [1 1 1 1]
-1-i-j-k               => [-1 -1 -1 -1]
  
16k-20j+2i-7           => [-7 2 -20 16]
i+4k-3j+2              => [2 1 -3 4]
5k-2i+9+3j             => [9 -2 3 5]
5k-2j+3                => [3 0 -2 5]
  
1.75-1.75i-1.75j-1.75k => [1.75 -1.75 -1.75 -1.75]
2.0j-3k+0.47i-13       => [-13 0.47 2.0 -3]
5.6-3i                 => [5.6 -3 0 0]
k-7.6i                 => [0 -7.6 0 1]
  
0                      => [0 0 0 0]
0j+0k                  => [0 0 0 0]
-0j                    => [0 0 0 0]
1-0k                   => [1 0 0 0]
</textarea>

edc65

Posted 2016-03-30T03:08:44.750

Reputation: 31 086

0

PowerShell, 178 bytes

param($a);$p="(-?)([\d.]+)?";$g={param($s)if($a-match"$p$s"){if(($r=$matches)[2]){$r[1]+$r[2]}else{$r[1]+1}}else{0}};$a-match"$p(\+|-|$)">$null;+$matches[2];"i","j","k"|%{&$g $_}

Ungolfed with Explanation

# Get the whole string into a variable
param($a)
# Pattern shared getting both imaginary and real numbers. 
$p="(-?)([\d.]+)?"
# Anonymous function that will locate a imaginary number using a letter sent as a parameter. 
# If no value is assigned a signed 1 is returned. If no value is matched 0 is returned
$g={param($s)if($a-match"$p$s"){if(($r=$matches)[2]){$r[1]+$r[2]}else{$r[1]+1}}else{0}}
# Locate the real component if any. Null is converted to 0
$a-match"$p(\+|-|$)">$null;+$matches[2]
# Call the anonymous function using each of the imaginary suffixes.                                               
"i","j","k"|%{&$g $_}

Not super impressed but it is a result nonetheless.

Matt

Posted 2016-03-30T03:08:44.750

Reputation: 1 075

0

PHP, 179 bytes

$a=[''=>0,'i'=> 0,'j'=>0,'k'=>0];preg_match_all("/([-+]?)(\d*(\.\d+)?)([ijk]?)/",$argv[1],$m,2);foreach($m as$n)if($n[0])$a[$n[4]]=$n[1].($n[2]===''?1:$n[2]);echo implode(',',$a);

Try the test suite.

nickb

Posted 2016-03-30T03:08:44.750

Reputation: 351

0

Python 3.5 - 496 bytes [using Regular Expressions]:

from re import*
def wq(r):
 a=sub('[+](?![0-9])','+1',sub('[-](?![0-9])','-1',r));q=lambda x:(not x.isdigit(),''.join(filter(str.isalpha,x)))
 for z in findall('(?<![0-9])[a-z]',a):a=a.replace(z,('+1{}'.format(z)))
 if not str(sorted(((sub('[.]','',sub('[+-]',' ',a))).split(' ')),key=q)[0]).isdigit():a+='+0, '
 for i in list(set(findall('[a-z]',a))^{'i','j','k'}):a+='+0{}, '.format(i)
 print(findall('[-]?\d+(?:\.\d+)?',''.join(sorted(sub('(?<=[A-Za-z0-9])(?=[+-])',', ',a).split(' '),key=q))))

It may be long, but in my defense, it works perfectly in doing what the OP wants, since all the test cases given were successful using my code.

Ungolfed version with explanation included:

from re import*
def w(r):
    # Substitute all minus (-) and plus (+) signs NOT followed by a number  (if there are any) with a "-1"/"+1", respectively.
    a=sub('[+](?![0-9])','+1',sub('[-](?![0-9])','-1',r))
    # Lambda function created for later use to sort the Quaternion. This function, when given as a key to the "sorted" function, arranges the input Quaternion in the order where the whole number comes first, and then the rest are placed in order of increasing letter value (i,j,k in this case) 
    q=lambda x:(not x.isdigit(),''.join(filter(str.isalpha,x)))
    # The following "for" loop replaces the letters NOT preceded by a number with a one followed by that letter
    for z in findall('(?<![0-9])[a-z]',a):
        a=a.replace(z,('+1{}'.format(z)))
    # The following first substitutes all pluses and minuses (+ and -) with a space, and then that new string is split at those spaces, and returned as a list. After that, the list is sorted according the the "lambda" function shown above. Then, the first item in that list, which is supposed to be a lone number, is checked to make sure that it indeed is a lone number. If it isn't, then "+0, " is appended to the Quaternion. 
    if not str(sorted(((sub('[.]','',sub('[+-]',' ',a))).split(' ')),key=q)[0]).isdigit():
        a+='+0, '
    # The following "for" loop finds ALL the letters NOT in the list, by finding the symmetric difference between a set of all the letters found, and a set containing all the letters needed. For the letters not in the list, a '+0' is added the quaternion, followed by that letter, and then a comma and a space.
    for i in list(set(findall('[a-z]',a))^{'i','j','k'}):
        a+='+0{}, '.format(i)
    # Finally, in this last step, a ", " is added IN BETWEEN unicode characters and pluses/minuses (+/-). Then, it splits at those spaces, and the commas separate different parts of the Quaternion from each other (otherwise, you would get something like `12i+3j+4k` from `2i+3j+4k+1`) in a returned list. Then, that list is sorted according to the lambda expression "q" (above), and then, finally, the NUMBERS (of any type, courtesy to Regex) are extracted from that joined list, and printed out in the correct order.
    print(findall('[-]?\d+(?:\.\d+)?',''.join(sorted(sub('(?<=[A-Za-z0-9])(?=[+-])',', ',a).split(' '),key=q))))

If the above is a little too hard to read, basically what is happening is that:

  1. If there are any, all + or - signs NOT followed by a number are replaced with a "+1"/"-1", respectively.

  2. A lambda function is defined, which, when used in a sorted function as a key, sorts the list according to putting the whole number first, and then ordering the rest in increasing letter value ("i",then "j", then "k" in this instance).

  3. The Quaternion, now having all +/- signs replaced with a 1 if needed, is searched, using Regular Expressions, for ALL letters NOT preceded by at least one number, and those letters that match are replaced with a "+1" followed by that letter.

  4. The "if" statement then replaces ALL +/- signs with a space, and then the modified Quaternion is now "split" at those spaces, and returned in a list. Then, the list is sorted according to the lambda function explained earlier. Finally, the first item in that list is checked to make sure it's a number, since it's supposed to be, and if it's not, then a "+0, " is added to the Quaternion.

  5. The second "for" loop finds ALL the letters NOT in the Quaternion by finding a symmetric difference between a set of those letters found in the expression, and then a set including all letters required. If any are found, then a "+0" followed by the missing letter and a space is added to the Quaternion.

  6. Finally, in this last step, a ", " is added in between each character followed by a +/- symbol, and then the Quaternion is split at those spaces, then the list returned is sorted, for the last time, according to the lambda function defined as "q" earlier on. The commas in the expression separate each part of the quaternion (otherwise, you would be getting something like 14i+5j+6k from 4i+5j+6k+1). Lastly, that now sorted list is joined together into a string, and only the numbers of any type (courtesy of Regular Expressions) are extracted, and finally returned in a list, in the correct order, every single time.

R. Kap

Posted 2016-03-30T03:08:44.750

Reputation: 4 730