Implement an integer parser

4

1

Implement an integer parser in the shortest code possible, with the following restrictions:

  1. The function or program can only take one argument: a string (or equivalent). Assume it to be signed and always valid (don't worry about decimals, etc.). The input may optionally omit the sign, in which it is assumed to be positive.

  2. The function or program must output an integer (or equivalent).

  3. No pattern matching, regular expressions, etc. are allowed. This includes utilities such as tr for *nix.

  4. No string parsing utilities are permitted, even from the standard library. This includes, but is not limited to, parseInt() from JavaScript, int() in Python, and Int32.parse() from VBScript.

  5. The only permitted predefined string manipulation functions, methods, etc. fall into one of the following categories:

    • extract a substring (e.g. substr(), slice())
    • get code point from character, vice versa (e.g. toCodePoint(), fromCodePoint())
    • concatenate strings (e.g. append(), concat(), strcpy())
  6. No external libraries are permitted.

  7. No string parsing utilities are permitted, even from the standard library.

  8. Implicit conversion to string is disallowed (e.g. 1+'' in JavaScript)

Shortest code wins, so happy golfing!

EDIT: Sorry for the confusion, but implicit conversion is also disallowed.

Isiah Meadows

Posted 2014-05-30T06:45:47.873

Reputation: 1 546

What if the input string isn't an integer? what to output? – Adam Speight – 2014-05-30T06:50:05.490

Addressed. Question updated. – Isiah Meadows – 2014-05-30T06:52:12.317

Both positive and negative? – Adam Speight – 2014-05-30T06:53:20.793

Also addressed. Assume it's already valid and is signed. – Isiah Meadows – 2014-05-30T07:00:33.527

Is split an extract a substring method? – Peter Taylor – 2014-05-30T08:35:59.250

Okay, I'm going to use 0 + $string in PHP (implicit integer cast). Or is using language syntax for string parsing forbidden too? :-P – bwoebi – 2014-06-01T14:05:44.817

@PeterTaylor Yes. – Isiah Meadows – 2014-06-01T19:43:57.443

@bwoebi That is forbidden. I apologize for missing the implicit string conversion side of things. I'm a little embarassed for not even catching that, considering my primary language of choice (JavaScript) uses that on a regular basis. – Isiah Meadows – 2014-06-01T19:45:44.447

"The function or program must output an integer": how can you distinguish an integer output from a string one? – Blackhole – 2014-06-01T20:01:18.307

1@Blackhole This basically excludes languages that don't have an integer type. – Isiah Meadows – 2014-06-01T21:23:31.257

Here is a short PostScript function that will convert a string to an integer: exec. (It even allows you to express it symbolically as the result of a computation!) – AJMansfield – 2014-06-02T12:24:28.833

1To be pedantic, you haven't specified how (or if) the output integer should correspond to the input string, so a solution that always outputs 42 would seem to be technically valid, as 42 is definitely an integer. – Ilmari Karonen – 2014-06-02T19:11:51.290

1Also, you might want to include some examples of valid and invalid input strings. For example, I'd like to know which of the following strings are valid inputs: 0, 1, -1, +1, +-1, -0, +0, 0000, 0001, +, -, (empty string), - 1, 123 456, 9999999999, 2147483648, -2147483648. – Ilmari Karonen – 2014-06-02T19:17:18.153

@AJMansfield: If that's allowed, I can beat it in GolfScript: ~ – Ilmari Karonen – 2014-06-02T20:03:43.653

Do we have to treat both 123 and +123, or does one of them suffice? (if so, I presume the former suffices) – FireFly – 2014-06-02T21:44:49.240

@ajmansfield, ilmari Not allowed. – Isiah Meadows – 2014-06-03T22:28:20.567

@firefly Either one. – Isiah Meadows – 2014-06-03T22:28:40.710

The exec is effectively equivalent to a parseInt method. – Isiah Meadows – 2014-06-03T22:30:29.097

Answers

4

GolfScript, 19 / 24 / 28 chars

Assuming that positive numbers are always preceded by a + sign (19 chars; online demo):

(45-~0@{48-\10*+}/*

Assuming that positive numbers never have a + sign (24 chars; online demo):

"-"/)0\{48-\10*+}/-1@,?*

Assuming that positive numbers may or may not have a + sign (28 chars; online demo):

"+"-"-"/)0\{48-\10*+}/-1@,?*

(This is the same as the 24-char solution about, but with "+"- prepended to strip away any + signs.)

All of these are based on the following 13-character program, which converts an unsigned base-10 digit string into an integer:

0\{48-\10*+}/

It works by keeping a running total on the stack and iterating over (the ASCII codes of) the characters in the string. For each character, it subtracts 48 (the ASCII code of 0) from the ASCII code, multiplies the running total by 10, and adds the two together to give the new running total.

The rest of the code is for sign handling. The always-signed case is easiest, because we can just chop the first character off the string and test its ASCII code to see if it's + (43) or - (45). Conveniently, the ASCII codes of those two characters differ by two, so we can easily remap them to 1 and -1 via subtraction and negation. (The code happens to also give the right answer for an unsigned zero, but mostly by accident.)

For the only-negatives-signed case, we use a different strategy: we split the input string at - signs, take the last segment as the number, and count the number of earlier segments to determine whether to negate the number or not. The code for the maybe-signed case is exactly the same, but we first strip off any plus signs to reduce it to the previous case.


Ps. If we can cheat and use eval, the obvious GolfScript solution is just one character long:

~

It doesn't handle leading + signs, though, since GolfScript integer literals don't allow them.

Ilmari Karonen

Posted 2014-05-30T06:45:47.873

Reputation: 19 513

This (the 28-character version) is the winner. – Isiah Meadows – 2014-07-13T07:55:09.313

5

Python 3 (68)

s=input()
n=s<'0'
t=0
for c in s[n:]:t=t*10+ord(c)-48
print(t-t*2*n)

The Boolean n remembers whether the string started with a minus sign by using the lexicographic string ordering, in which '-' is less then the lowest digit '0'. The expression s[n:] casts n to a number (0 or 1), skipping the minus sign if it was there. The expression t-t*2*n is a shortening of [t,-t][n].

xnor

Posted 2014-05-30T06:45:47.873

Reputation: 115 687

3

Python, 164 125 121 characters

n=raw_input()
d=[i-48for i in map(ord,n[::-1])if i!=45]
x=0
for i in range(len(d)):x+=d[i]*10**i
if'-'==n[0]:x=-x
print x

Prints out the value with a .0 on the end but that should be all right for the purposes of the exercise, right? Either way, it's a complete script in and of itself. Idea's pretty simple: get ASCII values of the characters passed in, convert those to numbers, multiply each digit by a relevant power of 10 and sum the lot. Test the first character passed in to see if it's negative.

It can almost certainly be shortened but since I'm really new to this whole golfing thing (2nd submission) I reckon it's not bad.

Edit: further golfed by TheRare

Zaphod65

Posted 2014-05-30T06:45:47.873

Reputation: 31

i!=45 can written as i>45 since all digits have larger values.The last two lines can be shortened to print[x,-x]['-'==n[0]]. – xnor – 2014-06-02T01:46:03.630

@xnor Or even print[-x,x]['-'<n[0]] – seequ – 2014-06-02T09:38:05.957

3

GolfScript, 33 characters

I'm still pretty new to GolfScript, so I'm not sure how well this came out. I could be missing some better approaches/optimizations. If you see any, let me know so I can hopefully learn!

I believe this code works correctly for any signed integer of the form: [-+]?[0-9]+

Fully golfed:

(.45=!2*(:s;48-.0>*\{48-s*\10*+}/

Expanded source with C-ish pseudocode:

"-01234567890"  # char in[] = "-01234567890";

(               # char c0 = in[0];
.45=!2*(:s;     # int s = c0 == '-' ? -1 : 1;
48-.0>*\        # int x = c0 > '0' ? c0 - '0' : 0;
{               # for (char c : in[1-]) {
  48-s*\10*+    #   x = x * 10 + (c - '0') * s;
}/              # }
                # return x;

Runer112

Posted 2014-05-30T06:45:47.873

Reputation: 3 636

Mine's a little bit shorter, but works basically the same way, although I only multiply the result with the sign at the end. Still, have a +1. – Ilmari Karonen – 2014-06-02T20:08:26.970

3

C 75 76 80 chars

p(char*s){int c,r=0,m=*s-45;for(;c=*s++;)r=c<48?r:r*10+c-48;return m?r:-r;}

No checks for valid inputs, as by rule 1.

THX bebe for edit - Now still 1 less

edc65

Posted 2014-05-30T06:45:47.873

Reputation: 31 086

2

CJam - 14

q(',\-\'0f-Ab*

It assumes the input is signed, that is, it always starts with a "+" or "-".
Try it online at http://cjam.aditsu.net/

q reads the input
( takes out the first character
',\- subtracts it from the ',' character, resulting in 1 for '+' and -1 for '-'
\ swaps the sign with the rest of the number
'0f- subtracts the character '0' from each character, obtaining an integer array
Ab converts this array to a number in base 10 (A=10)
* multiplies the number with the sign

Thanks Runer112

aditsu quit because SE is EVIL

Posted 2014-05-30T06:45:47.873

Reputation: 22 326

The ASCII codes for + and - are 43 and 45; that's a difference of 2, just like between -1 and 1. You can use this fact to squeeze out two characters, by replacing '+=2*( with 44\-, or using the character literal, ',\-. As a side note, I should really start using a hyper-golfed language like CJam instead of GolfScript... it seems objectively better. No way I can beat your answer with GolfScript. – Runer112 – 2014-06-02T20:10:34.740

2

Haskell - 51 69 chars

p('-':n)= -p n
p n=foldl(\v->(10*v-48+).fromEnum)0n

aviraldg

Posted 2014-05-30T06:45:47.873

Reputation: 189

2

Python 3 - 92 84

I'm pretty happy with this.

x=lambda n:[1,-1][n<'0']*sum(10**i*(ord(c)-48)for i,c in enumerate(n[n<'0':][::-1]))

Alternatively, without a lambda: (92)

def x(n):j=n[0]=='-';return((j*2-1)*sum(10**i*(ord(c)-48)for i,c in enumerate(n[j:][::-1])))

Thanks to xnor for n<'0'.

Wobblebot

Posted 2014-05-30T06:45:47.873

Reputation: 29

You need to deal with negative numbers. – xnor – 2014-06-02T19:10:55.830

@xnor There, fixed. :) – Wobblebot – 2014-06-02T19:21:58.327

2

Mathematica - 33 or 37

Fold[10#+#2&,0,ToCharacterCode@#-48]&

or

FromDigits[ToCharacterCode@#-48]&

if FromDigits is permissible. FromDigits takes a list of integers and interprets it as the digits of a number (in arbitrary base).

In contrast, Fold can be considered a fundamental part of the language (even though what is fundamental and what isn't is a bit subjective in Mathematica).

Szabolcs

Posted 2014-05-30T06:45:47.873

Reputation: 341

1

AWK, 120 characters

{for(i=0;i<10;i++)o[sprintf("%c",i+48)]=i;for(i=l=length($0);i>0;i--)(c=substr($0,i,1))in o?r+=o[c]*10^(l-i):r=-r;$0=r}1

Usage:

$ echo -0123456789 | awk -f parse-int.awk
-123456789

AWK is a language where numbers are strings. (Perl, Tcl, and the Unix shell are also such languages.) My code is stupid because it pretends that numbers and strings are different.

AWK has no function to convert a string to a code point, so I do the inverse, using sprintf("%c",i+48) to convert a code point to a string. I then loop from 0 to 9, storing o["0"] = 0 through o["9"] = 9 in an associative array. Later, I use o[c] to get the number 9 from the string "9". This is stupid because 9 and "9" are already the same thing.

The usual ways to split a string into characters are with FS="" or split(string,array,""), but these use "" as an empty regular expression, and I can't use regular expressions. So, I use c=substr($0,i,1) to get each character c from the input line $0. I check if c is an index in array o. If yes, I add the product of o[c] and a power of 10 to the result. If no, I assume that c is a negative sign and negate the result.

AWK, 1 character, no parser!

1

Usage:

$ echo -0123456789 | awk 1
-01234567890

This 1-character answer implements all the requirements except for parsing the integer, just by copying input to output.

To print the parsed integer, I must format it as a string. The input is already a valid integer formatted as a string. So I optimize my program to skip the parse!

kernigh

Posted 2014-05-30T06:45:47.873

Reputation: 2 615

1

JavaScript (ECMAScript 6) - 83 80 71 67 Characters

Version 4:

Since the input is always signed (i.e. has a leading + or -) then:

f=x=>(t=0,[(k=48-c.charCodeAt())>0?s=4-k:t=10*t+k*s for(c of x)],t)

However, if it can be unsigned then (+5 characters):

f=x=>(s=-1,t=0,[(k=48-c.charCodeAt())>0?s=4-k:t=10*t+k*s for(c of x)],t)

Version 3:

Since the input is always signed (i.e. has a leading + or -) then:

f=x=>[...x].reduce((p,c)=>(k=48-c.charCodeAt())>0?(s=4-k,0):10*p+k*s,0)

Version 2:

f=x=>[...x].reduce((p,c)=>((k=c.charCodeAt()-48)<0?p:k+10*p),0)*(x[0]=='-'?-1:1)

Version 1:

f=x=>[].reduce.call(x,(p,c)=>((k=c.charCodeAt()-48)<0?p:k+10*p),0)*(x[0]=='-'?-1:1)

JavaScript (ECMAScript 6) - 8 Characters

JavaScript will implicitly try to convert any variable to a number if you perform an arithmetic operation on it as part of the language (i.e. without using any external libraries). So any of the solutions below will convert a string containing a valid number to a number.

f=x=>~~x
f=x=>x-0
f=x=>x*1
f=x=>x|0

JavaScript (ECMAScript 6) - 7 Characters

f=x=>+x

JavaScript (ECMAScript 6) - 5 Characters

Or as an anonymous function:

x=>+x

MT0

Posted 2014-05-30T06:45:47.873

Reputation: 3 373

+1 for applying an array function over a string. Learn something new every day – edc65 – 2014-06-02T16:11:26.653

Now with the sign handling our soluctions could collide. Beware, the sign is not mandatory: +nnn, -nnn, nnn are handled. – edc65 – 2014-06-02T21:04:14.987

The OP states Assume it to be signed - and then reconfirms this in the comment dated May 30 7:00. Based on this I would say that the sign is mandatory and various other answers have also made that deduction. – MT0 – 2014-06-02T21:29:39.873

1

Java 7 - 107

Boring and long (typical for Java), but the other Java answer is an embarrassment so I had to remedy the situation.

int p(String i){for(int x=1,l=i.length(),a=0;;a=a*10+i.charAt(x++)-48)if(x==l)return i.charAt(0)==45?-a:a;}

Ungolfed:

int parseInt(String input) {
  for(int index = 1,len = input.length(),accumulator=0,num;;
      num = input.charAt(index) - '0',
      accumulator = accumulator * 10 + num,index++)
    if (index == len) return input.charAt(0)=='-'?-accumulator:accumulator;
}

durron597

Posted 2014-05-30T06:45:47.873

Reputation: 4 692

1

C (82 chars)

I'm late to the party and I guess my answer is really close to edc's.

int i(char*s){int v=0,n=*s-'-';if(!n)s++;for(;*s;s++)v=v*10+*s-'0';return n?v:-v;}

Connor Hollis

Posted 2014-05-30T06:45:47.873

Reputation: 111

1

Fortran - 46 characters

function i(s);character(len=*)s;read(s,*)i;end

Call like:

i('100')

anonymouscowherd

Posted 2014-05-30T06:45:47.873

Reputation: 11

Hmm, this might be close to breaking #4. – Kyle Kanos – 2014-06-08T18:12:45.680

0

C# - 82 characters

int p(string t)
{
 var n=t(0)=='-';
 var i=n?1:0;
 var v=0;
 while(i<t.length)
 {
   v=(10*v)+(t(i)-'0');
   i++;
 }
 return n ? -v : v;
}

I think this would also parse a string t into an int.

 var i = t.skip(t(0)=='-'?1:0).Aggregate(0,(p,c)=>(10*p)+(c-'0'))*(t(0)=='-'?-1:1);

Adam Speight

Posted 2014-05-30T06:45:47.873

Reputation: 1 234

0

Javascript (E6) 80 92

Edit: apply 'reduce' directly, no need to 'split'. Thanks to MTO

Esoteric

Q=S=>(m=1,[].reduce.call(S,(p,x)=>(c=48-x.charCodeAt())>0?(m=c-4)&0:p*10-c*m,0))

Simple loop stay behind at 92

P=S=>{for(n=i=0,m=S[0]>'/'||(S[i++]>'+'?-1:1);d=S.charCodeAt(i++);n=n*10+m*(d-48));return n}

Anyway, assume valid input: first letter can be '-' or '+' or digit, then digits only

... or even

F=S=>1*S

No library function used

edc65

Posted 2014-05-30T06:45:47.873

Reputation: 31 086

0

Scala, 115 characters

def p(s:String)=((0,1)/:s.reverse){case(a,'+')=>a;case(a,'-')=>(-a._1,0);case((t,n),d)=>(t+(d.toInt-48)*n,n*10)}._1

Ungolfed:

def parseInt(input: String) = ((0, 1) /: input.reverse) {
  case (acc, '+') => acc
  case (acc, '-') => (-acc._1, 0) // place never used again
  case ((total, place), digit) =>
    (total + (digit.toInt - 48) * place, place * 10)
}._1

Dan Getz

Posted 2014-05-30T06:45:47.873

Reputation: 533

0

Python 2.7 - 102 bytes

def p(n):
 a,r,l=0,range(10),list(n);d=dict(zip(map(str,r),r))
 while l:a*=10;a+=d[l.pop(0)]
 return a

I believe this is valid, but I'm not 100% sure about rule 3.

Creates a dictionary where the keys are the strings '0', '1', '2'... '9' and the values are the int representing that key. Creates a list of the characters in the input, and assigns ato 0. While the list we made isn't empty, multiply a by 10, take off the first item of that list and use it as a key in the dictionary, adding the value to a. Then just return a.

undergroundmonorail

Posted 2014-05-30T06:45:47.873

Reputation: 5 897

0

Java - 274 Characters

static int parseInt(String arg)
{
    int length = arg.length();
    String reversed = new StringBuilder(arg).reverse().toString();
    int sum = 0;
    for(int i=0; i<length-1; i++)
        sum += (reversed.charAt(i)-48) * Math.pow(10, i);
    if(arg.charAt(0) == '-')
        sum = -1*sum;
    else
        sum += (reversed.charAt(length-1)-48) * Math.pow(10, length-1);
    return sum;
}

padawan

Posted 2014-05-30T06:45:47.873

Reputation: 294

Whoops, what a fail :) Sorry for that. – padawan – 2014-06-01T13:13:07.827

For code-golf, you need to supply the number of bytes/characters. No-one will count them for you, but someone will tell you if you are off by one :-) – Bill Woodger – 2014-06-01T13:15:00.907

By byte, you mean, copy+pase to a text file and write the length? – padawan – 2014-06-01T13:20:03.770

0

C 226

Parses positive and negative integers.

#include <stdio.h>
#define P(x) printf("%d",x)
char c;
int n,x,y;
int main()
{
while(((c=getchar())=='-') || ((c>='0')&&(c<='9')))
{
if(c=='-')y=1;
else{
n*=10;
x=c-'0';
n+=x;
}
}
if(y)n=-1*n;
P(n);
getchar();
return 0;
}

bacchusbeale

Posted 2014-05-30T06:45:47.873

Reputation: 1 235

Overkill? The number is guaranted valid and a full program is not requested, just a function. – edc65 – 2014-06-02T18:13:15.830

0

Go - 141 characters

func p(s string) (t int) {
    for i, x := range s {
        y := int(x) - 48
        for j := len(s) - 1; j > i;  j-- {
            y *= 10
        }
        t += y
    }
    return
}

int() here is a straight conversion from a rune (a type identical to int32) to a longer int - and it has to be in the code somewhere to get the function to return an int rather than a rune. The above code is standard go formatting, but technically some of the whitespace is optional.

(If you copy/paste the above, before you count the characters, put it in play.golang.org and hit format.)

voutasaurus

Posted 2014-05-30T06:45:47.873

Reputation: 121

0

in :r4 (http://code.google.com/p/reda4/)

::str>int | str -- int
    0 swap ( c@+ $2f >? )(
        $39 >? ( 2drop ; )
        $30 - rot 10 * + swap )
    2drop ;

is in standard library http://code.google.com/p/reda4/source/browse/trunk/r4/Lib/parse.txt line 92

Phreda

Posted 2014-05-30T06:45:47.873

Reputation: 11

0

C - 69 characters

c,r,m;p(char*s){for(m=2,r=0;c=15&*s++;r=c>9?m=r:r*10+c);return--m*r;}

bebe

Posted 2014-05-30T06:45:47.873

Reputation: 3 916

0

J - 19 characters

10#.'0123456789'&i.

No string manipulation at all. Just index each character into the vector of decimal digits, and take the base-10 representation of that array.

Dan Bron

Posted 2014-05-30T06:45:47.873

Reputation: 421

1Does that handle negative numbers too? It seems to me it doesn't, but I don't really know J. – Ilmari Karonen – 2014-06-02T20:06:13.553

0

J, 41 chars

Here's a J solution that should follow the spec... I'm not very fond of the sign handling; it's too verbose. I can't think of a more compact way, though.

(u:48+i.10)&((_1^'-'e.])*10 p.~e.~#i.)@|.

Sample use:

   atoi =: (u:48+i.10)&((_1^'-'e.])*10 p.~e.~#i.)@|.
   atoi '-1234'
_1234

FireFly

Posted 2014-05-30T06:45:47.873

Reputation: 7 107

0

Python 116 108 146

O=b=0
I=list(input())
if I[0]=='-':I.pop(0);b=1
a=range(len(I))
for c in a:I[c]=ord(I[c])-48
for i in a:O+=I[i]*10**(len(I)-i-1)
if b:O=-O
print O

Harry Beadle

Posted 2014-05-30T06:45:47.873

Reputation: 787

-1

Ruby, 43 characters

a=0;$*[0].each_char{|n|a=a*10+n.ord-48};p a

Alternate answer (also 43 characters)

p $*[0].chars.inject(0){|a,b|a*10+b.ord-48}

afuous

Posted 2014-05-30T06:45:47.873

Reputation: 429

-3

Python 3.3 - 6 characters

f=eval

Call like:

>>> f("-12")
-12

Cheating? Well, it certainly stretches the limit of the rules, but eval is not an integer parsing function, per se, so...

propeller

Posted 2014-05-30T06:45:47.873

Reputation: 101