Background

Variable declaration statement in C consists of three parts: the name of the variable, its base type, and the type modifier(s).

There are three kinds of type modifiers:

Pointer * (prefix)
Array [N] (postfix)
Function () (postfix)
- You can specify a list of function arguments inside the parens, but for the sake of this challenge, let's ignore it and just use () (which technically means "the function can take any kind of arguments").

And a way to read out the notations is as follows:

int i;             // i is an int
float *f;          // f is a pointer to a float
my_struct_t s[10]; // s is an array of 10 my_struct_t
int func();        // func is a function returning an int

The catch is that we can mix all of these to form a more complicated type, such as array of arrays or array of function pointers or pointer to array of pointers:

int arr[3][4];
// arr is an array of 3 arrays of 4 ints

int (*fptrs[10])();
// fptrs is an array of 10 pointers to functions returning an int

float *(*p)[16];
// p is a pointer to an array of 16 pointers to float

How did I read these complicated statements?

Start from the variable name. (name) is ...
Select the modifier with the highest precedence.
Read it:
- * -> pointer to ...
- [N] -> array of N ...
- () -> function returning ...
Repeat 2 and 3 until the modifiers are exhausted.
Finally, read the base type. ... (base type).

In C, postfix operators take precedence over prefix operators, and type modifiers are no exception. Therefore, [] and () bind first, then *. Anything inside a pair of parens (...) (not to be confused with function operator) binds first over anything outside.

Illustrated example:

int (*fptrs[10])();
      fptrs           fptrs is ...
           [10]       array of 10 ... // [] takes precedence over *
    (*         )      pointer to ...
                ()    function returning ...
int                   int

Task

Given a line of variable declaration statement written in C, output the English expression that describes the line, using the method shown above.

Input

The input is a single C statement that includes a single base type, a single variable name, zero or more type modifiers and the ending semicolon. You have to implement all the syntax elements covered above, plus:

Both the base type and the variable name match the regular expression [A-Za-z_][A-Za-z0-9_]*.
Theoretically, your program should support unlimited number of type modifiers.

You can simplify other C syntax elements in the following ways (full implementation is also welcome):

The base type is always a single word, e.g. int, float, uint32_t, myStruct. Something like unsigned long long won't be tested.
For the array notation [N], the number N will always be a single positive integer written in base 10. Things like int a[5+5], int a[SIZE] or int a[0x0f] won't be tested.
For the function notation (), no parameters will be specified at all, as pointed out above.
For whitespaces, only the space character 0x20 will be used. You can restrict your program to specific usage of whitespaces, e.g.
- Use only one space after the base type
- Use a space everywhere between tokens
However, you cannot use two or more consecutive spaces to convey more information than being a token separator.

According to C syntax, the following three combinations are invalid, and thus won't be tested:

f()() Function returning function
f()[] Function returning array
a[]() Array of N functions

C developers use these equivalent forms instead (and all of these are covered in the test cases):

(*f())() Function returning pointer to function
*f() Function returning pointer to array's first element
(*a[])() Array of N pointers to function

Output

The output is a single English sentence. You don't need to (but you can if you wish) respect English grammar, e.g. the use of a, an, the, singular/plural forms, and the ending dot (period). Each word should be separated by one or more whitespaces (space, tab, newline) so the result is human-readable.

Again, here is the conversion process:

Start from the variable name. (name) is ...
Select the modifier with the highest precedence.
Read it:
- * -> pointer to ...
- [N] -> array of N ...
- () -> function returning ...
Repeat 2 and 3 until the modifiers are exhausted.
Finally, read the base type. ... (base type).

Test cases

int i;              // i is int
float *f;           // f is pointer to float
my_struct_t s[10];  // s is array of 10 my_struct_t
int func();         // func is function returning int
int arr[3][4];      // arr is array of 3 array of 4 int
int (*fptrs[10])(); // fptrs is array of 10 pointer to function returning int
float *(*p)[16];    // p is pointer to array of 16 pointer to float

_RANdom_TYPE_123 (**(*_WTH_is_TH15)())[1234][567];
/* _WTH_is_TH15 is pointer to function returning pointer to pointer to array of
   1234 array of 567 _RANdom_TYPE_123 */

uint32_t **(*(**(*(***p)[2])())[123])[4][5];
/* p is pointer to pointer to pointer to array of 2 pointer to function returning
   pointer to pointer to array of 123 pointer to array of 4 array of 5 pointer to
   pointer to uint32_t */

uint32_t (**((*(**(((*(((**(*p)))[2]))())))[123])[4])[5]);
// Same as above, just more redundant parens

some_type (*(*(*(*(*curried_func())())())())())();
/* curried_func is function returning pointer to function returning pointer to
   function returning pointer to function returning pointer to
   function returning pointer to function returning some_type */

Scoring & Winning criterion

This is a code-golf challenge. The program with the smallest number of bytes wins.

Bubbler

Posted 2018-10-25T04:24:31.990

Reputation: 16 616

Related: https://cdecl.org

– user202729 – 2018-10-25T05:46:31.000

int arr[3][4]; is an array of 3 arrays of 4 ints (as you say), or an array of 4 arrays of 3 ints? – Charlie – 2018-10-25T07:24:02.530

@Charlie The former is correct. sizeof(arr[0]) == sizeof(int[4]), so an item of arr contains four ints.

– Bubbler – 2018-10-25T07:34:48.500

1Does the input contain the ; at the end of the line? – Black Owl Kai – 2018-10-25T10:54:05.607

Which types do we need to support? – 12431234123412341234123 – 2018-10-25T10:58:17.113

@12431234123412341234123 Everything that matches the regex [A-Za-z_][A-Za-z0-9_]* – Black Owl Kai – 2018-10-25T11:01:46.977

You may want to specify that you have to return a sentence as displayed in your description: * to pointer to, [##] to array of ##, () to function returning (with optionally a/an/the to make the sentences more correct). Currently your "The output is a single English sentence." seems to allow any English sentence. – Kevin Cruijssen – 2018-10-25T12:28:31.060

The question states that the syntax []() doesn't need to be supported. However, one of the test cases is int (*fptrs[10])();. Is this test case made valid because there is a ) between the array and the function, or because the * causes it to become "array of pointer to function" instead of "array of function"? – Kamil Drakari – 2018-10-25T15:56:35.690

Ew. Nested arrays. – None – 2018-10-25T17:48:08.677

2@KamilDrakari It's the latter. "array of pointer to function" is essentially "array of pointer", which is perfectly valid in C. – Bubbler – 2018-10-25T23:01:27.880

Just to clarify: will we ever encounter something of the form int x[]; with empty square brackets? – DLosc – 2018-10-27T01:21:24.647

@DLosc No, the square brackets will always contain a positive integer. – Bubbler – 2018-10-27T04:24:41.973

K&R book on C I remember has one exercise on this argument – RosLuP – 2018-10-27T15:40:26.647

Answers

Python 3, 331 312 294 261 240 bytes

from re import*
class V(str):__pos__=lambda s:V(s+'pointer to ');__call__=lambda s:V(s+'function returning ');__getitem__=lambda s,i:V(s+'array of %i '%i)
t,e=input().split()
print(eval(sub('\*','+',sub('(\w+)',r'V("\1 is ")',e[:-1],1)))+t)

Try it online!

-19 bytes by switching to python 2 and putting the class definition into an exec

-18 bytes by changing the regex from [a-zA-Z_][a-zA-Z0-9_]* to \\w+, thanks to Kevin Cruijssen

-33 bytes by working some class definition magic and utilising str, thanks to Lynn, changing back to python 3

-21 bytes by merging together multiple regexes, thanks to infmagic2047

Requires that only one space is contained in the input (between the type and the expression).

I think this is a pretty unique approach to the problem. This mostly uses the fact that Python itself can evaluate strings like (**((*(**(((*(((**(*p)))[2]))())))[123])[4])[5]) and gets the correct sequence of function calls, array indexes and pointers - and that the user can overload these.

Black Owl Kai

Posted 2018-10-25T04:24:31.990

Reputation: 980

1Nice approach, +1 from me! You can golf [a-zA-Z_][A-Za-z0-9_]* to [a-zA-Z_]\\w* to save a few bytes. EDIT: Actually, I think you can just use \\w+ instead of [a-zA-Z_][A-Za-z0-9_]*. – Kevin Cruijssen – 2018-10-25T12:48:05.000

I like this approach :) here it is in 253 bytes

– Lynn – 2018-10-25T13:16:42.033

That's a good point. 261 it is then.

– Lynn – 2018-10-25T13:37:59.467

1You can use [0] instead of .group() since Python 3.6. – infmagic2047 – 2018-10-26T06:52:01.867

And here is a 240 bytes version.

– infmagic2047 – 2018-10-26T12:55:00.993

Retina 0.8.2, 142 138 128 117 bytes

(\w+) (.+);
($2) $1
\(\)
 function returning
\[(\d+)?]
 array of$#1$* $1
+`\((\**)(.+)\)
$2$1
\*
 pointer to
1` 
 is

Try it online! Link includes test cases. Better grammar. Edit: Saved 10 21 bytes by porting @DLosc's Pip solution. Explanation:

(\w+) (.+);
($2) $1

Move the type to the end and wrap the rest of the declaration in ()s in case it contains an outer *.

\(\)
 function returning

Process any functions.

\[(\d+)?]
 array of$#1$* $1

Process any arrays.

+`\((\**)(.+)\)
$2$1

Move any pointers to the end of their brackets, and delete the brackets, repeatedly working from the outermost set of brackets inwards.

\*
 pointer to

Process any pointers.

1` 
 is

Insert the is.

Neil

Posted 2018-10-25T04:24:31.990

Reputation: 95 035

Java 11, 469 467 463 450 bytes

s->{String r="",t,S[];for(s=s.replace("()","~");s.contains("(");s=s.replace(t,"").replace("()",""),r+=t+";")t=s.replaceAll(".*(\\([^()]+\\)).*","$1");S=s.split(" ");t=S[0];r+=r.isEmpty()?S[1]:s;S=r.split(";");r=S[0].replaceAll(".*?(\\w+).*","$1 is ");for(var p:S)r+=p.replaceAll("[A-Za-z_]+\\d+|[^\\[\\d]","").replaceAll("\\[(\\d+)","array of $1 ")+(p.contains("~")?"function returning ":"")+"pointer to ".repeat(p.split("\\*").length-1);return r+t;}

Try it online.

Explanation:

s->{               // Method with String as both parameter and return-type
  String r="",     //  Result-String, starting empty
         t,        //  Temp-String, starting uninitialized
         S[];      //  Temp String-array, starting uninitialized
  for(s=s.replace("()","~");
                   //  Replace all "()" in the input `s` with "~"
      s.contains("(");
                   //  Loop as long as the input `s` still contains "("
      ;            //    After every iteration:
       s=s.replace(t,"")
                   //     Remove `t` from `s`
          .replace("()",""),
                   //     And also remove any redundant parenthesis groups
       r+=t+";")   //     Append `t` and a semi-colon to the result-String
    t=s.replaceAll(".*(\\([^()]+\\)).*","$1");
                   //   Set `t` to the inner-most group within parenthesis
  S=s.split(" ");  //  After the loop, split the remainder of `s` on the space
  t=S[0];          //  Set `t` to the first item (the type)
  r+=              //  Append the result-String with:
    r.isEmpty()?   //   If the result-String is empty
                   //   (so there were no parenthesis groups)
     S[1]          //    Set the result-String to the second item
    :              //   Else:
     s;            //    Simple append the remainder of `s`
  S=r.split(";");  //  Then split `r` on semi-colons
  r=S[0].replaceAll(".*?(\\w+).*",
                   //  Extract the variable name from the first item
     "$1 is ");    //  And set `r` to this name appended with " is "
  for(var p:S)     //  Loop over the parts split by semi-colons:
    r+=            //   Append the result-String with:
      p.replaceAll("[A-Za-z_]+\\d+
                   //    First remove the variable name (may contain digits)
         |[^\\[\\d]","")
                   //    And then keep only digits and "["
       .replaceAll("\\[(\\d+)",
                   //    Extract the number after "["
         "array of $1 ")
                   //    And append the result-String with "array of " and this nr
      +(p.contains("~")?
                   //    If the part contains "~"
         "function returning "
                   //     Append the result-String with "function returning "
       :           //    Else:
        "")        //     Leave the result-String the same
      +"pointer to ".repeat(
                   //    And append "pointer to " repeated
         p.split("\\*").length-1);
                   //    the amount of "*" in the part amount of time
  return r         //  Then return the result-String
          +t;}     //  appended with the temp-String (type)

Kevin Cruijssen

Posted 2018-10-25T04:24:31.990

Reputation: 67 575

Fails on the test case with redundant parentheses. – Bubbler – 2018-10-25T07:45:57.727

@Bubbler Ah, didn't notice that new test case. Luckily it's an easy fix. – Kevin Cruijssen – 2018-10-25T08:07:21.593

Bash + cdecl + GNU sed, 180

cdecl is a venerable Unix utility that does most of what is required here, but in order to match I/O requirements, some sed pre- and post-processing is required:

sed -r 's/^/explain struct /;s/struct (int|char double|float|void) /\1 /;s/\bfunc/_func/g'|cdecl|sed -r 's/^declare //;s/as/is/;s/struct //g;s/([0-9]+) of/of \1/g;s/\b_func/func/g'

No attempts made to correct grammar.

sed Pre-processing:

s/^/explain struct / - Add "explain struct " to the start of every line
s/struct (int|char double|float|void) /\1 / - Remove struct when dealing with C language types
s/\bfunc/_func/g - "func" is recognized as a keyword by cdecl - suppress this

sed Post-processing:

s/^declare // - remove "declare" at start of line
s/as/is/ - self-explanatory
s/struct //g - remove all "struct" keywords
s/([0-9]+) of/of \1/g - correct ordering of "of "
s/\b_func/func/g - revert any "_func" that was replaced in pre-processing

In action:

$ < cdecls.txt sed -r 's/^/explain struct /;s/struct (int|char double|float|void) /\1 /;s/\bfunc/_func/g'|cdecl|sed -r 's/^declare //;s/as/is/;s/struct //g;s/([0-9]+) of/of \1/g;s/\b_func/func/g'
i is int
f is pointer to float
s is array of 10 my_struct_t
func is function returning int
arr is array of 3 array of 4 int
fptrs is array of 10 pointer to function returning int
p is pointer to array of 16 pointer to float
_WTH_is_TH15 is pointer to function returning pointer to pointer to array of 1234 array of 567 _RANdom_TYPE_123
p is pointer to pointer to pointer to array of 2 pointer to function returning pointer to pointer to array of 123 pointer to array of 4 array of 5 pointer to pointer to uint32_t
p is pointer to pointer to pointer to array of 2 pointer to function returning pointer to pointer to array of 123 pointer to array of 4 array of 5 pointer to pointer to uint32_t
curried_func is function returning pointer to function returning pointer to function returning pointer to function returning pointer to function returning pointer to function returning some_type
$

Digital Trauma

Posted 2018-10-25T04:24:31.990

Reputation: 64 644

Would it be sufficient to do s/\bfu/_fu/g and save the bytes of the full func replacement? – DLosc – 2018-10-26T06:05:40.270

wait it's a real utility? I've always thought that it's the name of the website – phuclv – 2018-10-27T04:20:54.093

@phuclv cdecl is a real utility, and really useful for checking C declarations. – Patricia Shanahan – 2018-10-27T07:55:38.540

@phuclv Cdecl was written before the ANSI C standard was completed...

– Digital Trauma – 2018-10-27T15:44:39.580

Fails for a variable named as (+4 bytes for spaces to fix). I don't have access to cdecl but I think you can save 64 bytes using sed -r 's/^(\w+)(\W+)/explain struct \1_\2_/'|cdecl|sed -r 's/^declare struct _|_$//;s/ as / is /;s/([0-9]+) of/of \1/g'. – Neil – 2018-10-28T10:06:24.117

Pip `-s`, 152 150 148 139 137 126 125 123 bytes

Third approach!

YaRs" ("R';')R`\[(\d+)]`` array of \1`R"()"" function returning"L#aYyR`\((\**)(.+)\)`{c." pointer to"X#b}{[b"is"g@>2a]}Vy^s

Takes the declaration as a command-line input. Try it online!

Explanation

The code is in three parts: initial setup and handling of functions and arrays; a loop that handles parentheses and pointers; and a final rearrangement.

Setup, functions & arrays

We want the whole declaration to be parenthesized (this helps with the loop later on), so we change type ...; into type (...). Then, observe that no reordering is done with the descriptions of functions and arrays, so we can perform all those replacements first without affecting the final output.

Y                         Yank into y variable...
 a                        The result of a (the cmdline arg)...
  R s                     Replace the space
   " ("                    with " ("
  R ';                    Replace the semicolon
   ')                      with a closing paren
  R `\[(\d+)]`            Replace digits in square brackets
   ` array of \1`          with " array of <digits>"
  R "()"                  Replace function parens
   " function returning"   with " function returning"

If our original input was float *((*p()))[16];, we now have float (*((*p function returning)) array of 16).

Parentheses and pointers

We run a loop replacing the outermost pair of parentheses and any asterisks that are immediately inside the opening paren.

L#a                   Loop len(a) times (enough to complete all replacements):
 Y                    Yank into y variable...
  y                   The result of y...
   R `\((\**)(.+)\)`  Replace open paren, 0 or more asterisks (group 1), 1 or more
                      characters (group 2), and close paren
    {                  with this callback function (b = group 1, c = group 2):
     c .               The stuff in the middle, concatenated to...
      " pointer to"    that string
       X #b            repeated len(asterisks) times
    }

Example steps:

float (*((*p function returning)) array of 16)
float ((*p function returning)) array of 16 pointer to
float (*p function returning) array of 16 pointer to
float p function returning pointer to array of 16 pointer to

Cleanup

The only thing remaining is to move the type to the end and add "is":

{[b"is"g@>2a]}Vy^s
               y^s  Split y on spaces
{            }V     Use the resulting list as arguments to this function:
 [          ]        Return a list of:
  b                   2nd argument (the variable name)
   "is"               That string
       g@>2           All arguments after the 2nd
           a          1st argument (the type)
                    The resulting list is printed, joining on spaces (-s flag)

For definitions like int x;, this approach will result in an extra space, which is permitted by the challenge.

DLosc

Posted 2018-10-25T04:24:31.990

Reputation: 21 213

JavaScript (ES6), 316 ... 268 253 bytes

s=>(g=s=>[/\d+(?=])/,/\*/,/!/,/.+ /,/\w+/].some((r,i)=>(S=s.replace(r,s=>(O=[O+`array of ${s} `,O+'pointer to ','function returning '+O,O+s,s+' is '+O][i],'')))!=s)?g(S):'',F=s=>(O='',S=s.replace(/\(([^()]*)\)/,g))!=s?O+F(S):g(s)+O)(s.split`()`.join`!`)

Try it online!

Commented

Helper function

g = s =>                             // s = expression to parse
  [                                  // look for the following patterns in s:
    /\d+(?=])/,                      //   array
    /\*/,                            //   pointer
    /!/,                             //   function
    /.+ /,                           //   type
    /\w+/                            //   variable name
  ].some((r, i) =>                   // for each pattern r at index i:
    ( S = s.replace(                 //   S = new string obtained by removing
      r,                             //       the pattern matching r from s
      s => (                         //     using the first match s and the index i,
        O = [                        //     update the output O:
          O + `array of ${s} `,      //       array
          O + 'pointer to ',         //       pointer
          'function returning ' + O, //       function
          O + s,                     //       type
          s + ' is ' + O             //       variable name
        ][i],                        //
        ''                           //     replace the match with an empty string
    )))                              //   end of replace()
    != s                             //   make some() succeed if S is not equal to s
  ) ?                                // end of some(); if truthy:
    g(S)                             //   do a recursive call with S
  :                                  // else:
    ''                               //   stop recursion and return an empty string

Main part

s => (                 // s = input
  g = …,               // define the helper function g (see above)
  F = s => (           // F = recursive function, taking a string s
    O = '',            //   O = iteration output, initialized to an empty string
    S = s.replace(     //   S = new string obtained by removing the next expression from s
      /\(([^()]*)\)/,  //     look for the deepest expression within parentheses
      g                //     and process it with the helper function g
    )                  //   end of replace()
  ) != s ?             // if S is not equal to s:
    O + F(S)           //   append O to the final output and do a recursive call with S
  :                    // else (we didn't find an expression within parentheses):
    g(s) + O           //   process the remaining expression with g and return O
)(s.split`()`.join`!`) // initial call to F with all strings '()' in s replaced with '!'

Arnauld

Posted 2018-10-25T04:24:31.990

Reputation: 111 334

I was wondering why you used [...s.split`()`.join`!`] instead of just [...s.replace('()','!')], but I realized it's the exact same byte-count.. :) – Kevin Cruijssen – 2018-10-25T09:50:41.693

@KevinCruijssen The primary reason is that s.replace('()','!') would only replace the first occurrence. – Arnauld – 2018-10-25T09:57:29.427

Ah, of course. Forgot JS replace isn't the same as Java's. In Java .replace replaces all occurrences, and .replaceAll replaces all occurrences with regex enabled. Always thought the naming was quite bad for these two methods in Java, as I would have called them .replaceAll and .regexReplaceAll or something along those lines, but I guess for codegolf it's shorter as .replace and .replaceAll. – Kevin Cruijssen – 2018-10-25T10:00:54.370

1BTW, I noticed that you were using the same technique (with ~) just after posting the first version of my own answer. Great minds think alike, I suppose. :p – Arnauld – 2018-10-25T10:07:56.857

Perl 6, 209 190 171 162 153 bytes

{~({(.[1]Z'is'),.<e>.&?BLOCK,('array of'X .[2]),('function returning','pointer to'Zxx.[3,0])if $_}(m:g/(\*)*[(\w+)+|\(<e=~~>.][\[(\d+).]*(\(.)*/[1]),$0)}

Try it online!

Recursive regex approach. Produces some extra space characters which can be avoided at the cost of 3 bytes.

Explanation

{     # Anonymous block
 ~(   # Convert list to string
   {  # Block converting a regex match to a nested list
     (.[1]            # Array of 0 or 1 variable names
       Z'is'),        # zipped with string "is"
     .<e>.&?BLOCK,    # Recursive call to block with subexpression
     ('array of'      # String "array of"
       X .[2]),       # prepended to each array size
     ('function returning',  # Strings "function returning"
      'pointer to'           # and "pointer to"
      Zxx             # zipped repetition with
      .[3,0])         # number of function and pointer matches
     if $_            # Only if there's an argument
   }
   (             # Call block
     m:g/        # Input matched against regex
      (\*)*      # Sequence of asterisks, stored in [0]
      [          # Either
       (\w+)+    # the variable name, stored as 1-element array in [1]
       |         # or
       \(        # literal (
         <e=~~>  # the same regex matched recursively, stored in <e>
       .         # )
      ]
      [\[(\d+).]*  # Sequence of "[n]" with sizes stored in [2]
      (\(.)*       # Sequence of "()" stored in [3]
     /
     [1]  # Second match
   ),
   $0     # First match (base type)
 )
}

nwellnhof

Posted 2018-10-25T04:24:31.990

Reputation: 10 037

Clean, 415 bytes

import StdEnv,Text
$s#(b,[_:d])=span((<>)' ')(init s)
=join" "(?d++[""<+b])
?[]=[]
?['()':s]=["function returning": ?s]
?['*':s]= ?s++["pointer to"]
?['[':s]#(n,[_:t])=span((<>)']')s
=["array of "<+n: ?t]
?s=case@0s of(['(':h],t)= ?(init h)++ ?t;(h,t)|t>[]= ?h++ ?t=[h<+" is"]
~c=app2((++)[c],id)
@n[c:s]=case c of'('= ~c(@(n+1)s);')'|n>1= ~c(@(n-1)s)=([c],s);_|n>0= ~c(@n s)=span(\c=c<>'('&&c<>'[')[c:s]
@_ e=(e,e)

Try it online!

Οurous

Posted 2018-10-25T04:24:31.990

Reputation: 7 916

R, 225 218 bytes

g=gsub
"&"="@"=paste
"["=function(a,b)a&"array of"&b
"+"=function(a)a&"pointer to"
eval(parse(t=g('\\(\\)','@"function returning"',g('(\\w+) (.*?)([A-Za-z_]\\w*)(.*);','\\2"\\3 is"\\4&"\\1"',g('\\*','+',readline())))))

Try it online!

Full program, wrapped in a function on TIO for convenient testing of all test cases at once.

First, we use Regex to convert the input of the form type ...name...; to ..."name is"..."type". Function notation () is then converted to text with a high-precedence concatenation operator. Unfortunately, we also have to replace * with + as the former is not acceptable as an unary operator. The rest is done by R's eval with overloaded operators.

Kirill L.

Posted 2018-10-25T04:24:31.990

Reputation: 6 693

1Clever solution! – J.Doe – 2018-10-28T15:26:09.943

JavaScript 250 Bytes [249?]

This uses 250 Bytes:

k=>(a=k.match(/\W|\w+/g),s=[v=j=r=""],f=y=>!j&!a[i+1]||(m=a[i],v?(r+=v=m=='['?`array of ${a[i+=3,i-2]} `:m<')'?(i+=2,"function returning "):s[j-1]=='*'?j--&&"pointer to ":""):m==')'?v=j--|i++:m<'+'?s[j++]=a[i++]:r+=a[v=i++]+" is ",f(),r+a[0]),f(i=2))

Explanation:

Basically, it's reading from a buffer a, which is the tokenized input. It continuously moves tokens from the buffer a to a stack s, until evaluation mode is triggered. Evaluation mode will consume postfix operations first (), [] from the buffer, and then it will consume the prefix operator * from the stack. Evaluation mode is triggered when the state is where a word would be (Either the typename is found and consumed, or an ending ) is found and removed). Evaluation mode is deactivated when no more prefix/postfix operators are found.

k=>( // k is input
    a=k.match(/\W|\w+/g), // split by symbol or word
    s=[v=j=r=""], // j=0, v=false, r="", s=[]
    // s is the stack, r is the return string,
    // v is true if we're in evaluation mode (Consume (), [], *)
    // v is false if we're waiting to see a ) or token, which triggers evaluation
    // j is the index of the top of the stack (Stack pointer)
    f=y=>!j&!a[i+1]||( // !j means stack is empty, !a[i+1] means we're at the ;
        m=a[i], // Save a[i] in a variable
        v // Are we evaluating?
        ?(
        r+=v=
            m=='[' // Array
            ?`array of ${a[i+=3,i-2]} ` // Skip three tokens: "[", "10", "]"
                                        // a[i-2] is the "10"
            :m<')' // m == '('
                ?(i+=2,"function returning ") // Skip two tokens: "(", ")"
                :s[j-1]=='*' // Stack has a pointer
                    ?j--&&"pointer to " // Pop the stack
                    :"" // Set v to be false, r+=""
        )
        :m==')'
            ?v=j--|i++ // Pop the '(', skip over the ')', v = Evaluation mode
            :m<'+' // m == '*' || m == '('
                ?s[j++]=a[i++] // push(s, pop(a))
                :r+=a[v=i++]+" is " // Otherwise we have the token
        , f(), r+a[0] // Recurse f(), and return r+a[0]. a[0] is the type.
    ),
    f(i=2) // Set i=2, and call f(), which returns the final value r + type
    // a = ["type", " ", ...], so i=2 give the first real token
    // This soln assumes there is only one space, which is an allowed assumption
)

NOTE

If I understand "Use a space everywhere between tokens" correctly:

k=>(a=k.split(" "),s=[v=j=r=""],f=y=>!j&!a[i+1]||(v?(r+=v=a[i]=='['?`array of ${a[i+=3,i-2]} `:a[i]<')'?(i+=2,"function returning "):s[j-1]=='*'?j--&&"pointer to ":""):a[i]==')'?v=j--|i++:a[i]<'+'?s[j++]=a[i++]:r+=a[v=i++]+" is ",f(),r+a[0]),f(i=1))

is technically valid, and uses

249 Bytes

Assuming that there's a space between every token.

Nicholas Pipitone

Posted 2018-10-25T04:24:31.990

Reputation: 123

2This took me many many hours, despite it looking straightforward. I probably knocked 5-10 bytes / hour, starting with 350 chars. I do indeed have no life. – Nicholas Pipitone – 2018-10-26T17:54:09.850

2I was at about 325 when I thought "I hit optimality with my current algorithm - rip", but then for some reason I was still able to knock 5-10 / hour, despite each knock being followed by "Okay, this is definitely the optimal result". Hitting 250 was arbitrary since it was the first to beat the reigning 253, so even though I still say "Okay, this is definitely the optimal result", there might still be more to optimize. – Nicholas Pipitone – 2018-10-26T17:56:44.687

Red, 418 410 bytes

func[s][n: t:""a: charset[#"a"-#"z"#"A"-#"Z"#"0"-#"9""_"]parse s[remove[copy x thru" "(t: x)]to a
change[copy x[any a](n: x)]"#"]b: copy[]until[c: next find s"#"switch c/1[#"("[append
b"function returning"take/part c 2]#"["[parse c[remove[skip copy d to"]"(append b
reduce["array of"d])skip]]]#")"#";"[take c c: back back c while[#"*"= c/1][take c
c: back c append b"pointer to"]take c]]s =""]reduce[n"is"b t]]

Try it online!

Explanation:

f: func [ s ] [
    n: t: 0                                         ; n is the name, t is the type
    a: charset [ #"a"-#"z" #"A"-#"Z" #"0"-#"9" "_" ]; characters set for parsing 
    parse s[                                        ; parse the input with the following rules
        remove [ copy x thru " " ](t: x)            ; find the type, save it to t and remove it from the string
        to a                                        ; skip to the next alphanumerical symbol
        change [ copy n [ any a ] (n: x) ] "#"      ; save it to n and replace it with '#'
    ]
    b: copy [ ]                                     ; block for the modifiers 
    until [                                         ; repeat 
       c: next find s "#"                           ; find the place of the name   
       switch c/1 [                                 ; and check what is the next symbol
           #"(" [ append b "function returning"     ; if it's a '('- it's a function - add the modifier       
                  take/part c 2                     ; and drop the "()"
                ]
           #"[" [ parse c [                         ; '[' - an array
                     remove [ skip copy d to "]"    ; save the number
                             (append b reduce [     ; and add the modifier 
                                  "array of" d
                              ] )                   
                             skip ]                 ; and remove it from the string
                     ]
                ]
           #")"                                     ; a closing bracket 
           #";" [ take c                            ; or ';' - drop it
                    c: back back c                  ; go to the left 
                    while [ #"*" = c/1 ]            ; and while there are '*'
                    [
                        take c                      ; drop them
                        c: back c                   ; go to the left
                        append b "pointer to"       ; add the modifier
                    ]
                    take c                          ; drop '(' (or space)
                 ]
       ]
       s = ""                                       ; until the string is exhausted
    ]
    reduce [ n "is" b t ]                     ; display the resul
]

Galen Ivanov

Posted 2018-10-25T04:24:31.990

Reputation: 13 815

APL(NARS), chars 625, bytes 1250

CH←⎕D,⎕A,⎕a,'_'⋄tkn←nm←∆←''⋄in←⍬⋄⍙←lmt←lin←0
eb←{∊(1(0 1 0)(0 1)(1 0))[⍺⍺¨⍵]}
tb←{x←({⍵='[':3⋄⍵=']':4⋄⍵∊CH,' ':1⋄2}eb⍵)\⍵⋄(x≠' ')⊂x}

gt
tkn←''⋄→0×⍳⍙>lin⋄tkn←∊⍙⊃in⋄⍙+←1⋄→0×⍳(⍙>lin)∨'('≠↑tkn⋄→0×⍳')'≠↑⍙⊃in⋄tkn←tkn,⍙⊃in⋄⍙+←1

r←dcl;n
   n←0
B: gt⋄→D×⍳'*'≠↑tkn⋄n+←1⋄→B×⍳tkn≢''
D: r←ddcl⋄∆←∆,∊n⍴⊂'pointer to '

r←ddcl;q
   r←¯1⋄→0×⍳0>lmt-←1
   →A×⍳∼'('=↑tkn⋄q←dcl⋄→F×⍳')'=↑tkn⋄→0
A: →B×⍳∼(↑tkn)∊CH⋄nm←tkn⋄→F
B: r←¯2⋄→0
F: gt⋄→G×⍳∼tkn≡'()'⋄∆←∆,'function that return '⋄→F
G: →Z×⍳∼'['=↑tkn⋄∆←∆,'array of ',{''≡p←(¯1↓1↓tkn):''⋄p,' '}⋄→F
Z: r←0

r←f w;q
   nm←∆←''⋄in←tb w⋄⍙←1⋄lin←↑⍴in⋄lmt←150⋄gt⋄→A×⍳∼0>q←dcl⋄r←⍕q⋄→0
A: r←nm,' is a ',∆,1⊃in

this is just one traslation from C language to APL from code in the book: "Linguaggio C" by Brian W. Kerninghan and Dennis M. Ritchie chapter 5.12. I don't know how reduce all that because i had not understood 100% that code, and because I do not know too much on APL... The function for exercise it is f; i think are only allowed 150 nested parentesis '(' ')' for error return one strign with one negative value in that or the string descrition if all is ok. It seems this is not better than the other version even if less chars because the other sees the errors better. Some test:

  f 'int f()()'
f is a function that return function that return int
  f 'int a[]()'
a is a array of function that return int
  f 'int f()[]'
f is a function that return array of int
  f 'int i;'
i is a int
  f 'float *f;'
f is a pointer to float
  f 'my_struct_t s[10];'
s is a array of 10 my_struct_t
  f 'int func();'
func is a function that return int
  f 'int arr[3][4];'
arr is a array of 3 array of 4 int
  f 'int (*fptrs[10])();'
fptrs is a array of 10 pointer to function that return int
  f 'float *(*p)[16]; '
p is a pointer to array of 16 pointer to float
  f '_RANdom_TYPE_123 (**(*_WTH_is_TH15)())[1234][567];'
_WTH_is_TH15 is a pointer to function that return pointer to pointe
  r to array of 1234 array of 567 _RANdom_TYPE_123
  f 'uint32_t (**((*(**(((*(((**(*p)))[2]))())))[123])[4])[5]);'
p is a pointer to pointer to pointer to array of 2 pointer to funct
  ion that return pointer to pointer to array of 123 pointer to
   array of 4 array of 5 pointer to pointer to uint32_t

RosLuP

Posted 2018-10-25T04:24:31.990

Reputation: 3 036

Read out the C variable declaration

Background

Task

Input

Output

Test cases

Scoring & Winning criterion

Answers

Python 3, 331 312 294 261 240 bytes

Retina 0.8.2, 142 138 128 117 bytes

Java 11, 469 467 463 450 bytes

Bash + cdecl + GNU sed, 180

sed Pre-processing:

sed Post-processing:

In action:

Pip -s, 152 150 148 139 137 126 125 123 bytes

Explanation

Setup, functions & arrays

Parentheses and pointers

Cleanup

JavaScript (ES6), 316 ... 268 253 bytes

Commented

Helper function

Main part

Perl 6, 209 190 171 162 153 bytes

Explanation

Clean, 415 bytes

R, 225 218 bytes

JavaScript 250 Bytes [249?]

NOTE

249 Bytes

Red, 418 410 bytes

Explanation:

APL(NARS), chars 625, bytes 1250

Pip `-s`, 152 150 148 139 137 126 125 123 bytes