Clearly parenthesize APL trains

19

4

In APL, you can write tacit functions, called trains. How they work is irrelevant for this challenge. Here are the different ways they can be grouped, using as the function:

⍴      -> ⍴
⍴⍴     -> ⍴⍴
⍴⍴⍴    -> ⍴⍴⍴
⍴⍴⍴⍴   -> ⍴(⍴⍴⍴)
⍴⍴⍴⍴⍴  -> ⍴⍴(⍴⍴⍴)
⍴⍴⍴⍴⍴⍴ -> ⍴(⍴⍴(⍴⍴⍴))
...

The order remains the same. The procedure is that as long as there are strictly more than 3 functions, the last 3 functions are grouped into one function. If we meet a nested train, we parenthesize that first, before continuing. Here is the procedure applied to ⍴⍴⍴⍴⍴⍴:

Step 0: ⍴⍴⍴⍴⍴⍴
There are strictly more than 3 functions, repeat.
Step 1: ⍴⍴⍴(⍴⍴⍴)
There are strictly more than 3 functions, repeat.
Step 2: ⍴(⍴⍴(⍴⍴⍴))
There are 3 or less functions, we're done.

Here is the same procedure applied to ⍴⍴⍴(⍴⍴)⍴(⍴⍴⍴⍴(⍴⍴⍴))⍴⍴:

Step 0: ⍴⍴⍴(⍴⍴)⍴(⍴⍴⍴⍴(⍴⍴⍴))⍴⍴
There are strictly more than 3 functions, repeat.
We have met a nested train, applying procedure to that first:
  Step 0: ⍴⍴⍴⍴(⍴⍴⍴)
  There are strictly more than 3 functions, repeat.
  We have met a nested train, applying procedure to that first:
    Step 0: ⍴⍴⍴
    There are 3 or less functions, we're done.
  Step 1: ⍴⍴(⍴⍴(⍴⍴⍴))
  There are 3 or less functions, we're done.
Step 1: ⍴⍴⍴(⍴⍴)⍴((⍴⍴(⍴⍴(⍴⍴⍴)))⍴⍴)
There are strictly more than 3 functions, repeat.
We have met a nested train, applying procedure to that first:
  Step 0: ⍴⍴
  There are 3 or less functions, we're done.
Step 2: ⍴⍴⍴((⍴⍴)⍴((⍴⍴(⍴⍴(⍴⍴⍴)))⍴⍴))
There are strictly more than 3 functions, repeat.
Step 3: ⍴(⍴⍴((⍴⍴)⍴((⍴⍴(⍴⍴(⍴⍴⍴)))⍴⍴)))
There are 3 functions or less, we're done.

Input

For this challenge, input will be simplified. This means that you can choose 2 different chars for opening and closing parentheses and 1 char for functions, different from the ones chosen for parentheses. The chars you choose must be consistent. The input will not be empty, and will not contain parentheses with no content (i.e. ()).

Output

Again, you may choose 3 different chars, 2 for parentheses and 1 for functions. Note that they need not be the same as the ones chosen for input, but they must be consistent.

Rules

  • If there are parentheses which only enclose one function within them in the input, you must remove them in the output. Your output may not contain unneeded parentheses (i.e. enclosing only one function, or enclosing the whole output).
  • You don't need to implement the algorithm used here, as long as your solution is valid for this challenge.
  • Input and output are strings in the format explained in the Input and Output sections. The input will have at least one character.
  • Using the standard loopholes is strictly forbidden.
  • This is , so shortest answer wins. However, there won't be an accepted answer, since this is a per-language competition, and to encourage answering in languages in which this task would result in longer code compared to code written in other languages.

Test cases

The chars used here are ()⍴, you should replace them with your chosen chars.

⍴                          -> ⍴
⍴                          -> ⍴
⍴⍴                         -> ⍴⍴
⍴⍴⍴                        -> ⍴⍴⍴
⍴⍴⍴⍴                       -> ⍴(⍴⍴⍴)
⍴⍴⍴⍴⍴⍴⍴⍴⍴⍴⍴⍴⍴⍴⍴            -> ⍴⍴(⍴⍴(⍴⍴(⍴⍴(⍴⍴(⍴⍴(⍴⍴⍴))))))
⍴⍴⍴⍴⍴(⍴⍴⍴)⍴⍴(⍴(⍴⍴⍴)⍴⍴⍴)⍴⍴⍴ -> ⍴(⍴⍴(⍴⍴((⍴⍴⍴)⍴(⍴(⍴(⍴⍴⍴)(⍴⍴⍴))(⍴⍴⍴)))))
(⍴⍴⍴)(⍴⍴⍴)(⍴⍴⍴)            -> (⍴⍴⍴)(⍴⍴⍴)(⍴⍴⍴)
(⍴⍴⍴)(⍴⍴⍴)⍴⍴⍴              -> (⍴⍴⍴)(⍴⍴⍴)(⍴⍴⍴)
⍴⍴(⍴)⍴⍴                    -> ⍴⍴(⍴⍴⍴)
((⍴⍴))                     -> ⍴⍴
⍴⍴((⍴⍴))⍴⍴                 -> ⍴⍴((⍴⍴)⍴⍴)

This challenge has been posted in the Sandbox. If you have the required privilege, you can view the sandbox post here.

Erik the Outgolfer

Posted 2017-12-11T17:48:05.803

Reputation: 38 134

2I think Fully is a better title than Clearly. – Adám – 2017-12-11T18:01:36.893

1Reference answer to any test case. – Adám – 2017-12-11T18:06:06.313

@Adám I did expect that reference answer, it won't get many upvotes though ;) – Erik the Outgolfer – 2017-12-11T18:31:12.923

@Adám The Clearly part in the title refers to the fact that you must remove unneeded parentheses. Fully is something you're supposed to do when you answer a challenge ;p – Erik the Outgolfer – 2017-12-11T18:32:34.293

Is it true that this function will always be idempotent? – Esolanging Fruit – 2017-12-12T02:29:24.987

Can we assume there will never be ()? – Esolanging Fruit – 2017-12-12T02:59:05.410

@EsolangingFruit Yes, you can assume there will be no (), however what do you mean by "idempotent"? – Erik the Outgolfer – 2017-12-12T12:26:44.953

@EriktheOutgolfer Idempotent means applying the function repeatedly won't change the result. I think the answer is "yes". – Martin Ender – 2017-12-12T12:49:00.313

@EsolangingFruit Well, with proper adjusting of the I/O chars, the function should be idempotent. – Erik the Outgolfer – 2017-12-12T12:51:19.103

I have to express my distaste for the additional requirement that degenerate inputs need to be trimmed. I don't think it really adds anything to the challenge. – Post Rock Garf Hunter – 2017-12-12T15:00:19.150

@WheatWizard Well, it was suggested to me in the sandbox. – Erik the Outgolfer – 2017-12-12T15:01:43.183

Do the three chosen characters have to be printable? – Οurous – 2017-12-18T01:27:20.557

@Οurous No, they don't. But you still need to actually output them. – Erik the Outgolfer – 2017-12-18T10:59:29.543

Answers

7

APL (Dyalog Classic), 71 68 65 63 bytes

0{⍵≡⍕⍵:⍵⋄⍬≡⍵:'⍬'⋄1=≢⍵:⍺∇⊃⍵⋄3≥≢⍵:⍺⌽')(',⍣⍺∊1∇¨⍵⋄⍺∇¯3(↓,∘⊂1∇↑)⍵}⍎

Try it online!

The characters I chose for I/O are '(', ')', and '⍬'.

This solution is itself an APL train.

parses the input as if it's a nested array - a tree with empty numeric vectors () as leaves.

The dfn (i.e. lambda - { }) traverses the tree recursively and converts it to a properly parenthesised string. The left argument controls whether parentheses should be added to the current level if necessary.

The dfn handles the following cases based on the right argument:

  • if it's already a string (⍵≡⍕⍵), return it

  • if it's , return the char '⍬'

  • if it's a singleton, just dig deeper ( is the symbol for a recursive call)

  • if its length is ≤3, recurse for each of the items and surround with () if necessary

  • otherwise, recurse for the 3-tail, prepend all but the 3-tail, and recurse again

ngn

Posted 2017-12-11T17:48:05.803

Reputation: 11 449

Looks like 63 characters, most of them Unicode. What character encoding produces 63 bytes for this? I make it 141 bytes in UTF8. – Corey – 2017-12-12T02:16:14.430

3

@Corey APL has its own 8-bit code page.

– zwol – 2017-12-12T02:31:00.320

@Corey Relevant meta post.

– Adám – 2017-12-21T17:23:23.027

@Adám Thanks for that. I looked but didn't know what to search for to get that answer up. – Corey – 2017-12-21T22:10:05.470

3

Python 2, 224 208 204 bytes

-16 bytes thanks to Mr. Xcoder -4 bytes thanks to ovs

r=str.replace
p='p'
def c(l):
 while len(l)>3:l=l[:-3]+(l[-3:],)
 return l and map(c,l)or l
print r(r(r(r(r(`c(eval(r(r(r(input(),'(p)',p),p,'[],'),')','),')))`,'[]',p),*'[('),*'])'),' ',''),',','')[1:-1]

Try it online! or Try all test cases

The code can be divided in 3 major steps:
Converting the input into a nested list and replacing (p)->p. The single function p will be replaced by an empty list.

eval(r(r(r(input(),'(p)',p),p,'[],'),')','),'))

A recursive function to apply the "3 or less" rule on the current list and call itself on all-sublists.

def c(l):
 while len(l)>3:l=l[:-3]+(l[-3:],)
 return l and map(c,l)or l

Lots of replaces to format to the desired output format

r(r(r(r(r(`c(...)`,'[]',p),*'[('),*'])'),' ',''),',','')[1:-1]

Rod

Posted 2017-12-11T17:48:05.803

Reputation: 17 588

204 bytes – ovs – 2017-12-11T21:11:55.963

1This doesn't simplify ((pp)) (or p((pp))p). – Martin Ender – 2017-12-12T12:40:55.723

2

CJam, 56 bytes

Beats APL!

lW%~]]]{{_,{K_,({~}|}&}%{_,3>}{3/(aa\+:~}w}:K~~`1>W<W%S/

Try it online!

This works (I think) and I have no idea why...

Input characters are ][T for ()⍴, and output characters are ][0 for ()⍴ (yes, this means they are reversed from what you would expect; for example, you might pass in TTT]TT[T]TTTT]TTT[[TT).

High-Level Overview

The program works with the input backwards, because it's more convenient. To parse the input, we leverage CJam's parser — reversing and executing the input provides the (backwards) parsed form of the input.

We then define a procedure K. K does most of the work for our submission, and it works as follows:

  • The input array will be a mix of zeroes and nonempty sub-arrays. Identify the sub-arrays and recursively apply K to them. The result should be another array, and if this array consists of just a single element, unpack it (this gets rid of redundant parentheses).
  • As long as the result has more than three elements, group its first three (not its last three; recall that the input is being processed backwards) into a single list.
  • Return the result.

By applying K to the input, we get the properly parenthesized form of the input (the only thing to note is that we actually wrap the input in a singleton list and unwrap it afterwards; the reason for this is that we want the snippet that unpacks singletons to apply to the top-level program, and not just its sub-arrays). Afterwards, we just apply some minimal formatting to get our result.

Some explanation for golfed bits

The golf I'm most proud of is using , to perform the check between integers and arrays.

  • If the top-of-stack is an integer n, , generates the range [0..n). Since the only integer we will encounter is 0, this always gives us the empty list [], which is falsey.
  • If the top-of-stack is an array, , takes its length. Since all arrays we encounter will be non-empty, this always gives us a positive integer, which is truthy.

Another golf that might be of interest is the method I use to group the first three elements of the array; it's somewhat similar to my "Turing complete language interpreter code golf" submission. CJam doesn't have a short way to break an array into two pieces (you can try to slice off the first part and then the other part while keeping the original array and index on the stack, but that doesn't work very well), so what I do instead is use 3/, which groups an array into blocks of 3. I can then pop off the first element (, wrap in array twice aa, and then append back on to the start of the list \+. The reason we wrap in array twice is that we have to take off a layer with :~, since we just grouped the rest of the array into sections as well.

Esolanging Fruit

Posted 2017-12-11T17:48:05.803

Reputation: 13 542

Nitpick: this beats APL without builtins. – Erik the Outgolfer – 2017-12-12T12:37:02.623

@EriktheOutgolfer Fair enough. – Esolanging Fruit – 2017-12-12T15:19:19.243

0

JavaScript (ES6), 149 146 bytes

i='a';f=s=>s==(s=s[R='replace'](/\((\w+)\)/,(q,t)=>(f[q=i+=0]=f(t),q)))&&s==(s=s[R](/(?!^)((a0+|p){3})$/,"($1)"))?s[R](/a0+/g,t=>`(${f[t]})`):f(s)
<textarea cols=80 id=I>ppp(pp)p(pppp(ppp))pp</textarea><br>
<button onclick=O.innerText=f(I.value)>Run</button><br>
<pre id=O></pre>

Uses ()p, although you to use a different letter you can just change the p toward the end.

ETHproductions

Posted 2017-12-11T17:48:05.803

Reputation: 47 880