Write a program which replaces with spaces the braces in cases where braces in places cause stasis

17

2

You are a project manager. One day, one of your programmers went insane (not your fault) and took all the expressions in the codebase and added random brackets to them, before quitting on the spot, ranting about your incompetence (also not your fault). This would be an easy fix, however, for some reason you aren't using revision control (totally not your fault). And for some reason, none of the other programmers want to go through every expression to fix up the mismatched brackets (by the way, that is not your fault). Programmers these days, you think to yourself. You will have to do it yourself. The horror! Such tasks were supposed to be beneath you...

The input will be a single line, which will contain a number of left brackets (( [ {) and right brackets () ] }). It also may, but not always, contain comments (/* */) and string literals (" " or ' ') and various numbers, letters, or symbols.

There will be at least one bracket (outside of a comment or string literal) that does not have a corrosponding opposite (outside of a comment or string literal). For example, an errant } without a { prior. Another example: a ( which does not have a ) afterwards. Your program will replace with a space the minimum number of brackets required to make the brackets match up.

Examples:

(4 + (2 + 3))] ==> (4 + (2 + 3)) (the square bracket at the end)
][][[]] ==> [][[]] (the square bracket at the start)
("Hel(o!")) ==> ("Hel(o!") (the parenthesis at the end)
( /* )]*/ ==> /* )]*/ (the parenthesis at the start)
{()] ==> () (the curly bracket and the square bracket)

  • Input can be taken from whichever way is most convenient (STDIN, command line argument, reading from a file, etc.)
  • If there is more than one way to resolve the mismatch with the same number of removals, either is acceptable.
  • There will be only mismatches in brackets. String literals and comments will always be correctly formed.
  • The title comes from this SO thread
  • There will never be any quotes in comments, quotes in quotes, comments in comments, or comments in quotes.

This is code golf, so minimum number of bytes wins. Ask questions in comments if the specification is not clear.

absinthe

Posted 2014-08-19T22:09:20.717

Reputation: 8 359

Whoops, our edits kind of collided there. :P Everything should be fixed now. – Doorknob – 2014-08-19T22:22:16.553

@Doorknob Thanks for that, by the way. Didn't know how to stop SE from wiping the spaces.​​​​​​​​​​​​​​​ – absinthe – 2014-08-19T22:22:47.130

Do we have to handle escaping stuff in string literals (e.g. ("foo (\") bar"))? – Doorknob – 2014-08-19T22:30:22.657

​​​​​​​​​​​​​​​@Doorknob To simplify things, I'm just going to say there won't be any double quotes ever inside a string literal in the input. – absinthe – 2014-08-19T22:34:10.207

Similarly, can there be quotes in a comment / a /* inside quotes / etc.? For example this is "a tricky /* input" /* it is "very tricky */ – Doorknob – 2014-08-19T22:38:27.187

​​​​​​​​​​​​​​​@Doorknob How about this. Quotes override comments. So in your example, everything between a and is is a quote, and the /* and " inside that just become part of the quote. This leaves the */ on the outside as a mismatched quote, which won't every happen according to the bullet point. – absinthe – 2014-08-19T22:41:52.347

Why do quotes override comments, but comments not override quotes? That seems somewhat inconsistent. It would make more sense if either a.) There will never be quotes in comments / comments in quotes, or b.) There can be both quotes in comments and comments in quotes. – Doorknob – 2014-08-19T22:43:32.843

​​​​​​​​​​​​​​​@Doorknob Yeah, I suppose you're right. We'll go with the former option then, so any input where there are quotes in comments or comments in quotes will never happen. I'll edit to reflect that. – absinthe – 2014-08-19T22:44:52.313

@Doorknob In most languages isn't the deal with strings comments just that what ever starts first continues until the ending delimiter is encountered? – Martin Ender – 2014-08-19T23:01:46.777

@MartinBüttner Yes, which would be option B in my comment. But option A is simpler for the challenge. – Doorknob – 2014-08-19T23:42:10.453

Is the correct output for {{(}) {(}) or { } ? – QuadmasterXLII – 2014-08-20T18:04:10.330

​​​​​​​​​​​​​​​@QuadmasterXLII The correct output would be {(}) (based on my definition above), but I'm inclined to let anything go for these edge cases, since they're so confusing for everybody involved :/ – absinthe – 2014-08-20T22:28:27.247

1I would argue that the correct output for {{(}) should be { } or equivalent, since the opening scenario implies that the code was working to begin with, and {(}) counts as mismatched brackets in every programming language I know (i.e. "causes stasis"??). But, then, I already wrote an answer, so I'm biased. – DLosc – 2014-08-21T22:48:17.233

@DLosc, that is indeed the syntactically correct answer. However, I wanted to highlight the incompetence of the project manager in the scenario. – absinthe – 2014-08-21T22:50:16.860

3I see. Guess I'm just not incompetent enough. ;) – DLosc – 2014-08-21T23:00:46.230

Answers

6

Ruby, 223 characters

This one turned out a bit long.

u,b,i=[],[[],[],[]],-1
s=gets.gsub(/(\/\*|"|').*?(\*\/|"|')|~/){|m|u+=[m];?~}
s.chars{|c|i+=1
(t='{[('.index(c))?b[t].push(i):((t='}])'.index(c))&&(b[t].pop||s[i]=' '))}
b.flatten.map{|l|s[l]=' '}
puts s.gsub(/~/){u.shift}

What it does is take out the strings and comments first, so they don't get counted (and puts them back in later).

Then, it goes through the string character by character. When it finds an opening brace, it stores its position. When it finds a closing brace, it pops from its respective open brace storage array.

If pop returns nil (i.e. there weren't enough opening braces), it removes the closing brace. After this entire thing is done, it removes the remaining extra opening braces (i.e. there weren't enough closing braces).

At the end of the program, it puts all the strings and comments back and outputs them.

Ungolfed:

in_str = gets

# grab strings and comments before doing the replacements
i, unparsed = 0, []
in_str.gsub!(/(\/\*|"|').*?(\*\/|"|')|\d/){|match| unparsed.push match; i += 1 }

# replaces with spaces the braces in cases where braces in places cause stasis
brace_locations = [[], [], []]
in_str.each_char.with_index do |chr, idx|
    if brace_type = '{[('.index(chr)
        brace_locations[brace_type].push idx
    elsif brace_type = '}])'.index(chr)
        if brace_locations[brace_type].length == 0
            in_str[idx] = ' '
        else
            brace_locations[brace_type].pop
        end
    end
end
brace_locations.flatten.each{|brace_location| in_str[brace_location] = ' ' }

# put the strings and comments back and print
in_str.gsub!(/\d+/){|num| unparsed[num.to_i - 1] }
puts in_str

Doorknob

Posted 2014-08-19T22:09:20.717

Reputation: 68 138

This is seriously impressive. One question, though: will it still work for an input like (("string"/*comment*/)"string"? If I'm reading (the ungolfed version) correctly, you replace strings and comments with their index in the unparsed array, which would lead to a substitution like ((12)3 and then looking for a nonexistent index 12 (or 11). I see the golfed version just uses shift, but might it not still have a similar problem? – DLosc – 2014-08-21T23:49:21.227

4

Python 3, 410 322 317

import re;a='([{';z=')]}';q=[re.findall('".*?"|/\*.*?\*/|.',input())]
while q:
 t=q.pop(0);s=[];i=0
 for x in t:
  if x in a:s+=[x]
  try:x in z and 1/(a[z.find(x)]==s.pop())
  except:s=0;break
 if[]==s:print(''.join(t));break
 while 1:
  try:
   while t[i]not in a+z:i+=1
  except:break
  u=t[:];u[i]=' ';q+=[u];i+=1

Tries every possible set of deletions, starting with smaller ones, until it finds one where the braces are balanced. (By which I mean fully correctly balanced: {{(}) produces ( ), not {(}).)

The first version used a recursive generator function, which was really cool but also really long. This version performs a simple breadth-first search using a queue. (Yes, it's a factorial time algorithm. What's the problem? :^D)

DLosc

Posted 2014-08-19T22:09:20.717

Reputation: 21 213

I like this one because it actually finds the least removal and produces correctly nested expressions, but the last comment by @vonilya suggests that correct nesting isn't important. However, it is really slow if lots of braces need to be removed. – rici – 2014-08-21T06:09:43.200

2

C - 406

An Attempt in C without using regular expressions.

#define A if((d==125||d==93||d==41)
char*s;t[256];f(i,m,n,p){while(s[i]!=0){int c=s[i],k=s[i+1],v=1,d;if((c==42&&k==47)||(c==m&&i>n))return i;if(!p||p==2){if((c==39||c==34)||(c==47&&k==42)){i=f(i,c*(c!=47),i,p+1);
c=c==47?42:c;}d=c+1+1*(c>50);A){v=f(i+1,d,i,2);if(!p&&v)t[d]++;if(p==2&&v)i=v;}}d=c;A&&!p){v=!!t[c];t[c]-=v;}if(p<2)putchar(c*!!v+32*!v);i++;}return 0;}main(int c,char*v[]){s=v[1];f(0,0,0,0);}

To compile and run (on a linux machine):
gcc -o brackets brackets.c
./brackets "[(])"

In undefined cases like [ ( ] ) it returns the last valid bracket pair ( )

Markuz

Posted 2014-08-19T22:09:20.717

Reputation: 1 824

2

Python 3, 320

import re
O=dict(zip('([{',')]}'))
def o(s,r,m=0,t=[]):m+=re.match(r'([^][)({}/"]|/(?!\*)|/\*.*?\*/|".*?")*',s[m:]).end();return r and o(s[:m]+' '+s[m+1:],r-1,m+1,t)or(o(s,r,m+1,t+[O[s[m]]])if s[m]in O else[s[m]]==t[-1:]and o(s,r,m+1,t[:-1]))if s[m:]else not t and s
s=input();i=0;a=0
while not a:a=o(s,i);i+=1
print(a)

Like DLosc's solution, this investigates every possible deletion, but it uses a recursive explore and fallback strategy which is a lot faster. I know that speed is not a criterion in code golf, and exhaustive search is in any case exponential, but this one can handle inputs like ({({({({({({({({(}}}}}}}} in a couple of seconds.

rici

Posted 2014-08-19T22:09:20.717

Reputation: 601

Well played, well played. I did manage to get down to 317, but I think you should be able to pass that pretty easily. (Meanwhile, my program is still churning away at your example input...) – DLosc – 2014-08-22T18:20:31.303

@DLosc: Don't hold your breath :). It took my machine 58 minutes to do the version of that pattern with 6 open parens. In order to solve the stasis before the universe reaches heat-death, you'll need to memoise the queue; otherwise, you end up with an O(n!!) solution, not O(n!). (My golf is O(n*2^n) instead of O(2^n), because o actually produces all the patterns with up to r removals, instead of exactly r removals. Easy to fix, but it would cost a few characters.) – rici – 2014-08-23T03:24:14.910