Write a code golfer

26

7

Your job, should you choose to not refuse it, is to write a program that will "golf" a piece of given code in your language of choice.

Basic functionality:

  • Remove extraneous whitespace
  • Remove comments

'Advanced' functionality:

  • Combine declarations (int x; int y; int z; to int x, y, z;)
  • Use shorter variations of basic control structures (while(1) is replaced with for(;;))
  • Move code around (i = 0; while(i < 10) { /* ... */ ++i; } to for(i = 0; i < 10; ++i) { /* ... */ })
  • Etc

Mateen Ulhaq

Posted 2011-09-05T19:19:53.493

Reputation: 1 889

Question was closed 2017-04-27T11:36:52.230

Change to popcon explained in discussion on meta

– Peter Taylor – 2014-04-16T14:51:14.787

I proposed a question which turned out to be a duplicate of this one. Hopefully the new activity on this question will attract some new answers, and more languages too.

– trichoplax – 2014-04-16T15:41:11.167

11I think you need some actual criteria for scoring submissions. This is way too vague as it is. – migimaru – 2011-09-05T19:35:32.800

1

I was already thinking about a Golfscript golfer in Golfscript for my blog, so I'll certainly think about this. But I agree that you don't have a criterion for selecting a winner. You might find it useful to read the discussion on an question idea migimaru proposed.

– Peter Taylor – 2011-09-05T21:10:12.280

I guess the largest % of characters in a 'normal' ungolfed simple "Hello World!" program reduced is a start. – Mateen Ulhaq – 2011-09-05T21:29:10.020

1@muntoo A typical "Hello world" program is usually already pretty condensed. – Casey Chu – 2011-09-05T22:44:52.280

@Casey Yeah, but I wanted something that's easily available for any language. I suppose FizzBuzz is a slightly better option. – Mateen Ulhaq – 2011-09-05T23:09:47.263

What would be more ideal would be to run an answer program against a secret collection of small programs, and score the answer based on the total of all the programs' new lengths. Problem is, you would need a set of programs for each language golfed by an answer, and a way to compensate for differences in language. – Joey Adams – 2011-09-06T01:11:56.410

@Joey You could request users to send example programs for each language to a mail adress. – FUZxxl – 2011-09-06T13:05:38.080

7If somebody else reduces your program with his program successfully, you're out. :) The winner is the one, who is the longest time on top. Submission time is somehow substracted or added - I have to think about it. – user unknown – 2011-09-06T13:56:12.820

Perhaps I may have to limit the input/output language to one only (not sure which... GolfScript?). Then I can do: "The program with the smallest ratio of new_code to old_code will win." – Mateen Ulhaq – 2011-09-06T18:37:10.280

6If I had such a piece of software, I'd rather use it to my own benefit than share it with rivals ;) – J B – 2011-09-07T12:15:38.793

I think a program should be rated by the % by which it has been reduced. And by the way, I write my programs with ridiculously long variables: m_pointerToAnUnsignedIntegerWhichMayHappenToBeYetAnotherPointer :D – Neil – 2011-09-13T14:15:54.350

1@Neil What does your for loop look like? for ( size_t index_variable_in_3rd_nested_for_loop = 0; index_variable_in_3rd_nested_for_loop < MAX_VALUE_OF_index_variable_in_3rd_nested_for_loop; ++index_variable_in_3rd_nested_for_loop ) the_sum_which_is_determined_in_3rd_nested_for_loop += the_array_of_numbers_which_total_the_sum_which_is_determined_in_3rd_nested_for_loop[ index_variable_in_3rd_nested_for_loop ]; – Mateen Ulhaq – 2011-09-14T00:48:40.000

@munto: Oh good god no. It's at least three times as bad. Why use size_t when you can typedef it to "AmbiguouslyLargeSizeTTypeSomewhereBetweenFourAndEightBytes"? ;) – Neil – 2011-09-15T11:18:09.230

@Neil You mean AmbiguouslyLargeSizeTTypeSomewhereBetweenAndIncludingFourAndEightBytes. – Mateen Ulhaq – 2011-09-15T23:14:21.403

1@muntoo: Yes, excellent point. Could be interpreted to mean that it excludes the four and eight bytes. – Neil – 2011-09-16T12:20:34.363

I've rejected a proposed edit which wanted to put shortening method names in the basic functionality. Since there's no space to give a detailed reason in the rejection form, this seems to be the best place. It seems to me that it's covered already by the "Etc" for advanced functionality; and in addition, this question doesn't seem to be going anyway so it's rather a trivial change to zombie it. – Peter Taylor – 2011-10-02T21:28:37.553

Answers

18

Python with Python

Does a bunch of stuff including renaming of variables, getting rid of unnecessary whitespace and comments, and putting as much as it can on one line. Doesn't always completely work with fancier python syntax and I'll will be continuing to update with any fixes.

Code:

import string
import keyword
import pkgutil

builtins = __builtins__.__dict__.keys()

vars = {}
#imported = builtins+string.__dict__.keys()+['append','extend','count','index','insert','pop','remove','reverse','sort']
multiline = ''
ml_last = ''
strings = []
defined = []
undefined = []

def get_name(name):
    if name.startswith('__'):
        vars[name] = name
        return name
    if name in vars:
        return vars[name]

    for c in string.letters+'_':
        if c not in vars.values():
            vars[name] = c
            return c

    for c0 in string.letters+'_':
        for c1 in string.letters+string.digits+'_':
            if c0+c1 not in vars.values():
                if c0+c1 in keyword.kwlist:
                    continue
                vars[name] = c0+c1
                return c0+c1

def replace_names(expr,defining=False,prefix = '',assign=True):
    if ';' in expr:
        ns = ''
        for e in expr.split(';'):
            ns += replace_names(e,assign=assign)+';'
        return ns[:-1]

    global multiline
    expr = expr.strip()
    if expr in ['']+keyword.kwlist:
        return expr
    if expr == '""':
        return '"'+strings.pop(0)+'"'
    if expr == "''":
        return "'"+strings.pop(0)+"'"

    if '=' in expr and assign:
        e = expr[:]
        vals = ['']
        while '=' in e:
            i = e.index('=')
            if e != '' and e[0] == '=':
                vals[-1] += '='
                e = e[1:]
                continue
            if e[i-1] not in '!<>*/+-%' and e[i+1] != '=' and (vals[-1]+e[:i]).count('(') == (vals[-1]+e[:i]).count(')'):
                vals[-1] += e[:i]
                e = e[i+1:]
                vals.append('')
            else:
                vals[-1] += e[:i+1]
                e = e[i+1:]

        if len(vals) > 1:
            vals[-1] += e
            ns = ''
            left,val = vals[:-1],vals[-1]
            for l in left:
                rs = replace_names(l,True,assign=assign)
                ns += rs+'='
            ns += replace_names(val,assign=assign)
            return ns

    if expr[0] in ['(','[','{']:
        try:
            delimit = expr[0]
            i = 0; level = 1
            while level > 0:
                i += 1
                char = expr[i]
                if char in '([{':
                    level += 1
                if char in ')]}':
                    level -= 1
            inner = expr[1:i]
            rest = expr[i+1:]
            return expr[0]+replace_names(inner,defining,assign=False)+expr[i]+replace_names(rest,defining,assign=assign)

        except IndexError:
            multiline = expr
            return ''

    if expr.startswith('for') and not expr.endswith('in'):
        varname = ''
        curword = ''
        for i,char in enumerate(expr):
            if char in string.letters+string.digits+'_':
                curword += char
            else:
                if curword == 'in':
                    break
                curword = ''
            varname += char
        rest = expr[i:]

        dpart = replace_names(varname[3:-2],True,assign=assign)
        rpart = replace_names(rest,assign=assign)
        return 'for' + ' '*(dpart[0] in string.letters+string.digits+'_') + dpart + ' '*(dpart[-1] in string.letters+string.digits+'_') + 'in' + ' '*(rpart[0] in string.letters+string.digits+'_') + rpart

    if expr.startswith('lambda'):
        args = expr.split('lambda',1)[1].split(':')[0]
        replace_names(args,True,assign=assign)

    poses = ['' if e == -1 else e for e in (expr.find(char) for char in ['(','[','{'])]
    pos = min(poses)
    if pos != '':
        delimit = '([{'[poses.index(pos)]
        first,rest = expr.split(delimit,1)
        return replace_names(first,defining,assign=assign)+replace_names(delimit+rest,defining,assign=assign)

    multiline = ''
    if ' ' in expr:
        ns = ''
        for sub in expr.split(' '):
            rs = replace_names(sub,defining,assign=assign)
            if rs == '':
                continue
            if ns != '' and (ns[-1] in string.letters+string.digits+'_' and rs[0] in string.letters+string.digits+'_'):
                ns += ' '+rs
            else:
                ns += rs
        return ns

    for cmp in ['**=','//=','==','!=','<>','<=','>=','+=','-=','*=','/=','%=','//','**','<<','>>','<','>','+','-','*','/','%','&','|','^','~',':',',','.']:
        if cmp in expr:
            ns = ''
            for sub in expr.split(cmp):
                rs = replace_names(sub,defining,prefix,assign=assign)+cmp
                ns += rs 
                if cmp == '.':
                    prefix += rs 
            return ns[:-len(cmp)]

    if expr[0] in string.digits:
        return expr
    if not defining and expr not in defined:
        #print '---',prefix+expr
        if prefix+expr not in undefined and (prefix == '' or prefix[0] != '.') :
            undefined.append(prefix+expr)
        return expr
    if defining:
        if prefix+expr in undefined:
            undefined.remove(prefix+expr)
        if expr not in defined:
            defined.append(expr)
    return get_name(expr)

def fix_names(line):
    for cmp in ['**=','//=','==','!=','<>','<=','>=','+=','-=','*=','/=','%=','//','**','<<','>>','<','>','+','-','*','/','%','&','|','^','~',':',',','.','=','(',')','[',']','{','}']:
        if cmp in line:
            ns = ''
            for sub in line.split(cmp):
                ns += fix_names(sub)+cmp
            return ns[:-len(cmp)]
    if line in defined and line not in vars.values():
        return vars[line]
    return line

def first_pass(file):
    lines_firstpass = []
    for line in file:
        if line.strip() == '':
            continue
        indent = 0
        for char in line:
            if not char in string.whitespace:
                break
            indent += 1

        if multiline != '':
            line_string = ml_last
        else:
            line_string = ''
            #line_string = '\t'*indent
        line = multiline + line.strip()

        newline = ''
        while line:
            char = line[0]
            if char in ['"',"'"]:
                limit=char; i=0
                inside = ''
                escape = False
                while True:
                    i+=1; char = line[i]
                    if escape:
                        escape = False
                        inside += char
                        continue
                    if char == '\\':
                        escape = True
                    elif char == limit:
                        break
                    inside += char
                strings.append(inside)
                newline += limit*2
                line = line[i+1:]
            else:
                if char == '#':
                    break
                newline += char
                line = line[1:]
        line = newline

        if line == '':
            continue 
        if ' ' not in line:
            first = ''
        else:
            first,line = line.split(' ',1)  

        if first in ['class','def']:
            name = line.split('(')[0].split(':')[0].strip()
            line_string += first+' '
            defined.append(name)
            line_string += get_name(name)
            if '(' in line:
                line_string += '('
                inner = line.split('(',1)[1]
                inner = ')'.join(inner.split(')')[:-1])
                part = ''
                for char in inner:
                    if char == ',' and part.count('(') == part.count(')'):
                        line_string += replace_names(part,True)+','
                        part = ''
                    else:
                        part += char
                line_string += replace_names(part,True)+')'
            line_string += ':'

        to_import = ''
        importing = []
        if first == 'from':
            module,rest = line.split('import')
            module = module.strip()
            #imported.append(module)
            first,line = 'import',rest
            to_import += 'from '+module+' '

        if first == 'import':
            to_import += 'import '
            for module in line.split(','):
                module = module.strip()
                #imported.append(module)
                to_import += module+','
            to_import = to_import[:-1]
            line_string += to_import

        if line_string.strip() == '':
            r = replace_names(first+' '+line)
            if multiline != '':
                ml_last = line_string + r
                continue
            line_string += r
            ml_last = ''
        lines_firstpass.append((indent,line_string))
        #print '\t'*indent+line_string

    for i,(indent,line) in enumerate(lines_firstpass):
        lines_firstpass[i] = (indent,fix_names(line))
    return lines_firstpass

def second_pass(firstpass):
    lines = []
    current_line = ''
    current_line_indent = 0
    last_indent = 0
    for i,(indent,line) in enumerate(firstpass):
        for kw in keyword.kwlist:
            if line[:len(kw)] == kw:
                first = kw
                line = line[len(kw):]
                break
        else:
            first = ''
        limit=';'
        for kw in ['import','global']:
            if first == kw and current_line.startswith(kw):
                first = ''
                line = line.strip()
                limit=','

        if first not in ['if','elif','else','while','for','def','class','try','except','finally'] and indent == last_indent:
            current_line += limit*(current_line != '') + first + line
        else:
            lines.append((current_line_indent,current_line))
            current_line = first + line
            current_line_indent = indent
        last_indent = indent
    lines.append((current_line_indent,current_line))

    new_lines = []
    i = 0
    while i < len(lines):
        indent,line = lines[i]
        if i != len(lines)-1 and lines[i+1][0] == indent + 1 and (i == len(lines)-2 or lines[i+2][0] <= indent):
            new_lines.append((indent,line+lines[i+1][1]))
            i += 1
        else:
            new_lines.append((indent,line))
        i += 1
    return new_lines

def third_pass(lines):
    new_definitions = ''
    for u in sorted(undefined,key=lambda s:-s.count('.')):
        #print u
        parts = u.split('.')
        if parts[0] in vars.values():
            continue
        c = 0
        for indent,line in lines:
            if line.startswith('import'):
                continue
            c += line.count(u)
        if c > 1:
            new_definitions += ';'*(new_definitions!='')+get_name(u)+'='+u
            for ind,(indent,line) in enumerate(lines):
                if line.startswith('import'):
                    continue
                nline = ''
                cur_word = ''
                i = 0
                while i < len(line):
                    char = line[i]
                    if char not in string.letters+string.digits+'_.':
                        if cur_word == u:
                            nline += get_name(u)
                        else:
                            nline += cur_word
                        cur_word = ''
                        nline += char
                        i += 1
                        continue
                    if char in '"\'':
                        nline += char
                        limit = char
                        escape = False
                        while True:
                            i += 1
                            char = line[i]
                            nline += char
                            if escape:
                                escape = False
                                continue
                            if char == '\\':
                                escape = True
                            if char == limit:
                                break
                        i += 1
                        continue
                    cur_word += char
                    i += 1
                lines[ind] = (indent,nline+cur_word)

    return [lines[0]]+[(0,new_definitions)]+lines[1:]

def golf(filename):
    file = open(filename)
    write_file = open('golfed.py','w')
    for indent,line in third_pass(second_pass(first_pass(file))):
        write_file.write('\t'*(indent/2)+' '*(indent%2)+line+'\n')
    file.close()
    write_file.close()

#print first_pass(["for u in sorted(undefined,key=lambda s:-s.count('.')):"])
golf('golfer.py')

Tested on an old fractal drawing program I had (4672 to 1889):

Original:

import pygame
import math
import os
import colorsys
from decimal import *

#two = Decimal(2)
#half = Decimal(0.5)

def fractal_check_point(function,x,y):
    #n = (Decimal(0),Decimal(0))
    n = (0,0)
    i = 0
    last_dist = 0
    while n[0]**2 + n[1]**2 <= 16 and i < max_iter:
        nr,ni = function(n)
        n = (nr+x,ni+y)
        i+=1
    if i == max_iter:
        return False

    #extra = math.log(math.log( (n[0]**two + n[1]**two)**half )/math.log(300),2)
    extra = math.log(math.log( (n[0]**2 + n[1]**2)**0.5 )/math.log(300),2)

    #prev = math.sqrt(last_dist)
    #final = math.sqrt(n.real**2+n.imag**2)

    return i - extra

def f((r,i)):
    return (r**2 - i**2, 2*r*i)

screen_size = (500,500)
try: screen = pygame.display.set_mode(screen_size)
except pygame.error:
    print 'Too large to draw to window...'
    screen = pygame.Surface(screen_size)

#pixels = pygame.PixelArray(screen)

#xmin = Decimal(- 2.2)
#xmax = Decimal(.8)
#
#ymin = Decimal(- 1.5)
#ymax = Decimal(1.5)

max_iter = 50

xmin = -2.2
xmax = 0.8
ymin = -1.5
ymax = 1.5


def draw_fractal():
    print repr(xmin),repr(xmax)
    print repr(ymin),repr(ymax)
    print

    xlist = []
    ylist = []

    for x in range(screen_size[0]):
        #xlist.append(Decimal(x)*(xmax-xmin)/Decimal(screen_size[0])+xmin)
        xlist.append(x*(xmax-xmin)/screen_size[0]+xmin)

    for y in range(screen_size[1]):
        #ylist.append(Decimal(y)*(ymax-ymin)/Decimal(screen_size[1])+ymin)
        ylist.append(y*(ymax-ymin)/screen_size[1]+ymin)

    xi = 0
    for x in xlist:
        yi = 0
        for y in ylist:
            val = fractal_check_point(f,x,y)
            if val == False:
                screen.set_at((xi,yi),(0,0,0))
                #pixels[xi][yi] = (0,0,0)
            else:
                r,g,b = colorsys.hsv_to_rgb(val/10.0 % 1, 1, 1)
                screen.set_at((xi,yi),(r*255,g*255,b*255))
                ##screen.set_at((xi,yi),(0,(val/300.0)**.25*255,(val/300.0)**.25*255))
                #pixels[xi][yi] = (0,(val/300.0)**.25*255,(val/300.0)**.25*255)

            yi += 1
        xi += 1
        pygame.event.get()
        pygame.display.update((xi-1,0,1,screen_size[1]))
    save_surface('F:\FractalZoom\\')

def save_surface(dirname):
    i = 0
    name = '%05d.bmp' % i
    while name in os.listdir(dirname):
        i += 1
        name = '%05d.bmp' % i
    pygame.image.save(screen,dirname+name)
    print 'saved'

x_min_step = 0
x_max_step = 0
y_min_step = 0
y_max_step = 0

savefail = 0

def zoom(xmin_target,xmax_target,ymin_target,ymax_target,steps):
    global xmin
    global xmax
    global ymin
    global ymax

    xc = (xmax_target + xmin_target)/2
    yc = (ymax_target + ymin_target)/2

    d_xmin = ((xc-xmin_target)/(xc-xmin))**(1.0/steps)
    d_xmax = ((xc-xmax_target)/(xc-xmax))**(1.0/steps)
    d_ymin = ((yc-ymin_target)/(yc-ymin))**(1.0/steps)
    d_ymax = ((yc-ymax_target)/(yc-ymax))**(1.0/steps)

    for s in range(steps):
        xmin = xc-(xc-xmin)*d_xmin
        xmax = xc-(xc-xmax)*d_xmax
        ymin = yc-(yc-ymin)*d_ymin
        ymax = yc-(yc-ymax)*d_ymax

        draw_fractal()
        save_dir = 'D:\FractalZoom\\'
        global savefail
        if not savefail:
            try:
                save_surface(save_dir)
            except:
                print 'Warning: Cannot save in given directory '+save_dir+', will not save images.'
                savefail = 1

#zoom(.5,.6,.5,.6,10)
#zoom(-1.07996839017,-1.07996839014,-0.27125861927,-0.27125861923,100)

#n = 1
#while 1:
#    pygame.display.update()
#    pygame.event.get()
#    
#    def f(x):
#        if x == 0:
#            return 0
#        else:
#            return x**n
#    draw_fractal()
#    n += .0001

draw_fractal()
zooming = 0
#firstx = Decimal(0)
#firsty = Decimal(0)
firstx = firsty = 0

clicking = 0
while 1:
    pygame.display.update()
    pygame.event.get()
    mx, my = pygame.mouse.get_pos()
    rx, ry = pygame.mouse.get_rel()

#   mx = Decimal(mx)
#   my = Decimal(my)
#   sx = Decimal(screen_size[0])
#   sy = Decimal(screen_size[1])
    sx = screen_size[0]
    sy = screen_size[1]

    if pygame.mouse.get_pressed()[0]:
        if clicking == 0:
            clicking = 1
        if zooming and clicking == 1:
            secondx = mx*(xmax-xmin)/sx+xmin
            secondy = my*(ymax-ymin)/sy+ymin

            firstx = firstx*(xmax-xmin)/sx+xmin
            firsty = firsty*(ymax-ymin)/sy+ymin

            if secondx < firstx:
                xmin = secondx
                xmax = firstx
            else:
                xmin = firstx
                xmax = secondx

            if secondy < firsty:
                ymin = secondy
                ymax = firsty

            else:
                ymin = firsty
                ymax = secondy

            screen.fill((0,0,0))
            screen.lock()
            draw_fractal()
            screen.unlock()
            zooming = 0 

        elif clicking == 1:
            firstx = mx
            firsty = my

            zooming = 1
            screen.set_at((firstx,firsty),(255,255,255))

        if clicking:
            clicking = 2

    else:
        clicking = 0

Golfed:

import pygame,math,os,colorsys;from decimal import *
ai=pygame.event.get;aj=pygame.display.update;ak=math.log;al=pygame.mouse;am=False;an=pygame;ao=repr;ap=range;aq=os
def a(b,c,d):
 e=(0,0);f=0;g=0
 while e[0]**2+e[1]**2<=16 and f<o:h,i=b(e);e=(h+c,i+d);f+=1
 if f==o:return False
 j=ak(ak((e[0]**2+e[1]**2)**0.5)/ak(300),2);return f-j
def k((l,f)):return(l**2-f**2,2*l*f)
m=(500,500)
try:n=pygame.display.set_mode(m)
except pygame.error:print'Too large to draw to window...';n=pygame.Surface(m)
o=50;p=-2.2;q=0.8;r=-1.5;s=1.5
def t():
 print ao(p),ao(q);print ao(r),ao(s);print;u=[];v=[]
 for c in ap(m[0]):u.append(c*(q-p)/m[0]+p)
 for d in ap(m[1]):v.append(d*(s-r)/m[1]+r)
 w=0
 for c in u:
    x=0
    for d in v:
     y=a(k,c,d)
     if y==am:n.set_at((w,x),(0,0,0))
     else:l,z,A=colorsys.hsv_to_rgb(y/10.0%1,1,1);n.set_at((w,x),(l*255,z*255,A*255))
     x+=1
    w+=1;ai();aj((w-1,0,1,m[1]))
 B('F:\FractalZoom\\')
def B(C):
 f=0;D='%05d.bmp'%f
 while D in os.listdir(C):f+=1;D='%05d.bmp'%f
 pygame.image.save(n,C+D);print'saved'
E=0;F=0;G=0;H=0;I=0
def J(K,L,M,N,O):
 global p,q,r,s;P=(L+K)/2;Q=(N+M)/2;R=((P-K)/(P-p))**(1.0/O);S=((P-L)/(P-q))**(1.0/O);T=((Q-M)/(Q-r))**(1.0/O);U=((Q-N)/(Q-s))**(1.0/O)
 for V in ap(O):
    p=P-(P-p)*R;q=P-(P-q)*S;r=Q-(Q-r)*T;s=Q-(Q-s)*U;t();W='D:\FractalZoom\\';global I
    if not I:
     try:B(W)
     except:print'Warning: Cannot save in given directory '+W+', will not save images.';I=1
t();X=0;Y=Z=0;_=0
while 1:
 aj();ai();aa,ab=pygame.mouse.get_pos();ac,ad=pygame.mouse.get_rel();ae=m[0];af=m[1]
 if pygame.mouse.get_pressed()[0]:
    if _==0:_=1
    if X and _==1:
     ag=aa*(q-p)/ae+p;ah=ab*(s-r)/af+r;Y=Y*(q-p)/ae+p;Z=Z*(s-r)/af+r
     if ag<Y:p=ag;q=Y
     else:p=Y;q=ag
     if ah<Z:r=ah;s=Z
     else:r=Z;s=ah
     n.fill((0,0,0));n.lock();t();n.unlock();X=0
    elif _==1:Y=aa;Z=ab;X=1;n.set_at((Y,Z),(255,255,255))
    if _:_=2
 else:_=0

Run on itself (creating a very long quine) (9951 to 5323):

import string,keyword,pkgutil;a=__builtins__.__dict__.keys();b={};c='';d='';e=[];f=[];g=[]
aw=string.letters;ax=string.digits;ay=keyword.kwlist;az=False;aA=True;aB=len;aC=enumerate;aD=open
def h(i):
 if i.startswith('__'):b[i]=i;return i
 if i in b:return b[i]
 for j in aw+'_':
    if j not in b.values():b[i]=j;return j
 for k in aw+'_':
    for l in aw+ax+'_':
     if k+l not in b.values():
        if k+l in ay:continue
        b[i]=k+l;return k+l
def m(n,o=az,p='',q=aA):
 if';'in n:
    r=''
    for s in n.split(';'):r+=m(s,q=q)+';'
    return r[:-1]
 global c;n=n.strip()
 if n in['']+ay:return n
 if n=='""':return'"'+e.pop(0)+'"'
 if n=="''":return"'"+e.pop(0)+"'"
 if'='in n and q:
    s=n[:];t=['']
    while'='in s:
     u=s.index('=')
     if s!=''and s[0]=='=':t[-1]+='=';s=s[1:];continue
     if s[u-1]not in'!<>*/+-%'and s[u+1]!='='and(t[-1]+s[:u]).count('(')==(t[-1]+s[:u]).count(')'):t[-1]+=s[:u];s=s[u+1:];t.append('')
     else:t[-1]+=s[:u+1];s=s[u+1:]
    if aB(t)>1:
     t[-1]+=s;r='';v,w=t[:-1],t[-1]
     for x in v:y=m(x,aA,q=q);r+=y+'='
     r+=m(w,q=q);return r
 if n[0]in['(','[','{']:
    try:
     z=n[0];u=0;A=1
     while A>0:
        u+=1;B=n[u]
        if B in'([{':A+=1
        if B in')]}':A-=1
     C=n[1:u];D=n[u+1:];return n[0]+m(C,o,q=az)+n[u]+m(D,o,q=q)
    except IndexError:c=n;return''
 if n.startswith('for')and not n.endswith('in'):
    E='';F=''
    for u,B in aC(n):
     if B in aw+ax+'_':F+=B
     else:
        if F=='in':break
        F=''
     E+=B
    D=n[u:];G=m(E[3:-2],aA,q=q);H=m(D,q=q);return'for'+' '*(G[0]in aw+ax+'_')+G+' '*(G[-1]in aw+ax+'_')+'in'+' '*(H[0]in aw+ax+'_')+H
 if n.startswith('lambda'):I=n.split('lambda',1)[1].split(':')[0];m(I,aA,q=q)
 J=[''if s==-1 else s for s in(n.find(B)for B in['(','[','{'])];K=min(J)
 if K!='':z='([{'[J.index(K)];L,D=n.split(z,1);return m(L,o,q=q)+m(z+D,o,q=q)
 c=''
 if' 'in n:
    r=''
    for M in n.split(' '):
     y=m(M,o,q=q)
     if y=='':continue
     if r!=''and(r[-1]in aw+ax+'_'and y[0]in aw+ax+'_'):r+=' '+y
     else:r+=y
    return r
 for N in['**=','//=','==','!=','<>','<=','>=','+=','-=','*=','/=','%=','//','**','<<','>>','<','>','+','-','*','/','%','&','|','^','~',':',',','.']:
    if N in n:
     r=''
     for M in n.split(N):
        y=m(M,o,p,q=q)+N;r+=y
        if N=='.':p+=y
     return r[:-aB(N)]
 if n[0]in ax:return n
 if not o and n not in f:
    if p+n not in g and(p==''or p[0]!='.'):g.append(p+n)
    return n
 if o:
    if p+n in g:g.remove(p+n)
    if n not in f:f.append(n)
 return h(n)
def O(P):
 for N in['**=','//=','==','!=','<>','<=','>=','+=','-=','*=','/=','%=','//','**','<<','>>','<','>','+','-','*','/','%','&','|','^','~',':',',','.','=','(',')','[',']','{','}']:
    if N in P:
     r=''
     for M in P.split(N):r+=O(M)+N
     return r[:-aB(N)]
 if P in f and P not in b.values():return b[P]
 return P
def Q(R):
 S=[]
 for P in R:
    if P.strip()=='':continue
    T=0
    for B in P:
     if not B in string.whitespace:break
     T+=1
    if c!='':U=d
    else:U=''
    P=c+P.strip();V=''
    while P:
     B=P[0]
     if B in['"',"'"]:
        W=B;u=0;X='';Y=False
        while aA:
         u+=1;B=P[u]
         if Y:Y=az;X+=B;continue
         if B=='\\':Y=True
         elif B==W:break
         X+=B
        e.append(X);V+=W*2;P=P[u+1:]
     else:
        if B=='#':break
        V+=B;P=P[1:]
    P=V
    if P=='':continue
    if' 'not in P:L=''
    else:L,P=P.split(' ',1)
    if L in['class','def']:
     i=P.split('(')[0].split(':')[0].strip();U+=L+' ';f.append(i);U+=h(i)
     if'('in P:
        U+='(';C=P.split('(',1)[1];C=')'.join(C.split(')')[:-1]);Z=''
        for B in C:
         if B==','and Z.count('(')==Z.count(')'):U+=m(Z,aA)+',';Z=''
         else:Z+=B
        U+=m(Z,aA)+')'
     U+=':'
    _='';aa=[]
    if L=='from':ab,D=P.split('import');ab=ab.strip();L,P='import',D;_+='from '+ab+' '
    if L=='import':
     _+='import '
     for ab in P.split(','):ab=ab.strip();_+=ab+','
     _=_[:-1];U+=_
    if U.strip()=='':
     ac=m(L+' '+P)
     if c!='':d=U+ac;continue
     U+=ac;d=''
    S.append((T,U))
 for u,(T,P)in aC(S):S[u]=(T,O(P))
 return S
def ad(ae):
 af=[];ag='';ah=0;ai=0
 for u,(T,P)in aC(ae):
    for aj in ay:
     if P[:aB(aj)]==aj:L=aj;P=P[aB(aj):];break
    else:L=''
    W=';'
    for aj in['import','global']:
     if L==aj and ag.startswith(aj):L='';P=P.strip();W=','
    if L not in['if','elif','else','while','for','def','class','try','except','finally']and T==ai:ag+=W*(ag!='')+L+P
    else:af.append((ah,ag));ag=L+P;ah=T
    ai=T
 af.append((ah,ag));ak=[];u=0
 while u<aB(af):
    T,P=af[u]
    if u!=aB(af)-1 and af[u+1][0]==T+1 and(u==aB(af)-2 or af[u+2][0]<=T):ak.append((T,P+af[u+1][1]));u+=1
    else:ak.append((T,P))
    u+=1
 return ak
def al(af):
 am=''
 for an in sorted(g,key=lambda s:-s.count('.')):
    ao=an.split('.')
    if ao[0]in b.values():continue
    j=0
    for T,P in af:
     if P.startswith('import'):continue
     j+=P.count(an)
    if j>1:
     am+=';'*(am!='')+h(an)+'='+an
     for ap,(T,P)in aC(af):
        if P.startswith('import'):continue
        aq='';ar='';u=0
        while u<aB(P):
         B=P[u]
         if B not in aw+ax+'_.':
            if ar==an:aq+=h(an)
            else:aq+=ar
            ar='';aq+=B;u+=1;continue
         if B in'"\'':
            aq+=B;W=B;Y=False
            while aA:
             u+=1;B=P[u];aq+=B
             if Y:Y=az;continue
             if B=='\\':Y=True
             if B==W:break
            u+=1;continue
         ar+=B;u+=1
        af[ap]=(T,aq+ar)
 return[af[0]]+[(0,am)]+af[1:]
def at(au):
 R=aD(au);av=aD('golfed.py','w')
 for T,P in al(ad(Q(R))):av.write('\t'*(T/2)+' '*(T%2)+P+'\n')
 R.close();av.close()
at('golfer.py')

KSab

Posted 2011-09-05T19:19:53.493

Reputation: 5 984

Impressive. This seems a substantial reduction. Is it alternating between spaces and tabs for the indentation to give a double indent with only one character...? I like it. – trichoplax – 2014-04-20T20:51:42.327

Only a trivial improvement, but (at least in Python 3) you can use import* instead of import *. I'm guessing that will work in Python 2 also? – trichoplax – 2014-04-20T20:53:34.683

1

@githubphagocyte Yeah, although I didn't think of it myself, I first saw it here: http://codegolf.stackexchange.com/questions/54/tips-for-golfing-in-python

– KSab – 2014-04-20T21:27:17.813

That's a interesting question you linked to. Maybe some of the other answers there could apply here too... Several of them look like they'd take some careful thought to apply safely though! – trichoplax – 2014-04-20T22:22:20.287

13

BrainFuck - 489 Characters

Removes all non executable characters. Respects comments from # to end of line.

,[>--[<++>+++++++]<+[>+>+<<-]>[<+>-]+>[<->[-]]<[[,----------]]<>>--[>+
<++++++]<<--------[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>>+<<<-[>+>+<<-]>
[<+>-]+>[<->[-]]<[->>.<<]>>+<<<-[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>>+
<<<-[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>>++++++++++++++<<<------------
--[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>>++<<<--[>+>+<<-]>[<+>-]+>[<->[-
]]<[->>.<<]>++++[>+++++++<-]>+<<++++[<------->-]<-[>+>+<<-]>[<+>-]+>[<
->[-]]<[->>.<<]>>++<<<--[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>>[-]<<<,]

Naturally run through itself from this source:

,[

#subtract #
>--[<++>+++++++]<+

#strip comments
[>+>+<<-]>[<+>-]+>[<->[-]]<[[,----------]]<

#put '+' in 4th cell
>>--[>+<++++++]<<
#+
--------
[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>>+<<<
#,
-
[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>>+<<<
#-
-
[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>>+<<<
#.
-
[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>>++++++++++++++<<<
#<
--------------
[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>>++<<<
#>
--
[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>++++[>+++++++<-]>+<<
#[
++++[<------->-]<-
[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>>++<<<
#]
--
[>+>+<<-]>[<+>-]+>[<->[-]]<[->>.<<]>>[-]<<<
,]

captncraig

Posted 2011-09-05T19:19:53.493

Reputation: 4 373

first line of the golfed version, about 6 chars from the end, there's a <> which doesn't do anything – None – 2014-04-18T12:48:47.410

10I should obviously write a golfer to eliminate such wasteful behavior. – captncraig – 2014-04-18T13:21:01.233

4

Brainfuck golfer in Bash (v3)

This is a work in progress, I will keep updating it if I can.

Reads from a file (the filename should be the first command-line argument).

For now all it does is

  • Remove any characters that are not <>+-.,[]
  • Remove two-character strings that do nothing useful, e.g. <>, ><, +-, -+
  • When it is done, it repeats the entire procedure, so >>>><<<<< gets reduced to <

Code

#!/bin/bash

#if the file exists, take input from it
if [ -f $1 ]; then
input=`cat $1`
else
#complain to STDERR and exit
echo "File not found. Exiting with status 1.">&2
exit 1
fi

original=$input #save the original input
code=$input #save original input to another variable which we will be golfing
code=`echo $code|grep -o "[][<>.,+-]"|tr -d " \n"` #remove non-executable chars
output=$code; #this will be output
hits=-1; #Count the number of golfing operations carried out every time the code loops. When this is 0, the code will be completely golfed.

until [ $hits = 0 ]; do
hits=0
#we will be processing the optimised version from last time
code=$output
output=""

#Keep taking characters off from $code until it is empty
until [ m$code = m ]; do
#examine the first two chars
c1=${code:0:1}
code=`echo $code|cut -c2-`
c2=${code:0:1}
#if this is 1, the two characters now being read will be removed
ignore=0

if [ $c1$c2 = "<>" ] ; then
#set the second character to be taken off as well and not saved to output
ignore=1
elif [ $c1$c2 = "><" ] ; then
ignore=1
elif [ $c1$c2 = "+-" ] ; then
ignore=1
elif [ $c1$c2 = "-+" ] ; then
ignore=1
else
#save the char we took off to output, we aren't removing it
output=$output$c1;
fi

if [ $ignore = 1 ]; then
#ignore the second character and save no chars to output
code=`echo $code|cut -c2-`
#another hit
hits=`expr $hits + 1`
fi

#end inner until loop
done
#end main loop
done

#done, print output
echo $output;
exit 0;

How it works

After removing all non-executable characters, it does the following. The hits counter is set to -1 at the start - it counts how many golfing operations were carried out each time the outer loop runs.

  1. If the code being golfed is empty, go to step 5.
  2. Read the first two chars from the code, and remove the first char.
  3. If they are <>, ><, +- or -+, add 1 to the hits counter and go back to step 1.
  4. If not, save the first character to output and go to step 1.
  5. If the hits counter is 0, print output and exit.
  6. If not, reset the hits counter to 0, set the code being golfed to be the output variable, reset output to the empty string, and go to step 1.

user16402

Posted 2011-09-05T19:19:53.493

Reputation:

4

HQ9+ golfer in Bash (v3)

I know HQ9+ is useless, but I might as well submit a five-liner for it. It reads from standard input a file. The path to the file should be the first command-line argument.

Features

  • Removes all comment characters (anything except HhQq9+)
  • Removes + (it increments a number but there is no way to print that number)
  • Converts hq to uppercase (not golfing)

Code

#!/bin/bash
if [ -f $1 ]; then
input=`cat $1`
else exit 1; fi
echo $input|tr "[[:lower:]]" "[[:upper:]]"|grep -o "[HQ9]"|tr -d ' \n'
exit 0

user16402

Posted 2011-09-05T19:19:53.493

Reputation:

3It still reads from standard input! Just use /dev/stdin as the first argument :) – undergroundmonorail – 2014-04-19T05:57:19.877

3

Java with Java

Takes the file name as a command line argument, and edits the file in place.

  • Removes comments
  • Shortens identifiers (including classes, methods, variables, method parameters, and lambda expression parameters)
  • Trims spaces
  • Shortens imports
  • Removes unnecessary braces around one-line statements
  • Converts while(true) to for(;;)
  • Removes unneeded modifiers, such as private and final

When the program is run on itself, its size is reduced from 7792 to 4366.

import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

public class Golfer {

    public static void main(String[] args) throws IOException {
        Path path = new File(args[0]).toPath();
        String program = Files.readAllLines(path).stream().collect(Collectors.joining("\n"));
        Golfer golfer = new Golfer(program);
        System.out.println(golfer.toString().length() + " characters");
        System.out.println(golfer.toString().getBytes().length + " bytes");
        golfer.golf();
        String str = golfer.toString();
        Files.write(path, Arrays.asList(str));
        System.out.println(golfer);
        System.out.println(golfer.toString().length() + " characters");
        System.out.println(golfer.toString().getBytes().length + " bytes");
    }

    private String program;

    public Golfer(String program) {
        this.program = program;
    }

    public void golf() {
        doUnicodeSubstitutions();
        protectStrings();
        removeComments();
        removeDuplicateSpaces();
        removeExcessSpaces();
        removeUnnecessaryModifiers();
        simplifyImports();
        shortenIdentifiers();
        improveControlStructures();
        unprotectStrings();
    }

    void removeDuplicateSpaces() {
        program = program.replaceAll("\\s+", " ");
    }

    void removeExcessSpaces() {
        program = program.trim();
        program = program.replaceAll("([.,;:?!@%^&*()\\[\\]{}=|/<>~]) ", "$1");
        program = program.replaceAll(" ([.,;:?!@%^&*()\\[\\]{}=|/<>~])", "$1");
        program = program.replaceAll("([^+]) \\+", "$1+").replaceAll("\\+\\+ \\+", "+++").replaceAll("\\+ ([^+])", "+$1");
        program = program.replaceAll("([^-]) -", "$1-").replaceAll("-- -", "---").replaceAll("- ([^-])", "-$1");
    }

    void removeUnnecessaryModifiers() {
        program = program.replaceAll("private |final |@Override ", "");
    }

    void simplifyImports() {
        int startImports = program.indexOf("import ");
        List<String> imports = new ArrayList();
        Matcher importMatcher = Pattern.compile("import [A-Za-z0-9$_.*]*;").matcher(program);
        while (importMatcher.find()) {
            imports.add(importMatcher.group());
        }
        for (int i = 0; i < imports.size(); i++) {
            if (!imports.get(i).endsWith("*;")) {
                imports.set(i, imports.get(i).replaceFirst("\\.[A-Za-z0-9$_]*;", "\\.*;"));
            }
            imports = imports.stream().distinct().collect(Collectors.toList());
            program = program.replaceAll("import [A-Za-z0-9$_.*]*;", "");
            program = program.substring(0, startImports) + String.join("", imports) + program.substring(startImports);
        }
    }

    private List<Character> unusedCharacters;

    void shortenIdentifiers() {
        unusedCharacters = IntStream.concat(IntStream.rangeClosed('a', 'z'), IntStream.rangeClosed('A', 'Z'))
                .mapToObj(i -> (char) i)
                .filter(c -> !Pattern.compile("[^A-Za-z0-9$_']" + c + "[^A-Za-z0-9$_]").matcher(program).find())
                .collect(Collectors.toList());
        shortenIdentifiers("(class|interface|enum) ([A-Za-z0-9$_]{2,})( extends [A-Za-z0-9$_]+)?( implements [A-Za-z0-9$_,]+)?\\{",
                           2, "[^A-Za-z0-9$_]", "[^A-Za-z0-9$_]");
        shortenIdentifiers("([A-Za-z0-9$_]+(\\[+\\]+| )|<[A-Za-z0-9$_,]+>+\\[+\\]+)(?<!implements |else |package |import |return )([A-Za-z0-9$_]{2,})(?<!null)[=,;)]",
                           3, "[^A-Za-z0-9$_]", "[^A-Za-z0-9$_(]");
        shortenIdentifiers("([A-Za-z0-9$_]+ |<[A-Za-z0-9$_,]+>+)(?<!new |else )([A-Za-z0-9$_]{2,})(?<!main|toString|compareTo|equals|hashCode|paint|repaint)\\(",
                           2, "[^A-Za-z0-9$_]", "\\(", "::", "[^A-Za-z0-9$_]");
        shortenIdentifiers("[^A-Za-z0-9$_]([A-Za-z0-9$_]{2,})(,[A-Za-z0-9$_]+)*\\)?->",
                           1, "[^A-Za-z0-9$_]", "[^A-Za-z0-9$_(]");
    }

    void shortenIdentifiers(String pattern, int groupNumber, String... afficesForReplacement) {
        Pattern compiledPattern = Pattern.compile(pattern);
        Matcher matcher;
        while ((matcher = compiledPattern.matcher(program)).find()) {
            String identifer = matcher.group(groupNumber);
            char newIdentifier;
            if (unusedCharacters.remove((Character) identifer.charAt(0))) {
                newIdentifier = identifer.charAt(0);
            } else if (Character.isUpperCase(identifer.charAt(0))
                       && unusedCharacters.remove((Character) Character.toLowerCase(identifer.charAt(0)))) {
                newIdentifier = Character.toLowerCase(identifer.charAt(0));
            } else if (Character.isLowerCase(identifer.charAt(0))
                       && unusedCharacters.remove((Character) Character.toUpperCase(identifer.charAt(0)))) {
                newIdentifier = Character.toUpperCase(identifer.charAt(0));
            } else if (unusedCharacters.size() > 0) {
                newIdentifier = unusedCharacters.remove(0);
            } else {
                System.err.println("out of identifiers");
                break;
            }
            for (int i = 0; i < afficesForReplacement.length; i += 2) {
                program = program.replaceAll('(' + afficesForReplacement[i] + ')' + identifer + "(?=" + afficesForReplacement[i + 1] + ')', "$1" + newIdentifier);
            }
        }
    }

    void improveControlStructures() {
        program = program.replaceAll("while\\(([^()]+)\\)\\{", "for(;$1;){");
        program = program.replaceAll("for\\(([^()]*);true;([^()]*)\\)", "for($1;;$2)");
        while (!program.equals(removeBrackets(program))) {
            program = removeBrackets(program);
        }
    }

    String removeBrackets(String string) {
        return string.replaceAll("((if|while|do)\\([^{;]*\\))\\{([^;}]*;)\\}", "$1$3")
                .replaceAll("else\\{([^;}]*;)\\}", "else $1")
                .replaceAll("(for\\([^{;]*;[^{;]*;[^{;]*\\))\\{([^;}]*;)\\}", "$1$2");
    }

    void protectStrings() {
        adjustStrings(1000);
    }

    void unprotectStrings() {
        adjustStrings(-1000);
    }

    void adjustStrings(int n) {
        char[] chars = new char[program.length()];
        for (int i = 0; i < program.length(); i++) {
            chars[i] = program.charAt(i);
            if (chars[i] == '"' && chars[i - 1] != '\'') {
                for (i++; chars.length > i + 1 && program.charAt(i) != '"'; i++) {
                    chars[i] = (char) (program.charAt(i) + n);
                }
                chars[i] = program.charAt(i);
            }
        }
        program = new String(chars);
    }

    void removeComments() {
        program = program.replaceAll("(?s)/\\*.*\\*/", "");
        program = program.replaceAll("//.*\n", "");
    }

    void doUnicodeSubstitutions() {
        List<Character> chars = new ArrayList();
        for (int i = 0; i < program.length(); i++) {
            if (program.charAt(i) != '\\' || program.charAt(i + 1) != 'u') {
                chars.add(program.charAt(i));
            } else {
                chars.add((char) Integer.parseInt(program.substring(i + 2, i + 6), 16));
                i += 5;
            }
        }
        char[] charArray = new char[chars.size()];
        for (int i = 0; i < charArray.length; i++) {
            charArray[i] = chars.get(i);
        }
        program = new String(charArray);
    }

    @Override
    public String toString() {
        return program;
    }
}

Ypnypn

Posted 2011-09-05T19:19:53.493

Reputation: 10 485

2

Perl, Parts 1 - 2

(removes comments and ignores # characters inside double quotes)

(removes all whitespace after brackets and = signs)

I did not try to golf this code. Maybe, when it is done, it could golf itself.

$/="\n\n";
chomp($prog = <>);
for(1..length($prog)){
 $rev.=chop $prog;
}
#print$rev;
for(1..length($rev)){
 $temp = chop$rev;
 if ($temp eq '"'){
  if($quote == 0){$quote = 1}
  else{$quote = 0}
 }
 if($temp eq '#' && $quote == 0){$comment = 1}
 if($temp eq "\n"){
  $comment = 0;
 }
 if($comment != 1){
  $prog.=$temp;
 }
}
for(1..length($prog)){
 $rev.=chop $prog;
}
for(1..length($rev)){
 $temp = chop$rev;
 if ($temp eq ";" || $temp eq "}" || $temp eq ")" || $temp eq "=" || $temp eq "{"){
  $ws = 2;
 }
 if(($temp eq "\n" || $temp eq " ") && $ws != 0){$ws = 1}elsif($ws==1){$ws=0}
 if($ws != 1){
  $prog.=$temp;
 }
}
print$prog;

Example input

for(1..10) {
 print "Hello, World!";#prints hello world
 print ($n = <>); #prints "#"
}

Output

for(1..10){print "Hello, World!";print ($n =<>);}

Next, it will eliminate spaces between symbols and alphanumeric characters.

PhiNotPi

Posted 2011-09-05T19:19:53.493

Reputation: 26 739

1

Java golfer in Perl

WIP at the moment, although it gets pretty nice code right now.

Features:

  • Removes all comments
  • Removes unnecessary whitespace
  • Removes package declarations (it's single-file anyway!)

Code

#!/usr/bin/env perl
use strict;
use warnings;

my $string_re = qr/"(?:\\[btnfr"'\\]|\\(?:[0-3][0-7][0-7]|[0-7]{1,2})|\\u+[0-9a-fA-F]{4}|[^\r\n\\"])*"/;
my @lines = <>;
my @strings = ();

# First, replace strings with placeholders. Strings suck REALLY hard when replacing.
map {
  while (m/$string_re/) {
    s//"\032".@strings."\032"/e;
    push @strings, $&;
  }
} @lines;

# Remove comments
my $in_comment = 0;
my $partial_line;

sub remove_comments {
  my $line = $_[0];
  start:
  if ($in_comment) {
    if ((my $mulc_end_index = index $line, '*/') != -1) {
      $line = $partial_line . substr $line, $mulc_end_index + 2;
      $in_comment = 0;
      goto no_comment;
    }
    return 0;
  }

  no_comment:
  my $eolc_idx = index $line, '//';
  my $mulc_idx = index $line, '/*';

  if ($mulc_idx == -1 || ($eolc_idx != -1 && $eolc_idx < $mulc_idx)) {
    $line =~ s;//.*$;;;
  } elsif ($mulc_idx != -1) {
    $in_comment = 1;
    $partial_line = substr $line, 0, $mulc_idx;
    goto start;
  }
  $_ = $line;
  return 1
}

@lines = grep {&remove_comments($_)} @lines;

# Remove empty lines, remove line ends
@lines = grep(!/^\s*$/, @lines);
map {chomp;s/^\s*//;s/\s*$//} @lines;

# Remove unnecessary whitespace
map {s/\s*([][(){}><|&~,;=!+-])\s*/$1/g} @lines;

# Remove unnecessary package declaration
$lines[0] =~ s/^package [^;]+;//;

# Finally, put strings back.
map {s/\032(\d+?)\032/$strings[$1]/g} @lines;

print @lines;

14mRh4X0r

Posted 2011-09-05T19:19:53.493

Reputation: 348