Dead code elimination

20

Dead code sits there doing nothing, staring at us knowing it will never be executed... but today we can take revenge.

Specification

The input will be a multiline string.

Each line may either be an assignment or an expression .

Assignment

An assignment is of the form <name> = number where name is a sequence of letters, underscores and numbers, but not starting with a number.

Variables may be assigned any number of times.

Expression

An expression is of the form <var_name OR number> <operation> <var_name OR number> ...

An expression may be any combination of:

  • Variables already defined
  • Basic arithmetic operators +-*/
  • Numbers (integers)

Expected output

You should output the string with redundant assignments, assignments that are never used by any of the expressions following it, removed. Please note that assignments can also be made redundant if an additional assignment to the same variable is performed before an expression using the variable is executed.

Test cases

in

a = 10
a * 3

out

a = 10
a * 3

in

foo = 8
2 - 1
a = 18

out

2 - 1

in

a = 10
a = 8
b = 4
ab = 72  
b / 6
b + 1

out

b = 4
b / 6
b + 1

in

a = 1
a = 2
a + 1

out

a = 2
a + 1

in

FooBar1 = 0
Fuz__ = 8
Fuz__ / 1

out

Fuz__ = 8
Fuz__ / 1

in

a = 1
a + 1
a = 2
a + 1

out

a = 1
a + 1
a = 2
a + 1

in

a = 1
1 / 5 * 8 + 4

out

1 / 5 * 8 + 4

in

a = 1
a + 1
a = 1
a + 1

out

a = 1
a + 1
a = 1
a + 1

in

a = 7
5 / a

out

a = 7
5 / a

Caridorc

Posted 2015-08-29T19:00:09.747

Reputation: 2 254

1Should you add this particularly difficult case:a = 1; a + 1; a = 1; a + 1;? Where the second a = 1 can be discarded only because a was previously set to the same value (1). – flodel – 2015-08-29T22:01:32.250

3@flodel No, no need to look at values – Caridorc – 2015-08-29T22:02:32.350

@flodel testcase incorporated – Caridorc – 2015-08-29T22:06:12.153

You should add a test case where a variable is used in an expression, but not as the first element of the expression. Especially important is as the last member of the expression. – isaacg – 2015-08-30T01:55:11.003

@isaacg no dead code, the var may be anywhere, testcase added – Caridorc – 2015-08-30T08:06:49.700

Answers

9

PHP - 197 bytes

The function works by analysing each line, in the reverse order and one after the other, and maintaining an array of the used variables.

  • If there is an equal character = in the line, it's an assignment.
    • It the variable is used, the assignment is useful, and the line is printed, but the variable isn't considered used any more.
    • Otherwise, do nothing.
  • Otherwise, the line is an expression. We split the line after each space, and add each symbol to the list of used variables. Numbers (1, 2, …) and operators (+, -, …) will be added too, but since they aren't valid variables names, it's not a problem. The line is then of course printed.
function($c){$u=[];foreach(array_reverse(split('
',$c))as$l){if($p=strpos($l,'=')){if(!isset($u[$x=substr($l,0,$p-1)]))continue;
unset($u[$x]);}else$u+=array_flip(split(' ',$l));$f="
$l$f";}echo$f;}

Here is the ungolfed version:

function removeDeadCode($code)
{
    $usedVariables = [];
    $finalCode = '';

    foreach (array_reverse(explode("\n", $code)) as $line)
    {
        if ($equalPosition = strpos($line, '='))
        {
            $variable = substr($line, 0, $equalPosition - 1);
            if (isset($usedVariables[$variable]))
            {
                $finalCode = "\n" . $line . $finalCode;
                unset($usedVariables[$variable]);
            }
        }
        else
        {
            $usedVariables += array_flip(explode(' ', $line));
            $finalCode = "\n" . $line . $finalCode;
        }
    }

    echo $finalCode;
}

Blackhole

Posted 2015-08-29T19:00:09.747

Reputation: 2 362

7

Retina, 45 bytes

m`^(\w+) =.*\n(?=((?!\b\1\b)[^!])*(^\1 =|\Z))
<empty>

For counting purposes, each line goes in a separate file (where <empty> is an empty file) and \n should be replaced with an actual line feed (0x0A).

This assumes that the string will always end with a line feed.

As this regex doesn't use any .NET-specific features, you can test it on regex101.

The idea is fairly simple: remove all assignments from which we can find (searching forward) another assignment to the same variable or the end of the string without passing another use of the variable.

Martin Ender

Posted 2015-08-29T19:00:09.747

Reputation: 184 808

6

Pyth, 40 bytes

eMf|!}K\=eT&Jf}+dhceTK++dYdPT!}KeJ_.__.z

Seems kinda long. Maybe I can save one or two bytes tomorrow.

Try it online: Demonstration or Test Suite

Explanation:

_.__.z gives all postfixes of the input lines in reversed order. E.g. the input FooBar1 = 0; Fuz__ = 8; Fuz__ / 1 gives the list:

[['Fuz__ / 1', 'Fuz__ = 8', 'FooBar1 = 0'], 
 ['Fuz__ / 1', 'Fuz__ = 8']
 ['Fuz__ / 1']]

Then I filter for list elements T, in which = doesn't is in the last element of T (expression) or (assignment) the last element in T, which contains the variable name, is an expression. Afterwards print the last element of each of the remaining elements on a separate line.

eMf|!}K\=eT&Jf}+dhceTK++dYdPT!}KeJ_.__.z
                                      .z  all input lines
                                     _    reverse order
                                   ._     all prefixes
                                  _       reverse order
  f                                       filter for elements T, which satisfy:
      K\=                                   K = '='
    !}K  eT                                 '=' is not in T[-1]
   |                                        or
             f             PT                 filter for elements Y in T[:-1],
                                              which satisfy:
                 hceTK                          extract variable name of T[-1]
                                                with an additional space at the end
               +d                               and at the beginning
              }       ++dYd                     ^ in space+Y+space
            J                                 assign these list to J
           &                                  J not empty and
                             !KeJ             '=' is not in J[-1]
eM                                        take the last element of each and print

Jakube

Posted 2015-08-29T19:00:09.747

Reputation: 21 462

8Aww it's so cute... .__. – orlp – 2015-08-30T00:00:22.093

@isaacg Fixed . – Jakube – 2015-08-30T06:50:44.680

4

Python 2, 270 267 bytes

import sys,re
d={}
s=list(enumerate(sys.stdin))
for n,l in s:
 try:v,_=l.split('=');v=v.strip();d[v]=d.get(v,[])+[[0,n]]
 except:
  for v in re.findall('[a-zA-Z_]\w*',l):d[v][-1][0]+=1
print''.join(l for n,l in s if n not in[n for x in d.values()for c,n in x if c==0])

Indentation is: 1. Space 2. Tab

Saved 3 bytes thanks to @Kamehameha!

kirbyfan64sos

Posted 2015-08-29T19:00:09.747

Reputation: 8 730

The space after print in print ''.join and in in in [n can be removed. – Kamehameha – 2015-08-31T08:45:07.913

Also, you can use this trick by using a tab instead of the double space after except line and saving one byte.

– Kamehameha – 2015-08-31T08:52:36.297

4

CJam, 49 bytes

LqN/W%{_'=#_){(<:V1$\e=_{\Va-\}&}{;S/+1}?},\;W%N*

Try it online

The approach here is that a list of unassigned variables is maintained while processing the input lines back to front:

  • If the line is an expression, all variables in the expression are added to the list. Actually, in the implementation, all tokens are added to the list, since it saves code and having numbers and operators in the list does not do any harm.

  • If the line is an assignment, it tests if the assigned variable name is in the list. If it is, the assignment is accepted, and the variable name removed from the list. Otherwise, the assignment is skipped.

Explanation :

L     Start with empty list.
qN/   Get input and split at newlines.
W%    Reverse to process lines back to front.
{     Start of filter block.
  _     Copy line.
  '=#   Find equal sign.
  _     Copy position of equal sign, will use original later to extract
        variable name from assignment.
  )     Increment to produce truthy/falsy value (-1 from find means not found).
  {     Start if-block that processes assignments.
    (<    Slice off everything but variable name.
    :V    Save in variable V for later re-use.
    1$\   Place copy of unassigned variable list and new variable name at
          top of stack.
    e=    Count occurrences. This tells us if variable name was in list.
    _     Copy the condition value because it will also be used as the
          overall filter result.
    {     Start of block that removes variable name from list.
      \V    Bring list to top, and push variable name.
      a-    Remove the variable name from list.
      \     Swap to get variable list to bottom of stack for next iteration,
            and filter result to top.
    }&    End of conditional block to remove variable name.
  }     End of if-block for assignment.
  {     Start of else-block for expression.
    ;     Pop result of find operation.
    S/    Split expression at spaces.
    +     Concatenate the new variables with the existing ones in the list.
    1     Filter result, expressions are always accepted.
  }?    End of if for assignment vs. expression.
},    End of filter.
\;    Get rid of variable list at bottom of stack.
W%    Reverse order of filtered result, since we worked back to front.
N*    Join with newlines.

Reto Koradi

Posted 2015-08-29T19:00:09.747

Reputation: 4 870

2

JavaScript (ES6) 164 177

Using template strings, all newlines are signifcant and counted.

Test running the snippet in FireFox (required for ES6 compatibility including arrow functions)

f=s=>(k=>{s=s.split`
`,s.map((t,n)=>(r=t.match(/\w+/g)).map(v=>k[v]=f,~t.search`=`?k[s[k[v=r[0]]]=r[0]=0,v]=n:0))
for(v in k)s[k[v]]=0})([])||s.filter(r=>r).join`
`

ungolfed=s=>
{
  k={} // list line defining variables, by name, until used
  s=s.split`\n`
  s.forEach((t,n)=>
  {
    r=t.match(/\w+/g); // list variables in the line, operators are lost
    if (t.search`=` >= 0) // if it's an assignment
    {
      v=r[0] // new variable
      s[k[v]]=r[0]=0 // kill previous definition if not used
      k[v]=n
    }
    r.forEach(v=>k[v]='') // for each used variable, forget its definition line
  })
  for(v in k)s[k[v]]=0; // kill all remaining unused definitions
  return s.filter(r=>r).join`\n`
}

// TEST
out=x=>O.innerHTML+=x+'\n';


;[['a = 10\na * 3', 'a = 10\na * 3']
 ,['foo = 8\n2 - 1\na = 18','2 - 1'] 
 ,['a = 10\na = 8\nb = 4\nab = 72\nb / 6\nb + 1','b = 4\nb / 6\nb + 1'] 
 ,['a = 1\na = 2\na + 1','a = 2\na + 1'] 
 ,['FooBar1 = 0\nFuz__ = 8\nFuz__ / 1','Fuz__ = 8\nFuz__ / 1'] 
 ,['a = 1\na + 1\na = 2\na + 1','a = 1\na + 1\na = 2\na + 1']
 ,['a = 1\na + a\na = 2', 'a = 1\na + a']
 ,['a = 1\n1 / 5 * 8 + 4', '1 / 5 * 8 + 4']
 ,['a = 1\na + a\na = 2', 'a = 1\na + a']
 ,['a = 1\na + 1\na = 1\na + 1', 'a = 1\na + 1\na = 1\na + 1']
 ,['a = 7\n5 / a', 'a = 7\n5 / a']
]
.forEach(([i,k])=>(r=f(i),
  out('Test '+(r==k?'OK':'Fail')+'\nInput:  '+i.replace(/\n/g,',')
      +'\nResult: '+r.replace(/\n/g,',')
      +'\nCheck:  '+k.replace(/\n/g,',')+'\n')
));
Note: newlines trasformed to commas to save space in output
<pre id=O></pre>

edc65

Posted 2015-08-29T19:00:09.747

Reputation: 31 086

Hey, that's not 164 bytes! – Cyphase – 2015-08-31T04:17:30.783

@Cyphase line 1:20 + 1 newline, line 2;92 + 1 newline, line 3:48 + 1 newline, line 4:1. 21+93+49+1 => 164. The ungolfed part is for explanation only. The TEST part is ... uhm just guess ... – edc65 – 2015-08-31T05:29:49.750

I know. I was just kidding. Sorry :). – Cyphase – 2015-08-31T06:12:19.830

2

R 144

Q=R=c()
for(L in rev(scan(,"",,,"\n"))){W=strsplit(L," ")[[1]]
if("="%in%W)if(W[1]%in%R)R=R[R!=W[1]]else next else R=c(R,W)
Q=c(L,Q)}write(Q,"")

where

  • L is a line from the input (starting from the last one)
  • W are the symbols (variables, operators, numbers) in a line
  • R is a vector of symbols that will be printed. It includes variables whose assignment is needed.
  • Q is the vector of lines in the output

flodel

Posted 2015-08-29T19:00:09.747

Reputation: 2 345

You can replace scan(what="",sep="\n") with scan(,"",sep="\n"). You may also be able to replace the named sep argument with its positional equivalent but I can't recall where the commas would go for that. – Alex A. – 2015-08-30T01:56:26.600

... saving 6. Very nice. Thank you Alex! – flodel – 2015-08-30T02:58:02.363

1

JavaScript ES6, 79 75 118 bytes

s=>s.split`
`.filter((l,i,a)=>(q=l.split`=`)[1]?!~(a.slice(i+1)+0).search(q[0]+'=')&&~s.search(q[0]+'[^=]'):1).join`
`

Tell me if this doesn't work for a case. Any ideas for golfing are welcome.


Explanation

s=>          // Function with argument "s"
  s.split`   // Splits each line
  `
  .filter(   // Filters through each line,
    (item,index,array)=>
      (q=l.split`=`)[1]? // If there is something after the equal sign
        !~ // XOR (~) will  essentially make -1, 0. NOT (!) will make 0, 1, vice-versa
         (a.slice(i+1)+0) // Gets following lines
         .search`^${z=q[0]}=` // Checks if following lines have the same variable name and then =
        && // AND...
         ~s.search(z+'[^=]') // Check if there is an expression with the variable
        :1) // If there is no equal sign, return 1 (true)
  .join` // Join everything back, seperated by newlines
  `

Tested on Safari Nightly. Firefox friendly version:

s=>s.split`
`.filter((l,i,a)=>(q=l.split`=`)[1]?!~(a.slice(i+1)+0).search(`^${z=q[0]}=`)&&~s.search(z+'[^=]'):1).join`
`

You can drop in in babeljs to get an ES5 version.

Downgoat

Posted 2015-08-29T19:00:09.747

Reputation: 27 116

@Blackhole I've got that fixed. – Downgoat – 2015-08-29T20:43:47.403

@edc65 according to examples the separator is a newline though. Input is also in a strict format with spaces, etc. – Downgoat – 2015-08-29T21:10:16.397

@edc65 Are you sure? Try wrapping the function in parenthesis and run it like that. It works for me (Safari Nightly). – Downgoat – 2015-08-29T23:43:34.400

Maybe I'm too obstinate but I still feel it's too simple to work good in every case. I made it run without errors in Firefox adding brackets in the search call (still way shorter than mine). And tried "a = 1\na + a\na = 2". And it fails... – edc65 – 2015-08-30T14:05:16.713

Thx for adding my suggestion to your answer. -1 because it's still bugged – edc65 – 2015-08-30T20:18:19.840

@edc65 I think I posted the wrong version. I've updated it but I'll rewrite it and iron out the bugs – Downgoat – 2015-08-30T20:24:26.930

1

Haskell, 187 bytes

Use d.

import Data.List
f=filter
(!)=map
b((v,e,w):t)=e||or((\(_,p,_)->p)!take 1(f(\(x,_,y)->v==x||v`elem`y)t))
d=unlines.(\l->snd!f fst(zip(b!tails(((\(v:o:w)->(v,o/="=",w)).words)!l))l)).lines

Leif Willerts

Posted 2015-08-29T19:00:09.747

Reputation: 1 060