Resolve SAS macro variables

13

3

The SAS programming language is a clunky, archaic language dating back to 1966 that's still in use today. The original compiler was written in PL/I, and indeed much of the syntax derives from PL/I. SAS also has a preprocessor macro language which derives from that of PL/I as well. In this challenge, you'll be interpreting some simple elements of the SAS macro language.

In the SAS macro language, macro variables are defined using the %let keyword and printing to the log is done with %put. Statements end with semicolons. Here are some examples:

%let x = 5;
%let cool_beans =Cool beans;
%let what123=46.lel"{)-++;

Macro variable names are case insensitive and always match the regular expression /[a-z_][a-z0-9_]*/i. For the purposes of this challenge, we'll say the following:

  • Macro variables can only hold values consisting entirely of printable ASCII characters except ;, &, and %
  • There will be no leading or trailing spaces in the values
  • The values will never be more than 255 characters long
  • Values may be empty
  • Brackets and quotes in the values may be unmatched
  • There can be any amount of space before and after the = in the %let statement and this space should be ignored
  • There can be any amount of space before the terminal ; in the %let statement and this space should similarly be ignored

When a macro variable is called, we say it "resolves" to its value. Macro variables are resolved by prepending &. There is an optional trailing . that denotes the end of the identifier. For example,

%put The value of x is &X..;

writes The value of x is 5. to the log. Note that two periods are required because a single period will be consumed by &X. and resolve to 5. Also note that even though we defined x in lowercase, &X is the same as &x because macro variable names are case insensitive.

Here's where it gets tricky. Multiple &s can be strung together to resolve variables, and &s at the same level of nesting resolve at the same time. For example,

%let i = 1;
%let coolbeans1 = broseph;
%let broseph = 5;

%put &&coolbeans&i;  /* Prints broseph */
%put &&&coolbeans&i; /* Prints 5 */

The innermost &s resolve first, and resolution continues outward. Variable name matching is done greedily. In the second %put statement, the processor makes the following steps:

  1. &i resolves to 1, and the innermost leading & is consumed, giving us &&coolbeans1
  2. &coolbeans1 resolves to broseph, giving us &broseph
  3. &broseph resolves to 5.

If there are trailing .s, only a single . is consumed in resolution, even if there are multiple &s.

Task

Given between 1 and 10 %let statements separated by newlines and a single %put statement, print or return the result of the %put statement. Input can be accepted in any standard way.

You can assume that the input will always be valid and that the %let statements will preceed the %put statement. Variables that are defined will not be redefined in later %let statements.

If actually run in SAS, there would be no issues with variables resolving to variables that don't exist and everything will be syntactically correct as described above.

Examples

  1. Input:

    %let dude=stuff;
    %let stuff=bEaNs;
    %put &&dude..;
    

    Output:

    bEaNs.
    
  2. Input:

    %let __6 = 6__;
    %put __6&__6;
    

    Output:

    __66__
    
  3. Input:

    %let i=1;
    %let hOt1Dog = BUNS;
    %put &&HoT&i.Dog are FUNS&i!");
    

    Output:

    BUNS are FUNS1!")
    
  4. Input:

    %let x = {*':TT7d;
    %put SAS is weird.;
    

    Output:

    SAS is weird.
    
  5. Input:

    %let var1   =  Hm?;
    %let var11 = var1;
    %let UNUSED = ;
    %put &&var11.....;
    

    Output:

    Hm?....
    

    Note that &&var11 matches var11 since name matching is greedy. If there had been a ., i.e. &&var1.1, then var1 would be matched and the extra 1 wouldn't be part of any name.

This is code golf, so the shortest solution in bytes wins!

Alex A.

Posted 2016-02-23T21:06:57.647

Reputation: 23 761

How does the output from test case 1 have a period? Shouldn't &stuff. remove the period? – GamrCorps – 2016-02-23T23:18:30.170

@GamrCorps I should specify: Only a single trailing period is consumed in resolution. – Alex A. – 2016-02-23T23:21:38.810

@GamrCorps Edited to specify and added it as a test case. – Alex A. – 2016-02-23T23:24:38.530

so &&&&&&&&&a...................... would still only remove one period? – GamrCorps – 2016-02-23T23:24:59.113

@GamrCorps Yes. – Alex A. – 2016-02-23T23:25:13.093

Answers

1

Python 3, 354 341 336 bytes

import re
S=re.sub
def f(x):
	r=x.splitlines();C=r[-1].strip('%put ');D=0
	while D!=C:
		D=C
		for a in sorted([l.strip('%let ').replace(" ","").split(';')[0].split('=')for l in r[:-1]],key=lambda y:-len(y[0])):
			s=1
			while s:C,s=re.subn('&'+a[0]+'(\.?)',a[1]+'\\1',S('+\.([^\.])','\\1',C),0,re.I)
	return S('+\.?','',C)

Try it online!

edit: some easy shortening

edit: reverse sort by -len(...) instead of [::-1] (5 bytes), thanks to Jonathan Frech!

Ungolfed

import re
S=re.sub # new name for the function re.sub()
def f(x):
    r=x.splitlines() # input string to list of rows
    C=r[-1].strip('%put ') # get the string to put (from the last row)
    D=0
    while(D!=C): # iterate until the result does not change
        D=C
        for a in                                                                                                                    : # iterate over the list of variables
                 sorted(                                                                          ,key=lambda y:len(y[0]),reverse=1) # sort list for greediness by decreasing var.name lengths
                        [l.strip('%let ') # cut the 'let' keyword
                                         .replace(" ","") # erase spaces
                                                         .split(';')[0] # cut parts after ';'
                                                                       .split('=') # create [variable_name,value] list
                                                                                  for l in r[:-1]] # for each row but last
            s=1
            while(s): # iterate until the result does not change
                C,s=re.subn( # substitute
                            '&'+a[0]+'(\.?)', # &varname. or &varname
                                                 a[1]+'\\1', # to value. or value
                                                              S('+\.([^\.])','\\1',C), # in the string we can get from C erasing ('s)(.) sequences if the next char is not .
                                                                                        0,re.I) # substituting is case insensitive
    return S('+\.?','',C) # erase smileys and one .

mmuntag

Posted 2016-02-23T21:06:57.647

Reputation: 76

I would suggest taking a lot at the Python tips page. Trivial optimizations such as non-compound statement concatenation (;), parentheses reduction (if(...) -> if ...) and list operations (,reverse=1 -> [::-1]) can easily save some bytes.

– Jonathan Frech – 2018-11-19T22:25:29.210

Thanks! I have read it before, but it was a long time ago, and I forgot some tricks. – mmuntag – 2018-11-20T09:00:09.427

You are welcome. len(y[0]))[::-1] can be -len(y[0])). – Jonathan Frech – 2018-11-20T12:47:14.170