Am I code-golfing properly?

12

I am curious if I am Code Golfing properly. I set the challenge for myself to make a small hashing program into a single statement in Python. I first started off with:

from itertools import permutations
from string import ascii_lowercase
from random import sample

def test():
    chars = sample(ascii_lowercase, 9)
    sums = list(map(h, permutations(chars)))
    if len(set(sums)) == len(sums):
       print("unique results for permutations of given string")
    else:
       print("duplicate entries present in test results")

def h(s):
    r = 0
    for i in range(len(s)):
        r += ord(s[i]) << (i * len(s))
    return r

test()

I then made the function recursive:

def h(s, i=0):
    if i < len(s) - 1: return h(s, i+1) + ord(s[i]) << (i * len(s))
    else: return ord(s[i]) << (i * len(s))

I tried shortening it with a lambda for repeating code (it didn't work):

def h(s, i=0, f=lambda s,i: ord(s[i]) << (i * len(s))):
    if i < len(s) - 1: return h(s, i+1) + f(s,i)
    else: return f(s,i)

Finally I ended up with a lambda:

h=lambda s,i=0:h(s,i+1)+ord(s[i])<<(i*len(s))if i<len(s)-1 else ord(s[i])<<(i*len(s))

I wanted the program to be one statement, so first I came up with:

def test():
    chars = sample(ascii_lowercase, 9)
    sums = list(map((lambda s,i=0,f=lambda s,i,f:f(s,i+1,f)+ord(s[i])<<(i*len(s))if i<len(s)-1 else ord(s[i])<<(i*len(s)):f(s,i,f)), permutations(chars)))
    if len(set(sums)) == len(sums):
       print("unique results for permutations of given string")
    else:
       print("duplicate entries present in test results")

And lastly I ended up with:

print((lambda x=list(map((lambda s,i=0,f=lambda s,i,f:f(s,i+1,f)+ord(s[i])<<(i*len(s))if i<len(s)-1 else ord(s[i])<<(i*len(s)):f(s,i,f)), permutations(sample(ascii_lowercase, 9)))): "unique results for permutations of given string" if len(set(x)) == len(x) else "duplicate entries present in test results")())

Is this how codegolf problems are worked out? I've never really done this sort of thing, so right now I just want to know if I am doing it right.

Amendment: This program does all the work for you; so I will here refer to the function: As input, the program takes in all the permutations of a given string; here the string is nine characters randomly picked from ascii_lowercase. The output is a human-readable string defining whether the result of each permutation of the given string is a duplicate of another result for a different string. If there are no duplicates for all permutations, the program indicates success. Nine characters was picked as being the largest length of characters readily computed repeatedly on my box.

Amendment II As pointed out by a studious reader, the intended purpose described is not obtained through the accompanying code. The test case is obviously inadequate.

motoku

Posted 2015-02-26T03:06:59.483

Reputation: 221

3This looks like a good tips question, and I'm pleased to see you've shown your golfing process in detail. But I don't know what you mean by "hashing" program. You should post a spec that explains how input is to be taken, how the output is to be given, and what relation the output must have to the input. – xnor – 2015-02-26T03:26:31.517

@xnor does that suffice? – motoku – 2015-02-26T03:32:53.870

For code golf, you should definitely remove some more of those optional spaces. Also, use Python 2 instead, since printing is shorter. print"x" instead of print("x") – mbomb007 – 2015-02-26T03:38:12.300

And use a list comprehension instead of list()? – mbomb007 – 2015-02-26T03:41:49.987

Removable space examples: the ones after a comma, before if, after a colon, or before or after else are all removable. If spaces don't count towards the length, then it's optional to remove them, of course. – mbomb007 – 2015-02-26T03:44:10.053

3

Your process seems fine. Start with a program, shorten by trial/error. Get more experience and browse the python tips and you'll be doing great in no time.

– Geobits – 2015-02-26T03:47:28.733

Is it just me, or does your original function and your "made the function recursive" function give different outputs? – Sp3000 – 2015-02-26T04:07:16.403

@Sp3000 oh, wow I wish I had noticed that – motoku – 2015-02-26T04:13:56.407

Also, I'm a bit confused about what the function is supposed to do - is output supposed to be different for different outputs? For the original function, h("!!") = h("% ") = 165. – Sp3000 – 2015-02-26T04:17:06.733

@Sp3000 I'm getting the idea that my test case is inadequate. – motoku – 2015-02-26T04:19:38.200

Answers

11

There is no 'right' way to golf. You've done well, and the process you've used is fairly standard. Making the program into one statement isn't usually a requirement though.

If it helps, here's how I would approach golfing down your program...

In the hashing function, the for statement can be replaced with a sum:

def h(s):
    r = 0
    r = sum(ord(s[i]) << (i * len(s)) for i in range(len(s)))
    return r

This can then be defined as a lambda function:

h = lambda s: sum(ord(s[i]) << (i * len(s)) for i in range(len(s)))

And now we remove unnecessary spaces and brackets:

h=lambda s:sum(ord(s[i])<<i*len(s)for i in range(len(s)))

As Sp3000 pointed out, this can be shortened further with enumerate:

h=lambda s:sum(ord(x)<<i*len(s)for i,x in enumerate(s))

Moving on to the test function, we merge its first two lines:

def test():
    sums = list(map(h, permutations(sample(ascii_lowercase, 9))))
    ...

Since both functions are only used once, we can move everything inline:

sums = list(map(lambda s:sum(ord(x)<<i*len(s)for i,x in enumerate(s)), permutations(sample(ascii_lowercase, 9))))
...

This is shorter as a list comprehension:

sums = [sum(ord(x)<<i*len(s)for i,x in enumerate(s)) for s in permutations(sample(ascii_lowercase, 9))]

Next, we give it a shorter name and remove unnecessary spaces again:

x=[sum(ord(x)<<i*len(s)for i,x in enumerate(s))for s in permutations(sample(ascii_lowercase,9))]

The if statement can be moved inside the print function:

print('unique...' if len(set(x)) == len(x) else 'duplicate...')

However, it's usually shorter to use and/or:

print(len(set(x)) == len(x) and 'unique...' or 'duplicate...')

Since len(x) doesn't change, we can calculate and hardcode its value:

print(len(set(x)) == 362880 and 'unique...' or 'duplicate...')

After removing unnecessary spaces and switching around the comparison, we get:

print(len(set(x))<362880and'duplicate...'or'unique...')

This lets us move everything into one statement:

print(len(set([sum(ord(x)<<i*len(s)for i,x in enumerate(s))for s in permutations(sample(ascii_lowercase,9))]))<362880and'duplicate...'or'unique...')

And now we can use a set comprehension instead:

print(len({sum(ord(x)<<i*len(s)for i,x in enumerate(s))for s in permutations(sample(ascii_lowercase,9))})<362880and'duplicate...'or'unique...')

The result is 210 bytes, excluding imports. The next step would probably be to golf down the imports or the long strings.

grc

Posted 2015-02-26T03:06:59.483

Reputation: 18 565

7Funnily enough, I think enumerate is shorter: h=lambda s:sum(ord(x)<<i*len(s)for i,x in enumerate(s)) – Sp3000 – 2015-02-26T04:18:25.563

@Sp3000 oh nice! Every builtin has its day :D – grc – 2015-02-26T04:20:48.220