Tips for golfing in Clean

17

3

What general tips do you have for golfing in Clean? Please post only ideas that can be applied to code golf problems in general, and are at least somewhat specific to Clean.

If you've never heard of Clean, you can find out more here.
Or, you can join the chat room.

Οurous

Posted 2018-01-24T22:17:35.277

Reputation: 7 916

Answers

10

Avoid import StdEnv when possible

To access built-in functions, even seemingly basic ones like (==) or map, an import statement is needed, usually import StdEnv because it imports the most common modules like StdInt, StdBool and so on (see here for more info on StdEnv).

However it can be possible to avoid this import for some challenges and just use the core language features like list comprehensions and pattern matching.

For example, instead of

import StdEnv 
map f list

one can write

[f x\\x<-list]

List of alternatives:

Some functions or function invocations that need import StdEnv, an alternative that does not need the import and a rough estimate of bytes saved.

  • hd -> (\[h:_]=h), ~6 bytes
  • tl -> (\[_:t]=t), ~6 bytes
  • map f list -> [f x\\x<-list], ~10 bytes
  • filter p list -> [x\\x<-list|p x], ~11 bytes
  • (&&) -> %a b|a=b=a;%, ~6 bytes
  • (||) -> %a b|a=a=b;%, ~6 bytes
  • not -> %a|a=False=True;%, ~1 byte
  • and -> %[a:r]|a= %r=a;%_=True, ~0 bytes
  • or -> %[a:r]|a=a= %r;%_=False, ~0 bytes

The last few are unlikely to actually save bytes, because a direct replacement yields more bytes than the import, but it might be possible in cases where the recursion over the list is needed anyway.

This tip has successfully been used here.

Laikoni

Posted 2018-01-24T22:17:35.277

Reputation: 23 676

Isn't import StdEnv + a and b (21 bytes) smaller than %[a:r]|a= %r=a;%_=True (22 bytes), though? Or would it be import StdEnv + a=True and b=True (31 bytes), in which case it's indeed definitely shorter? (I've never programmed in Clean, btw.) – Kevin Cruijssen – 2018-01-25T09:13:59.190

@KevinCruijssen We were just discussing that in chat. It's true that those are unlikely to save bytes, except maybe when the program needs to recurse over a list anyway.

– Laikoni – 2018-01-25T09:20:04.063

4Ah ok. Maybe it might also be useful to state how many bytes are saved with the alternative (i.e. map f list -> [f x\\x<-list] (11 bytes saved) (or something similar)). – Kevin Cruijssen – 2018-01-25T09:36:33.640

@KevinCruijssen Done. – Laikoni – 2018-01-25T21:29:11.807

5

Know how to learn the language

After all, how can anyone golf in a language they can't use!

Online

Clean isn't a well-known or well-documented language, and the name certainly doesn't make it easy to find much-needed resources to remedy these issues... or does it?

Clean was originally called Concurrent Clean, which is still used in the preface of almost every document related to Clean - so if you're looking for Clean, look for Concurrent Clean instead.

One of Clean's more remarkable similarities to Haskell (of which there are many) is the existence of Cloogle, which is a function search-engine covering the libraries that Clean ships with.

Locally

The libraries that Clean ships with are in the form of decently-commented, somewhat self-documenting Clean source files, which are able to be browsed through using the IDE.
(It also comes with full example programs, under $INSTALL/Examples.)

Speaking of which, the Windows version of Clean comes with an IDE - while it is fairly limited by modern standards, it's worlds better than using a text editor and the command-line.
The two most useful features (in the context of learning) are:

  • You can double-click an error to see which line it's on
  • You can highlight a module name and press [Ctrl]+[D] to open the definition file (or use [Ctrl]+[I] for the implementation file), and toggle between the definition and implementation file with [Ctrl]+[/]

Οurous

Posted 2018-01-24T22:17:35.277

Reputation: 7 916

4

Forget about character encoding

Clean's compiler doesn't care about what encoding you think you've saved the source file as, just about the byte values in the file. This has some neat consequences.

In the body of the source code, only bytes with code-points corresponding to the printable ASCII characters are allowed, in addition to those for \t\r\n.

Literals:

In String and [Char] literals ("stuff" and ['stuff'] respectively), any bytes except 0 are allowed, with the caveat that " and ' must be escaped (for String and [Char] respectively), and that newlines and carraige returns must be replaced with \n and \r (also respectively).

In Char literals, any byte except 0 is permitted, meaning that:

'\n'

'
'

Are the same, but the second is one byte shorter.

Escaping:

Other than the standard letter escapes \t\r\n (etc.), all non-numeric escape sequences in Clean are either for the slash, or for the quote used to delimit the literal the escape is inside.

For numeric escape sequences, the number is treated as an octal value terminated after three digits. This means that if you want a null followed by the character 1 in a String, you need to use "\0001" (or "\0\61") and not "\01". However, if you follow the escape with anything but numbers, you can omit the leading zeroes.

Consequences:

This quirk with how Clean handles its source files allows String and ['Char'] to effectively become sequences of base-256 single-digit numbers - which has a multitude of uses for code-golf, such as storing indexes (up to 255, of course).

Οurous

Posted 2018-01-24T22:17:35.277

Reputation: 7 916

3

Name functions with symbols

When defining a function, it is often shorter to use some combination of !@$%^&*~-+=<:|>.?/\ than to use alphanumeric characters, because it allows you to omit white-space between identifiers.

For example: ?a=a^2 is shorter than f a=a^2, and invoking it is shorter as well.

However:

If the function identifier is used adjacent to other symbols, which can combine to form a different, but valid identifier, they will all be parsed as one identifier and you'll see an error.

For example: ?a+?b parses as ? a +? b

Additionally:

It is possible to overwrite imported identifiers in Clean, and so the only single-character symbol identifiers that aren't already used in StdEnv are @$?. Overwriting ^-+ (etc.) can be useful if you need more symbolic identifiers, but be wary that you don't overwrite one you're using.

Οurous

Posted 2018-01-24T22:17:35.277

Reputation: 7 916

3

Know your Knodes

Some of the strongest constructs (for golfing) in functional languages are let ... in ....
Clean of course, has this, and something better - the #.

What is a node?

Clean's #, and the ubiquitous | (pattern guard) are both known as 'node expressions'.
Notably, they allow you to program imperatively-ish in Clean (which is really good here!).

The # (let-before):

These both compute the value of an integer given as a string, multiplied by the sum of its chars

f s=let i=toInt s;n=sum[toInt c\\c<-:s]in n*i

f s#i=toInt s
#s=sum[toInt c\\c<-:s]
=s*i

Note how the version with # is shorter, and how we can redefine s. This is useful if we don't need the value that a variable has when we receive it, so we can just re-use the name. (let can run into issues when you do that)

But using let is easier when you need something like flip f = let g x y = f y x in g

The | (pattern guard):

Clean's pattern guard can be used like those in many other functional languages - however it can also be used like an imperative if ... else .... And a shorter version of the ternary expression.

For example, these all return the sign of an integer:

s n|n<>0|n>0=1= -1
=0

s n=if(n<>0)if(n>0)1(-1)0

s n|n>0=1|n<0= -1=0

Of course, the last one which uses the guard more traditionally is the shortest, but the first one shows that you can nest them (but only two unconditional return clauses can appear on the same line in layout rule), and the second shows what the first one does logically.

A note:

You can use these expressions basically anywhere. In lambdas, case ... of, let ... in, etc.

Οurous

Posted 2018-01-24T22:17:35.277

Reputation: 7 916

1

Use shorter lambdas

Sometimes you find yourself using a lambda expression (to pass to map, or sortBy, etc.). When you're doing this (writing lambdas), there's a bunch of ways you can do it.

The right way:

This is sortBy, with an idiomatic lambda sorting lists from longest to shortest

sortBy (\a b = length a > length b)

The other right way:

If you're using Data.Func, you can also do

sortBy (on (>) length)

The short way:

This is the same thing, but with a golfier syntax

sortBy(\a b=length a>length b)

The other way:

Using composition isn't shorter this time, but it can be shorter sometimes

sortBy(\a=(>)(length a)o length)

The other other way:

While it's a bit contrived here, you can use guards in lambdas

sortBy(\a b|length a>length b=True=False)

And also let-before node expressions

sortBy(\a b#l=length
=l a>l b)

A note:

There are two more forms of lambda, (\a b . ...) and (\a b -> ...), the latter of which is identical to the = variant, and the former of which exists for some reason and often looks like you're trying to access a property of something instead of defining a lambda so don't use it.

Οurous

Posted 2018-01-24T22:17:35.277

Reputation: 7 916

1After seeing some of your golfed programs, I had got the impression \a=... was Clean's usual lambda syntax :P – Ørjan Johansen – 2018-01-25T09:37:20.647

You could also add the guards in lambda, as used here. This is undocumented (it even contradicts the language report), but works. Also, -> and = for lambdas are identical as far as the compiler is concerned (-> is old syntax). Only . is different (but I don't know exactly how).

– None – 2018-05-18T19:56:21.033

And in this particular example you could consider using on(<)length, although the Data.Func import will break you up unless you need it already. – None – 2018-05-18T20:00:31.430

@Keelan Cool. I'll update this later today. I think you can also use let-before (#) in lambdas. – Οurous – 2018-05-18T20:26:01.250

Yes, you can :-) – None – 2018-05-18T20:26:40.887

1

If you're using a String you should be using Text

Conversion to strings, and manipulation of strings (the {#Char} / String kind, not the [Char] kind) is quite lengthy and bad for golfing. The Text module remedies this.

Conversion:

Text defines the operator <+ for any two types which have toString defined.
This operator, used as a<+b is the same as toString a+++toString b - saving at least 19 bytes. Even if you include the extra import, ,Text, and use it only once, it still saves 14 bytes!

Manipulation:

Text defines a few string-manipulation staples that are missing from StdEnv:

  • The operator + for strings, which is much shorter than +++ (from StdEnv)
  • indexOf, with the C-like behaviour of returning -1 instead of Nothing on failure
  • concat, which concatenates a list of strings
  • join, which joins a list of strings using a separator string
  • split, which splits a string into a list of strings on a substring

Οurous

Posted 2018-01-24T22:17:35.277

Reputation: 7 916

0

Use Character List Literals

A character list literal is a shorthand way of writing something like ['h','e','l','l','o'] as ['hello'].

This isn't the limit of the notation, for example:

  • repeat'c' becomes ['c','c'..] becomes ['cc'..]
  • ['z','y'..'a'] becomes ['zy'..'a']
  • ['beginning']++[a,b,c]++['end'] becomes ['beginning',a,b,c,'end']
  • ['prefix']++suffix becomes ['prefix':suffix]

These work in matching too:

  • ['I don't care about the next character',_,'but I do care about these ones!']

Οurous

Posted 2018-01-24T22:17:35.277

Reputation: 7 916

0

Sometimes code is shorter

Clean has a bunch of really useful functions in the standard libraries, some of which are incredibly verbose to use without access to *World, and using *World in code-golf is generally a bad idea anyway.

To get around this problem, there are often ccalls you can use inside code blocks instead.

Some examples:

System Time

import System.Time,System._Unsafe
t=toInt(accUnsafe(time))

The above is 58 bytes, but you can save 17 bytes (down to 40+1) with:

t::!Int->Int
t _=code{ccall time "I:I"
}

Random Numbers

This one doesn't save bytes on its own, but avoids having to pass around a list from genRandInt

s::!Int->Int
s _=code{ccall time "I:I"ccall srand "I:I"
}
r::!Int->Int
r _=code{ccall rand "I:I"
}

Other Uses

In addition to these two, which are probably the main uses for this in code-golf, you can call any named function (including but not limited to every syscall), embed arbitrary assembly with instruction <byte>, and embed code for the ABC machine.

Οurous

Posted 2018-01-24T22:17:35.277

Reputation: 7 916