How's your string theory?

3

3

Back to the basics! You're in you comp sci class again, except wiser and golfier. You have the tools, you have the power… but do you have the skill? Probably.

Crash Course for the non-Computer Scientist (skip if you don't need it)

A string is symbolically represented by an arbitrary letter (usually, for our purposes, will be etc. Of course, we know a string as something like "Hello, World!" or 'Hello, World!', or, if you're the snarky Python user, """Hello, World!""".

An alphabet can be thought of as a set (i.e. no repeating characters) of symbols, of which strings can be made. For example, ; this is the binary alphabet, of which contains strings like and . These strings can said to be strings over , or, alternatively, .

The length of any string is denoted . For example, .

There is the empty or null string , which is equivalent to "". It is a valid string in over any alphabet. In other words, is the unique string satisfying the property .

The set is the set of all strings over for which .

The set is called the Kleene closure of , and is the set of all strings over . This can be considered the language of .

The concatenation of two strings and is denoted , and is the string containing the characters of followed by the characters of .

is said to be a prefix of if there is some string that satisfies the equality . (If , then is said to be a proper prefix of .)

The alphabet of a string is denoted , and is the minimal alphabet so that is a string over

The projection of a string relative to an alphabet is defined as

and is essentially the removal of all characters in not in .


TL;DR

Objective Write a program, series of functions, series of programs etc. that can calculate as many of the following tasks as possible:

  • (1) Calculate, given a string s, its length.
  • (2) Arrive at, given a string s and an alphabet Σ, whether or not (output a truthy/falsy value) that string is over said alphabet.
  • (1) Given an alphabet, arrive at whether or not (output a truthy/falsy value) the alphabet is valid (a valid alphabet can have no repeating characters, but is otherwise valid).
  • (4) Given an alphabet Σ and a positive integer n, display .
  • (6) Given an alphabet Σ and an integer n, display the first n entries in (Remember that ε ∈ Σ!)
  • (1) Given a string s and a string t, output st.
  • (3) Given a string s, output Al(s).
  • (5) Given a string s and an alphabet Σ, output πΣ(s).

Input is given by a string or your language's closest equivalent for a string input, an array, tuple, or n-ordinate of characters for an alphabet (your language's closest equivalent), and a number for a number (or your language's closest equivalent, heaven forbid your language does not support numbers!).

Each task has a score (the number in parenthesis). Your submission's score is calculated as follows:

        sum points scored
score = —————————————————
         length of bytes

Highest score wins.

Bonuses

  • ×4 final score iff you output (with the correct symbols!) the input query. E.g., outputting \pi_{\{0,1\}}(012345678910111213)=01101111 for an input of Σ = [0,1]; s = 012345678910111213. (You may choose to do this in either HTML or (La)TeX. Either counts.)
  • (I'm open to suggestions.)

Conor O'Brien

Posted 2015-10-25T18:33:50.973

Reputation: 36 228

Question was closed 2019-10-20T12:55:19.540

Also, for the bonus, how would you notate the third or sixth challenges? – LegionMammal978 – 2015-10-25T19:11:51.270

10@CᴏɴᴏʀO'Bʀɪᴇɴ There is currently no incentive to solve more than 1 challenge, because any challenge with a worse ratio will give a lower score. The scoring system needs to be changed to make this challenge work. I would recommend making it points/length for each problem, and then sum the scores for each problem completed. Most points wins. – isaacg – 2015-10-25T19:17:12.510

2@CᴏɴᴏʀO'Bʀɪᴇɴ That's almost there, but not quite. The point is that answering additional challenges should always improve your score. So you should sum the scores of each challenge, not compute an overall points number. – isaacg – 2015-10-25T20:17:31.893

4This is a situation in which the sandbox would have been handy to sort out issues prior to posting. – Alex A. – 2015-10-25T20:29:00.557

@isaacg actually, you can share code between the challenges. – Paŭlo Ebermann – 2015-10-25T20:41:01.867

For (6) is there a specific order required? I.e. what are the first elements of an infinite set? – Paŭlo Ebermann – 2015-10-25T20:42:44.030

For the bonus, is using unicode characters directly (and not HTML entities) also okay? – Paŭlo Ebermann – 2015-10-25T20:57:30.940

@PaŭloEbermann Yes. – Conor O'Brien – 2015-10-25T20:58:51.347

1The scoring has not really changed with the latest edits. It's now the inverse of what it was originally, and the highest wins. This changes the scores, but not the order of them. If somebody had score s1 before, which was lower than score s2, the new scores are now 1 / s1 and 1 / s2, where the first score is higher. – Reto Koradi – 2015-10-25T21:14:26.030

@RetoKoradi What do you suggest? – Conor O'Brien – 2015-10-25T21:21:44.993

1Task 6, the first n seems meaningless for a set. Do you mean any n? – edc65 – 2015-10-25T22:16:40.570

I assume the order is meant to be lexicographical @edc65 – Lynn – 2015-10-25T22:26:57.893

@Mauris A lexicographical order requires an order on the alphabet. And we don’t know the alphabet. – Édouard – 2015-10-25T23:17:22.813

How do we parse "complete anything for the challenge?" Is JavaScript's built-in string type unsuitable for [1] because it has a .length property and therefore completes something for the challenge? Or is it merely forbidden to use that property? If our programming language contains lazy generators can we require that the $\Sigma$ "input" be the generator for $\Sigma^*$ ? – CR Drost – 2015-10-25T23:47:12.447

@CRDrost For the former, it is acceptable. For the latter, I said that Input is given by the most convenient method that does not complete anything for the challenge. E.g., you can accept a string for the first challenge, but not the length of the string. Specifically, the use of an input that effectively solves the problem is forbidden. – Conor O'Brien – 2015-10-26T01:28:41.557

@CᴏɴᴏʀO'Bʀɪᴇɴ I mean, I view "effectively solves the problem" as vague in these scenarios where interconversion is so trivial, which is why I'm asking you to clarify. – CR Drost – 2015-10-26T14:33:16.593

@CRDrost Thanks for clarifying your problem. It is a tad vague. I will disambiguate – Conor O'Brien – 2015-10-26T15:14:47.563

@CᴏɴᴏʀO'Bʀɪᴇɴ OK. I posted a new entry while you were revising that but it was invalidated by your rules change, so I have put its heading in strikethrough font; I think it might be worth keeping it for historical reasons. – CR Drost – 2015-10-26T15:28:07.107

0 counts as falsy in all languages, right? – user8397947 – 2016-06-07T01:41:25.417

@dorukayhan It depends on the language. Typically, if code inside an if statement runs with a value as a condition, the value is considered truthy – Conor O'Brien – 2016-06-07T02:03:48.970

4

I'm voting to close this question as off-topic because it is a multi-part challenge with insufficient interaction between the parts

– pppery – 2019-10-19T18:57:24.540

@pppery Thanks, but you’re retroactively applying a standard that wasn’t in place at the time the question was asked. Not really applicable – Conor O'Brien – 2019-10-19T19:21:29.517

Rules about question on-topicness apply to questions that predate them. Othertwise, First code golf decathlon would never have been closed and locked. All that has happened is that this clearly off-topic question has slipped through the cracks for three years -- well, the time has come for that to be remedied.

– pppery – 2019-10-19T19:24:36.343

@pppery Alright. I don't really care, not terribly invested in this site. Seems stupid and pedantic, but do whatever. Good luck getting this closed. – Conor O'Brien – 2019-10-19T19:59:38.003

Answers

6

APL, score 3

Task 7: 3 / 1

This is a built-in that – according to this discussion – can be considered an unnamed function.

Try it online on TryAPL.

Dennis

Posted 2015-10-25T18:33:50.973

Reputation: 196 637

I don't know APL, so I'm just trying to understand: The meta discussion says that your code must define a function that was not defined before. Isn't this just using a built-in function? Does it actually define a new function? – Reto Koradi – 2015-10-26T16:20:06.267

3The accepted answer says that sum would be a valid Python answer (if it did what asked for), and the comments clarify that J/APL built-ins are valid for the same reason. is like an anonymous function, which you can name via f←∪. – Dennis – 2015-10-26T16:29:52.797

3

I think, under the current scoring system, my best score is:

Pyth, 5/3

@wz

Task 8


Pyth, 295/42 ~ 7.024

Task 1: 1/2

lz

Task 2: 2/4

!-wz

Task 3: 1/3

.{z

Task 4: 4/3

^zQ

Task 5: 6/7

<s^LzQQ

Warning: Extremely slow.

Task 6: 1/3

+zw

Task 7: 3/2

{z

Task 8: 5/3

@wz

Demonstration

Exactly two of these are not Pyth builtins, namely tasks 2 and 5.

isaacg

Posted 2015-10-25T18:33:50.973

Reputation: 39 268

Please revise to fit scoring criteria. – Conor O'Brien – 2015-10-25T20:12:06.600

2

Mathematica, score 64033/52668 = 1.215785676

StringLength
StringMatchQ[#,#2..]&
#==DeleteDuplicates@#&
""<>#&/@Tuples@##&
Table[""<>#&/@#~Tuples~i,{i,#2}]&
#<>#2&
Union@Characters@#&
""<>StringCases@##&

LegionMammal978

Posted 2015-10-25T18:33:50.973

Reputation: 15 731

Of course. There's always a Mathematica answer for stuff like this! – kirbyfan64sos – 2015-10-25T18:58:28.847

Please revise to fit scoring criteria. – Conor O'Brien – 2015-10-25T20:12:38.523

Neat, I didn't know about \[Function], but I think here you can save two bytes with Table[""<>#&/@#~Tuples~i,{i,#2}] (same number of characters, but no Unicode). – Martin Ender – 2015-10-26T09:08:28.887

@MartinBüttner Yeah, using the Unicode can be a good golfing strategy at times. Too bad that most of them are in the Private Use Area and can't be displayed normally... – LegionMammal978 – 2015-10-26T10:45:19.647

2

Pyth, score 1/2 = 0.5

Challenge 1:

lz

Try it out

Adnan

Posted 2015-10-25T18:33:50.973

Reputation: 41 965

Please revise to fit scoring criteria. – Conor O'Brien – 2015-10-25T20:12:12.020

2

CJam, score 3/3 = 1.00

Best score for the current definition of the scoring function counts only task 7:

r_&

This is a complete program, and scores 3 points with 3 bytes of code.

As bonus material, here are my solutions for all the tasks but task 5:

Task 1, score 1/2:

r,

Task 2, score 2/4:

rr-!

Task 3, score 1/5:

r__&=

Task 4, score 4/7:

rrim*S*

Task 6, score 1/2:

rr

Task 7, score 3/3:

r_&

Task 8: score 5/6:

rr_@--

Reto Koradi

Posted 2015-10-25T18:33:50.973

Reputation: 4 870

2

Ruby, score = 17 / 153 = 0.111

I've just started with the tasks, I'll try to solve all of them.

Task 1 (13 bytes):

a=->s{s.size}

Task 2 (33 bytes):

b=->s,a{s.chars.all?{|c|a[c]!=p}}

Task 3 (28 bytes):

c=->a{a.chars==a.chars.uniq}

Task 4 (88 bytes, awfully long...):

d=->a,n{a.chars.map{|e|e*=n}.join.chars.permutation.map(&:join).map{|e|e=e[0...n]}.uniq}

Tash 6 (12 bytes):

f=->s,t{s+t}

Task 7 (19 bytes):

g=->s{s.chars.uniq}

Task 8 (48 bytes):

h=->s,a{g='';s.chars.map{|c|a[c]==p ? 0:g+=c};g}

Peter Lenkefi

Posted 2015-10-25T18:33:50.973

Reputation: 1 577

Please revise to fit scoring criteria. – Conor O'Brien – 2015-10-25T20:11:58.750

@CᴏɴᴏʀO'Bʀɪᴇɴ Done, thanks! – Peter Lenkefi – 2015-10-25T20:18:50.353

1

Ceylon score 16/167 = 1/11 = 0.090909...

This code of length 176 defines for projection, size and the iterators, giving 5 + 1 + 4 + 6 = 16 points:

import ceylon.language{S=String,B=Boolean}import ceylon.collection{H=HashSet}S al(S s)=>S(H{*s});B v(S s)=>s==al(s);S p(S a,S s)=>S(s.filter(a.contains));B i(S s,S a)=>s==p(a,s);S c(S s,S t)=>s+t;Integer(S)l=S.size;

The alphabet is also provided in form of a string (a character iterable {Character*} would work just the same, but take more code space).

I did actually implement all of the functions, but the other ones took more length per point.

All of them formatted and commented:

import ceylon.language {
    S=String,
    I=Integer,
    B=Boolean
}

// minimal alphabet of a string → 3 points
// taken from http://codegolf.stackexchange.com/a/59789/2338
S b(S s) => S { for (x->c in s.indexed) if (!c in s[0:x]) c };

// is the alphabet valid? → 1 point
// we simply check if it is its own minimal alphabet.
B v(S a) => a == b(a);

// project string to alphabet → 5 points
S p(S a, S s) => S(s.filter(a.contains));
//  S p(S a, S s) => S(s.filter((c)=>c in a));

// is string s over alphabet a? → 2 points
// we check if the string is its own projection.
B i(S a, S s) => s == p(a, s);

// concatenate → 1 point
S c(S s, S t) => s + t;

// length of a string → 1 point
// uses a different syntax than the usual I l(S s) => s.size, because it is slightly shorter. 
I(S) l = S.size;

// strings of length n over alphabet a → 4 points
{S*} x(S a)(I n) =>
// defined recursively.
        n == 0
        then { "" }
        else { for (s in x(a)(n - 1))
        for (c in a)
            S { c } + s };

// first n strings → 6 points
{S*} y(S a, I n) => expand((0..n).map(x(a))).take(n);

(I guess I could replace the .. by : in the last line, but then the score wouldn't be so nice anymore.)

Paŭlo Ebermann

Posted 2015-10-25T18:33:50.973

Reputation: 1 010

1

Haskell, score=1/3

If function is not allowed, this submission is invalid.

The best score is task 8 with score (5/16).

Task 1(1/6):

length

I somehow misread Prelude.

Task 2(2/13):

all.flip elem

First argument is alphabet, the second is string

Task 3(1/32):

c[]=2>1;c(x:xs)=all(/=x)xs&&c xs

Sigh, is is overly long.

Task 4(1/16):

_#0=[[]];x#y=concatMap(flip map$x#(y-1))$map(:)x

Task 5(1/12):

_#0=[[]];x#y=concatMap(flip map$x#(y-1))$map(:)x;n&k=take k.map(n#)[1..]

Task 6(1/4 or 1/2):

(++)

I don't know if the paratheses is allowed to be dropped in this case

Task 7(3/32 or 3/22 or 3/20):

It is same as delete duplicates

a[]=[];a(x:y)=x:(a$filter(\=x)y)

If using library is not cheating

import Data.List
a=nub

Or even drop the a=

Task 8(5/16):

For Haskell it is ridiculously easy.

filter.flip elem

Akangka

Posted 2015-10-25T18:33:50.973

Reputation: 1 859

Functions are allowed, you can drop the a= in task 7, dropping the parenteses should be fine, and in task 3 change xs to y or some other single letter. – CalculatorFeline – 2016-06-07T00:23:31.493

1

Haskell, 5, possible cheating

(This was invalidated by a rules update.)

This strongly depends on how you interpret the minutiae of the grading rules.

$

I claim that this solves task 8 (5 points) in 1 byte for a reasonable interpretation of the inputs. I have asked for clarification and not really received much, so I am posting this solution in order to induce more discussion.

The scope of the liberties I am taking

The freedom afforded to us is not "You have to use the most obvious thing that someone would use if they talked about a "string" in your language" but rather,

Input is given by the most convenient method that does not complete anything for the challenge. E.g., you can accept a string for the first challenge, but not the length of the string.

I am choosing some methods as "convenient" which are convenient in two niche areas of programming. They do not "complete anything for the challenge" in the sense that neither one completes the challenge by itself, nor does the challenge have any well-defined subtasks (and if these challenges did have subtasks then builtins would likely be disallowed!).

Stealing a lambda-calculus-style list

In lambda calculus the usual way that you encode a list is the Church encoding with direct reference to its foldr; a variant encoding will run the function on the empty list [], it is constrained only in its output type and not in its function. In other words, list_from_fn f = f (:); fn_from_list ls f = foldr f [] ls. I think that this is relatively noncontroversial because this is in fact a usual way to think of a list in functional programming circles.

Stealing a Clojure-transducer-style alphabet.

The specification of the alphabet is more contorted and may fall foul of the rather-vague rule. For my alphabet I find it quite convenient to use a filtering arrow, more precisely the sort of filtering arrow you'd see in the Kleisli representation of the List monad. So that requires maybe a bit of explaining: there are these functions from a -> [b] which are composed, in the Kleisli category, with the function (.).concatMap. A filtering arrow is the specific a -> [a] which has the form

filterArrow predicate a = if predicate a then [a] else []

So far, so good: an alphabet is equivalent to its filter which is equivalent to its filter arrow, which makes perfect sense in a certain class of applications. However, these applications are used in practice: the programming language Clojure a few years ago started a big hacker-media frenzy when they introduced this Kleisli category a -> [b] into mainline Clojure under the name "Transducers," with the only difference being that they were transformed with a bit of continuation-passing so that the order of the composition was reversed.

So both of those seem to be relatively straightforward and real programmers really use both.

Where cheating may be occurring

However where I might be cheating is that I actually am expecting as "convenient" for the list function to be the concatenating foldr applied to the empty string, (a -> [b]) -> [b]. This is still 100% equivalent to a list, as evidenced by the isomorphism

list_from_fn f = f (:[]) 
fn_from_list ls f = foldr ((++).f) [] ls

however I do not have any prior art of its widespread usage to represent lists. It's quite possible that someone has done it before, but I can't prove it.

CR Drost

Posted 2015-10-25T18:33:50.973

Reputation: 949

1

dc: 1k 1 1+2 3+/p: 0.4 points

Strings s and t, as applicable, must be present on the top of the stack at execution time, with s topmost.

Calculate length of string s: 1pt, 2B

Zp

Concatenate s and t: 1pt, 3B

rnn

Joe

Posted 2015-10-25T18:33:50.973

Reputation: 895

1

Java, score: 8 / 491 467 465 462 ~= 0.0173

I just wanted to see how many tasks I can figure out.

import static java.lang.System.*;interface a{static void main(String[]A){char[]Z=A[1].toCharArray();switch(A[0]){case"a":out.print(A[1].length());break;case"b":int b=1;for(char B:Z)b=A[2].contains(new String(c))?b:0;out.print(b>0);break;case"c":String B="";int c=1;for(char C:Z){c=B.contains(new String(C))?0:c;B+=C;}out.print(c>0);break;case"d":out.print(A[1]+A[2]);break;case"e":String C="";for(char d:Z)C+=C.contains(new String(d))?"":d;out.print(C);break;}}}

The program (ab)uses command line arguments to get the tasks done. Here's how it works (note that every argument should be enclosed in double quotes like "arg1" "arg2" "arg3"):

  • If the first argument is a, the second argument's length is printed (calculates the length of a given string). Syntax: a <string>
  • If the first argument is b, true is printed in case of the second argument being over the third argument, which is taken as an alphabet (given a string and an alphabet, find out if the string is over the alphabet). Syntax: b <string> <alphabet>
  • If the first argument is c, true is printed if the second argument is a valid alphabet where every character occurs only once (given an alphabet, find out if it's valid). Syntax: c <alphabet>
  • If the first argument is d, the concatenation of the second and third arguments is printed - the second argument comes first (given two strings, concatenate them). Syntax: d <first string> <second string>
  • If the first argument is e, the smallest alphabet which the second argument is over is printed (given a string, find out its alphabet). Syntax: e <string>

Oh wait, let me ungolf the program for you:

import static java.lang.System.*;
interface a {
    static void main(String[] A) {
        char[] Z = A[1].toCharArray();
        switch (A[0]) {
        case "a":
            out.print(A[1].length());
            break;
        case "b":
            int b = 1;
            for (char B : Z)
                b = A[2].contains(new String(c)) ? b : 0;
            out.print(b > 0);
            break;
        case "c":
            String B = ""; int c = 1;
            for (char C : Z){
                c = B.contains(new String(C)) ? 0 : c;
                B += C;
            }
            out.print(c > 0);
            break;
        case "d":
            out.print(A[1] + A[2]);
            break;
        case "e":
            String C = "";
            for (char d : Z)
                C += C.contains(new String(d)) ? "" : d;
            out.print(C);
            break;
        }
    }
}

user8397947

Posted 2015-10-25T18:33:50.973

Reputation: 1 242

0

Python 3, 27 / 158 ≈ 0.1709

I tried for as many tasks as possible instead of the highest score possible.

from itertools import*
L=len
lambda s,a:all(c in a for c in s)
lambda a:L(set(a))==L(a)
permutations
str.__add__
set
lambda s,a:filter(a.__contains__,s)

Task 1:

L=len

Pretty self-explanatory. I assign it because it's used later, and naming it L saved 4 bytes.

Task 2:

lambda s,a:all(c in a for c in s)

This maps s to a list of boolean values telling if the corresponding character is in the string. If they are all True, then s is over a.

Task 3:

lambda a:L(set(a))==L(a)

Here's where I used that len rename. In Python, the set class is a container of unique objects, and if the constructor is given a container with duplicates, it removes them.

Task 4:

permutations

The Python library function itertools.permutations does exactly what is needed.

Task 5:

nope

Task 6:

str.__add__

Shamelessly co-opting class methods. Not much else to say.

Task 7:

set

Once again, the beautiful uniquefying set comes in handy. Here, it automatically finds the minimal alphabet.

Task 8:

lambda s,a:filter(a.__contains__,s)

Filters out the characters in s s which are not also in a. Alternative with the same length: [c for c in s if c in a]

jqblz

Posted 2015-10-25T18:33:50.973

Reputation: 2 062