Your challenge is to write N snippets of code such that, when you concatenate the first K ≥ 1 together, they produce the number K. The higher N, the better. Here's the catch: you may not use any character more than once across your snippets.

Rules

You may use the same character multiple times in one (and only one) snippet.
These snippets must be concatenated in the order they are presented, without skipping any.
You must write at least two snippets.
All snippets must be in the same language.
Remember: snippets do not have to be full programs or functions, nor do they have to function on their own. -1 is a valid snippet in Java, e.g.
All resulting concatenations must output the respective K value.
The winner is the person with the highest N value. Tie-breaker is shortest overall program length in bytes.

Example

Suppose your snippets were AD, xc, 123, and ;l. Then:

AD should produce 1
ADxc should produce 2
ADxc123 should produce 3
and ADxc123;l should produce 4.

This program would have a score of 4.

Conor O'Brien

Posted 2017-10-31T18:20:26.190

Reputation: 36 228

4Since they have to be snippets, in a stack-based language, the numbers can just be pushed on the stack, right? – totallyhuman – 2017-10-31T18:58:24.633

To add to totallyhuman's question, in a stack-based language is top-of-stack the only value that matters? That is, could the first two snippets in dc be 1 and 2? – brhfl – 2017-10-31T19:26:49.867

@totallyhuman I would say no--in a stack based environment, if you had multiple values on the stack, you "produced" more than one value, instead of the requested one integer. – Conor O'Brien – 2017-10-31T22:48:18.727

@brhfl See above. – Conor O'Brien – 2017-10-31T22:48:26.467

@ConorO'Brien Could just the top of the stack be considered output? 'Cause otherwise, it's practically impossible in a stack-based language with no implicit IO... – totallyhuman – 2017-11-01T02:18:10.540

@totallyhuman ... You kidding me? I suggest you rethink your approach, I have my own version in a stack based language without implicit output. – Conor O'Brien – 2017-11-01T02:19:33.827

@totallyhuman maybe a snippet should pull the old value of and push another value on the stack... – Heimdall – 2017-11-01T05:03:48.507

Answers

Python 3, 1 112 056 snippets, 4 383 854 bytes

This is very similar to @WheatWizard's Python 2 answer. I started working on this shortly before it was posted, but sorting out Python's quirks regarding non-ASCII characters and long lines took some time. I discovered that Python reads lines 8191 bytes at a time, and when those 8191 bytes contain only a part of a multi-byte character, Python throws a SyntaxError.

The first snippet uses an encoding from Fewest (distinct) characters for Turing Completeness.

exec('%c'%(111+1)+'%c'%(111+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+11)+'%c'%(111+1+1+1+1+1)+'%c'%(11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+11)+'%c'%(11+11+11+1+1+1+1+1+1+1)+'%c'%(111)+'%c'%(111+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+11)+'%c'%(11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1)+'%c'%(11+11+11+11+11+11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+1+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+1+1)+'%c'%(111+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1)+'%c'%(11+11+11+11+11+11+11+11+1+1+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1)+'%c'%(11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+1+1+1+1+1+1+1+1)+'%c'%(11+11+11+1+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+1)+'%c'%(11+11+11+11+1+1+1+1+1)+'%c'%(11+11+11+11+1+1+1+1+1+1)+'%c'%(11+11+11+11+11)+'%c'%(11+11+11+11+1+1+1+1)+'%c'%(11+11+11+1+1+1+1+1+1+1++++++++++1))

This monstrosity simply build the following string and executes it.

print(len(open(__file__).read())-1260)

The following snippets are all exactly one character long. The next three characters are \n, \r, and #. All remaining Unicode characters (except surrogates) follow in a specific order, so they align with the 8191-byte boundary.

The following script generates the appropriate programs for input k between 1 and 1112056.

j = 4
s = "exec('%c'%(111+1)+'%c'%(111+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+11)+'%c'%(111+1+1+1+1+1)+'%c'%(11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+11)+'%c'%(11+11+11+1+1+1+1+1+1+1)+'%c'%(111)+'%c'%(111+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+11)+'%c'%(11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1)+'%c'%(11+11+11+11+11+11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+1+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+1+1)+'%c'%(111+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1+1)+'%c'%(11+11+11+11+11+11+11+11+1+1+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+11+11+11+11+11+1)+'%c'%(11+11+11+1+1+1+1+1+1+1)+'%c'%(11+11+11+1+1+1+1+1+1+1+1)+'%c'%(11+11+11+1+1+1+1+1+1+1+1)+'%c'%(11+11+11+11+1)+'%c'%(11+11+11+11+1+1+1+1+1)+'%c'%(11+11+11+11+1+1+1+1+1+1)+'%c'%(11+11+11+11+11)+'%c'%(11+11+11+11+1+1+1+1)+'%c'%(11+11+11+1+1+1+1+1+1+1++++++++++1))"
l = 1
c = \
        [
                None,
                [n for n in range(0x80) if chr(n) not in "\n\r#%'()+1cex"],
                [*range(0x80, 0x800)],
                [*range(0x800, 0xd800), *range(0xe000, 0x10000)],
                [*range(0x10000, 0x110000)]
        ]

k = int(input())
assert k in range(1, 1112057)
s += '\n\r#'[:k - 1]
k -= 4

while j:
                while k > 0 and c[j] and l + j < 8191:
                        s += chr(c[j].pop())
                        l += j
                        k -= 1
                if k < 1:
                        print(end = s)
                        break
                elif c[j] == []:
                        j -= 1
                else:
                        s += chr(c[8191 - l].pop())
                        print(end = s)
                        k -= 1
                        s = ''
                        l = 0

Dennis

Posted 2017-10-31T18:20:26.190

Reputation: 196 637

4Do you ever lose? – Patrick Roberts – 2017-11-01T18:26:33.533

I'm confused as to how you have scored more than 256. Are different unicode characters different characters? If so why not use combining diacritics to obtain an infinite score? – Post Rock Garf Hunter – 2017-11-02T04:56:25.137

@WheatWizard What is a character?

– Dennis – 2017-11-02T15:18:13.780

It seems that by that definition you can use combining diacritics to get a higher score. – Post Rock Garf Hunter – 2017-11-02T16:42:39.010

@WheatWizard No, a letter plus a combining diacritic is two Unicode characters. – Dennis – 2017-11-02T17:58:09.113

Why though? Are emoji modifiers their own characters?

– Post Rock Garf Hunter – 2017-11-02T18:11:07.843

@WheatWizard I have no idea what those are, but if they have their own code points and aren't surrogates, they characters. Whenever in doubt, just run wc -m. – Dennis – 2017-11-02T19:33:40.540

Perl 5, 50,091 151 snippets

First snippet:

use utf8; print length A

~~2nd through 26th snippets: B through Z~~

27th through 46nd snippets: a through z, excluding the characters in "length"

47th through 56th snippets: 0 through 9

~~57th snippet: _~~

~~The remaining snippets are the 50,105 individual Unicode characters which Perl regards as "word" characters, excluding the 14 distinct word characters in the initial snippet, in any order.~~

Well, it was a nice thought, but it turns out that after a certain length Perl gives you an "identifier too long" error. This is the longest combined program I was able to get Perl to digest:

use utf8; print length A012345679BCDEFGHIJKLMNOPQRSTUVWXYZ_abcdjkmoqsvwxyzĀāĂăĄąĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪīĬĭĮįİıĲĳĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇňŉŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝŞşŠšŢţ

The perldiag manual page says "Future versions of Perl are likely to eliminate these arbitrary limitations" but my Perl 5.18 has not done so.

Explanation:

In non-strict mode, Perl 5 interprets unquoted strings of word characters as "barewords," essentially quoting them for you automatically. They're usually best avoided, but they sure help here!

Sean

Posted 2017-10-31T18:20:26.190

Reputation: 4 136

4Your a-z snippets will most likely use characters from your first snippet. – Jonathan Frech – 2017-10-31T18:44:26.833

Yes indeed, thanks. Fixed. – Sean – 2017-10-31T18:46:44.490

I suggest that you make a "showcase"-like answer, because almost all (exo)langs - jelly, pyth, etc - have this behavior – Rod – 2017-10-31T18:49:40.440

I don't know what "showcase-like answer" means. – Sean – 2017-10-31T18:52:04.127

What @Rod wants you to do is to incorporate similar approaches from other languages in your answer. There's really no strong reason to do this, however, as we tend to prefer a single submission per answer – Conor O'Brien – 2017-10-31T18:56:30.610

the strong reason is, if he don't do this, someone else will, and since he came with this idea first, seems fair to be his answer – Rod – 2017-10-31T18:57:54.107

Unfortunately I don't really know any esolangs. – Sean – 2017-10-31T19:15:05.123

1@Sean Plenty can be found on esolangs.org, and because this approach does not require a thorough understanding to work, you can learn what you need from the site. In addition, many non-esolangs exhibit this behavior; for example, TI-BASIC's first snippet would be length("length(. – Khuldraeseth na'Barya – 2017-10-31T23:18:05.687

Your trick with Julia allows for 126427 snippets - to you want to take my answer? – mschauer – 2017-11-01T14:26:32.813

JavaScript (ES6, V8 6.x), 52 50298 119526 119638 119683 128781 snippets, 88 149147 575179 575631 576121 612789 bytes

Farther below is a Stack Snippet that generates the full program, evaluates it, and creates a download link for the file. That snippet will continue to generate better answers as later versions of Unicode are supported by newer versions of JavaScript, which add new valid identifiers to the language.

Using ASCII only

console.log(new Proxy({},{get:(n,{length:e})=>e>>(e/e)}).nn$$00112233445566778899AABBCCDDEEFFGGHHIIJJKKLLMMNNOOQQRRSSTTUUVVWWXXYYZZ__aabbccddffiijjkkmmppqqssuuvvzz)

Explanation

This uses the metaprogramming technique of Proxy to enable a get handler trap on the object and access the property name as a string, returning the identifier's length / 2 as its value.

With the first snippet starting as new Proxy({},{get:(n,{length:e})=>e>>(e/e)}).nn, each additional snippet added increments the string length of the identifier by 2 by making sure to .repeat() the respective code point twice for 2 byte utf-16 characters, and once for 4 byte utf-16 characters.

Identifiers in JavaScript

In the ECMAScript Specification, an IdentifierName is defined with the following grammar:

IdentifierName::
  IdentifierStart
  IdentifierName IdentifierPart

IdentifierStart::
  UnicodeIDStart
  $
  _
  \UnicodeEscapeSequence

IdentifierPart::
  UnicodeIDContinue
  $
  _
  \UnicodeEscapeSequence
  <ZWNJ>
  <ZWJ>

UnicodeIDStart::
  any Unicode code point with the Unicode property “ID_Start”

UnicodeIDContinue::
  any Unicode code point with the Unicode property “ID_Continue”

Generating the answer

Initially using the "ID_Continue" Unicode property, I wrote a Node.js script that generates the full answer. Now it's just a client-side script that uses a naive eval() to test for valid characters, iterating through all the unicode code points instead:

// first snippet
let answer = 'new Proxy({},{get:(n,{length:e})=>e>>(e/e)}).nn'

const used = Array.from(
  answer,
  c => c.codePointAt(0)
).sort(
  (a, b) => a - b
)

// create a O(1) lookup table for used characters in first snippet
const usedSet = Array.from(
  { length: Math.max(...used) + 1 }
)

for (const codePoint of used) {
  usedSet[codePoint] = true
}

// equal to 1 for first snippet
let snippets = eval(answer)
let identifier = ''

for (let codePoint = 0, length = 0x110000; codePoint < length; codePoint++) {
  const character = String.fromCodePoint(codePoint)

  // if unused
  if (usedSet[codePoint] === undefined) {
    // if valid `IdentifierPart`
    try {
      eval(`{let _${character}$}`)
    } catch (error) {
      // console.log(character)
      continue
    }

    // repeat so that `snippet.length === 2`
    identifier += character.repeat(2 / character.length)
    snippets++
  }
}

// number of snippets generated
console.log(`snippets: ${snippets}`)

const program = `console.log(${answer + identifier})`

// output of program to validate with
eval(program)

// download link to check number of bytes used
dl.href = URL.createObjectURL(new Blob([program], { type: 'text/javascript' }))

<a id=dl download=answer.js>Click to Download</a>

Running stat -f%z answer.js yields a byte count of 612802, but we subtract 13 bytes for the console.log( and ) wrapping the actual submission.

Encoding

The source is stored as utf-8, which is reflected in the enormous byte count of the answer. This is done because Node.js can only run source files encoded in utf-8.

JavaScript internally stores strings with utf-16 encoding, so the string "character length" returned in JavaScript is actually just half the number of bytes of the string encoded in utf-16.

Patrick Roberts

Posted 2017-10-31T18:20:26.190

Reputation: 2 475

Why not use, say, x instead of $, freeing it up as an extra identifier character? – Neil – 2017-11-01T10:36:18.813

@Neil I noticed that a little while ago. I'm currently working on an answer that should be a score of ~119519. Right now I've got it down to just a matter of traversing the encoding properly. – Patrick Roberts – 2017-11-01T10:38:47.000

I tried a copy of Spidermonkey JS shell that I happened to have lying around. It only supported 50466 different identifier characters. (Since you use 12 in your initial snippet, that scores you 50455.) – Neil – 2017-11-01T10:48:11.330

Well, without doing a major overhaul, it looks like the score will have to be 50297. Writing the answer now. To be clear, there are in fact 128,096 supported identifiers in ES6+ using Unicode 10.0.0 specification, but of those, only the number you mentioned have a string length of 1. Otherwise it's a lot more difficult to get a string character count, and that's what I was hung up on. – Patrick Roberts – 2017-11-01T10:48:25.710

(Stupid site using keyup handlers to perform actions.) I then tried Node JS 8 and again I was only able to get 50466 different identifier characters. – Neil – 2017-11-01T10:49:03.390

Huh, I wonder what the other 156 characters are... – Neil – 2017-11-01T11:46:15.300

If I include those with a string length of 2 I get 119694 different identifier-safe characters? – Neil – 2017-11-01T14:04:09.650

@Neil would you mind sharing the script or snippet you used to produce that value? My suspicion is that your script might erroneously be including ; and whitespace characters, now that I think about it. – Patrick Roberts – 2017-11-01T14:36:12.717

@Neil I updated my script to not just check ID_Continue, and now I'm getting 119638 characters that work in Node v8.2.1. I'll update my answer, but I think I need to use another service besides gist. – Patrick Roberts – 2017-11-01T14:57:13.093

for(j=i=0;i<17<<16;i++)try{j+=20%eval("new Proxy({},{get:(n,e)=>e.length}).$$$"+String.fromCodePoint(i))===0}catch(e){};j (evaluated in Firefox's console) – Neil – 2017-11-01T16:05:30.467

@Neil I get 119683 snippets on Chrome, but 119638 on Node v8.2.1. The implementation was the discrepancy, not my algorithm. – Patrick Roberts – 2017-11-01T17:05:34.667

Why not use if (!answer.includes(character))? Or if you think that's too slow, make usedSet = new Set(answer); (new Set tries to iterate its argument, so if you pass a string, it adds all distinct characters individually). – Neil – 2017-11-01T18:02:04.900

@Neil I recently answered a code review question where I discovered that using a sparse array as a set is ~10-20x faster than using Set() for non-negative integers.

– Patrick Roberts – 2017-11-01T18:11:55.697

So, this has 576121 snippets? – Conor O'Brien – 2017-12-03T17:41:32.200

@ConorO'Brien, no, clearly in the title it says "119683 snippets". – Patrick Roberts – 2017-12-03T17:44:34.603

1@PatrickRoberts Sorry, my mistake, while reading this I assumed all the text until the ending one was just crossed out text. My eyes must have skipped over that part. – Conor O'Brien – 2017-12-03T17:46:35.073

Python 2, score 32

for r in range(32):locals()[''.join(map(chr,range(65,66+r)[:26]+range(117,92+r)))]=r+1
print A

With subsequent snippets B, C, D, … Y, Z, u, v, w, x, y, z.

In a twist of dramatic irony, Python 3 supports Unicode identifiers, which would let us get very silly with this trick — but it can’t print without parentheses. I could cram digits into the identifier, too, but I don’t think this approach is very fun to squeeze more out of.

Concatenative counting

Rules

Example

Answers

Python 3, 1 112 056 snippets, 4 383 854 bytes

Perl 5, 50,091 151 snippets

JavaScript (ES6, V8 6.x), 52 50298 119526 119638 119683 128781 snippets, 88 149147 575179 575631 576121 612789 bytes

Using ASCII only

Explanation

Identifiers in JavaScript

Generating the answer

Encoding

Python 2, score 32

Python 2, score 18, less cheat-y

Python 2, score 6 10

TI-Basic (83 series, OS version 1.15 or higher), score: 17 18 19 24

PowerShell, 25 bytes, Score 5

C, 10 snippets, 45 bytes

MATL, Score 8 15, 64 123 bytes

V, score 10

Jelly, 253 bytes, score 250

Lenguage, 1 112 064 snippets

BASIC (ZX Spectrum), score 244 (new score 247) [is this cheating?]

Klein 011, 9 snippets

Explanation

><>, Score: Infinity 1,112,064-6 = 1,112,058

Snippet 1 (6 bytes)

R, score: 79

Pyke, 256 bytes, score 254

Java 8, 7 snippets (19 bytes)

Python 2, 110 snippets

dc, score 13, 58 bytes

Pyth, 124 snippets

Beatnik, 22 bytes, score 20

Octave, Score 86

Julia 0.6, 111217

Explanation

APL (Dyalog), score 12

TeX, score 61 (possibly 190)

Pip, 57 bytes, score = 16

Zsh, score >50 000 (1 112 046?), 16 + Σ(UTF-8 codepoint lengths) bytes

Zsh, score >50 000 (1 112 046?), 16 + Σ(UTF-8 codepoint lengths) bytes