Big big numbers

25

1

Whilst trying to golf several of my answers, I've needed to write large integers in as few characters as possible.

Now I know the best way to do that: I'll be getting you to be writing this program.

The challenge

  • Write a program that when given a positive integer, outputs a program that prints it to stdout or equivalent.
  • The output programs don't have to be in the same language as the creator.
  • The output must be at most 128 bytes.
  • You may accept input from stdin or equivalent (not function input)
  • You may output the resulting program to stdout or equivalent.
  • The number output must be in decimal (base 10)

Scoring

Your score is equal to the smallest positive integer your program can not encode.

The entry with the largest score wins.

Blue

Posted 2015-09-14T09:00:48.390

Reputation: 26 661

I added the tag metagolf, since we're golfing the output program. – orlp – 2015-09-14T09:13:19.083

1@orlp I actually omitted it on purpose, because [tag:metagolf] is a scoring criterion tag which says "the score is the length of your output". I'm considering adding a meta post about that though to allow sort of inverse scoring as well (which is the case for [tag:fastest-code] for instance). – Martin Ender – 2015-09-14T09:14:15.293

1@MartinBüttner I guess we need [tag:meta-restricted-source] :) – orlp – 2015-09-14T09:16:44.793

2How is the challenge different from "which language has the biggest integer range"? – nwp – 2015-09-14T14:29:19.453

5@nwp I think you misunderstand the question. The question is about compression. It would be useful, but not necessary to use a language with a big integer range. – potato – 2015-09-14T20:18:48.453

Answers

2

Python 3 → CJam, (163122 − 1)·255/162 + 1 ≈ 1.213·10270

import sys
n = int(input())
for b in range(163, 1, -1):
    s = []
    m = n
    while m:
        m, r = divmod(m - 93, b)
        if m < 0:
            break
        s.append(r + 93)
    else:
        sys.stdout.buffer.write(b'"%s"%db' % (bytes(s[::-1]), b))
        break
else:
    sys.stdout.buffer.write(b'%d' % n)

It turns out that every integer from 1023 through (163122 − 1)·255/162 can be represented in at least one way by a base b ≤ 163 conversion from a string of at most 122 characters with codes 93 through b + 92, rather than the usual 0 through b − 1. This avoids the troublesome characters 34 (double quote) and 92 (backslash) without any extra output code.

Anders Kaseorg

Posted 2015-09-14T09:00:48.390

Reputation: 29 242

12

Pyth, 252111 ≈ 3.593 × 10266

Js[
"ixL-rC1`H``N"
N
s@L-rC1`H``NjQ252
N
"252")$import sys$$sys.stdout.buffer.write(J.encode('iso-8859-1'))$

Had to use a little bit of Python syntax, because Pyth's print can't print in iso-8859-1.

The number gets encoded in base 252 and represent each digit in that base as an iso-8859-1 char. The chars \ and " would need escaping, and therefore are not used. The char ` isn't used because golfing... And additionally the null-byte is also not used, the Pyth compiler forbids it.

The output is a program with an overhead of 17 bytes:

ixL-rC1`H``N""252

Here's an example usage with the biggest number possible:

Usage

Explanation

of the output program.

ixL-rC1`H``N""252
    rC1`H          create the range of chars: ['\x01', '\x02', ..., '{}']
         ``N       creates a string containing the 3 chars " ' \
   -               remove strings which consists of these 3 chars
 xL         ""     determine the index of each char in "" (encoded number)
i             252  convert from base 253 to base 10

Jakube

Posted 2015-09-14T09:00:48.390

Reputation: 21 462

1

This program fails to encode 12, because Pyth unfortunately reads CR as LF.

– Anders Kaseorg – 2016-07-13T22:29:14.577

10

CJam, 254109 ≈ 1.34 x 10262

q~254b{_33>+_91>+c}%`"{_'[>-_'!>-}%254b"

I'm encoding the number in base 254 and represent each digit in that base as an ISO 8859-1 character, skipping " and \. The output has an overhead of 19 bytes, ""{_'[>-_'!>-}%254b, so I can represent everything less than 254128-19, or explicitly

13392914970384089616967895168962602841770234460440231501234736723328784159136966979592516521814270581662903357791625539571324435618053333498444654631269141250284088221909534717492397543057152353603090337012149759082408143603558512232742912453092885969482645766144

As an example, 6153501 would be encoded as

"abc"{_'[>-_'!>-}%254b

Here is a test program which prints the encoded integer, and then prints its length, and then executes it right away to show its validity (this avoids the trouble of having to copy the unprintable characters into a new program, which doesn't always work with the online interpreter).

Martin Ender

Posted 2015-09-14T09:00:48.390

Reputation: 184 808

8

Perl, 10216

print"print unpack'h*',q{",(pack'h*',<>),"}"

Also base 100 encoding, slightly more elegant. Output for 12345678 would be:

print unpack'h*',q{!Ce‡}

The delimeters { and } correspond to hex values b7 and d7 respectively, which cannot appear in the input, and therefore do not need to be escaped.

There are 20 bytes of overhead, leaving 108 for encoding, reaching a maximum value of 10216-1.


Perl, 10206

print"ord=~print\$' for'",(map chr"1$_",<>=~/.{1,2}/g),"'=~/.|/g"

Simple base 100 encoding. Output for 12345678 would look like this:

ord=~print$' for'p†œ²'=~/.|/g

There are 25 bytes of overhead, leaving 103 for encoding, reaching a maximum value of 10206-1.

primo

Posted 2015-09-14T09:00:48.390

Reputation: 30 891

6

Common Lisp, 36114 - 1 ~ 2.62 × 10117

(lambda(x)(format t"(lambda()#36r~36r)"x))

The largest number is:

2621109035105672045109358354048170185329363187071886946329003212335230440027818091139599929524823562064749950789402494298276879873503833622348138409040138018400021944463278473215

Simply use base 36. For the largest input, the 128-bytes long output is:

(lambda()#36rzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz)

coredump

Posted 2015-09-14T09:00:48.390

Reputation: 6 292

1

CJam, 233114 ≈ 7.561⋅10269

ri233b{Kms/m]_34=+c}%s`"{iKms*}%233b"

The output program "…"{iKms*}%233b decodes the 8-bit characters of a string to base 233 digits with n ↦ ⌊n ⋅ sin 20⌋ = ⌊n ⋅ 0.913⌋. This transformation happens to be surjective without requiring the critical codepoints 34 (double quote) and 92 (backslash) as input.

Anders Kaseorg

Posted 2015-09-14T09:00:48.390

Reputation: 29 242