Generate ladder of integers using the least number of unique characters (in C++)

13

I am new to the sport of code golf. I am trying to generate a ladder of integers using the least number of unique characters in C++.

Let's say we are given an integer 4.

We will generate the following ladder:

1
1 2
1 2 3
1 2 3 4

In short, my program will read a positive integer from stdin and print this ladder to the output. I am trying to do so with the least number of unique characters possible.

My program is as follows:

#include<iostream>

int i;
int ii;
int iii;
int iiii;

main() {
    std::cin >> i;
    for(ii++; ii <= i; ii++) {
        int iii = iiii;
        for(iii++; iii <= ii; iii++) {
            std::cout << iii << " ";
        }
        std::cout << std::endl;
    }
}

Here's the checker that I used to check the number of unique characters in my program:

#include <cstdio>
#include <cstring>
using namespace std;
int check[300],diffcnt=0,cnt=0,t;
char c;
double score;
int main(){

    memset(check,0,sizeof(check));
    FILE *in=fopen("ans.cpp","r");
    while(fscanf(in,"%c",&c)!=EOF){
        cnt++;
        if(!check[c]){
            check[c]=1;
            if(c=='\r'||c=='\n') continue;
            diffcnt++;
        }
    }
    if(diffcnt<25) printf("100\n");
    else if(diffcnt<30){
        printf("%.3lf\n",20.0*100.0/cnt+20.0*(29-diffcnt));
    }
    else{
        score=20.0;
        for(int x=95;x<cnt;x++) score*=0.9;
        printf("%.3lf\n",score);
    }
    printf("Unique Characters: %d\n", diffcnt);
    printf("Total Characters: %d\n", cnt);
    return 0;
}

Preferably I wish to use less than 25 unique characters to complete this program (excluding newline characters but including whitespace). Currently, my program uses 27. I am not sure how to optimize it further.

Could someone please advise me on how to optimize it further (in terms of the number of unique characters used)? Please note that only C++ can be used.

LanceHAOH

Posted 2019-05-13T14:26:53.383

Reputation: 349

5It is certainly novel to ask for [tag:tips] regarding any other scoring criteria than [tag:code-golf], but afaict, it is on-topic, since the [tag:tips] pages says to make it a better answer to a programming challenge that is on-topic. – Adám – 2019-05-13T15:23:19.413

This would have a wider interest if extended to any language – Luis Mendo – 2019-05-13T16:47:04.423

8@LuisMendo I don't really think that is true in this case, since many languages completely trivialise this scoring scheme. If this user wants help learning to "unique golf" it only really makes sense in a subset of languages, so I think this is much better as a tip than as a generic challenge. That said the base problem could probably be a challenge if someone wants to post it. – FryAmTheEggman – 2019-05-13T16:52:56.313

So how many characters would this be in Unary? Yeah, that'd be super-exploitable. – Darrel Hoffman – 2019-05-13T19:45:50.203

3I think you can use digraphs <% and %> instead of curly braces, and I think I missed some. – my pronoun is monicareinstate – 2019-05-14T00:09:11.153

2I definitely missed some. # is %:, so you can get rid of three characters and introduce one ( { => <%, } => %>, # => %:) and get to 25. If you combine this with the answer below, I think you can get 24. – my pronoun is monicareinstate – 2019-05-14T00:36:30.587

@someone +1 Oh my god. Where did you learn these tricks? :) – LanceHAOH – 2019-05-14T00:51:31.300

2@LanceHAOH Trigraphs are extremely common in [underhanded] questions, and digraphs show up as well when reading about trigraphs. – my pronoun is monicareinstate – 2019-05-14T01:20:45.653

@someone: fortunately trigraphs have been removed as of ISO C++17. But only trigraphs, not digraphs or operator keywords like not instead of !. So if you're unique-golfing in the most recent C++ version, you get to ignore trigraphs, but not digraphs. :/ Current g++ already ignores trigraphs by default even without -std=gnu++17; you have to use -trigraphs to enable them, or -std=c++11 or something for strict conformance to a standard that does include them. But yes, gcc -std=c++17 does do %> as }

– Peter Cordes – 2019-05-14T06:55:19.673

Answers

12

I believe I managed to remove the = character from your code, although it now is significantly slower

#include<iostream>

int i;
int ii;
int iii;
int iiii;

int main() {
    std::cin >> i;
    i++;
    for(ii++; ii < i;) {
    for(;iii>iiii;iii++);
    for(;iii<iiii;iii++);
    ii++;
        for(iii++; iii < ii; iii++) {
            std::cout << iii << " ";
        }
        std::cout << std::endl;
    }
}

Its not pretty, but by abusing integer overflow we can get back to 0 without using =

Also we had to change the guards a little. Unfortunately because of the include I couldn't get rid of all new line characters (although its close) so that may be the next avenue for investigation.

Edit: Run out of time for now, but if you include and use strstream and various other libraries, I think you may be able to remove the " character too, again using integers to arrive at the correct character for space and passing it into the strstream

Expired Data

Posted 2019-05-13T14:26:53.383

Reputation: 3 129

2You could #include<std> and eliminate all the :s. Not great coding practice, but that's beside the point. – Darrel Hoffman – 2019-05-13T19:43:30.627

3@DarrelHoffman I can't get that to work, don't you have to do using namespace std; which would use an additional p for the : so a net 0 – Expired Data – 2019-05-13T21:06:39.303

Hmm. Maybe, my C++ is a tad rusty. Also that adds a g, so net loss I guess. If this were code gold, we could reduce the byte count by renaming ii, iii, and iiii to other single letter names (pick any other already-used letters), but that's not what this challenge is about, so I guess not. I'm wondering if there'd be any gains to using getc and putc instead of cin/cout, would have to try it. – Darrel Hoffman – 2019-05-13T22:46:30.773

1My bad. I just read through the checker again. Seems that newline character is ignored. So there is actually no need to bother about removing newlines. But combined with your strategy and the solution by @someone in the comments, I managed to get it to 24 characters. I made the program even faster by using short instead of int. So I got an additional 'h' character. But this let me use the char datatype without additional cost. So I got rid of the " character as well by using character code. – LanceHAOH – 2019-05-14T01:58:34.963

@LanceHAOH: note that signed integer overflow is undefined behaviour in C++, for all signed types including signed char. If you compile with optimization enabled, this code might break with modern compilers, unless you use gcc -fwrapv to make signed overflow well-defined as 2's complement wrap-around. clang supports -fwrapv, too. (unsigned integer types including unsigned char have well-defined behaviour (wrap-around) in ISO C++). It depends on the ABI whether char is signed char or unsigned char, so char can be ok.

– Peter Cordes – 2019-05-14T06:13:36.817

Also, fun fact: atomic<int> is defined as 2's complement with wrap-around. (But of course is very slow vs. a non-shared int, like maybe 20 cycles per atomic increment on modern x86 with no other threads touching it, vs. 1 cycle for an int that optimizes into a register. Or for int with -fwrapv a compiler might see what a loop is doing and just zero it without looping.) The optional type int32_t is defined to be 2's complement if it exists at all, but isn't wrap-around safe. It's usually just a typedef for int on implementations that have it. – Peter Cordes – 2019-05-14T06:19:40.083

Anyway, int and char are both fine for golfing with optimization disabled, but just in case you or anyone else are curious about C++ language rules outside of any particular implementation, this is worth pointing out. – Peter Cordes – 2019-05-14T06:21:25.517

@LanceHAOH and @Expired: If you assume ASCII/UTF-8, space is a char with value 32. If you can create that in a char c variable somehow, you can cout << stuff << c. If you can create a 0, you could just fully unroll 32 c++; statements to avoid using an = or any digits. Or maybe char cc(c+c); to double a value into a new variable (left shift by adding to itself), using C++ parenthesis initializer syntax. So after 2x c++ or something, create new var names to get 4, 8, 16, 32 = ' ' – Peter Cordes – 2019-05-14T06:25:16.313

@ExpiredData: turned some of my comments into an answer, 23 bytes with digraphs. – Peter Cordes – 2019-05-14T08:58:54.403

10

I finally got 24 unique characters by combining the answers of @ExpiredData and @someone. Also, using the short data type instead of int helped to speed up my program because it takes a shorter time to overflow a short data type.

My code is as follows.

%:include<iostream>

short i;
short ii;
short iii;
short iiii;
char iiiii;

main() <%
    std::cin >> i;
    iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;
    iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;
    iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;
    iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;iiiii++;
    i++;
    for(ii++; ii < i; ii++) <%
        for(;iii;iii++);
        for(iii++; iii < ii; iii++)
            std::cout << iii << iiiii;
        std::cout << iii << std::endl;
    %>
%>

LanceHAOH

Posted 2019-05-13T14:26:53.383

Reputation: 349

@KevinCruijssen he uses it in char iiiii;, the last of the variable initializations. – Rɪᴋᴇʀ – 2019-05-16T00:37:27.407

1@KevinCruijssen That is true. But that allows me to remove the " character because I can use character code to represent the space character. So the net difference in unique characters used = 0. – LanceHAOH – 2019-05-16T01:57:45.100

9

23 unique characters using Digraphs. (25 without). No UB.

Use C++11 braced initializer syntax to list-initialize an integer to zero with int var{}; avoiding = and 0. (Or in your case, avoiding global iiii). This gives you a source of zeros other than global variables (which are statically initialized to zero, unlike locals).

Current compilers accept this syntax by default, without having to enable any special options.

(The integer wraparound trick is fun, and ok for golfing with optimization disabled, but signed overflow is undefined behaviour in ISO C++. Enabling optimization will turn those wraparound loops into infinite loops, unless you compile with gcc/clang -fwrapv to give signed integer overflow well-defined behaviour: 2's complement wraparound.

Fun fact: ISO C++ std::atomic<int> has well-defined 2's complement wrap-around! int32_t is required to be 2's complement if defined at all, but the overflow behaviour is undefined so it can still be a typedef for int or long on any machine where one of those types is 32 bits, no padding, and 2's complement.)


Not useful for this specific case:

You can also initialize a new variable as a copy of an existing one, with either braces or (with a non-empty initializer), parens for direct initialization.
int a(b) or int a{b} are equivalent to int a = b;

But int b(); declares a function instead of a variable initialized to zero.

Also, you can get a zero with int() or char(), i.e. zero-initialization of an anonymous object.


We can replace your <= compares with < compares by a simple logic transformation: do the loop-counter increment right after the compare, instead of at the bottom of the loop. IMO this is simpler than the alternatives people have proposed, like using ++ in the first part of a for() to make a 0 into a 1.

    // comments aren't intended as part of the final golfed version
    int n;
    std::cin >> n;      // end condition

    for(int r{}; r < n;) {      // r = rows from 0 .. n-1
        ++r;
        for(int i{}; i < r;) {
            ++i;
            std::cout << i << ' ';
        }
        std::cout << std::endl;
    }

We could golf that down to for(int r{}; r++ < n;) but IMO that's less easy for humans to read. We're not optimizing for total byte count.


If we were already using h, we could save the ' or " for a space.

Assuming an ASCII or UTF-8 environment, space is a char with value 32. We can create that in a variable easily enough, then cout << c;

    char c{};
    c++; c++;            // c=2
    char cc(c+c+c+c);    // cc=8
    char s(cc+cc+cc+cc); // s=32 = ' ' = space in ASCII/UTF-8

And other values can obviously be created from a sequence of ++ and doubling, based on the bits of their binary representation. Effectively shifting a 0 (nothing) or 1 (++) into the LSB before doubling into a new variable.


This version uses h instead of ' or ".

It's much faster than either of the existing versions (not relying on a long loop), and is free of Undefined Behaviour. It compiles with no warnings with g++ -O3 -Wall -Wextra -Wpedantic and with clang++. -std=c++11 is optional. It is legal and portable ISO C++11 :)

It also doesn't rely on global variables. And I made it more human-readable with variable names that have a meaning.

Unique-byte count: 25, excluding the comments which I stripped with g++ -E. And excluding space and newline like your counter. I used sed 's/\(.\)/\1\n/g' ladder-nocomments.cpp | sort | uniq -ic from this askubuntu to count occurrences of each character, and piped that into wc to count how many unique characters I had.

#include<iostream>

int main() {
    char c{};
    c++; c++;            // c=2
    char cc(c+c+c+c);    // cc=8
    char s(cc+cc+cc+cc); // s=32 = ' ' = space in ASCII/UTF-8

    int n;
    std::cin >> n;      // end condition

    for(int r{}; r < n;) {      // r = rows counting from 0
        ++r;
        for(int i{}; i < r;) {
            ++i;
            std::cout << i << s;
        }
        std::cout << std::endl;
    }
}

The only 2 f characters are from for. We could use while loops instead if we had a use for w.

We could possibly rewrite the loops into an assembly-language style of i < r || goto some_label; to write a conditional jump at the bottom of the loop, or whatever. (But using or instead of ||). Nope, that doesn't work. goto is a statement like if and can't be a sub-component of an expression like it can in Perl. Otherwise we could have used it to remove the ( and ) characters.

We could trade f for g with if(stuff) goto label; instead of for, and both loops always run at least 1 iteration so we'd only need one loop-branch at the bottom, like a normal asm do{}while loop structure. Assuming the user inputs an integer > 0...


Digraphs and Trigraphs

Fortunately, trigraphs have been removed as of ISO C++17 so we don't have to use ??> instead of } if we're unique-golfing for the most recent C++ revision.

But only trigraphs specifically: ISO C++17 still has digraphs like :> for ] and %> for }. So at the cost of using %, we can avoid both { and }, and use %: for # for a net saving of 2 fewer unique characters.

And C++ has operator keywords like not for the ! operator, or bitor for the | operator. With xor_eq for ^=, you could zero a variable with i xor_eq i, but it has multiple characters you weren't using.

Current g++ already ignores trigraphs by default even without -std=gnu++17; you have to use -trigraphs to enable them, or -std=c++11 or something for strict conformance to an ISO standard that does include them.

23 unique bytes:

%:include<iostream>

int main() <%
    int n;
    std::cin >> n;

    for(int r<% %>; r < n;) <%
        ++r;
        for(int i<%%>; i < r;) <%
            ++i;
            std::cout << i << ' ';
        %>
        std::cout << std::endl;
    %>
%>

Try it online!

The final version uses a ' single-quote instead of h or " for the space separator. I didn't want to digraph the char c{} stuff so I deleted it. Printing a char is more efficient than printing a string, so I used that.

Histogram:

$ sed 's/\(.\)/\1\n/g' ladder-nocomments.cpp | sort | uniq -ic  | tee /dev/tty | wc -l
     15         // newline
     95         // space
     11 %
      2 '
      3 (
      3 )
      4 +
      9 :
     10 ;
     14 <
      8 >
      2 a
      4 c
      6 d
      3 e
      2 f
     12 i
      2 l
      2 m
     11 n
      5 o
      7 r
      5 s
     11 t
      3 u
25   // total lines, including space and newline

The space separator (still unsolved)

In a now-deleted answer, Johan Du Toit proposed using an alternate separator, specifically std::ends. That's a NUL character, char(0), and prints as zero-width on most terminals. So the output would look like 1234, not 1 2 3 4. Or worse, separated by garbage on anything that didn't silently collapse '\0'.

If you can use an arbitrary separator, when the digit 0 is easy to create with cout << some_zeroed_var. But nobody wants 10203040, that's even worse than no separator.

I was trying to think of a way to create a std::string holding a " " without using char or a string literal. Maybe appending something to it? Maybe with a digraph for [] to set the first byte to a value of 32, after creating one with length 1 via one of the constructors?

Johan also suggested the std::ios fill() member function which returns the current fill character. The default for a stream is set by std::basic_ios::init(), and is ' '.

std::cout << i << std::cout.fill(); replaces << ' '; but uses . instead of '.

With -, we can take a pointer to cout and use ->fill() to call the member function:
std::cout << (bitand std::cout)->fill(). Or not, we weren't using b either so we might as well have used & instead of its lexical equivalent, bitand.

Calling a member function without . or ->

Put it inside a class, and define operator char() { fill(); }

// not digraphed
struct ss : std::ostream {  // default = private inheritance
//      ss() { init(); }  // ostream's constructor calls this for us
        operator char() { return fill(); }
}

Then ss s{} before the loop, and std::cout << i << s; inside the loop. Great, it compiles and works properly, but we had to use p and h for operator char(), for a net loss of 1. At least we avoided b to make member functions public by using struct instead of class. (And we could override the inheritance with protected in case that ever helps).

Peter Cordes

Posted 2019-05-13T14:26:53.383

Reputation: 2 810

@JohanduToit: good idea with cout.fill() from std::ios, but we weren't previously using . Maybe we can call it somehow by taking a pointer and using ->fill() to a member function? Does anything return a pointer to cout or any other stream?

– Peter Cordes – 2019-05-14T09:48:36.990

Oops, << (bitand std::cout)->fill() compiles, but uses -. (Despite the token name, bitand is just a lexical equivalent to &, not specifically the bitwise-and operator. It also works as the address-of operator.) Hmm, is there some template or lambda stuff that can get a pointer to a member function that we can () without using . or ->? – Peter Cordes – 2019-05-14T09:54:24.420

1The only other thing that I found is that std::ios::left is defined as 32, in gcc, but I could not really figure out a way to take advantage of that. I think I’m going to let this one go and get some actual work done :-) – Johan du Toit – 2019-05-14T11:03:18.390

@JohanduToit: Creating an int 32 isn't a problem, my answer already shows how to do that with ++ starting from an int c{}; zero. But yeah, I'm not going down the rabbit hole of looking into lambdas, templates, or std::function. Or the std::string idea. But we're not using g to we can't actually declare a std::string without losing; my idea for using goto instead of for didn't pan out. decltype(something) could give us a char type, but costs us a y. – Peter Cordes – 2019-05-14T11:11:38.520

1You can use auto instead of char for the opeator:struct ss : std::ostream { operator auto () { return fill(); } }; but it doesn't help much. – Johan du Toit – 2019-05-14T12:37:12.633

@JohanduToit: oh crazy, I didn't know you could do operator auto. That brings is back to break even, swapping ' for p. I keep wondering about CPP macros, but the predefined ones are all upper-case and I assume this problem is case-sensitive. – Peter Cordes – 2019-05-14T12:52:32.917

Use #import instead of #include with the overloaded operator... – Johan du Toit – 2019-05-15T03:29:16.570

Ignore that, late night brain fart... – Johan du Toit – 2019-05-15T06:30:28.480

@JohanduToit: We could avoid l if we construct a char with value 10 instead of using std::endl for newline. (We know how to do that using h for char). So the GCC extension import has possibilities... – Peter Cordes – 2019-05-15T06:32:48.513

Cool - I have decided to undelete my post. Please feel free to edit it if you want to. – Johan du Toit – 2019-05-15T06:38:16.863

How to not do it... – Johan du Toit – 2019-05-15T07:56:16.627

@rv7: Uh, thank you? I wrote it all the text from scratch, summarizing my own thoughts. I don't have the patience to organize my thoughts into an actual book, though :/ The 23-unique-byte code is there in the last section, with a TIO link. Maybe you missed it. – Peter Cordes – 2019-05-17T17:38:39.483

7

C++ (gcc) x86_64 Linux only, 9295 8900 8712 6812 5590 bytes, 18 unique characters

int m[]={111111111111111+1111111111111+1111111111111+1111111111111+1111111111111+1111111111111+1111111111111+111111111+111111111+1111111+111111+11111+11111+11+11+11+11+11+1+1+1,11111111111+11111111111+11111111111+1111111111+111111111+111111111+111111111+111111+1111+1111+111+111+111+111+11+11+11+11+11+11+11+11+11+1+1+1,111111111111111+111111111111111+111111111111+111111111111+1111111111+1111111+1111111+11111+11111+11111+1111+111+111+11+11+11+11+11+11+11+11+11+1+1+1+1+1+1+1+1+1+1,111111111111111+111111111111111+1111111111111+1111111111111+11111111111+111111111+111111111+11111111+11111111+11111111+11111111+1111111+1111111+1111111+11111+1111+111+111+11+1+1+1,1111111111111+1111111111111+11111111111+11111111111+1111111111+1111111111+1111111111+111111+11111+11111+11111+11111+1111+1111+1111+1111+111+111+111+11+11+11+11+11+11,11111111111111+1111111111111+11111111111+11111111111+11111111111+1111111111+111111111+11111111+11111111+11111111+11111111+1111+1111+1111+1111+1111+1111+1111+1111+1111+111+1+1+1+1,111111111111111+1111111111111+1111111111111+1111111111111+1111111111111+11111111111+11111111111+1111111+11111+11111+1111+1111+11+11+11+11+11+11+11+1+1+1+1,111111111111+11111111111+1111111111+1111111111+1111111111+1111111111+1111111111+1111111111+11111111+11111+11111+11111+11111+11111+11111+1+1,111111111111111+11111111111111+11111111111+11111111111+1111111111+1111111+1111111+11111+111+111+111+111+111+111+111+111+11+11+1+1+1+1+1+1,11111111111+1111111111+111111111+11111111+11111111+1111111+1111111+1111111+1111111+1111111+1111111+1111111+111111+11+1+1+1+1+1+1+1,111111111111+11111111111+11111111111+11111111+1111111+1111111+111111+111111+111111+111111+111111+111111+111111+111111+111111+11111+11111+111+111+111+111+111+111+111+1+1+1+1+1+1+1,11==1,1111111111+11111111+11111111+11111111+1111111+1111111+1111111+1111111+1111111+1111+1111+1111+1111+1111+1111+1111+1111+1111+111+111+111+11+11+11+1+1+1,1111111111111+111111111111+11111111111+1111111111+111111111+111111111+11111111+111111+111111+111111+11111+1111+111+111+1+1,111111111111+111111111111+11111111111+11111111111+11111111111+11111111111+111111111+111111111+11111111+111111+1111+1111+111+111+111,111111111111+11111111111+1111111111+1111111111+111111111+1111111+111+111+1+1+1+1,111111111111111+11111111111111+1111111111111+1111111111111+111111111111+1111111111+1111111111+1111111111+1111111+111111+111111+111111+11111+11111+11111+1111+1111+111+11+11+1+1+1+1,111111111111111+1111111111111+1111111111111+11111111111+1111111111+11111111+11111111+1111+1111+1111+111+111+111+111+11+11,111111111+111111111+11111111+11111111+11111111+1111111+1111111+111111+11111+1111+1111+1111+1111+111+111+11+11+11+11+11+1+1+1+1+1+1+1+1,11111111111111+111111111111+111111111111+11111111111+111111111+111111+111111+111111+1111+1111+1111+1+1+1+1+1+1+1+1,11111111111+11111111111+11111111111+11111111111+1111111111+1111111111+11111111+1111111+1111111+1111111+1111111+111111+11111+11+11+11+1+1+1+1+1+1+1+1,111111111111111+111111111111111+111111111111+1111111111+1111111111+11111111+11111111+1111111+1111111+111111+111111+11111+11111+111+11+11+1+1+1+1+1+1+1+1+1+1,11111111111111+11111111111111+111111111111+11111111111+11111111111+1111111+1111111+1111111+1111111+1111111+1111111+11+11+11+11+11+11+11+11+1,11111111111111+11111111111111+11111111111+1111111111+11111111+1111111+1111111+1111111+1111111+1111111+1111111+11111+11111+1111+1111+1111+111+111+111+111+111+111+11,111111111111111+1111111111111+111111111111+111111111111+111111111111+11111111111+1111111111+1111111111+111111111+111111+111111+111111+111111+1111+11+1+1,111111111111111+11111111111111+111111111111+111111111111+1111111111+1111111111+111111111+11111111+1111+1111+1111+111+111+111+111+111+11+11+11+11+11+11+11+11+1+1+1+1,11111111111111+11111111111111+11111111111111+11111111111+11111111111+1111111111+11111111+1111111+11111+11111+11111+1111+111+111+111+11+11+11+11+1+1+1+1+1+1,111111111111111+11111111111111+1111111111+111111111+111111111+111111111+11111111+1111111+111111+11111+1111+1111+1111+111+111+111+111+111+111+11+11+11+11+11+11+11+11+11+1+1+1+1,111111111111111+1111111111111+1111111111111+1111111111111+1111111111+111111111+111111111+111111111+11111111+1111111+11111+1111+1111+1111+111+111+111+11,1111111111111+1111111111+11111111+11111111+11111111+11111+1111+111+111+11+11+11+11+11+11+11+11+11+1+1+1+1+1+1+1+1+1+1,11111111111111+1111111111+1111111111+111111111+11111111+1111111+1111111+1111111+111111+11111+11111+11111+11111+11111+1111+1111+1111+111+111+11+11+11+11+11+11+11+1+1+1+1+1+1+1,11111111111111+1111111111+1111111+1111111+1111111+1111111+1111111+1111111+1111111+111111+111111+11111+1111+1111+111+111+111+111+111+111+1+1+1+1+1+1,111111111111111+1111111111111+111111111+111111111+111111111+111111111+11111111+11111111+11111111+11111111+1111111+111111+11111+11111+11111+1111+111+111+111+11+11+11+11+11,1111111111+111111111+1111111+1111111+111111+111111+11111+11111+11111+11111+11111+11111+1111+1111+1111+11+11+11+11+11+11+11+11+11+1+1+1,111111111111111+111111111111+111111111111+111111111111+11111111111+1111111111+1111111111+1111111111+11111111+11111+1111+1111+111+111+111+111+111+111+111+111+1,1111111111+111111111+111111111+11111111+1111111+1111111+1111111+111111+11111+11111+11111+11111+11111+111+111+111+11+11+11+1,11111111111111+11111111111111+1111111111+1111111111+1111111111+1111111111+11111111+11111111+11111111+11111111+1111111+1111111+111+111+111+111+11+11+11+11+11+11+11+1+1,111111111111+11111111111+1111111111+111111111+111111111+111111+111111+111111+111111+11111+11111+11+11+11+11+11+1,111111111+11111+11111+111+11+1+1+1+1+1+1+1+1+1};main(){((int(*)())m)();}

Try it online!

This is based on ideas from this PPCG answer. A machine language program is expressed as an array of 32 bit ints, each of which is represented as a sum of 1+11+111.... It turns out that it may be more efficient to encode x as y such that y%(1<<32)==x. The encoded machine language program is the following

0x0000000000000000:  55                         push    rbp
0x0000000000000001:  31 ED                      xor     ebp, ebp
0x0000000000000003:  53                         push    rbx
0x0000000000000004:  48 83 EC 18                sub     rsp, 0x18
0x0000000000000008:  48 8D 74 24 0C             lea     rsi, [rsp + 0xc]
0x000000000000000d:  31 C0                      xor     eax, eax
0x000000000000000f:  31 FF                      xor     edi, edi
0x0000000000000011:  6A 01                      push    1
0x0000000000000013:  5A                         pop     rdx
0x0000000000000014:  0F 05                      syscall 
0x0000000000000016:  89 C3                      mov     ebx, eax
0x0000000000000018:  85 C0                      test    eax, eax
0x000000000000001a:  74 0C                      je      0x28
0x000000000000001c:  6B ED 0A                   imul    ebp, ebp, 0xa
0x000000000000001f:  03 6C 24 0C                add     ebp, dword ptr [rsp + 0xc]
0x0000000000000023:  83 ED 30                   sub     ebp, 0x30
0x0000000000000026:  EB E0                      jmp     8
0x0000000000000028:  C7 44 24 0C 00 00 00 00    mov     dword ptr [rsp + 0xc], 0
0x0000000000000030:  FF C3                      inc     ebx
0x0000000000000032:  8B 44 24 0C                mov     eax, dword ptr [rsp + 0xc]
0x0000000000000036:  8D 78 01                   lea     edi, [rax + 1]
0x0000000000000039:  89 7C 24 0C                mov     dword ptr [rsp + 0xc], edi
0x000000000000003d:  E8 27 00 00 00             call    0x69
0x0000000000000042:  6A 20                      push    0x20
0x0000000000000044:  48 89 E6                   mov     rsi, rsp
0x0000000000000047:  52                         push    rdx
0x0000000000000048:  58                         pop     rax
0x0000000000000049:  50                         push    rax
0x000000000000004a:  5F                         pop     rdi
0x000000000000004b:  0F 05                      syscall 
0x000000000000004d:  5E                         pop     rsi
0x000000000000004e:  39 5C 24 0C                cmp     dword ptr [rsp + 0xc], ebx
0x0000000000000052:  7C DE                      jl      0x32
0x0000000000000054:  6A 0A                      push    0xa
0x0000000000000056:  48 89 E6                   mov     rsi, rsp
0x0000000000000059:  52                         push    rdx
0x000000000000005a:  58                         pop     rax
0x000000000000005b:  0F 05                      syscall 
0x000000000000005d:  5E                         pop     rsi
0x000000000000005e:  39 DD                      cmp     ebp, ebx
0x0000000000000060:  7F C6                      jg      0x28
0x0000000000000062:  48 83 C4 18                add     rsp, 0x18
0x0000000000000066:  5B                         pop     rbx
0x0000000000000067:  5D                         pop     rbp
0x0000000000000068:  C3                         ret     
0x0000000000000069:  85 FF                      test    edi, edi
0x000000000000006b:  74 2C                      je      0x99
0x000000000000006d:  89 F8                      mov     eax, edi
0x000000000000006f:  6A 0A                      push    0xa
0x0000000000000071:  59                         pop     rcx
0x0000000000000072:  48 83 EC 18                sub     rsp, 0x18
0x0000000000000076:  99                         cdq     
0x0000000000000077:  F7 F9                      idiv    ecx
0x0000000000000079:  89 C7                      mov     edi, eax
0x000000000000007b:  8D 42 30                   lea     eax, [rdx + 0x30]
0x000000000000007e:  89 44 24 0C                mov     dword ptr [rsp + 0xc], eax
0x0000000000000082:  E8 E2 FF FF FF             call    0x69
0x0000000000000087:  48 8D 74 24 0C             lea     rsi, [rsp + 0xc]
0x000000000000008c:  6A 01                      push    1
0x000000000000008e:  58                         pop     rax
0x000000000000008f:  50                         push    rax
0x0000000000000090:  5F                         pop     rdi
0x0000000000000091:  50                         push    rax
0x0000000000000092:  5A                         pop     rdx
0x0000000000000093:  0F 05                      syscall 
0x0000000000000095:  48 83 C4 18                add     rsp, 0x18
0x0000000000000099:  C3                         ret

...which is based on the following C code.

void print(int x){
  if( x ) {
    int y=x%10+'0';
    print(x/10);
    write(1,&y,1);
  }
}
void f() {
  int i=0,j=0,k;
  for( ;read(0,&k,1);i=i*10+k-'0' );
  do {
    for( j++,k=0; print( ++k ), write(1," ",1), k<j; );
    write(1,"\n",1);
  } while(j<i );
}

Edit: Now accepts input from stdin instead of argv[1]. Thanks to @ASCII-only and @PeterCordes for their suggestions!

Edit4: Slightly Significantly improved encoding.

ceilingcat

Posted 2019-05-13T14:26:53.383

Reputation: 5 503

-w flag pls :P (also you can rename ii to a) – ASCII-only – 2019-05-15T01:37:42.980

You need gcc -zexecstack for this, right? Because int m[] is not const. (And recent toolchains put .rodata in a non-executable page anyway so even const int m[] doesn't work on e.g. my Arch Linux system with gcc 8.2.1 20181127 and ld (GNU Binutils) 2.31.1.) Anyway, you forgot to mention that in your answer, but it's in your TIO link. – Peter Cordes – 2019-05-15T06:44:03.757

BTW, the OP's unique-count scoring algorithm doesn't count space and newline, so you don't have to make the whole thing terrible to read, just the array :P – Peter Cordes – 2019-05-15T06:45:27.847

You can save machine-code bytes by copying the 1 with push %rax / pop %rdi instead of another push-immediate. Or more simply, for values that aren't 64-bit, i.e. non-pointers, 2-byte mov %eax, %edi. Also, Linux syscall doesn't destroy its input registers, only rax with the return value and RCX+R11 with saved RIP and RFLAGS as part of how the syscall instruction works. So you can leave rdi and rdx set to 1 across calls, and use different regs. Also, RBX is call-preserved, so it's not really save to clobber main's RBX. It happens to work because CRT start code doesn't care. – Peter Cordes – 2019-05-15T06:56:38.257

6

21 unique characters + 1 unremovable newline

%:include<iostream>
int(n)(int(i))<%
    if(--i)if(n(i))<%%>
    if(i)if(std::cout<<i<<std::addressof(std::cout)->fill())<%%>
%>
int(l)(int(i))<%
    if(n(-(--i)))<%%>
%>
int(m)(int(i))<%
    if(--i)if(m(i))<%%>
    if(i)if(l(-i))<%%>
    if(i)if(std::cout<<std::endl)<%%>
%>
int(c)(int(i))<%
    if(m(-(--i)))<%%>
%>
int(main)(int(i))<%
    if(std::cin>>i)<%%>
    if(c(-i))<%%>
%>

Whitespaces aren't required except for the first newline. Compiled in g++ 7.3.0.

Used characters: %:include<ostram>()f-.

Improvements to other answers:

  1. Removed semicolons by changing for loops to if and recursion.
  2. Got the space character by std::addressof(std::cout)->fill(), aka std::cout.fill().

jimmy23013

Posted 2019-05-13T14:26:53.383

Reputation: 34 042

std::addressof,nice! – Johan du Toit – 2019-05-16T18:02:31.660

2

21 20 unique characters excluding whitespaces

All whitespaces could be changed into newlines.

%:include<iostream>
%:include<list>
int n;
const int co<%%>;
const int ci<%not co%>;
const int cmu<%-ci-ci-ci-ci%>;
const char ctd<%-cmu-cmu-cmu-cmu-cmu-cmu-cmu-cmu%>;
const int cia<%-ctd-ctd-ctd-ctd-ctd-cmu%>;
const int ciu<%cia- -ci- -ci%>;

struct<%struct<%struct<%struct<%struct<%struct<%struct<%
int c<:ctd-ci:>;
%>d<:ctd:>;int c<:ctd-ci:>;%>d<:ctd:>;int c<:ctd-ci:>;
%>d<:ctd:>;int c<:ctd-ci:>;%>d<:ctd:>;int c<:ctd-ci:>;
%>d<:ctd:>;int c<:ctd-ci:>;%>d<:-cmu:>;int c<:-ci-cmu:>;
%>e<:co:><:ctd:><:ctd:><:ctd:><:ctd:><:ctd:><:ctd:>;

int i<:co:>;
auto ia<%e%>;
auto iu<%e%>;
int l<%std::cin>>n and co%>;

struct s<%
    int c<%std::cout<<i<:ciu:>- --i<:cia:><<ctd and n%>;
%>;
struct o<%
    int c<%--ia and n%>;
%>;
struct t<%
    std::list<s>c<%- --l%>;
    std::list<o>r<%-l%>;
    int m<%std::cout<<std::endl and n%>;
%>;
std::list<t>a<%n%>;
int main;

Exits with segfault. Used characters: %:include<ostram>;-h.

It works in this specific compiler version on a 64 bit Linux:

g++-5 (Ubuntu 5.5.0-12ubuntu1) 5.5.0 20171010

With the parameter:

-std=c++17

Even then, I'm not sure it would always work. It may also depends on a lot of other things. cia and ciu are the memory offsets divided by 4 between ia iu and i. (int is 32 bit in this version.) You may have to change the numbers to match the actual offset. The addresses would be much more predictable if they are all contained in a struct. Unfortunately non-static auto isn't allowed in a struct.

e is a 0-elements array of an element type with a size of (232-1) ×232 bytes. If the corresponding pointer type of e is decremented, the higher half of the pointer would be decremented by (232-1), which is equivalent to incrementing by one. This could reset the decremented counter without using the equality sign.

A more reasonable version that should work more reliably, but uses one more character =:

%:include<iostream>
%:include<list>
int n;
int ci<%not n%>;
int cmu<%-ci-ci-ci-ci%>;
char ctd<%-cmu-cmu-cmu-cmu-cmu-cmu-cmu-cmu%>;
int i;
int l<%std::cin>>n and n-n%>;

struct s<%
    int c<%std::cout<<- --i<<ctd and n%>;
%>;
struct t<%
    std::list<s>c<%- --l%>;
    int r<%i=n-n%>;
    int m<%std::cout<<std::endl and n%>;
%>;
std::list<t>a<%n%>;
int main;

Even this doesn't work in the latest version of g++ because it doesn't seem to allow defining main in an arbitrary type anymore.

These two programs don't use parentheses. But then semicolons don't seem to be avoidable.

jimmy23013

Posted 2019-05-13T14:26:53.383

Reputation: 34 042

1

22 Unique characters excluding whitespaces. Separates the numbers with a NUL character which displays correctly on Windows.

%:include<iostream>
int main(int n)<%
    std::cin>>n;
    for(int r<%%>;r++<n;)<%
        for(int i<%%>;i<r;)
            std::cout<<++i<<std::ends;
        std::cout<<std::endl;
    %>
%>

Try it online

Histogram:

[%] 0x25 = 9
[:] 0x3A = 11
[)] 0x29 = 3
[i] 0x69 = 11
[n] 0x6E = 12
[c] 0x63 = 4
[l] 0x6C = 2
[u] 0x75 = 3
[d] 0x64 = 8
[e] 0x65 = 4
[<] 0x3C = 13
[o] 0x6F = 5
[s] 0x73 = 7
[t] 0x74 = 12
[r] 0x72 = 6
[a] 0x61 = 2
[m] 0x6D = 2
[>] 0x3E = 7
[(] 0x28 = 3
[;] 0x3B = 7
[f] 0x66 = 2
[+] 0x2B = 4
Unique Characters: 22
Total Characters: 189

Johan du Toit

Posted 2019-05-13T14:26:53.383

Reputation: 1 524

std::ends is a NUL character (char(0)), not a space (char(32) in ASCII / UTF-8). https://en.cppreference.com/w/cpp/io/manip/ends. I tried it on my Linux desktop just to make sure, and the output looks like 1234, not 1 2 3 4. It looks the same way on your TIO output!

– Peter Cordes – 2019-05-14T07:24:04.793

@PeterCordes, the OP does not specify how the numbers should be separated ;-) – Johan du Toit – 2019-05-14T07:50:38.823

Do you really think they would have wasted a character on " for " " if they could have used iiii to separate with '0' for 10203040? I guess you can make a case that there is a separator still in the binary output of the program, but pointing out this change and describing it in English is important for your answer, because this is not a drop-in replacement! I'd be happy to remove my downvote if you expand your answer to explain and justify that. – Peter Cordes – 2019-05-14T07:54:33.600

1@PeterCordes, Point taken. – Johan du Toit – 2019-05-14T08:44:22.697