Number of distinct non-empty subsequences of binary expansion

19

1

A subsequence is any sequence that you can get from another by deleting any amount of characters. The distinct non-empty subsequences of 100 are 0, 1, 00, 10, 100. The distinct non-empty subsequences of 1010 are 0, 1, 00, 01, 10, 11, 010, 100, 101, 110, 1010.

Write a program or function that given a positive integer n returns the number of distinct non-empty subsequences of the binary expansion of n.

Example: since 4 is 100 in binary, and we saw that it had five distinct non-empty subsequences above, so f(4) = 5. Starting from n = 1, the sequence begins:

1, 3, 2, 5, 6, 5, 3, 7, 10, 11, 9, 8, 9, 7, 4, 9, 14, 17, 15, 16, 19, 17, 12

However, your program must work for any n < 250 in under a second on any modern machine. Some large examples:

f(1099511627775) = 40
f(1099511627776) = 81
f(911188917558917) = 728765543
f(109260951837875) = 447464738
f(43765644099) = 5941674

orlp

Posted 2017-10-15T14:09:36.603

Reputation: 37 067

4I disagree with the time restriction. – ATaco – 2017-10-16T01:24:58.190

1

This sounded really familiar, especially after looking at the plot. Turns out I looked into a very closely related sequence earlier this year, but I counted the number of distinct binary numbers, not binary strings, you get when taking subsequences (so I discounted leading zeros). I had even sandboxed it, but due to the equivalence in the Math.SE post, it would have been a dupe of some Stern-Brocot challenge. The plot of your sequence is a bit nicer (i.e. more chaotic) though. :)

– Martin Ender – 2017-10-16T09:36:30.897

5@ATaco The time restriction has a good reason. There is an efficient algorithm, and it is interesting yet well golfable. If I don't have a time restriction I feel that nearly every answer would simply brute force all possible subsequences, which very, very quickly wouldn't work anymore. In a sense they're non-answers. – orlp – 2017-10-16T11:23:34.800

Answers

10

Python 3, 95 bytes 83 bytes

[-12 bytes thanks to Mr.XCoder :)]

def f(x):
 v=[2,1];c=1
 for i in bin(x)[3:]:k=int(i);c+=v[k];v[1-k]+=v[k]
 return c

Try it online!

A note on the algorithm. The algorithm computes the increment in unique subsequences given by the bit at a given position t. The increment for the first bit is always 1. The algorithm then runs over the sequence of bits s(t) and adds the increment v[s(t)]. At each step, the increment for the complement of s(t), v[1 - s(t)] is updated to v[1]+v[0]. The final number is the sum of all increments.

It should run in O(log2(n)), where n is the input number.

NofP

Posted 2017-10-15T14:09:36.603

Reputation: 754

183 bytes or 83 bytes – Mr. Xcoder – 2017-10-15T20:29:28.590

8

JavaScript (ES6), 53 51 bytes

f=(n,r=~(a=[]))=>n<1?~r:f(n/2,r*2-~~a[n&=1],a[n]=r)

Test cases

f=(n,r=~(a=[]))=>n<1?~r:f(n/2,r*2-~~a[n&=1],a[n]=r)

console.log(f(1099511627775))   // 40
console.log(f(1099511627776))   // 81
console.log(f(911188917558917)) // 728765543
console.log(f(109260951837875)) // 447464738
console.log(f(43765644099))     // 5941674

Formatted and commented

f = (                      // f is a recursive function taking:
  n,                       //   n = integer
  r = ~(                   //   r = last result, initially set to -1
    a = []                 //   and using a[] = last results for 0 and 1,
  )                        //   implicitly initialized to [0, 0]
) =>                       //
  n < 1 ?                  // if n is less than 1:
    ~r                     //   we're done: return -(r + 1)
  :                        // else:
    f(                     //   do a recursive call with:
      n / 2,               //     n / 2
      r * 2 - ~~a[n &= 1], //     updated result = r * 2 - last result for this binary digit
      a[n] = r             //     update last result for this binary digit
    )                      //   end of recursive call

Non-recursive version, 63 bytes

Saved 3 bytes thanks to @ThePirateBay

s=>[...s.toString(2)].map(l=c=>l[p=r,r=r*2-~~l[c],c]=p,r=1)|r-1

Test cases

let f =

s=>[...s.toString(2)].map(l=c=>l[p=r,r=r*2-~~l[c],c]=p,r=1)|r-1

console.log(f(1099511627775))   // 40
console.log(f(1099511627776))   // 81
console.log(f(911188917558917)) // 728765543
console.log(f(109260951837875)) // 447464738
console.log(f(43765644099))     // 5941674

Arnauld

Posted 2017-10-15T14:09:36.603

Reputation: 111 334

I think you can save 3 bytes by assigning the inner function (the first argument of map) to the flag variable l instead of an empty array. – None – 2017-10-15T16:48:01.587

@ThePirateBay Nice one. Thanks! – Arnauld – 2017-10-15T16:51:13.380

7

Python 2, 56 bytes

f=lambda x,a=1,b=1:x and f(x/2,a+~x%2*b,x%2*a+b)or a+b-2

Try it online!

Taking the method from NofP.

59 bytes iteratively:

x=input()
v=[1,1]
while x:v[x%2]=sum(v);x/=2
print sum(v)-2

Try it online!

xnor

Posted 2017-10-15T14:09:36.603

Reputation: 115 687

nice algorithm. would you mind adding just a short explanation about why it works? – Jonah – 2017-10-16T03:30:13.050

6

Jelly, 10 bytes

B3;BSṛ¦/’S

This uses @xnor's improvement on @NofP's algorithm.

Try it online!

Background

Let (a1, ..., an) be a finite binary sequence. For each non-negative integer k ≤ n, define ok as the number of unique subsequences of (a1, ..., ak) that are either empty or end in 1, zk as the number of unique subsequences that are either empty or end in 0.

Clearly, o0 = z0 = 1, as the only subsequence of the empty sequence is the empty sequence.

For each index k, the total number of subsequences of (a1, ..., ak) is ok + zk - 1 (subtracting 1 accounts for the fact that both ok and zk count the empty sequence). The total number of non-empty subsequences is therefore ok + zk - 2. The challenge asks to compute on + zn - 2.

Whenever k > 0, we can compute ok and zk recursively. There are two cases:

  • ak = 1

    zk = zk-1, since (a1, ..., ak-1) and (a1, ..., ak-1, 1) have the same subsequences that end in 0.

    For each of the ok - 1 non-empty subsequences of (a1, ..., ak) that end in 1, we can remove the trailing 1 to obtain one of the ok-1 + zk-1 - 1 subsequences (a1, ..., ak-1). Conversely, appending a 1 to each of the latter ok-1 + zk-1 - 1 sequences results in one of the ok - 1 former sequences. Thus, ok - 1 = ok-1 + zk-1 - 1 and ok = ok-1 + zk-1.

  • ak = 0

    Similarly to the previous case, we obtain the recursive formulae ok = ok-1 and zk = zk-1 + ok-1.

How it works

B3;BSṛ¦/’S  Main link. Argument: n (positive integer)

B           Binary; convert n to base 2.
 3;         Prepend a 3.
   B        Binary; convert all integers in the resulting array to base 2, mapping
            0 to [0], 1 to [1], and the prepended 3 to [1, 1].
       /    Reduce the resulting array by the quicklink to the left, which will be 
            called with left argument [x, y] (integer pair) and right argument [j] 
            (either [0] or [1]).
      ¦     Sparse application.
    S           Compute the sum (x + y) and...
     ṛ          for each index in the right argument (i.e., for j)...
            replace the element of [x, y] at that index with (x + y).
       ’    Decrement both integers in the resulting pair.
        S   Take the sum.

Dennis

Posted 2017-10-15T14:09:36.603

Reputation: 196 637

hey dennis, would you mind adding a short explanation about why the algorithm works? – Jonah – 2017-10-16T13:50:02.200

I've added an explanation. – Dennis – 2017-10-16T17:05:58.443

4

05AB1E, 12 bytes

0¸sbvDO>yǝ}O

Try it online! Explanation: As pointed out by the other answers, the number of subsequences for a binary string a..y0 that end in a 1 is the same as the number for the binary string a..y, while the number that end in a 0 is the total number of subsequences for the binary string a..y (which each gain a 0 suffix) plus one for 0 itself. Unlike the other answers I don't include the empty subsequence as this saves a byte constructing the initial state.

0¸s             Push [0] under the input
   b            Convert the input to binary
    v     }     Loop over the digits
     D          Duplicate the array
      O         Take the sum
       >        Increment
        yǝ      Replace the index corresponding to the binary digit
           O    Take the sum of the final array

Neil

Posted 2017-10-15T14:09:36.603

Reputation: 95 035

1

Java 8, 97 bytes

n->f(n,1,1)long f(long n,long a,long b){return n>0?f(n/2,a+Math.floorMod(~n,2)*b,n%2*a+b):a+b-2;}

Port of @xnor's Python 2 answer, which in turn is an improvement of @NofP's Python 3 answer.

Try it here.


Maybe it's a good thing the -tag was present, because I initially had the following to bruteforce all subsequences:

import java.util.*;n->p(n.toString(n,2)).size()-1;Set p(String s){Set r=new HashSet();r.add("");if(s.isEmpty())return r;Set q=p(s.substring(1));r.addAll(q);for(Object o:q)r.add(""+s.charAt(0)+o);return r;}

Try it here.

Which also worked, but took way too long for the last three test cases. Not to mention it's way longer (208 204 bytes).

Kevin Cruijssen

Posted 2017-10-15T14:09:36.603

Reputation: 67 575

1

6502 machine code (C64), 321 bytes

00 C0 20 FD AE A2 00 9D 4F C1 E8 20 73 00 90 F7 9D 4F C1 A0 FF C8 B9 4F C1 D0
FA A2 15 CA 88 30 0A B9 4F C1 29 0F 9D 4F C1 10 F2 A9 00 9D 4F C1 CA 10 F8 A9
00 A0 07 99 64 C1 88 10 FA A0 40 A2 6C 18 BD E4 C0 90 02 09 10 4A 9D E4 C0 E8
10 F2 A2 07 7E 64 C1 CA 10 FA 88 F0 13 A2 13 BD 50 C1 C9 08 30 05 E9 03 9D 50
C1 CA 10 F1 30 D1 A2 0F A9 00 9D 3F C1 CA D0 FA A9 01 8D 3F C1 8D 47 C1 A2 08
CA BD 64 C1 F0 FA A0 09 1E 64 C1 88 90 FA B0 0A CA 30 28 A0 08 1E 64 C1 90 04
A9 47 B0 02 A9 4F 8D AF C0 86 FE A2 F8 18 BD 47 C0 7D 4F C0 9D 47 C0 E8 D0 F4
A6 FE 88 D0 DC F0 D5 A2 F8 BD 47 C0 7D 4F C0 9D 6C C0 E8 D0 F4 AD 64 C1 E9 01
8D 64 C1 A2 F9 BD 6C C0 E9 00 9D 6C C0 E8 D0 F5 A0 15 A9 00 99 4E C1 88 D0 FA
A0 40 A2 13 BD 50 C1 C9 05 30 05 69 02 9D 50 C1 CA 10 F1 0E 64 C1 A2 F9 3E 6C
C0 E8 D0 FA A2 13 BD 50 C1 2A C9 10 29 0F 9D 50 C1 CA 10 F2 88 D0 D1 E0 14 F0
06 E8 BD 4F C1 F0 F6 09 30 99 4F C1 C8 E8 E0 15 F0 05 BD 4F C1 90 F0 A9 00 99
4F C1 A9 4F A0 C1 4C 1E AB

Online demo

Online demo with error checking (346 bytes)

Usage: sys49152,[n], e.g. sys49152,911188917558917.

The time restriction and the test-cases require solutions to calculate in 64bit numbers, so time to prove the C64 qualifies as "modern machine" ;)

Of course, this needs quite a bit of code, the OS doesn't provide anything for integers wider than 16bit. The lame part here: it's yet another implementation (slightly modified) of NofP's algorithm resp. xnor's improved variant. Thanks for the idea ;)


Explanation

Here's a commented disassembly listing of the relevant part doing the algorithm:

.C:c06c  A2 0F       LDX #$0F           ; 15 bytes to clear
.C:c06e  A9 00       LDA #$00
.C:c070   .clearloop:
.C:c070  9D 3F C1    STA .num_a,X
.C:c073  CA          DEX
.C:c074  D0 FA       BNE .clearloop
.C:c076  A9 01       LDA #$01           ; initialize num_a and num_b
.C:c078  8D 3F C1    STA .num_a         ; to 1
.C:c07b  8D 47 C1    STA .num_b
.C:c07e  A2 08       LDX #$08           ; 8 bytes of input to check,
.C:c080   .findmsb:                     ; start at most significant
.C:c080  CA          DEX
.C:c081  BD 64 C1    LDA .nc_num,X
.C:c084  F0 FA       BEQ .findmsb       ; repeat until non-0 byte found
.C:c086  A0 09       LDY #$09           ; 8 bits to check (+1 for pre dec)
.C:c088   .findbit:
.C:c088  1E 64 C1    ASL .nc_num,X      ; shift left, highest bit to carry
.C:c08b  88          DEY
.C:c08c  90 FA       BCC .findbit       ; bit was zero -> repeat
.C:c08e  B0 0A       BCS .loopentry     ; jump into calculation loop
.C:c090   .mainloop:
.C:c090  CA          DEX                ; next byte
.C:c091  30 28       BMI .done          ; index -1? -> done calculating
.C:c093  A0 08       LDY #$08           ; 8 bits to check
.C:c095   .bitloop:
.C:c095  1E 64 C1    ASL .nc_num,X      ; shift left, highest bit to carry
.C:c098  90 04       BCC .tgt_b         ; if 0, store addition result in num_b
.C:c09a   .loopentry:
.C:c09a  A9 47       LDA #$47
.C:c09c  B0 02       BCS .tgt_a         ; ... else store in num_a ...
.C:c09e   .tgt_b:
.C:c09e  A9 4F       LDA #$4F
.C:c0a0   .tgt_a:
.C:c0a0  8D AF C0    STA $C0AF          ; ... using self-modification.
.C:c0a3  86 FE       STX $FE            ; save byte index
.C:c0a5  A2 F8       LDX #$F8           ; index for adding
.C:c0a7  18          CLC
.C:c0a8   .addloop:
.C:c0a8  BD 47 C0    LDA $C047,X        ; load byte from num_a
.C:c0ab  7D 4F C0    ADC $C04F,X        ; add byte from num_b
.C:c0ae  9D 47 C0    STA $C047,X        ; store to num_a or num_b
.C:c0b1  E8          INX                ; next index
.C:c0b2  D0 F4       BNE .addloop       ; done if index overflown
.C:c0b4  A6 FE       LDX $FE            ; restore byte index
.C:c0b6  88          DEY                ; decrement bit index
.C:c0b7  D0 DC       BNE .bitloop       ; bits left in current byte -> repeat
.C:c0b9  F0 D5       BEQ .mainloop      ; else repeat main loop
.C:c0bb   .done:
.C:c0bb  A2 F8       LDX #$F8           ; index for adding
.C:c0bd   .addloop2:
.C:c0bd  BD 47 C0    LDA $C047,X        ; load byte from num_a
.C:c0c0  7D 4F C0    ADC $C04F,X        ; add byte from num_b
.C:c0c3  9D 6C C0    STA $C06C,X        ; store to nc_num (result)
.C:c0c6  E8          INX                ; next index
.C:c0c7  D0 F4       BNE .addloop2      ; done if index overflown
.C:c0c9  AD 64 C1    LDA .nc_num        ; load least significant result byte
.C:c0cc  E9 01       SBC #$01           ; subtract 2 (1 + negated carry)
.C:c0ce  8D 64 C1    STA .nc_num        ; store least significant result byte
.C:c0d1  A2 F9       LDX #$F9           ; index for subtract
.C:c0d3   .subloop:
.C:c0d3  BD 6C C0    LDA $C06C,X        ; subtract 0 from all other bytes
.C:c0d6  E9 00       SBC #$00           ; for handling carry if necessary
.C:c0d8  9D 6C C0    STA $C06C,X
.C:c0db  E8          INX
.C:c0dc  D0 F5       BNE .subloop       

The rest is input/output and converting between string and 64bit unsigned integer (little-endian) using some double-dabble algorithm. In case you're interested, here's the whole assembly source for the version with error-checking -- the "golfed" version is in the branch "golf".

Felix Palmen

Posted 2017-10-15T14:09:36.603

Reputation: 3 866