Help, I'm trapped in a Sierpinski triangle!

44

5

Drawing the Sierpinski triangle has been done to death. There's other interesting things we can do with it though. If we squint hard enough at the triangle, we can view upside-down triangles as nodes of a fractal graph. Let's find our way around that graph!

First, let's assign a number to each node. The largest upside-down triangle will be node zero, and then we just go down layer by layer (breadth-first), assigning consecutive numbers in the order top-left-right:

enter image description here
Click for larger version where the small numbers are a bit less blurry.

(Of course, this pattern continues ad infinitum inside the blue triangles.) Another way to define the numbering is that the centre node has index 0, and the children of node i (adjacent triangles of the next-smaller scale) have indices 3i+1, 3i+2, and 3i+3.

How do we move around this graph? There are up to six natural steps one can take from any given triangle:

  • One can always move through the midpoint of one of the edges to one of the three children of the current node. We'll designate these moves as N, SW and SE. E.g. if we're currently on node 2, these would lead to nodes 7, 8, 9, respectively. Other moves through the edges (to indirect descendants) are disallowed.
  • One can also move through one of the three corners, provided it doesn't touch the edge of the triangle, to either the direct parent or one of two indirect ancestors. We'll designate these moves as S, NE and NW. E.g. if we're currently on node 31, S would lead to 10, NE would be invalid and NW would lead to 0.

The Challenge

Given two non-negative integers x and y, find the shortest path from x to y, using only the six moves described above. If there are several shortest paths, output any one of them.

Note that your code should work for more than just the 5 levels depicted in the above diagram. You may assume that x, y < 1743392200. This ensures that they fit inside a 32-bit signed integer. Note that this corresponds to 20 levels of the tree.

Your code must process any valid input in less than 5 seconds. While this rules out a brute force breadth-first search, it should be a fairly loose constraint — my reference implementation handles arbitrary input for depth 1000 in half a second (that's ~480-digit numbers for the nodes).

You may write a program or function, taking input via STDIN (or closest alternative), command-line argument or function argument and outputting the result via STDOUT (or closest alternative), function return value or function (out) parameter.

The output should be a flat, unambiguous list of the strings N, S, NE, NW, SE, SW, using any reasonable separator (spaces, linefeeds, commas, ","...).

Standard rules apply.

Test Cases

The first few test cases can be worked out by hand using the diagram above. The others ensure that answers are sufficiently efficient. For those, there may be other solutions of the same length that are not listed.

0 40                    => N N N N
66 67                   => S SW N N N
30 2                    => NW NW -or- NE SW
93 2                    => NE SW
120 61                  => NW NW NW NW N SE SW N
1493682877 0            => S S NW NW
0 368460408             => SW SW N N SW SW SE SW SW N SE N N SW SW N SE SE
1371432130 1242824      => NW NW NE NW N SE SW SW SW SE SE SW N N N N SW
520174 1675046339       => NE NW NE NE SE SE SW SW N SE N SW N SW SE N N N N SE SE SW SW
312602548 940907702     => NE NW S SW N N SW SE SE SE SW SE N N SW SE SE SE SW
1238153746 1371016873   => NE NE NE SE N N SW N N SW N SE SE SW N SW N N SE N SE N
547211529 1386725128    => S S S NE NW N N SE N SW N SE SW SE SW N SE SE N SE SW SW N
1162261466 1743392199   => NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE

Martin Ender

Posted 2015-12-17T14:07:45.617

Reputation: 184 808

Answers

5

Ruby, 195 194 190 184 bytes

Original: With apologies to Ell, as this is essentially a port of their answer and with many thanks to Doorknob for their help in debugging this answer. There's probably another algorithm for this problem - something to do with *f[x[0,**however far x matches with y**],y] - but I will save that for another time.

a=->n{n<1?[]:a[~-n/3]+[-n%3]}
f=->x,y{j=x.size
(Array===x)?x==y[0,j]?y[j..-1].map{|m|%w[SE SW N][m]}:x.uniq.map{|m|[%w[NW NE S][m],*f[x[0,x.rindex(m)],y]]}.min_by(&:size):f[a[x],a[y]]}

EDIT: The greedy algorithm doesn't work for h[299792458, 1000000]. I have returned the byte count to 195, while I repair my algorithm once more. Fixed it only for the byte count to rise to 203. Sigh

Under construction: This program uses a greedy algorithm to find common ancestors x[0,j]==y[0,j] (note: there may be several common ancestors). The algorithm is based very loosely on Ell's recursive ancestor search. The first half of the resulting instructions are how to get to this common ancestor, and the second half is getting to y based on y[j..-1].

Note: a[n] here returns a base-3 bijective numeration using the digits 2,1,0 instead of 1,2,3.

As an example, let's run through f[59,17] or f[[2,0,2,1],[2,1,1]]. Here, j == 1. To get to x[0,j], we go 0 or NW. Then, to get to y, go [1,1] or SW SW

a=->n{n<1?[]:a[~-n/3]+[-n%3]}
h=->m,n{x=a[m];y=a[n];c=[];j=x.size
(j=x.uniq.map{|m|k=x.rindex(m);x[0..k]==y[0..k]?j:k}.min
c<<%w[NW NE S][x[j]];x=x[0,j])until x==y[0,j]
c+y[j..-1].map{|m|%w[SE SW N][m]}}

Sherlock9

Posted 2015-12-17T14:07:45.617

Reputation: 11 664

45

Python 2, 208 205 200 bytes

A=lambda n:n and A(~-n/3)+[-n%3]or[]
f=lambda x,y:f(A(x),A(y))if x<[]else["SSNEW"[m::3]for m in
y[len(x):]]if x==y[:len(x)]else min([["NNSWE"[m::3]]+f(x[:~x[::-1].index(m)],y)for
m in set(x)],key=len)

A function, f, taking a pair of node numbers, and returning the shortest path as a list of strings.

Explanation

We start by employing a different addressing scheme for the triangles; the address of each triangle is a string, defined as follows:

  • The address of the central triangle is the empty string.

  • The addresses of the north, south-west, and south-east children of each triangle are formed appending 0, 1, and 2, respectively, to the address of the triangle.

Essentially, the address of each triangle encodes the (shortest) path from the central triangle to it. The first thing our program does is translating the input triangle numbers to the corresponding addresses.

Figure 1

Click the image for a larger version.

The possible moves at each triangle are easily determined from the address:

  • To move to the north, south-west, and south-east children, we simply append 0, 1, and 2, respectively, to the address.

  • To move to the south, north-east, and north-west ancestors, we find the last (rightmost) occurrence of 0, 1, and 2, respectively, and trim the address to the left of it. If there is no 0, 1, or 2 in the address, then the corresponding ancestor doesn't exist. For example, to move to the north-west ancestor of 112 (i.e., its parent), we find the last occurrence of 2 in 112, which is the last character, and trim the address to the left of it, giving us 11; to move to the north-east ancestor, we find the last occurrence of 1 in 112, which is the second character, and trim the address to the left of it, giving us 1; however, 112 has no south ancestor, since there is no 0 in its address.

Note a few things about a pair of addresses, x and y:

  • If x is an initial substring of y, then y is a descendant of x, and therefore the shortest path from x to y simply follows the corresponding child of each triangle between x and y; in other words, we can replace each 0, 1, and 2 in y[len(x):] with N, SW, and SE, respectively.

  • Otherwise, let i be the index of the first mismatch between x and y. There is no path from x to y that doesn't pass through x[:i] (which is the same as y[:i]), i.e., the first common ancestor of x and y. Hence, any path from x to y must arrive at x[:i], or one of its ancestors, let's call this triangle z, and then continue to y. To arrive from x to z, we follow the ancestors as described above. The shortest path from z to y is given by the previous bullet point.

If x is an initial substring of y, then the shortest path from x to y is easily given by the first bullet point above. Otherwise, we let j be the smallest of the indices of the last occurrences of 0, 1, and 2 in x. If j is greater than, or equal to, the index of the first mismatch between x and y, i, we simply add the corresponding move (S, NE, or NW, respectively) to the path, trim x to the left of j, and continue. Things get trickier if j is less than i, since then we might get to y fastest by ascending to the common ancestor x[:j] directly and descending all the way to y, or we might be able to get to a different common ancestor of x and y that's closer to y by ascending to a different ancestor of x to the right of i, and get from there to y faster. For example, to get from 1222 to 1, the shortest path is to first ascend to the central triangle (whose address is the empty string), and then descend to 1, i.e., the first move takes us to the left of the point of mismatch. however, to get from 1222 to 12, the shortest path is to ascend to 122, and then to 12, i.e., the first move keeps us to the right of the point of mismatch.

So, how do we find the shortest path? The "official" program uses a brute-force-ish approach, trying all possible moves to any of the ancestors whenever x is not an initial substring of y. It's not as bad as it sounds! It solves all the test cases, combined, within a second or two.

But then, again, we can do much better: If there is more than one directly reachable ancestor to the left of the point of mismatch, we only need to test the rightmost one, and if there is more than one directly reachable ancestor to the right of the point of mismatch, we only need to test the leftmost one. This yields a linear time algorithm, w.r.t. the length of x (i.e., the depth of the source triangle, or a time proportional to the logarithm of the source triangle number), which zooms through even much larger test cases. The following program implements this algorithm, at least in essence—due to golfing, its complexity is, in fact, worse than linear, but it's still very fast.

Python 2, 271 266 261 bytes

def f(x,y):
 exec"g=f;f=[]\nwhile y:f=[-y%3]+f;y=~-y/3\ny=x;"*2;G=["SSNEW"[n::3]for
n in g];P=G+f;p=[];s=0
 while f[s:]:
    i=len(f)+~max(map(f[::-1].index,f[s:]));m=["NNSWE"[f[i]::3]]
    if f[:i]==g[:i]:P=min(p+m+G[i:],P,key=len);s=i+1
    else:p+=m;f=f[:i]
 return P

Note that, unlike the shorter version, this version is written specifically not to use recursion in the conversion of the input values to their corresponding addresses, so that it can handle very large values without overflowing the stack.

Results

The following snippet can be used to run the tests, for either version, and generate the results:

def test(x, y, length):
    path = f(x, y)
    print "%10d %10d  =>  %2d: %s" % (x, y, len(path), " ".join(path))
    assert len(path) == length

#         x           y        Length
test(          0,          40,    4   )
test(         66,          67,    5   )
test(         30,           2,    2   )
test(         93,           2,    2   )
test(        120,          61,    8   )
test( 1493682877,           0,    4   )
test(          0,   368460408,   18   )
test( 1371432130,     1242824,   17   )
test(     520174,  1675046339,   23   )
test(  312602548,   940907702,   19   )
test( 1238153746,  1371016873,   22   )
test(  547211529,  1386725128,   23   )
test( 1162261466,  1743392199,   38   )

Golfed Version

         0         40  =>   4: N N N N
        66         67  =>   5: S SW N N N
        30          2  =>   2: NE SW
        93          2  =>   2: NE SW
       120         61  =>   8: NW NW NW NW N SE SW N
1493682877          0  =>   4: S S NW NW
         0  368460408  =>  18: SW SW N N SW SW SE SW SW N SE N N SW SW N SE SE
1371432130    1242824  =>  17: NW NW NE NW N SE SW SW SW SE SE SW N N N N SW
    520174 1675046339  =>  23: NE NE NE NE SE SE SW SW N SE N SW N SW SE N N N N SE SE SW SW
 312602548  940907702  =>  19: NE NW S SW N N SW SE SE SE SW SE N N SW SE SE SE SW
1238153746 1371016873  =>  22: NE NE NE SE N N SW N N SW N SE SE SW N SW N N SE N SE N
 547211529 1386725128  =>  23: S S S S NW N N SE N SW N SE SW SE SW N SE SE N SE SW SW N
1162261466 1743392199  =>  38: NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE

Efficient Version

         0         40  =>   4: N N N N
        66         67  =>   5: S SW N N N
        30          2  =>   2: NW NW
        93          2  =>   2: NE SW
       120         61  =>   8: NW NW NW NW N SE SW N
1493682877          0  =>   4: NE S NW NW
         0  368460408  =>  18: SW SW N N SW SW SE SW SW N SE N N SW SW N SE SE
1371432130    1242824  =>  17: NW NW NE NW N SE SW SW SW SE SE SW N N N N SW
    520174 1675046339  =>  23: NE NW NE NE SE SE SW SW N SE N SW N SW SE N N N N SE SE SW SW
 312602548  940907702  =>  19: NE NW S SW N N SW SE SE SE SW SE N N SW SE SE SE SW
1238153746 1371016873  =>  22: NE NE NE SE N N SW N N SW N SE SE SW N SW N N SE N SE N
 547211529 1386725128  =>  23: S S S S NW N N SE N SW N SE SW SE SW N SE SE N SE SW SW N
1162261466 1743392199  =>  38: NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE NE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE SE

Ell

Posted 2015-12-17T14:07:45.617

Reputation: 7 317

6Damn that was fast. I can't tell you how happy it makes me every time I get you to answer one of my challenges. :) – Martin Ender – 2015-12-17T15:14:07.963

2@MartinBüttner Thanks, that's a huge compliment! FWIW, I hugely enjoy solving your challenges. I might, or might not, have started working on this one while it was still in the sandbox... :) – Ell – 2015-12-17T15:18:54.213

2The addressing scheme is brilliant. This is awesome. – BrainSteel – 2015-12-17T15:20:59.490

Excellent answer. But how did you type all this up in less than an hour? ;-) – AdmBorkBork – 2015-12-17T15:26:19.400

1@BrainSteel the first thing that occured to me was to try that addressing scheme, but to see the whole thing conceptualized, implemented and written up in under an hour is awesome. +1 – Level River St – 2015-12-17T17:25:40.040

I believe this may be slightly off, as the largest center triangle does not have a value of 0, nor match the case in the OP where i is the value in the center node, and going counter-clockwise, the values are 3i+1, 3i+2, and 3i+3 – None – 2015-12-17T21:06:11.577

1@Zymus I'm not sure I follow, but if you're referring to the picture, it's not supposed to match the OP—this is a different addressing scheme, as described in the post., – Ell – 2015-12-17T21:28:18.180

@Ell, got it, I misunderstood. – None – 2015-12-17T22:22:04.180

1

The addressing system described here is a base-3 bijective numeration (using digits 0-2 rather than the more conventional 1-3).

– Blckknght – 2015-12-18T02:28:40.210

3

APL (Dyalog Unicode), 144 132 129 118 133 132 130 124 117 bytesSBCS

Thank you very much to Ven and ngn for their help in golfing this in The APL Orchard, a great place to learn the APL language. ⎕IO←0. Golfing suggestions welcome.

Edit: -12 bytes thanks to Ven and ngn by changing how n is defined and switching from 1-indexing to 0-indexing. -3 due to fixing a bug where not everything was switched to 0-indexing. -11 bytes due to changing how P and Q are defined. +15 bytes due to fixing a problem where my algorithm was incorrect with many thanks to ngn for help with figuring the s[⊃⍋|M-s] section. -2 bytes from rearranging the method of finding the backtracking path and +1 byte to bug fixing. -2 bytes thanks to Adám from rearranging the definition of I. -6 bytes thanks to ngn from rearranging the definition of 'S' 'NE' 'NW' 'N' 'SW' 'SE' and from rearranging how t is defined (it is no longer a separate variable). -7 bytes thanks to ngn from golfing how s is defined.

{M←⊃⍸≠⌿↑1+P Q←⍵{(⍵/3)⊤⍺-+/3*⍳⍵}¨⌊3⍟1+⍵×2⋄(∪¨↓6 2⍴'SSNENWNNSWSE')[P[I],3+Q↓⍨⊃⌽I←⍬{M≥≢⍵:⍺⋄(⍺∘,∇↑∘⍵)s[⊃⍋|M-s←⌽⊢.⊢⌸⍵]}P]}

Try it online!

An explanation of the bug in the algorithm

The basic problem is that I went thought the shortest path went directly through the common ancestor, and could not, in fact, go through an ancestor of the common ancestor. This is incorrect as the following examples will demonstrate.

From 66 to 5

66 → 0 2 2 2 → 0 2 2 2
5  → 0 1     → 0 1
      ↑ common ancestor

The two ancestors of 0 2 2 2 are:
0 2 2
(empty)

(empty) has the shorter path back to 0 1 as it only needs two forward moves,
while 0 2 2 requires two more backtracks and one more forward move.

From 299792458 to 45687

299792458 → 0 2 1 1 0 1 1 2 1 1 1 2 1 0 2 2 2 0
45687     → 0 2 1 1 0 1 1 1 2 2
                         ↑ common ancestor

The three ancestors of 299792458 are:
0 2 1 1 0 1 1 2 1 1 1 2 1 0 2 2 2
0 2 1 1 0 1 1 2 1 1 1 2            ← choose this one
0 2 1 1 0 1 1 2 1 1 1 2 1 0 2 2

And the three ancestors of 0 2 1 1 0 1 1 2 1 1 1 2 are:
0 2 1 1
0 2 1 1 0 1 1 2 1 1
0 2 1 1 0 1 1 2 1 1 1

0 2 1 1 0 1 1 1 2 2    ← 45687 for reference
             ↑ common ancestor

While it seems like `0 2 1 1` is the shorter path,
it actually results in a path that is 8 steps long
(2 backtracks, 6 forward steps to 45687).

Meanwhile, `0 2 1 1 0 1 1 2 1 1` is at an equal distance
to the common ancestor and has the following ancestors:
0 2 1 1
0 2 1 1 0 1 1 2 1
0 2 1 1 0 1 1

0 2 1 1 0 1 1 1 2 2    ← 45687 for reference
             ↑ common ancestor

Clearly, this is the superior path, as with three backtracks, we have reached
the point of the common ancestor. With 3 backtracks and 3 forward moves,
we have a path that is 6 steps long.

Explanation of the code

                       ⍝ ⍵ should be an array of 2 integers, x y
SierpinskiPath←{⎕IO←0  ⍝ 0-indexing
        ⍝ P Q←{...}¨⍵  ⍝ First, the bijective base-3 numeration of x and y
    P Q←{
        n←⌊3⍟1+⍵×2  ⍝ The number of digits in the numeration
        z←+/3*⍳⍵    ⍝ The number of numerations with ≤ n digits
        (n/3)⊤⍵-z   ⍝ And a simple decode ⊤ (base conversion) of ⍵-z
    }¨⍵             ⍝ gets us our numerations, our paths

    A←↑1+P Q      ⍝ We turn (1+P Q) into an 2-by-longest-path-length array 
                  ⍝ ↑ pads with 0s and our alphabet also uses 0s
                  ⍝ So we add 1 first to avoid erroneous common ancestor matches
    Common←⊃⍸≠⌿A  ⍝ We find the length of the common ancestor, Common

        ⍝ I←⍬{...}P  ⍝ Now we get the shortest backtracking path from P
    I←⍬{
        Common=≢⍵:⍺       ⍝ If P is shorter than Common, return our backtrack path
        s←⌽⊢.⊢⌸⍵          ⍝ Get the indices of the most recent N SW SE
        t←s[⊃⍋|Common-s]  ⍝ and keep the index that is closest to Common
                          ⍝ and favoring the ancestors to the right of
                          ⍝ Common in a tiebreaker (which is why we reverse ⊢.⊢⌸⍵)
        (⍺,t)∇t↑⍵         ⍝ Then repeat this with each new index to backtrack
    }P                    ⍝ and the path left to backtrack through

    Backtrack←P[I]   ⍝ We get our backtrack directions
    Forward←(⊃⌽I)↓Q  ⍝ and our forward moves to Q
                     ⍝ starting from the appropriate ancestor
    (∪¨↓6 2⍴'SSNENWNNSWSE')[Backtrack,Forward]    ⍝ 'S' 'NE' 'NW' 'N' 'SW' 'SE'
}                    ⍝ and return those directions

Alternative solution using Dyalog Extended and dfns

If we use ⎕CY 'dfns''s adic function, it implements our bijective base-n numeration (which was the inspiration for the version I use) for far fewer bytes. Switching to Dyalog Extended also saves quite a number of bytes and so here we are. Many thanks to Adám for his help in golfing this. Golfing suggestions welcome!

Edit: -8 bytes due to changing how P and Q are defined. -14 bytes due to switching to Dyalog Extended. -2 due to using a full program to remove the dfn brackets {}. +17 bytes due to fixing a problem where my algorithm was incorrect with many thanks to ngn for help with figuring the s[⊃⍋|M-s] section. +1 byte to bug fixing. -2 bytes thanks to Adám from rearranging the definition of I and -1 byte from remembering to put my golfs in both solutions. -3 bytes thanks to ngn by rearranging the generation of the cardinal directions, +1 byte from correcting a buggy golf, and -3 bytes thanks to ngn by rearranging how t is defined (it is no longer a separate variable). -7 bytes thanks to ngn by rearranging how s is defined.

APL (Dyalog Extended), 123 115 101 99 116 117 114 109 102 bytes

M←⊃⍸≠⌿↑1+P Q←(⍳3)∘⌂adic¨⎕⋄(∪¨↓6 2⍴'SSNENWNNSWSE')[P[I],3+Q↓⍨⊃⌽I←⍬{M≥≢⍵:⍺⋄(⍺∘,∇↑∘⍵){⍵[⊃⍋|M-⍵]}⌽⊢.⊢⌸⍵}P]

Try it online!

Sherlock9

Posted 2015-12-17T14:07:45.617

Reputation: 11 664

For 66 and 1, this doesn't find the shortest way via 0. – Christian Sievers – 2019-10-17T12:37:47.940

@ChristianSievers You're absolutely right and I'm not sure how to fix this yet. Thanks for letting me know. – Sherlock9 – 2019-10-17T14:17:19.093