Background

Perfect shuffle algorithms like Fisher-Yates shuffle don't produce great results when it comes to music playlist shuffling, because it often produces clusters of songs from the same album. In an attempt to solve this problem, Spotify introduced an interesting shuffle algorithm in 2014. At the end of the article, they claimed (emphasis mine):

All in all the algorithm is very simple and it can be implemented in just a couple of lines. It’s also very fast and produces decent results.

Let's see if this is indeed the case.

Task

Implement the modified Spotify shuffle. The algorithm described in the article has some gaps in its specification, so I present a refined one here.

The algorithm

Let's assume we have a list of items grouped into categories, e.g.:

[['A', 'AA', 'AAA'], ['B'], ['C', 'CC'], ['D'], ['E', 'EE', 'EEE', 'EEEE']]

Shuffle items within each category.
- You may use any shuffle algorithm that can produce every possible permutation with nonzero probability.

[['AAA', 'A', 'AA'], ['B'], ['C', 'CC'], ['D'], ['EEE', 'EEEE', 'EE', 'E']]

Assign each item a "positional value". Items in one category should be uniformly spaced, but with some randomness. To achieve this, do the following operations on each category having n items:
1. Initialize the positional value vector v of length n with the values v[k] = k/n for 0 <= k < n (i.e. 0-indexed), so that the items have default spacing of 1/n.
2. Generate an initial random offset io within the range of 0 <= io <= 1/n, and add it to every v[k].
3. Generate n individual random offsets o[k] within the range of -1/10n <= o[k] <= 1/10n, and apply v[k] += o[k] for each k. So the positional value of k-th item (0-indexed) within an n-item category will be v[k] = k/n + io + o[k].
  - The random offsets io and o[k] should ideally be picked from a uniform random variable, but can be approximated by picking from a discrete distribution with at least 5 distinct equally-spaced outcomes, including both lower and upper bounds. (e.g. you can choose to randomly pick io from [0, 1/4n, 2/4n, 3/4n, 1/n].)
  - Don't do extra processing even if v[k] < 0 or v[k] > 1.

[['AAA' -> 0.1, 'A' -> 0.41, 'AA' -> 0.79],
 ['B' -> 0.2],
 ['C' -> 0.49, 'CC' -> 1.01],
 ['D' -> 0.03],
 ['EEE' -> 0.12, 'EEEE' -> 0.37, 'EE' -> 0.6, 'E' -> 0.88]]

Sort all items by the positional values.

['D', 'AAA', 'EEE', 'B', 'EEEE', 'A', 'C', 'EE', 'AA', 'E', 'CC']

The shuffled result roughly looks like this (from the article):

Here is Python-like pseudocode of the above algorithm:

x = nested array of items
uniform(a,b) = uniformly generate a random value between a and b
items = []
v = []
for i in range(len(x)):  # for each category
    shuffle(x[i])  # shuffle items within category in place
    items += x[i]
    n = len(x[i])  # size of the current category
    io = uniform(0, 1/n)  # initial offset
    o = [uniform(-0.1/n, 0.1/n) for k in range(n)]  # individual offsets
    v += [k/n + io + o[k] for k in range(n)]  # resulting positional values
sort items by v
print items

Input and output

The input is a list of groups of items (i.e. a nested list) as shown in the example above. An item is a string made of only uppercase letters (or only lowercase if you want). You can assume that all items are distinct, within and across categories.

The output is a shuffled list including all items in the input, which is produced by the algorithm described above.

The randomness requirements are covered in the algorithm description.

Scoring and winning criterion

Standard code-golf rules apply. Shortest code in bytes wins.

Bubbler

Posted 2020-01-16T23:52:10.757

Reputation: 16 616

"You may use any shuffle algorithm that can produce every possible permutation with nonzero probability." - does this make the assumption that you have a true randomness source? Because it's important to note that the number of possible permutations is limited by the period of the random number generator in use. – Beefster – 2020-01-17T23:13:32.057

That minimum bound of 5 was so convenient for my answer - it lets me generate a random number from 0 to 4, halve it, decrement it, and divide by 10, to give me a random number from -0.1 to 0.1. Thanks! – Neil – 2020-01-18T01:10:27.543

@Beefster I'm aware of it, so it's OK to assume the underlying source of randomness for your language is perfect (i.e. true random). – Bubbler – 2020-01-18T08:18:33.583

May we assume no duplicates in the groups? – Jitse – 2020-01-20T13:16:58.627

@Jitse You can assume that all items are distinct, so yes. – Bubbler – 2020-01-20T22:49:40.220

Answers

JavaScript (Node.js), 125 124 bytes

Saved 1 byte thanks to @KevinCruijssen

a=>a.flatMap(a=>a.sort(_=>R()-.5).map(v=>[(o++-.1+R()/5)/a.length,v],o=R()),R=Math.random).sort(([a],[b])=>a-b).map(a=>a[1])

Try it online!

Commented

a =>                          // a[] = input
  a.flatMap(a =>              // for each category a[] in a[]:
    a.sort(_ => R() - .5)     //   shuffle a[]
    .map(v =>                 //   for each value v in a[]:
      [                       //     build a pair [position, value]:
        ( o++                 //       increment o
          - .1                //       subtract 1/10
          + R() / 5           //       add 2/10 * random value in [0,1)
        ) / a.length,         //       divide the result by the length of a[]
        v                     //       use v as the 2nd element
      ],                      //     end of pair
      o = R()                 //     start with o in [0,1)
    ),                        //   end of map()
    R = Math.random           //   R = alias for Math.random
  )                           // end of flatMap()
  .sort(([a], [b]) => a - b)  // sort by first element, in ascending order
  .map(a => a[1])             // keep only the 2nd element

Arnauld

Posted 2020-01-16T23:52:10.757

Reputation: 111 334

I honestly had no idea they added flat() and flatMap(), I suppose I'm way behind the curve now :P – Mama Fun Roll – 2020-01-20T08:50:44.543

Python 3, 244 218 200 bytes

from random import*
l,r,u=len,range,uniform
def f(x):
 I,v=[],[]
 for i in r(l(x)):shuffle(x[i]);I+=x[i];n=l(x[i]);t=0.1/n;v+=[k/n+u(0,1/n)+u(-t,t)for k in r(n)]
 return[s[1]for s in sorted(zip(v,I))]

Try it online!

Literally just a golfed version of the algorithm in the challenge. I mean, someone had to do it!

Thanks to ValueInk for golfing suggestions!

Lyxal

Posted 2020-01-16T23:52:10.757

Reputation: 5 253

Methinks [s[1]for s in sorted(zip(v,I))] is better because you don't need your key parameter (Python will sort by comparing the first elements, then second elements, etc.). Also, have you considered the fact that your offset initialization uses the same kind of loop as the value generation? That means that you can remove the offset assignment code and put the relevant randomization straight in the value array part. – Value Ink – 2020-01-17T21:46:38.330

Hmmm not quite. I was actually thinking k/n+u(0,1/n)+u(-t,t)for k in r(n). No indices, no arrays, only a quick grabbing of a random number. – Value Ink – 2020-01-17T23:20:50.500

Jelly, 28 bytes

‘9xX€’÷8Ḣ__.÷5ƊƊ_@Ẋ÷
ẈÇ€FỤịẎ

Try it online!

A pair of links that is called monadically with input and output as described in the question. Jelly doesn’t generate random uniform numbers so a random rational number between 0 and 1 in increments of 0.125 is used instead.

Incidentally, per the question’s quote, I’ve implemented it in a couple of lines.

Explanation

Helper link, generates scores for each collection

Argument is the length of the collection

‘                    | Increment by 1
 9x                  | 9 repeated that many times
   X€                | Generate a random number from 1..9 for each
     ’               | Decrement by 1
      ÷8             | Divide by 8
               Ɗ     | Following as a monad:
        Ḣ            | - Head
         _     Ɗ     | - Subtract following as a monad:
          _.         |   - (Remainder of list) subtract 0.5
            ÷5       |   - Divide by 5
                _@Ẋ  | Subtract this from shuffled list from 1..original argument (subtracted to get zero-based number)
                   ÷ | Divide by original argument

Main link

Ẉ       | Lengths of collections
 Ç€     | Call helper link for each
   F    | Flatten
    Ụ   | Sort indices by values
     ịẎ | Index into original argument joined into one list of strings

Nick Kennedy

Posted 2020-01-16T23:52:10.757

Reputation: 11 829

1It's official... Spotify uses Jelly! :P – Lyxal – 2020-01-17T21:58:37.267

J, ⁵² ⁴² 32 bytes

/:&;(%~_.1+?~+?@0+5%~?@#&0)@#&.>

Try it online!

Jonah

Posted 2020-01-16T23:52:10.757

Reputation: 8 729

Red, 180 bytes

func[b][random/seed now extract next sort/skip collect[forall
b[i: random 1.0 /(d: length? t: random b/1)repeat n
d[keep n - 1.0 / d + i +(0.1 / d - random 0.2 / d)keep t/:n]]]2 2]

Try it online!

Just a straightforward implementation of the algorithm

Galen Ivanov

Posted 2020-01-16T23:52:10.757

Reputation: 13 815

05AB1E, 37 bytes

ε.rDg©т*ÝΩVεN®/тD(ŸΩY‚₄/O+‚]˜2ôΣθ}€¨

Port of @Arnauld's JavaScript answer, so make sure to upvote him as well!

Try it online (feel free to remove the last 2 or 3 bytes (}€¨) to see the resulting \$v[k]\$ values it sorts on).

Explanation:

ε             # Map each group of the (implicit) 2D integer-list to:
 .r           #  Randomly shuffle the group
   Dg         #  Duplicate, and pop and push the length: n
     U        #  Pop and store this length in variable `X`
   ₄Ý         #  Push a list in the range [0,1000]
     ₄/       #  Divide each value by 1000 to make the range [0,1] in increments of 0.001
       ©      #  Store this list in variable `®` (without popping)
        Ω     #  Pop and push a random value from this list
         V    #  Pop and store that random value in variable `Y`
    ε         #  Map over each value in this shuffled group:
     Y        #   Push the random value from variable `Y`
      N+      #   Add the 0-based loop index to it
     Tz-      #   Subtract 1/10 from it
     ®        #   Push the [0,1] list from variable `®` again
      Ω       #   Pop and push a random value r from this list
       5/     #   Divide it by 5
         +    #   Add it to the earlier Y-0.1
          X/  #   And divide it all by the length of the group from variable `X`
     ‚        #   And pair this v[k] with the value name
]             # After both maps:
 ˜            # Flatten the entire list of list of value,key pairs
  2ô          # And then split it into parts of size 2
              # (so we now have a flattened value,key paired list)
    Σ }       # Sort these key-value pairs by:
     θ        #  The last item (so by the key v[k])
       €¨     # And then remove the last item (so the key v[k]) from each pair
              # (after which the resulting list of (wrapped) values is output implicitly)

Kevin Cruijssen

Posted 2020-01-16T23:52:10.757

Reputation: 67 575

This doesn't look correct. If I got this right, your io is uniform(0, 0.1 * n) when it should be uniform(0, 1 / n), and your o[k] is uniform(-0.1, 0.1) when it should be uniform(-0.1 / n, 0.1 / n). – Grimmy – 2020-01-17T16:12:30.323

@Grimmy Ported Arnauld's JS answer as fix (+1 byte). I have the feeling this can be golfed substantially, though.. >.> – Kevin Cruijssen – 2020-01-17T19:25:06.410

R, 104 101 bytes

function(x,`!`=unlist,`+`=runif)(!x)[order(!lapply(lengths(x),function(l)(sample(l)-+1--+l/5-.1)/l))]

Try it online!

A function taking a list of character vectors and returning a character vector.

Thanks to @RobinRyder for saving 3 bytes!

Nick Kennedy

Posted 2020-01-16T23:52:10.757

Reputation: 11 829

100 bytes by rearranging and redefining runif. – Robin Ryder – 2020-01-20T08:42:32.467

@RobinRyder thanks. Unfortunately doesn’t work because of the very low operator precedence for ?. I’ve managed to shave 2 bytes off though with inspiration from your answer. – Nick Kennedy – 2020-01-20T08:57:47.413

Right, sorry. I think this works for 101 bytes: unary + has high precedence, and you can use -- when you need actual addition.

– Robin Ryder – 2020-01-20T11:47:46.127

Ruby, 93 bytes

Another straightforward implementation. -9 bytes from borrowing ideas from Arnauld's value generation code, which I saw right before I posted my answer.

->a{v=a.flat_map{|e|i=rand-1;e.shuffle.map{|s|[(rand/5-0.1+i+=1)/e.size,s]}}.sort.map &:last}

Try it online!

Value Ink

Posted 2020-01-16T23:52:10.757

Reputation: 10 608

Charcoal, 60 bytes

Ｆθ«≔⟦⟧ηＷ⁻ιη⊞η‽κ≔∕‽⁵¦⁴ιＦＥη⟦∕⁺λ⁺ι∕⊖⊘‽⁵χＬηκ⟧⊞υκ»Ｗ⌊υ«≔Φυ¬⁼ικυ⟦⊟ι

Try it online! Link is to verbose version of code. Uses 5 distinct equally-spaced random numbers each time as allowed by the question. Explanation:

Ｆθ«

Loop over each category.

≔⟦⟧η

Start with no shuffled items.

Ｗ⁻ιη

Repeat until the set difference between the items and shuffled items is empty.

⊞η‽κ

Randomly pick one element of the difference to append to the shuffled items.

≔∕‽⁵¦⁴ι

Pick a random number between 0 and 1 which represents io*n.

∕⁺λ⁺ι∕⊖⊘‽⁵χＬη

For each item, add its index within its category v*n to io*n and a randomly generated offset o*n between -0.1 and 0.1, then divide the sum by n to get the final positional value.

ＦＥη⟦...κ⟧⊞υκ»

Save each random value and item in a separate list.

Ｗ⌊υ«

Repeat while the (minimum of) the list of random values and items is not empty.

≔Φυ¬⁼ικυ

Filter out the minimum value from the list.

⟦⊟ι

Output the item with that value.

Neil

Posted 2020-01-16T23:52:10.757

Reputation: 95 035

Python 3, 147 bytes

lambda a:[*zip(*sorted(((j+l.index(k)+random()/5-.1)/len(l),k)for j,l in((random(),sample(m,len(m)))for m in a)for k in l))][1]
from random import*

Try it online!

Assumes no duplicates

Jitse

Posted 2020-01-16T23:52:10.757

Reputation: 3 566

Spotify Shuffle (music playlist shuffle algorithm)

Background

Task

The algorithm

Input and output

Scoring and winning criterion

Answers

JavaScript (Node.js), 125 124 bytes

Commented

Python 3, 244 218 200 bytes

Jelly, 28 bytes

Explanation

Helper link, generates scores for each collection

Main link

J, 52 42 32 bytes

Red, 180 bytes

05AB1E, 37 bytes

R, 104 101 bytes

Ruby, 93 bytes

Charcoal, 60 bytes

Python 3, 147 bytes

J, ⁵² ⁴² 32 bytes