Where'd that germ go?

21

Introduction

You are a biologist studying the movement patterns of bacteria. Your research team has a bunch of them in a petri dish, and you are recording their activity. Unfortunately, you are seriously underfunded, and can't afford a video camera, so you just take a picture of the dish at regular intervals. Your task is to make a program that traces the movements of the germs from these pictures.

Input

Your inputs are two 2D arrays of characters in any reasonable format, representing consecutive pictures of the petri dish. In both arrays, the character . represents empty space, and O represents a germ (you can choose any two distinct characters if you want). Also, the "after" array is obtained from the "before" array by moving some germs one step in one of the four cardinal directions; in particular, the arrays have the same shape. The germs move simultaneously, so one of them may move to a space that already contained another germ, if it moves out of the way. It is guaranteed that the borders of the "before" array contain only empty spaces, and that there is at least one germ. Thus, the following is a valid pair of inputs:

Before  After
......  ......
.O..O.  ....O.
.OO.O.  .OO.O.
......  ..O...

Output

Your output is a single 2D array of characters in the same format as the inputs. It is obtained from the "before" array by replacing those germs that have moved with one of >^<v, depending on the direction of movement (you can also use any 4 distinct characters here). There may be several possible outputs, but you shall give only one of them. In the above example, one possible correct output is

......
.v..O.
.>v.O.
......

Unnecessary movement is allowed in the output and germs can swap places, so the following is also valid:

......
.v..v.
.>v.^.
......

Rules and scoring

You can write a full program or a function. The lowest byte count wins, and standard loopholes are disallowed.

I'm interested in relatively efficient algorithms, but I don't want to ban brute forcing entirely. For this reason, there's a bonus of -75% for solving the last test case within 10 minutes on a modern CPU (I'm unable to test most solutions, so I'll just trust you here). Disclaimer: I know that a fast algorithm exists (search for "disjoint paths problem"), but I haven't implemented it myself.

Additional test cases

Before
......
.O..O.
..OO..
......
After
......
..O...
...OO.
..O...
Possible output
......
.>..v.
..vO..
......

Before
.......
.OOOOO.
.O..OO.
.OO..O.
.OOOOO.
.......
After
.......
..OOOOO
.O...O.
.O...O.
.OOOOOO
....O..
Possible output
.......
.>>>>>.
.O..>v.
.Ov..v.
.O>>v>.
.......

Before
..........
.OOO..OOO.
.OOOOOOOO.
.OOO..OOO.
..........
After
..O.......
.OOO..O.O.
..OOOOOOOO
.O.O..OOO.
.......O..
Possible output
..........
.>^O..O>v.
.^O>>>vO>.
.O>^..>vO.
..........

Before
............
.OO..OOOOOO.
.OO......OO.
...OOOOOO...
.O.OOOOOO.O.
...OOOOOO...
.OOOOOOOOOO.
............
After
..........O.
.OO..OOOOO..
.O...O...O..
.O.OOOOOOO..
.O.OOOOOO..O
...OO..OO...
....OOOOOOOO
.OOO........
Possible output
............
.OO..v<<<<^.
.v<......^<.
...OOO>>>...
.O.OOO^OO.>.
...OOv^OO...
.vvvO>>>>>>.
............

Before
................
.OOOOOO.OOOOOOO.
..OO..OOOOOOOOO.
.OOO..OOOO..OOO.
..OOOOOOOO..OOO.
.OOOOOOOOOOOOOO.
................
After
................
..OOOOO.OOOOOOOO
..OO..OOOOOOOOO.
..OO..OOOO..OOOO
..OOOOOOOO..OOO.
..OOOOOOOOOOOOOO
................
Possible output
................
.>>>>>v.>>>>>>>.
..OO..>>^>>>>>v.
.>>v..OOO^..OO>.
..O>>>>>>^..OOO.
.>>>>>>>>>>>>>>.
................

Before
..............................
.OOO.O.O.....O.....O.O.O..O...
..OOO.O...O..OO..O..O.O.......
.....O......O..O.....O....O...
.O.OOOOO......O...O..O....O...
.OO..O..OO.O..OO..O..O....O...
..O.O.O......OO.OO..O..OO.....
..O....O..O.OO...OOO.OOO...O..
.....O..OO......O..O...OO.OO..
........O..O........OO.O.O....
..O.....OO.....OO.OO.......O..
.O.....O.O..OO.OO....O......O.
..O..OOOO..O....OO..........O.
.O..O...O.O....O..O....O...OO.
....O...OO..O.......O.O..OO...
........O.O....O.O....O.......
.OO.......O.OO..O.......O..O..
....O....O.O.O...OOO..O.O.OO..
.OO..OO...O.O.O.O.O...OO...O..
..............................
After
..............................
.OOOOO.......OO.....O..O......
...OO..O...O...O....OO....O...
....O.O......O..OO...OO...O...
.OO.OOOO......OO..O..O........
O.O.OO..O..O..O..OO...O...OO..
.OO.....O....OO.O..O.OO.O.....
......O.....O.....OOO.OO...O..
....O..OOOO..O..O..O.O.O.OO...
..O......O.O........O...O.O...
.O.....OOO.....OO.OO...O...O..
.......OOO..O.O.O...........O.
.O...O.....O...OOOO..O.O....O.
.O..O.O..O.....O......O....OO.
....O..O..O.O......O.....O....
........OOO....O......O..O....
.OO......O..OO..OOO.....O..O..
..O.O....OO..O...OO...O...OO..
.O..OO....O..O...O.O.O.OO.....
..............O............O..
Possible output
..............................
.OOO.O.v.....>.....>.v.O..v...
..>>^.v...>..^>..v..O.v.......
.....<......>..>.....O....O...
.O.<O><O......O...O..O....v...
.<O..O..v<.O..O^..O..>....>...
..<.^.v......OO.O^..>..<O.....
..^....v..v.Ov...>>^.<OO...O..
.....<..OO......O..O...Ov.v<..
........>..O........O^.v.^....
..^.....Ov.....OO.OO.......O..
.^.....^.^..O>.vO....v......O.
..<..Ov^^..O....><..........O.
.O..O...>.v....O..^....^...OO.
....O...<v..O.......<.^..v<...
........O.O....O.v....O.......
.OO.......<.Ov..O.......O..O..
....O....O.<.^...O^v..O.v.OO..
.O^..<<...O.>.v.>.^...<O...v..
..............................

Zgarb

Posted 2015-10-28T14:49:27.327

Reputation: 39 083

Just to be sure, germs can only move by one or zero cells, right? – Domino – 2015-10-28T16:14:57.570

@JacqueGoupil Yes, that's correct. Each of >^<v corresponds to a movement of exactly one step in the respective direction. – Zgarb – 2015-10-28T16:20:08.873

I didn't try solving it yet, but here's a tool to build more test cases :) https://jsfiddle.net/xd2xns64/embedded/result/

– Domino – 2015-10-28T17:43:25.830

Oh, careful, there's a chance the script will loop forever if it tries to move all cells against an edge but then the edge cells have nowhere to go. – Domino – 2015-10-28T19:05:15.123

Answers

3

Octave, 494 496 bytes - 372 byte bonus = 124 bytes

function o=G(b,a)
y='.O<^v>';s=(b>46)+0;t=a>46;v=t;f=s;t(:,2:end,2)=t(:,1:end-1);t(2:end,:,3)=t(1:end-1,:,1);t(1:end-1,:,4)=t(2:end,:,1);t(:,1:end-1,5)=t(:,2:end,1);t=reshape(t,[],5);m=size(s,1);p=[0 -m -1 1 m];
function z(n)
f(n+p(s(n)))--;q=find(t(n,:));w=n+p(q);d=min(f(w));q=q(f(w)==d);j=randi(numel(q));s(n)=q(j);f(n+p(q(j)))++;end
for g=find(s)' z(g);end
while any((f~=v)(:)) L=find(s);k=zeros(size(s));for h=L' k(h)=f(h+p(s(h)));end;c=find(k>1);g=c(randi(numel(c)));z(g);end
o = y(s+1);end

There's still a lot of golfing to be done on this answer, but I wanted to get the ungolfed explanation in.

I saw this as a Constraint Satisfaction Problem, so I went with my favorite local search heuristic, Min-conflicts. The idea is, given a starting placement with each germ in a reachable destination, select a random germ that occupies the same destination cell as one or more other germs and move it to a valid cell that has a minimum of other germs already there. Repeat as necessary until the placement matches the goal.

Interestingly, this algorithm is not guaranteed to terminate (if the goal is unreachable it will continue indefinitely, for instance) but if it does terminate it is guaranteed to produce a valid solution.

Here's the code:

function output = germs(before, after)

%before = ['......';'.O..O.';'.OO.O.';'......'];
%after = ['......';'....O.';'.OO.O.';'..O...'];

symbs = '.O<^v>';
start = (before > 46) + 0;                   %should be called current_board
target = after > 46;                         %destinations on current cell == O
goal = target;
conflicts = start;
target(:, 2:end,2) = target(:, 1:end-1);     %destinations on cell to left
target(2:end, :,3) = target(1:end-1, :,1);   %destinations on cell above
target(1:end-1, :,4) = target(2:end, :,1);   %destinations on cell below
target(:, 1:end-1,5) = target(:, 2:end,1);   %destinations on cell to right
target=reshape(target,[],5);
m = size(start,1);                           %number of rows = offset to previous/next column
offsets = [0 -m -1 1 m];                     %offsets of neighbors from current index


function moveGerm(n)
   conflicts(n+offsets(start(n)))--;         %take germ off board
   move = find(target(n, :));                %get valid moves for this germ
   neighbors = n + offsets(move);            %valid neighbors = current position + offsets
   minVal = min(conflicts(neighbors));       %minimum number of conflicts for valid moves
   move = move(conflicts(neighbors)==minVal);
   mi = randi(numel(move));                  %choose a random move with minimum conflicts
   start(n) = move(mi);                      %add move type to board
   conflicts(n + offsets(move(mi)))++;       %add a conflict on the cell we move to
end

% Generate an initial placement
for g = find(start)'
   moveGerm(g);                              %make sure all germs are moved to valid cells
end

% Repeat until board matches goal
while any((conflicts ~= goal)(:))
   germList = find(start);                   %list of all our germs
   cost = zeros(size(start));                %calculate conflicts for each germ
   for h = germList'
      cost(h) = conflicts(h + offsets(start(h)));
   end
   conflicted = find(cost > 1);              %find those germs that occupy the same cell as another
   g = conflicted(randi(numel(conflicted))); %choose a random germ to move
   moveGerm(g);
end

output = symbs(start+1);                     %use moves as indices into symbol array for output

end

Output for the last test case:

>> gtest
ans =

..............................
.OO>.O.v.....>.....>.v.O..v...
..>^O.v...>..^>..v..O.v.......
.....v......>..>.....O....O...
.O.<^<OO......>...O..O....v...
.<O..O..v<.O..^<..O..>....>...
..<.^.v......OO.O^..<..<O.....
..^....v..v.Ov...>>>.^OO...O..
.....<..OO......O..O...Ov.<<..
........>..O........O^.v.>....
..^.....OO.....OO.OO.......O..
.^.....^.O..O>.vO....v......O.
..<..Ov^^..O....OO..........O.
.O..O...>.v....O..^....^...OO.
....O...<v..O.......<.^..v<...
........O.O....O.v....O.......
.OO.......<.OO..O.......O..O..
....O....O.<.O...O^<..O.v.OO..
.O^..<<...O.>.v.>.>...<O...v..
..............................

Elapsed time is 0.681691 seconds.

The average elapsed time was less than 9 seconds 1 second* on a 5 year old Core i5, qualifying for the bonus.

I'm trying to get this working on ideone, but I'm having what I believe to be scoping issues with the way it handles nested functions. (Here's the non-working ideone link for reference: http://ideone.com/mQSwgZ)
The code on ideone is now working. I had to force all of the variables to global, which was unnecessary running it locally.

*I had a note in my ungolfed version that one of the steps was inefficient, so I tried it to see if I could speed up execution and for 2 added bytes the execution time is now down to under a second. Code and sample output has been updated and the input on ideone has been changed to the last test case.

beaker

Posted 2015-10-28T14:49:27.327

Reputation: 2 349

3

Python, 1171 bytes - 878.25 byte bonus = 292.75 bytes

from itertools import *;from random import *;R=range;L=len;O=choice;G='O'
def A(a,b):a.append(b)
def D(y,z):
 a=[];b=[];c=[]
 for i in R(L(y)):
  A(c,[])
  for j in R(L(y[0])):
   k=[(i,j),y[i][j]==G,z[i][j]==G,[],0];A(c[i],k)
   for l,m in [(0,1),(1,0)]:
    try:
     n=c[i-l][j-m]
     if k[2]&n[1]:A(n[3],k)
     if k[1]&n[2]:A(k[3],n)
    except:pass
   if k[1]&~k[2]:A(a,k)
   elif k[2]&~k[1]:A(b,k)
 d={}
 for i in a:
  j=[[i]]
  while j:
   k=j.pop();l=[e[0] for e in k]
   while True:
    m=k[-1];n=[o for o in m[3] if o[0] not in l]
    if not n:
     if m in b:A(d.setdefault(i[0],[]),k)
     break
    for o in n[1:]:p=k[:];A(p,o);A(j,p)
    A(k,n[0]);A(l,n[0][0])
 e={}
 for i in a:e[i[0]]=O(d[i[0]])
 def E():return sum(any(k in j for k in i) for i,j in combinations(e.values(),2))
 f=E()
 for i in count():
  t=3**-i/L(a);j=O(a);k=e[j[0]];e[j[0]]=O(d[j[0]]);l=E()
  if not l:break
  else:
   if l>f and random()>t:e[j[0]]=k
   else:f=l
 for i in e.values():
  for j in R(L(i)-1):i[j][4]=i[j+1]
 for i in c:
  for j in R(L(i)):
   k=i[j]
   if 1&~k[1]:i[j]='.'
   elif not k[4]:i[j]=G
   else:l,m=k[0];n,o=k[4][0];i[j]='v>^<'[abs((l-n+1)+2*(m-o))]
 return c

Ideone link: http://ideone.com/0Ylmwq

Takes anywhere from 1 - 8 seconds on the last test case on average, qualifying for the bonus.

This is my first code-golf submission, so it's probably not the best-golfed program out there. Nevertheless, it was an interesting challenge and I quite enjoyed it. @Beaker deserves a mention for reminding me that heuristic-based searches are a thing. Before he posted his solution and inspired me to redo mine, my brute force search was waaaay too long to qualify for the bonus on the last test case (it was on the order of 69! iterations, which is a 99-digit number...).

I didn't want to straight-up copy Beaker's solution, so I decided to use simulated annealing for my search heuristic. It seems slower than min-conflict for this problem (likely because it's an optimization algorithm rather than a constraint-satisfaction one), but it's still well within the 10-minute mark. It also had the benefit of being fairly small, code-wise. I spent a lot more bytes on transforming the problem than I did on finding a solution to it.

Explanation

My solution is probably fairly inefficient byte-wise, but I had trouble conceptualizing how to solve the problem as-is and so I ended up having to transform it into a different problem that was easier for me to understand. I realized that there are four possiblities for each cell on the grid:

  • It had no germ before or after, which means we can ignore it
  • It had a germ before but not after, which means we have to find a move for it.
  • It had no germ before but one after, which also means we have to find a move for it.
  • It had a germ before and after, which means we might have to find a move for it, but then again maybe not.

After decomposing the data into those classes, I was able to further transform the problem. It was immediately obvious to me that I had to find a way to supply a germ from the "before but not after" set to a cell in the "after but not before" set. Further, germs can only move one space, so the only way for them to affect further-away cells is by "pushing" an unbroken path of germs into that cell. That meant the problem became finding X vertex-disjoint paths on a graph, where each cell with a germ was a vertex in said graph, and the edges represented adjacent cells.

I solved that problem that by first building the graph explained above. I then enumerated every possible path from each cell in Before and each cell in After, then randomly assigned each cell in Before one of its possible paths. Finally, I used simulated annealing to semi-randomly mutate the potential solution until I eventually find a solution that has no conflicts for any of the paths.

Annotated version

from itertools import *;from random import *;

# redefine some built-in functions to be shorter
R=range;L=len;O=choice;G='O'
def A(a,b):a.append(b)

# The function itself.  Input is in the form of two 2d arrays of characters, one each for before and after.
def D(y,z):
 # Declare the Before-but-not-after set, the After-but-not-before set, and a temp cell array
 # (the cells are temporarily stored in a 2d array because I need to be able to locate neighbors)
 a=[];b=[];c=[]

 # Build the graph
 for i in R(L(y)):
  # Append a row to the 2d temp array
  A(c,[])

  for j in R(L(y[0])):
   # Define the interesting information about the cell, then add it to the temp array
   # The cell looks like this: [position, does it have a germ before?, does it have a germ after?, list of neighbors with germs, final move]
   k=[(i,j),y[i][j]==G,z[i][j]==G,[],0];A(c[i],k)
   for l,m in [(0,1),(1,0)]:
    # Fill up the neighbors by checking the above and left cell, then mutually assigning edges
    try:
     n=c[i-l][j-m]
     if k[2]&n[1]:A(n[3],k)
     if k[1]&n[2]:A(k[3],n)
    except:pass

   # Decide if it belongs in the Before or After set
   if k[1]&~k[2]:A(a,k)
   elif k[2]&~k[1]:A(b,k)

 # For each cell in the before set, define ALL possible paths from it (this is a big number of paths if the grid is dense with germs)
 # This uses a bastard form of depth-first search where different paths can cross each other, but no path will cross itself
 d={}
 for i in a:
  j=[[i]]  # Define the initial stack of incomplete paths as the starting node.
  while j:
   # While the stack is not empty, pop an incomplete path of the stack and finish it
   k=j.pop();l=[e[0] for e in k]
   while True:
    # Set the list of next possible moves to the neighbors of the current cell,
    # ignoring any that are already in the current path.
    m=k[-1];n=[o for o in m[3] if o[0] not in l]

    # If there are no more moves, save the path if it ends in an After cell and break the loop
    if not n:
     if m in b:A(d.setdefault(i[0],[]),k)
     break

    # Otherwise, set the next move in this path to be the first move,
    # then split off new paths and add them to the stack for every other move
    for o in n[1:]:p=k[:];A(p,o);A(j,p)
    A(k,n[0]);A(l,n[0][0])

 # Perform simulated annealing to calculate the solution
 e={}
 for i in a:e[i[0]]=O(d[i[0]])  # Randomly assign paths for the first potential solution

 # Define a function for calculating the number of conflicts between all paths, then do the initial calculation for the initial potential solution
 def E():return sum(any(k in j for k in i) for i,j in combinations(e.values(),2))
 f=E()

 # Do the annealing
 for i in count():
  # The "temperature" for simulated annealing is calculated as 3^-i/len(Before set).
  # 3 was chosen as an integer approximation of e, and the function e^(-i/len) itself was chosen because
  # it exponentially decays, and does so slower for larger problem sets
  t=3**-i/L(a)

  j=O(a)              # Pick a random Before cell to change
  k=e[j[0]]           # Save it's current path
  e[j[0]]=O(d[j[0]])  # Replace the current path with a new one, randomly chosen
  l=E()               # Recalculate the number of conflicts

  if not l:break  # If there are no conflicts, we have a valid solution and can terminate
  else:           # Otherwise check the temperature to see if we keep the new move
   if l>f and random()>t:e[j[0]]=k  # Always keep the move if it's better, and undo it with probability 1 - T if it's worse
   else:f=l                         # If we don't undo, remember the new conflict count

 # Set each of the cells' final moves based on the paths
 for i in e.values():
  for j in R(L(i)-1):i[j][4]=i[j+1]

 # Build the output in the form of a 2d array of characters
 # Reuse the temp 2d array from step since its the right size
 for i in c:
  for j in R(L(i)):
   k=i[j]
   # Cells that are empty in the before array are always empty in the output
   if 1&~k[1]:i[j]='.'
   # Cells that aren't empty and don't have a move are always germs in the output
   elif not k[4]:i[j]=G
   # Otherwise draw the move
   else:l,m=k[0];n,o=k[4][0];i[j]='v>^<'[abs((l-n+1)+2*(m-o))]
 return c

user46863

Posted 2015-10-28T14:49:27.327

Reputation: