The next revolution in typing on laptops was released on the first of April, 2014 by SwiftKey. However, I want to be the first person to write a swiping nano clone, but, as I can't find a good swipe-text to real-text library, and I can't wait for them, I'm asking here.

Task

Write a program that takes in swipe-text and outputs the real-text equivalent. Example:

Input: hgrerhjklo
Output: hello

When the user does:

enter image description here

Other examples:

Input: wertyuioiuytrtjklkjhgfd
Output: world

Input: poiuytrtyuioiugrewsasdfgbhnmkijnbg
Output: programming

Input: poiuygfdzxcvhjklkjhgres
Output: puzzles

Input: cvhjioiugfde
Output: code

Input: ghiolkjhgf
Output: golf

Rules

The program will take one swiped 'word' in on stdin or argv
The first and last letter of the swiped input will equate to the first and last letter of the real word
You can assume the user will make reasonably straight lines, but you can use the sample data to verify this (I made the sample data, and I will make the final test data)
For ambiguous input, you can make select either output, but I will try to eradicate all ambiguousness from the test data
This word will be in this word list (but swiped). The word list will be in the current directory, and can be read (newline separated, will be named wordlist, no extension).
The swipe will only contain lowercase alphabetic characters
The swipe may contain duplicated characters, if the user pauses on a key
The program must output on stdout (case does not matter)
The program must return 0 as the return code
You must provide the run command, compile command (if needed), name and which input path to use
Standard loopholes apply (they might not help, though)
No non-builtin libraries allowed
Deterministic, non golfed / obfuscated solutions preferred
No file writing, networking, etc.
Your code must run in one second or less (your code is run once per word)
The scoring runs are run on a Intel i7 Haswell processor, with 4 virtual code (2 real ones), so you can use threads if you have to
Maximum code length of 5000 bytes
The language you use must have a free (non trial) version available for Linux (Arch Linux, if that matters)

Winning Criterion

The winner is the most accurate solution (scored by the control program, using the provided test list)
Popularity is the tie breaker
The scoring table will be updated every few days
Time-outs and crashes count as fails
This challenge will run for two weeks or more, depending on popularity
The final scoring will use a different, randomly selected list of words (same length, from same word list)

Other

You can use the control program to test your program
If you are impatient, and want your program updated / added quickly, start an issue or pull request at https://github.com/matsjoyce/codegolf-swipe-type/blob/master
Entries are maintained at https://github.com/matsjoyce/codegolf-swipe-type/blob/master/entries
Logs of each program run are maintained at https://github.com/matsjoyce/codegolf-swipe-type/blob/master/logs
Main log at https://github.com/matsjoyce/codegolf-swipe-type/blob/master/log.log
The position of each key will be provided as a csv file in the current directory called keypos.csv, with the x and y values given as relative to Q (see https://github.com/matsjoyce/codegolf-swipe-type/blob/master/keypos.csv)
Each key is 1.5 x 1.5 cm (same unit as in keypos.csv)

Current Score Boards

testlist (logs):

Three Pass Optimizer:Errors: 0/250       Fails: 7/250        Passes: 243/250     Timeouts: 0/250     
Corner Sim:         Errors: 0/250       Fails: 9/250        Passes: 241/250     Timeouts: 0/250     
Discrete Fréchet Distance:Errors: 0/250       Fails: 17/250       Passes: 233/250     Timeouts: 0/250     
Turnaround:         Errors: 0/250       Fails: 18/250       Passes: 232/250     Timeouts: 0/250     
Direction Checker:  Errors: 0/250       Fails: 19/250       Passes: 231/250     Timeouts: 0/250     
Regex Solver:       Errors: 0/250       Fails: 63/250       Passes: 187/250     Timeouts: 0/250

testlist2 (logs):

Corner Sim:         Errors: 0/250       Fails: 10/250       Passes: 240/250     Timeouts: 0/250     
Three Pass Optimizer:Errors: 2/250       Fails: 14/250       Passes: 234/250     Timeouts: 0/250     
Turnaround:         Errors: 0/250       Fails: 16/250       Passes: 234/250     Timeouts: 0/250     
Direction Checker:  Errors: 0/250       Fails: 17/250       Passes: 233/250     Timeouts: 0/250     
Discrete Fréchet Distance:Errors: 0/250       Fails: 18/250       Passes: 232/250     Timeouts: 0/250     
Regex Solver:       Errors: 0/250       Fails: 67/250       Passes: 183/250     Timeouts: 0/250

Final Run

testlist (logs):

Corner Sim:         Errors: 0/250       Fails: 14/250       Passes: 236/250     Timeouts: 0/250     
Three Pass Optimizer:Errors: 0/250       Fails: 18/250       Passes: 232/250     Timeouts: 0/250     
Direction Checker:  Errors: 0/250       Fails: 20/250       Passes: 230/250     Timeouts: 0/250     
Turnaround:         Errors: 0/250       Fails: 23/250       Passes: 227/250     Timeouts: 0/250     
Discrete Fréchet Distance:Errors: 0/250       Fails: 30/250       Passes: 220/250     Timeouts: 0/250     
Regex Solver:       Errors: 0/250       Fails: 55/250       Passes: 195/250     Timeouts: 0/250

Well done to everybody and hgfdsasdertyuiopoiuy swertyuiopoijnhg!

matsjoyce

Posted 2014-10-12T18:29:22.617

Reputation: 1 319

What is "A Solution"? Where is its code? – Doorknob – 2014-10-12T18:58:21.520

Let us continue this discussion in chat.

– Optimizer – 2014-10-12T20:13:11.610

8Somewhat related. – wchargin – 2014-10-12T21:06:17.597

@Optimizer Not sure about the other cases, but "poiuytresaseresasdfghuioiugfdxcguiugcxsasdfghjklkuy" contains every letter of "paradoxically", in order, except for the l, which is not doubled. – es1024 – 2014-10-13T01:51:06.953

Sorry, my bad. I missed the first "d" there. – Optimizer – 2014-10-13T09:13:53.360

I am pretty sure that the Error and Timeout section will always remain 0. You can remove that. Also, feel free to add mine too ... – Optimizer – 2014-10-13T15:35:09.913

If I'm not mistaken there are two spelling mistakes in the test cases ('cvbjkloiuhgfcvhjkjhgfds', expecting 'cloaks' and 'ijnjuyfdsasrtre', expecting 'inundate') and at least a dozen genuine ambiguities (like brats vs. breasts or aped vs. adopted). You could of cause say that any realistic test scenario should include such cases. ;) – Emil – 2014-10-13T20:56:24.980

I'm getting a syntax error when trying the control program. It is pointing to def print(.... I think it is this problem: http://stackoverflow.com/questions/937491/invalid-syntax-when-using-print but I can't fix it because I don't know python.

– Mhmd – 2014-10-14T19:08:07.920

@Mhmd Make sure you are using python 3. It will not work with python 2. – matsjoyce – 2014-10-14T19:20:02.083

@Optimizer The Error category hasn't remained zero... :) – matsjoyce – 2014-10-17T15:32:01.803

That's a bug. I will fix it. – Optimizer – 2014-10-17T15:38:20.007

@Optimizer I'll run the final round as soon as you've fixed your bug. The test set is ready. – matsjoyce – 2014-10-25T17:16:33.873

Give me a day :) – Optimizer – 2014-10-25T17:21:11.760

@Optimizer Shall I go ahead with the final scoring then? – matsjoyce – 2014-10-31T13:47:00.500

Sorry, I wasn't able to fix the errors in time. Please proceed as it will not be fair to others – Optimizer – 2014-10-31T13:48:09.627

You are not going to choose your own submission as answer, right ? (Dunno if its possible) – Optimizer – 2014-11-03T16:39:48.670

1@Optimiser Well, I thought your submission would beat it, but it was just below (a little tweaking would have changed that, I'm sure). It seems I can accept it, so... should I (I don't appear to get rep from accepting it)? I'd like to accept someone else's, but that's not following the rules (unless you've got a good idea). – matsjoyce – 2014-11-03T17:38:14.707

Answers

JavaScript, ES6, Three Pass Optimizer, 112 187 235 240 241 243 and 231 234 passes

A three pass filter which first figures out critical keys in the keystroke sequence and then passes the sequence through the three filters:

A loosely formed RegEx out of the critical keys and helping keys. This gives correct result for majority of the keys (around 150)
A strict RegEx made out of just critical keys. This gives correct result for an extra 85 sequences
A third filter to figure out ambiguity amongst close answers. This works for 40% ambiguous cases.

Code

keyboard = {
  x: {},
  y: ['  q      w      e      r      t      y      u      i      o      p',
      '    a      s      d      f      g      h      j      k      l',
      '        z      x      c      v      b      n      m'],
};
for (var i in keyboard.y)
  for (var y of keyboard.y[i])
    keyboard.x[y] = +i*7;
p = C => (x=keyboard.x[C],{x, y: keyboard.y[x/7].indexOf(C)})
angle = (L, C, R) => (
  p0 = p(L), p1 = p(C), p2 = p(R),
  a = Math.pow(p1.x-p0.x,2) + Math.pow(p1.y-p0.y,2),
  b = Math.pow(p1.x-p2.x,2) + Math.pow(p1.y-p2.y,2),
  c = Math.pow(p2.x-p0.x,2) + Math.pow(p2.y-p0.y,2),
  Math.acos((a+b-c) / Math.sqrt(4*a*b))/Math.PI*180
)
corner = (L, C, R, N, W) => {
  if (skip) {
    skip = false;
    return [];
  } 
  ngl = angle(L, C, R);
  if (ngl < 80) return [C + "{1,3}"]
  if (ngl < 115 && p(L).x != p(R).x && p(L).x != p(C) && p(R).x != p(C).x && Math.abs(p(L).y - p(R).y) < 5) return [C + "{0,3}"]
  if (ngl < 138) {
    if (N && Math.abs(ngl - angle(C, R, N)) < 6) {
      skip = true;
      return [L + "{0,3}", "([" + C + "]{0,3}|[" + R + "]{0,3})?", N + "{0,3}"]
    }
    return [C + "{0,3}"]
  }
  return ["([" + L + "]{0,3}|[" + C + "]{0,3}|[" + R + "]{0,3})?"]
}
f = S => {
  for (W = [S[0] + "{1,2}"],i = 1; i < S.length - 1; i++)
    W.push(...corner(S[i - 1], S[i], S[i + 1], S[i + 2], W))
  return [
    new RegExp("^" + W.join("") + S[S.length - 1] + "{1,3}$"),
    new RegExp("^" + W.filter(C=>!~C.indexOf("[")).join("") + S[S.length - 1] + "{1,3}$")
  ]
}
thirdPass = (F, C) => {
  if (!F[0]) return null
  F = F.filter((s,i)=>!F[i - 1] || F[i - 1] != s)
  FF = F.map(T=>[...T].filter((c,i)=>!T[i - 1] || T[i - 1] != c).join(""))
  if (FF.length == 1) return F[0];
  if (FF.length < 6 && FF[0][2] && FF[1][2] && FF[0][0] == FF[1][0] && FF[0][1] == FF[1][1])
    if (Math.abs(F[0].length - F[1].length) < 1)
      for (i=0;i<Math.min(F[0].length, FF[1].length);i++) {
        if (C.indexOf(FF[0][i]) < C.indexOf(FF[1][i])) return F[0]
        else if (C.indexOf(FF[0][i]) > C.indexOf(FF[1][i])) return F[1]
      }
  return F[0]
}
var skip = false;
SwiftKey = C => (
  C = [...C].filter((c,i)=>!C[i - 1] || C[i - 1] != c).join(""),
  skip = false, matched = [], secondPass = [], L = C.length, reg = f(C),
  words.forEach(W=>W.match(reg[0])&&matched.push(W)),
  words.forEach(W=>W.match(reg[1])&&secondPass.push(W)),
  matched = matched.sort((a,b)=>Math.abs(L-a.length)>Math.abs(L-b.length)),
  secondPass = secondPass.sort((a,b)=>Math.abs(L-a.length)>Math.abs(L-b.length)),
  first = matched[0], second = secondPass[0], third = thirdPass(secondPass.length? secondPass: matched, C),
  second && second.length >= first.length - 1? first != third ? third: second: third.length >= first.length ? third: first
)

// For use by js shell of latest firefox
print(SwiftKey(readline()));

The code assumes a variable called words is present which is an array of all the words from this page

See the code in action here

See the test cases in action here

Both the above link work only on a latest Firefox (33 and above) (due to ES6).

Optimizer

Posted 2014-10-12T18:29:22.617

Reputation: 25 836

Yup! I'm shelling out with shells. You can also use the proper keypos.csv file now. The IO functions avalible are listed at https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Introduction_to_the_JavaScript_shell

– matsjoyce – 2014-10-14T19:24:26.883

It's fine, but the swipes are made with my keyboard angles, so it's your choice (doesn't seem to have affected your score, though!) – matsjoyce – 2014-10-14T19:29:23.253

Sure done. https://github.com/matsjoyce/codegolf-swipe-type/blob/master/logs/Three%20Pass%20Optimizer.log

– matsjoyce – 2014-10-14T19:40:32.230

240 passes is outstanding! I would have thought that ambiguities would prevent such good results. I'll be curious how this will perform on the final test set. – Emil – 2014-10-15T09:40:51.243

@Emil - Yeah, I am waiting to see that too. – Optimizer – 2014-10-15T09:44:27.693

@Emil I'll probably run the final set in ten days or so (unless the question becomes wildly popular, and I'll leave it longer). In the meantime, would making a second test set be useful? – matsjoyce – 2014-10-16T17:48:00.567

@matsjoyce yeah. – Optimizer – 2014-10-16T17:53:05.123

@Emil Done. Interesting how solid camelNeck's solution is (identical score) (actually most of the solutions got the same score, so my test-case validation method must work) – matsjoyce – 2014-10-16T20:48:39.667

This just means that I've got more data to improve my algorithm on an overall set. – Optimizer – 2014-10-16T21:04:56.597

Ruby, Regex Solver - 30 140 176 180 182 187 and 179 183 passes

I'll figure out the score later. Here is very naive solution that doesn't take the keyboard layout into account:

words = File.readlines('wordlist').map(&:chomp)

swipe = ARGV.shift
puts words.select {|word| word[0] == swipe[0] &&
                          word[-1] == swipe[-1]}
          .select {|word|
              chars = [word[0]]
              (1..word.size-1).each {|i| chars << word[i] if word[i] != word[i-1]}
              swipe[Regexp.new('^'+chars.join('.*')+'$')]
          }.sort_by {|word| word.size}[-1]

It takes input from ARGV and prints the result. I'm just filtering the word list by first and last letter, and them I'm trying all of the remaining words against the input (eliminating duplicate letters and using a regex like ^g.*u.*e.*s$ for "guess") and return the first of those if there are multiple solutions.

Run it like

ruby regex-solver.rb cvhjioiugfde

Anyone else, feel free to reuse this step as a first filter - I believe it won't throw out any correct words, so this preliminary check can greatly reduce the search space for better algorithms.

Edit: Following the OPs suggestion, I'm now selecting the longest of the candidates, which seems to be a decent heuristic.

Also thanks to es1024 for reminding me about duplicate letters.

Martin Ender

Posted 2014-10-12T18:29:22.617

Reputation: 184 808

Done. Your log's at https://github.com/matsjoyce/codegolf-swipe-type/blob/master/logs/Regex%20Solver.log I think the problem is that it selects from the possible solutions randomly, which could be improved by selecting the longest, or something else.

– matsjoyce – 2014-10-12T19:31:38.830

I think this might throw out all correct words with two identical letters next to each other, such as paradoxically, as the l would only appear once in the input, rather than twice as demanded by the regex. – es1024 – 2014-10-13T01:54:57.450

@es1024, ah thanks, when I first proposed this algorithm back in the sandbox, I was actually aware of that, but forgot about it yesterday. Will fix later. – Martin Ender – 2014-10-13T09:18:16.233

C++, Discrete Fréchet Distance - 201 220 222 232 and 232 passes

To me, the problem very much called for the Fréchet Distance which unfortunately is very hard to compute.

Just for fun, I've tried to approach the problem by implementing a discrete approximation described by Thomas Eiter and Heikki Mannila in Computing Discrete Fréchet Distance (1994).

At first, I'm using the same approach as the others for filtering all words in the list which are subsequences of the input (also accounting for multiple sequential occurances of the same character). Then, I'm filling the polygon curve from letter to letter of every word with intermediate points and match it against the input curve. Finally, I divide every distance by the length of the word and take the minimum score.

So far, the method doesn't work as well as I had hoped (It recognizes the code example as "chide"), but this could just be the result of bugs I haven't found yet. Else, another idea would be to use other variations of the fréchet distance ("average" instead of "maximum dog leash length").

Edit: Now, I'm using an approximation to the "average dog leash length". This means that I'm finding an ordered mapping between both paths which minimizes the sum of all distances and later divide it by the number of distances.

If the same character appears twice or more often in the dictionary word, I only put one node in the path.

#include<iostream>
#include<fstream>
#include<vector>
#include<map>
#include<algorithm>
#include<utility>
#include<cmath>

using namespace std;

const double RESOLUTION = 3.2;

double dist(const pair<double, double>& a, const pair<double, double>& b) {
    return sqrt((a.first - b.first) * (a.first - b.first) + (a.second - b.second) * (a.second - b.second));
}

double helper(const vector<pair<double, double> >& a,
        const vector<pair<double, double> >& b,
        vector<vector<double> >& dp,
        int i,
        int j) {
    if (dp[i][j] > -1)
        return dp[i][j];
    else if (i == 0 && j == 0)
        dp[i][j] = dist(a[0], b[0]);
    else if (i > 0 && j == 0)
        dp[i][j] = helper(a, b, dp, i - 1, 0) +
                   dist(a[i], b[0]);
    else if (i == 0 && j > 0)
        dp[i][j] = helper(a, b, dp, 0, j - 1) +
                   dist(a[0], b[j]);
    else if (i > 0 && j > 0)
        dp[i][j] = min(min(helper(a, b, dp, i - 1, j),
                           helper(a, b, dp, i - 1, j - 1)),
                       helper(a, b, dp, i, j - 1)) +
                   dist(a[i], b[j]);
    return dp[i][j];
}

double discretefrechet(const vector<pair<double, double> >& a, const vector<pair<double, double> >& b) {
    vector<vector<double> > dp = vector<vector<double> >(a.size(), vector<double>(b.size(), -1.));
    return helper(a, b, dp, a.size() - 1, b.size() - 1);
}

bool issubsequence(string& a, string& b) {
    // Accounts for repetitions of the same character: hallo subsequence of halo
    int i = 0, j = 0;
    while (i != a.size() && j != b.size()) {
        while (a[i] == b[j])
            ++i;
        ++j;
    }
    return (i == a.size());
}

int main() {
    string swipedword;
    cin >> swipedword;

    ifstream fin;
    fin.open("wordlist");
    map<string, double> candidatedistance = map<string, double>();
    string readword;
    while (fin >> readword) {
        if (issubsequence(readword, swipedword) && readword[0] == swipedword[0] && readword[readword.size() - 1] == swipedword[swipedword.size() - 1]) {
            candidatedistance[readword] = -1.;
        }
    }
    fin.close();

    ifstream fin2;
    fin2.open("keypos.csv");
    map<char, pair<double, double> > keypositions = map<char, pair<double, double> >();
    char rc, comma;
    double rx, ry;
    while (fin2 >> rc >> comma >> rx >> comma >> ry) {
        keypositions[rc] = pair<double, double>(rx, ry);
    }
    fin2.close();

    vector<pair<double, double> > swipedpath = vector<pair<double, double> >();
    for (int i = 0; i != swipedword.size(); ++i) {
        swipedpath.push_back(keypositions[swipedword[i]]);
    }

    for (map<string, double>::iterator thispair = candidatedistance.begin(); thispair != candidatedistance.end(); ++thispair) {
        string thisword = thispair->first;
        vector<pair<double, double> > thispath = vector<pair<double, double> >();
        for (int i = 0; i != thisword.size() - 1; ++i) {
            pair<double, double> linestart = keypositions[thisword[i]];
            pair<double, double> lineend = keypositions[thisword[i + 1]];
            double linelength = dist(linestart, lineend);
            if (thispath.empty() || linestart.first != thispath[thispath.size() - 1].first || linestart.second != thispath[thispath.size() - 1].second)
                thispath.push_back(linestart);
            int segmentnumber = linelength / RESOLUTION;
            for (int j = 1; j < segmentnumber; ++j) {
                thispath.push_back(pair<double, double>(linestart.first + (lineend.first - linestart.first) * ((double)j) / ((double)segmentnumber),
                                    linestart.second + (lineend.second - lineend.second) * ((double)j) / ((double)segmentnumber)));
            }
        }
        pair<double, double> lastpoint = keypositions[thisword[thisword.size() - 1]];
        if (thispath.empty() || lastpoint.first != thispath[thispath.size() - 1].first || lastpoint.second != thispath[thispath.size() - 1].second)
        thispath.push_back(lastpoint);

        thispair->second = discretefrechet(thispath, swipedpath) / (double)(thispath.size());
    }

    double bestscore = 1e9;
    string bestword = "";
    for (map<string, double>::iterator thispair = candidatedistance.begin(); thispair != candidatedistance.end(); ++thispair) {
        double score = thispair->second;
        if (score < bestscore) {
            bestscore = score;
            bestword = thispair->first;
        }
        // cout << thispair->first << ": " << score << endl;
    }
    cout << bestword << endl;

    return 0;
}

I've compiled the code with clang, but g++ ./thiscode.cpp -o ./thiscode should also work fine.

camelNeck

Posted 2014-10-12T18:29:22.617

Reputation: 71

Yes! Somebody has finally added a solution with a big fat algorithm name! Your log is at https://github.com/matsjoyce/codegolf-swipe-type/blob/master/logs/Discrete%20Fr%C3%A9chet%20Distance.log

– matsjoyce – 2014-10-13T17:49:26.900

1Let's rather call it a simple dynamic programming algorithm for a big fat computer science problem. – camelNeck – 2014-10-13T17:57:01.000

For some reason, this seems to fail for the input of programmijng -- it gives me pang. – Riking – 2014-10-15T23:41:41.597

That's strange. I'm getting programming like it should. Could you please uncomment the line // cout << thispair->first << ": " << score << endl; and paste the resulting scores? – camelNeck – 2014-10-16T10:52:00.177

PHP, Direction Checker, 203 213 216 229 231 and 229 233 passes

This begins with a simple pass through the dictionary to obtain a list of words whose letters are all present in the input (what Martin did here), and also only checking for l.* instead of l.*l.* when l is duplicated.

The longest word here is then saved in a variable.

Another check is then done, based on the keys at the points where the swipe changes direction (+/0 to - or -/0 to +, in either x or y). The list of possible words is checked against this list of characters, and words that do not match are eliminated. This approach relies on sharp turns in swiping to be accurate.

The longest remaining word is outputted, or if no words are left, the longest word from earlier is outputted instead.

Edit: Added all characters in a horizontal line leading up to a change in direction as possible values to check for.

PHP 5.4 is required (or replace all [..] with array(..)).

<?php
function get_dir($a, $b){
    $c = [0, 0];
    if($a[0] - $b[0] < -1.4) $c[0] = 1;
    else if($a[0] - $b[0] > 1.4) $c[0] = -1;
    if($a[1] < $b[1]) $c[1] = 1;
    else if($a[1] > $b[1]) $c[1] = -1;
    return $c;
}
function load_dict(){
    return explode(PHP_EOL, file_get_contents('wordlist'));
}

$coord = [];
$f = fopen('keypos.csv', 'r');
while(fscanf($f, "%c, %f, %f", $c, $x, $y)){
    $coord[$c] = [$x, $y];  
}
fclose($f);

$dict = load_dict();
$in = $argv[1];
$possib = [];

foreach($dict as $c){
    if($c[0] == $in[0]){
        $q = strlen($c);
        $r = strlen($in);
        $last = '';
        $fail = false;
        $i = $j = 0;
        for($i = 0; $i < $q; ++$i){
            if($last == $c[$i]) continue;
            if($j >= $r){
                $fail = true;
                break;
            }
            while($c[$i] != $in[$j++])
                if($j >= $r){
                    $fail = true; 
                    break;
                }
            if($fail) break;
            $last = $c[$i];
        }
        if(!$fail) 
            $possib[] = $c;
    }
}

$longest = '';
foreach($possib as $p){
    if(strlen($p) > strlen($longest))
        $longest = $p;
}

$dir = [[0, 0]];
$cdir = [0, 0];
$check = '/^' . $in[0] . '.*?';
$olst = []; $p = 1;

$len = strlen($in);
for($i = 1; $i < $len; ++$i){
    $dir[$i] = get_dir($coord[$in[$i - 1]], $coord[$in[$i]]);
    $olddir = $cdir;
    $newdir = $dir[$i];
    $xc = $olddir[0] && $newdir[0] && $newdir[0] != $olddir[0];
    $yc = $olddir[1] && $newdir[1] && $newdir[1] != $olddir[1];
    if($xc || $yc){ // separate dir from current dir
        if($yc && !$xc && $dir[$i - 1][1] == 0){
            $tmp = '';
            for($j = $i - 1; $j >= 0 && $dir[$j][1] == 0; --$j){
                $tmp = '(' . $in[$j-1] . '.*?)?' . $tmp;
            }
            $olst[] = range($p, $p + (($i - 1) - $j));
            $p += ($i - 1) - $j + 1;
            $check .= $tmp . '(' . $in[$i - 1] . '.*?)?';
        }else{
            $check .= $in[$i - 1] . '.*?';
        }
        $cdir = $dir[$i];
    }else{
        if(!$cdir[0]) $cdir[0] = $dir[$i][0]; 
        if(!$cdir[1]) $cdir[1] = $dir[$i][1]; 
    }
}
$check .= substr($in, -1) . '$/';
$olstc = count($olst);
$res = [];
foreach($possib as $p){
    if(preg_match($check, $p, $match)){
        if($olstc){
            $chk = array_fill(0, $olstc, 0);
            for($i = 0; $i < $olstc; ++$i){
                foreach($olst[$i] as $j){
                    if(isset($match[$j]) && $match[$j]){
                        ++$chk[$i];
                    }
                }
                // give extra weight to the last element of a horizontal run
                if(@$match[$olst[$i][count($olst[$i])-1]])
                    $chk[$i] += 0.5;
            }
            if(!in_array(0, $chk)){
                $res[$p] = array_sum($chk);
            }
        }else{
            $res[$p] = 1;
        }
    }
}
$possib = array_keys($res, @max($res));
$newlong = '';
foreach($possib as $p){
    if(strlen($p) > strlen($newlong))
        $newlong = $p;
}
if(strlen($newlong) == 0) echo $longest;
else echo $newlong;

es1024

Posted 2014-10-12T18:29:22.617

Reputation: 8 953

@matsjoyce Missed that bullet point while reading through the question. Code now uses the positions from keypos.csv – es1024 – 2014-10-13T08:10:49.027

@es1024 While not part of the 250 test cases, the word list does contain 238 words with three consecutive letters (so far, all I've checked are words ending in sss) - I don't think your duplicate elimination would catch those. – Martin Ender – 2014-10-13T09:54:14.243

If you need it, your logs at https://github.com/matsjoyce/codegolf-swipe-type/blob/master/logs/Direction%20Checker.log

– matsjoyce – 2014-10-13T16:38:33.950

Python 2, Turnarounds, 224 226 230 232 and 230 234 passes

This is quite a straight forward approach:

Find the points where the flow of letters changes direction (plus start and end).
Output the longest word that includes all these turning points.

import sys, csv, re

wordlistfile = open('wordlist', 'r')
wordlist = wordlistfile.read().split('\n')

layout = 'qwertyuiop asdfghjkl  zxcvbnm'
keypos = dict((l, (2*(i%11)+i/11, i/11)) for i,l in enumerate(layout))

#read input from command line argument
input = sys.argv[1]

#remove repeated letters
input = ''.join(x for i,x in enumerate(input) if i==0 or x!=input[i-1])

#find positions of letters on keyboard
xpos = map(lambda l: keypos[l][0], input)
ypos = map(lambda l: keypos[l][1], input)

#check where the direction changes (neglect slight changes in x, e.g. 'edx')
xpivot = [(t-p)*(n-t)<-1.1 for p,t,n in zip(xpos[:-2], xpos[1:-1], xpos[2:])]
ypivot = [(t-p)*(n-t)<0 for p,t,n in zip(ypos[:-2], ypos[1:-1], ypos[2:])]
pivot = [xp or yp for xp,yp in zip(xpivot, ypivot)]

#the first and last letters are always pivots
pivot = [True] + pivot + [True]

#build regex
regex = ''.join(x + ('+' if p else '*') for x,p in zip(input, pivot))
regexobj = re.compile(regex + '$')

#find all words that match the regex and output the longest one
words = [w for w in wordlist if regexobj.match(w)]
output = max(words, key=len) if words else input
print output

Run as

python turnarounds.py tyuytrghn

The solution is not very robust. A single wrong keystroke would make it fail. However, since the set of test cases has ~~very little~~ no misspellings it performs quite well, only getting confused in some cases where it picks too long words.

Edit: I changed it a bit to not use the provided key position file but a simplified layout. This makes it easier for my algorithm because in the file the keys are not evenly spaced. I can achieve even slightly better results by including some ad-hoc tiebreakers for ambiguous cases, but that would be over-optimizing for this particular test set. Some fails remain that are not actually ambiguous but that I don't catch with my simple algorithm because it only considers direction changes of more than 90 degrees.

Emil

Posted 2014-10-12T18:29:22.617

Reputation: 1 438

Well done! Current leader! If you want to see the log, goto https://github.com/matsjoyce/codegolf-swipe-type/blob/master/logs/Turnaround.log

– matsjoyce – 2014-10-13T20:22:07.197

@matsjoyce: I've added a comment to the question pointing out the two spelling mistakes I think I've found. :) – Emil – 2014-10-13T20:59:57.830

Yes, thanks, I'm just giving it another check. I'm not quite sure what to do with ambiguous cases, though. – matsjoyce – 2014-10-13T21:11:07.643

@matsjoyce: Some ambiguities could be resolved by choosing a another of the possible paths across the keyboard. E.g. brats could be 'bgrdsasdrtrds'which cannot be confused with breasts. However, I also wouldn't mind if you left it as it is. – Emil – 2014-10-13T21:35:29.780

True, that would work. I'm just worried that if this set is made too 'optimal', and the final scoring set is not put up to much scrutiny, some solutions may not perform as well – matsjoyce – 2014-10-14T06:41:13.230

Python 3, Corner Simulator, 241 and 240 passes

Algorithm:

Deduplicate (remove consecutive runs of the same character) the input and all the words in the word list (using a dictionary to get the original words back)
Remove all words that do not start and end with the start and end of the swipe (first pass)
Make a regex out of all the corners greater than 80 degrees, then remove all words which do not match this (second pass)
Regex each word (like Regex Solver) against the swipe, then split the swipe into a series of theoretically straight lines, and check if they are straight and could have been produced by a finger moving along that line (significant_letter function) (third pass)
Sort the words by closeness to the straight lines
Then use length as a tie breaker (longer is better)
Then use Levenshtein distance as the final tie breaker
Output word!

Code:

# Corner Sim

from math import atan, degrees, pi, factorial, cos, radians
import csv
import re
import sys

keys = {}
key_size = 1.5
for line in open("keypos.csv"):
    k, x, y = map(str.strip, line.split(","))
    keys[k] = float(x), float(y)


def deduplicate(s):
    return s[0] + "".join(s[i + 1] for i in range(len(s) - 1) if s[i + 1] != s[i])


def angle(coord1, coord2):
    x1, y1, x2, y2 = coord1 + coord2
    dx, dy = x2 - x1, y1 - y2
    t = abs(atan(dx/dy)) if dy else pi / 2
    if dx >= 0 and dy >= 0: a = t
    elif dx >= 0 and dy < 0: a = pi - t
    elif dx < 0 and dy >= 0: a = 2 * pi - t
    else: a = t + pi
    return degrees(a)


def significant_letter(swipe):
    if len(swipe) <= 2: return 0
    x1, y1, x2, y2 = keys[swipe[0]] + keys[swipe[-1]]
    m = 0 if x2 == x1 else (y2 - y1) / (x2 - x1)
    y = lambda x: m * (x - x1) + y1
    def sim_fun(x2, y2):
        try: return (x2 / m + y2 - y1 + x1 * m) / (m + 1 / m)
        except ZeroDivisionError: return x2
    dists = []
    for i, key in enumerate(swipe[1:-1]):
        sx, bx = min(x1, x2), max(x1, x2)
        pos_x = max(min(sim_fun(*keys[key]), bx), sx)
        sy, by = min(y1, y2), max(y1, y2)
        pos_y = max(min(y(pos_x), by), sy)
        keyx, keyy = keys[key]
        dist = ((keyx - pos_x) ** 2 + (keyy - pos_y) ** 2) ** 0.5
        a = angle((keyx, keyy), (pos_x, pos_y))
        if 90 <= a <= 180: a = 180 - a
        elif 180 <= a <= 270: a = a - 180
        elif 270 <= a <= 360: a = 360 - a
        if a > 45: a = 90 - a
        h = key_size / 2 / cos(radians(a))
        dists.append((max(dist - h, 0), i + 1, key))
    return sorted(dists, reverse=True)[0][0]


def closeness2(s, t):
    # https://en.wikipedia.org/wiki/Levenshtein_distance
    if s == t: return 0
    if not len(s): return len(t)
    if not len(t): return len(s)
    v0 = list(range(len(t) + 1))
    v1 = list(range(len(t) + 1))
    for i in range(len(s)):
        v1[0] = i + 1
        for j in range(len(t)):
            cost = 0 if s[i] == t[j] else 1
            v1[j + 1] = min(v1[j] + 1, v0[j + 1] + 1, v0[j] + cost)
        for j in range(len(v0)):
            v0[j] = v1[j]
    return v1[len(t)] / len(t)


def possible(swipe, w, s=False):
    m = re.match("^" + "(.*)".join("({})".format(i) for i in w) + "$", swipe)
    if not m or s:
        return bool(m)
    return closeness1(swipe, w) < 0.8


def closeness1(swipe, w):
    m = re.match("^" + "(.*)".join("({})".format(i) for i in w) + "$", swipe)
    unsigpatches = []
    groups = m.groups()
    for i in range(1, len(groups), 2):
        unsigpatches.append(groups[i - 1] + groups[i] + groups[i + 1])
    if debug: print(unsigpatches)
    sig = max(map(significant_letter, unsigpatches))
    if debug: print(sig)
    return sig


def find_closest(swipes):
    level1, level2, level3, level4 = swipes
    if debug: print("Loading words...")
    words = {deduplicate(i.lower()): i.lower() for i in open("wordlist").read().split("\n") if i[0] == level1[0] and i[-1] == level1[-1]}
    if debug: print("Done first filter (start and end):", words)
    r = re.compile("^" + ".*".join(level4) + "$")
    pos_words2 = list(filter(lambda x: bool(r.match(x)), words))
    if debug: print("Done second filter (sharpest points):", pos_words2)
    pos_words = pos_words2 or words
    pos_words = list(filter(lambda x: possible(level1, x), pos_words)) or list(filter(lambda x: possible(level1, x, s=True), pos_words))
    if debug: print("Done third filter (word regex):", pos_words)
    sorted_pos_words = sorted((closeness1(level1, x), -len(x), closeness2(level1, x), x)
                              for x in pos_words)
    if debug: print("Closeness matching (to", level2 + "):", sorted_pos_words)
    return words[sorted_pos_words[0][3]]


def get_corners(swipe):
    corners = [[swipe[0]] for i in range(4)]
    last_dir = last_char = None
    for i in range(len(swipe) - 1):
        dir = angle(keys[swipe[i]], keys[swipe[i + 1]])
        if last_dir is not None:
            d = abs(last_dir - dir)
            a_diff = min(360 - d, d)
            corners[0].append(last_char)
            if debug: print(swipe[i], dir, last_dir, a_diff, end=" ")
            if a_diff >= 55:
                if debug: print("C", end=" ")
                corners[1].append(last_char)
            if a_diff >= 65:
                if debug: print("M", end=" ")
                corners[2].append(last_char)
            if a_diff >= 80:
                if debug: print("S", end=" ")
                corners[3].append(last_char)
            if debug: print()
        last_dir, last_char = dir, swipe[i + 1]
    tostr = lambda x: deduplicate("".join(x + [swipe[-1]]).lower())
    return list(map(tostr, corners))


if __name__ == "__main__":
    debug = "-d" in sys.argv
    if debug: sys.argv.remove("-d")
    swipe = deduplicate(sys.argv[1] if len(sys.argv) > 1 else input()).lower()
    corners = get_corners(swipe)
    if debug: print(corners)
    print(find_closest(corners))

Run with python3 entries/corner_sim.py

matsjoyce

Posted 2014-10-12T18:29:22.617

Reputation: 1 319

This was a valid answer. No need to make mine the answer. – Optimizer – 2014-11-04T19:59:28.077

@Optimizer Well, the meta discussion seems to favour accepting your answer, so its fine by me.

– matsjoyce – 2014-11-04T20:02:51.877

After reading that meta discussion only, I was okay with your decision, but this is also good (better) :) – Optimizer – 2014-11-04T20:07:25.777

Swipe Type Converter

Task

Rules

Winning Criterion

Other

Current Score Boards

Final Run

Answers

JavaScript, ES6, Three Pass Optimizer, 112 187 235 240 241 243 and 231 234 passes

Ruby, Regex Solver - 30 140 176 180 182 187 and 179 183 passes

C++, Discrete Fréchet Distance - 201 220 222 232 and 232 passes

PHP, Direction Checker, 203 213 216 229 231 and 229 233 passes

Python 2, Turnarounds, 224 226 230 232 and 230 234 passes

Python 3, Corner Simulator, 241 and 240 passes