Chess Analysis with Limited Information

19

2

In this challenge, you are given a limited amount of information about a particular game of chess, and you need to predict who won the game.

You are given two sets of data:

  1. Piece counts (What pieces are still alive)
  2. Board colors (The color of pieces on the board)

More importantly, you don't know where the pieces are located. You need to determine who you think will win.

Games are selected from all events listed on PGNMentor from 2010 to now. I've selected 10% of all of the board positions from each game that ends in a win or a loss. Board positions will always be at least 30 moves into the game. The test cases can be found here. (White wins are listed first, followed by black wins)

Input

The piece count will be a string consisting of a character for each piece: king, queen, rook, knight, bishop, or pawn. Lowercase means black, uppercase is white. The board will be a string of 64 characters (8 rows by 8 columns). B represents a black piece, W represents a white piece, and . represents an empty spot. Sample:

W..WB......W.BB....W..B..W.WWBBB..W...B....W..BBWWW...BB.W....B.,BBKPPPPPPPQRRbbkpppppppqrr

would represent the following board

...B.BB.
.BBBBBBB
.B.B....
B..W....
WWWW.W..
....W.W.
...W..WW
W.....W.

and where both colors have 2 Bishops, 1 King, 7 Pawns, 1 Queen, 2 Rooks

Output

You need to return a floating-point number between 0 and 1 (inclusive) to determine how likely it is that white wins. Sample:

0.3     (30% chance that white wins)

More details:

  • Each test case is worth 1 point. Your score will be 1 - (1-Output)^2 if white wins, or 1 - (Output)^2 if black wins.
  • Your final score will be the sum across all test cases.
  • If I feel that submissions are hardcoding the input, I reserve the right to change the test cases. (If I do change them, they will have the SHA-256 hash 893be4425529f40bb9a0a7632f7a268a087ea00b0eb68293d6c599c6c671cdee)
  • Your program must run test cases independently. No saving information from one test case to the next.
  • If you are using machine learning, I highly recommend training on the first 80% of the data, and testing using the remaining 20%. (Or whatever percentages you use). I use the games multiple times in the data, but I put the same games together sequentially.
  • UPDATE: I've added over a million test cases for testing and learning purposes. They are split into black and white parts because of github repo size limits.

Good luck and have fun!

Nathan Merrill

Posted 2017-03-21T15:37:38.423

Reputation: 13 591

This conversation has been moved to chat.

– Dennis – 2017-03-26T16:55:38.460

Do the new test cases contain the old ones, or are the two sets disjoint? – Fatalize – 2017-03-29T06:38:02.997

I have no idea. I got them from different sites, so it is possible they both included the same set of games. – Nathan Merrill – 2017-03-29T13:17:29.197

Answers

8

GNU sed + bc, 4336 5074.5 points, 64 75 %

Update: the OP gave a new way to calculate the score of the prediction for an individual test case. Using Wolfram Alpha, I plotted both sets of formulas to see the differences.

The current way brings a strong incentive to output actual probabilities, and not just the extremes, 0 and 1, for which the new formulas give the same maximum score as before. This is why the unchanged algorithm below, has now a better prediction rate, in fact a great rate given its simplicity.

However, there's also a drawback associated with the new formulas, as explained in 'Edit 1'.


This is a simple estimation based only on the material advantage / disadvantage, ignoring the actual placement of the pieces. I was curious how this will perform. The reason I use sed, and not some language that can do this in one line, is because it is my favorite esoteric language.

/:/d                                             # delete the two headers
s:.*,::                                          # delete board positions
s:$:;Q9,R5,B3,N3,P1,K0,q-9,r-5,b-3,n-3,p-1,k-0:  # add relative piece value table
:r                                               # begin replacement loop
s:([a-Z])((.*)\1([^,]+)):\4+\2:                  # table lookup: letter-value repl.
tr                                               # repeat till last piece
s:;.*::                                          # delete value table
s:.*:echo '&0'|bc:e                              # get material difference: bc call
/^0$/c0.5                                        # print potential draw score
/-/c0                                            # print potential black win score
c1                                               # print potential white win score

Standard piece values used:

  • 9 - Queen
  • 5 - Rook
  • 3 - Knight
  • 3 - Bishop
  • 1 - Pawn
  • 0 - King

I calculate the material for both sides and subtract black's material from that of white. The output for each test case is based on that difference as follows:

  • if difference > 0, then output = 1 (potential white win)
  • if difference = 0, then output = 0.5 (potential draw).

This is my only fractional output, hence the reason of the improvement as explained above.

  • if difference < 0, then output = 0 (potential black win)

The prediction rate for this method was 64 %. Now it is 75 % with the new formulas.

I initially expected it to be higher, say 70 %, but as a chess player myself I can understand the result, since I've lost so many games when I was +1 / +2, and won that many when I was down in material. It's all about the actual position. (Well, now I got my wish!)

Edit 1: the drawback

The trivial solution is to output 0.5 for each test case, because this way you scored half a point regardless who won. For our test cases, this meant a total score of 3392.5 points (50 %).

But with the new formulas, 0.5 (which is an output you'd give if you're undecided who wins) is converted to 0.75 points. Remember that the maximum score you can receive for a test case is 1, for 100% confidence in the winner. As such, the new total score for a constant 0.5 output is 5088.75 points, or 75 %! In my opinion, the incentive is too strong for this case.

That score is better, though marginally, than my material advantage based algorithm. The reason for that is because the algorithm gives a probability of 1 or 0 (no incentive), assumed wins or losses, more times (3831) than it gives 0.5 (incentive), assumed draws (2954). The method is simple in the end, and as such it doesn't have a high percentage of correct answers. The boost from the new formula to constant 0.5, manages to reach that percentage, artificially.

Edit 2:

It is a known fact, mentioned in chess books, that it is usually better to have a bishop pair than a knight pair. This is especially true in the middle to end stage of the game, where the test cases are, since it's more likely to have an open position where a bishop's range is increased.

Therefore I did a second test, but this time I replaced the bishops's value from 3 to 3.5. The knight's value remained 3. This is a personal preference, so I didn't make it my default submission. The total score in this case was 4411 points (65 %). Only 1 percentage point increase was observed.

With the new formulas, the total score is 4835 points (71 %). Now the weighted bishop underperforms. But, the effect is explained because the weighted method now gives even more times assumed wins or losses (5089), than assumed draws (1696).

seshoumara

Posted 2017-03-21T15:37:38.423

Reputation: 2 878

1+1 for providing a reasonable baseline solution. I was also wondering how well this would perform. – Martin Ender – 2017-03-22T09:47:57.963

@MartinEnder Thank you. My idea of increasing the bishop's value, mentioned last time, produced only a 1% success rate increase (see Update 2). I think the standard values did include that effect after all. – seshoumara – 2017-03-22T11:20:29.000

Hey, as per xnor's comment, would you mind if I change the scoring to be the squared absolute difference? – Nathan Merrill – 2017-03-22T11:50:18.057

1Awesome. Also, thanks for answering! I always worry that my tougher questions won't ever get an answer. – Nathan Merrill – 2017-03-22T11:56:07.620

@NathanMerrill I updated my answer to use the new scoring as asked. Sorry for the long analysis, but I was indeed really curious. – seshoumara – 2017-03-23T20:23:53.597

This answer interestingly gives the optimal score if p (the probability that white wins) is assumed to be 0.5 (equal chance of winning) regardless of the pieces and boards. – justhalf – 2017-03-24T06:39:45.250

Hey, I've added a ton more test cases if you are interested. – Nathan Merrill – 2017-03-28T15:08:40.633

8

Java 8 + Weka, 6413 points, 94.5%

This answer uses a machine learning approach. You need to retrieve the Weka library, notably weka.jar and PackageManager.jar.

Here, I use a multilayer perceptron as classifier; you can replace mlp with any Classifier class of Weka to compare results.

I have not tinkered much with the parameters of the MLP, and simply eyeballed them (one hidden layer of 50 neurons, 100 epochs, 0.2 learning rate, 0.1 momentum).

I threshold the output value of the MLP, so the output really is either 1 or 0 as defined in the challenge. That way the number of correctly classified instances as printed by Weka is directly our score.

Feature vector construction

I turn each instance from a string to a vector of 76 elements, where:

  • The first 64 elements represent the cells of the board, in the same order as in the string, where 1 is a white piece, -1 is a black piece and 0 is an empty cell.
  • The last 12 elements represent each type of piece (6 per player); the value of those elements is the number of pieces of that type on the board (0 being "no piece of that type"). One could apply normalization to refit those values between -1 and 1 but this is probably not very helpful here.

Number of training instances

If I use all test cases given to train my classifier, I have managed to get 6694 (i.e. 98.6588%) correctly classified instances. This is obviously not surprising because testing a model on the same data you used to train it is way too easy (because in that case it is actually good that the model overfits).

Using a random subset of 80% of the instances as training data, we obtain the 6413 (i.e. 94.5173%) correctly classified instances figure reported in the header (of course since the subset is random you might get slightly different results). I am confident that the model would work decently well on new data, because testing on the remaining 20% of the instances (that weren't used for training) gives 77.0818% correct classification, which shows that the models generalizes decently well (assuming the instances we are given here are representative of the new test cases we would be given).

Using half the instances for training, and the other half for testing, we get 86.7502% on both training and testing data, and 74.4988% on only the test data.

Implementation

As I've said, this code requires weka.jar and PackageManager.jar from Weka.

One can control the percentage of data used in the training set with TRAIN_PERCENTAGE.

The parameters of the MLP can be changed just below TRAIN_PERCENTAGE. One can try other classifiers of Weka (e.g. SMO for SVMs) by simply replacing mlp with another classifier.

This program prints to sets of results, the first one being on the entire set (including the data used for training) which is the score as defined in this challenge, and the second one being on only the data that was not used for training.

One inputs the data by passing the path of the file containing it as an argument to the program.

import weka.classifiers.Classifier;
import weka.classifiers.Evaluation;
import weka.classifiers.functions.MultilayerPerceptron;
import weka.core.Attribute;
import weka.core.DenseInstance;
import weka.core.Instance;
import weka.core.Instances;

import java.io.BufferedReader;
import java.io.FileReader;
import java.util.ArrayList;

public class Test {

    public static void main(String[] arg) {

        final double TRAIN_PERCENTAGE = 0.5;

        final String HIDDEN_LAYERS = "50";
        final int NB_EPOCHS = 100;
        final double LEARNING_RATE = 0.2;
        final double MOMENTUM = 0.1;

        Instances instances = parseInstances(arg[0]);
        instances.randomize(new java.util.Random(0));
        Instances trainingSet = new Instances(instances, 0, (int) Math.floor(instances.size() * TRAIN_PERCENTAGE));
        Instances testingSet = new Instances(instances, (int) Math.ceil(instances.size() * TRAIN_PERCENTAGE), (instances.size() - (int) Math.ceil(instances.size() * TRAIN_PERCENTAGE)));

        Classifier mlp = new MultilayerPerceptron();
        ((MultilayerPerceptron) mlp).setHiddenLayers(HIDDEN_LAYERS);
        ((MultilayerPerceptron) mlp).setTrainingTime(NB_EPOCHS);
        ((MultilayerPerceptron) mlp).setLearningRate(LEARNING_RATE);
        ((MultilayerPerceptron) mlp).setMomentum(MOMENTUM);


        try {
            // Training phase
            mlp.buildClassifier(trainingSet);
            // Test phase
            System.out.println("### CHALLENGE SCORE ###");
            Evaluation test = new Evaluation(trainingSet);
            test.evaluateModel(mlp, instances);
            System.out.println(test.toSummaryString());
            System.out.println();
            System.out.println("### TEST SET SCORE ###");
            Evaluation test2 = new Evaluation(trainingSet);
            test2.evaluateModel(mlp, testingSet);
            System.out.println(test2.toSummaryString());
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public static Instances parseInstances(String filePath) {
        ArrayList<Attribute> attrs = new ArrayList<>(); // Instances constructor only accepts ArrayList
        for(int i = 0 ; i < 76 ; i++) {
            attrs.add(new Attribute("a" + String.valueOf(i)));
        }
        attrs.add(new Attribute("winner", new ArrayList<String>(){{this.add("white");this.add("black");}}));
        Instances instances = new Instances("Rel", attrs, 10);
        instances.setClassIndex(76);

        try {
            BufferedReader r = new BufferedReader(new FileReader(filePath));
            String line;
            String winner = "white";
            while((line = r.readLine()) != null) {
                if(line.equals("White:")) {
                    winner = "white";
                } else if(line.equals("Black:")) {
                    winner = "black";
                } else {
                    Instance instance = new DenseInstance(77);
                    instance.setValue(attrs.get(76), winner);
                    String[] values = line.split(",");
                    for(int i = 0 ; i < values[0].length() ; i++) {
                        if(values[0].charAt(i) == 'B') {
                            instance.setValue(attrs.get(i), -1);
                        } else if(values[0].charAt(i) == 'W') {
                            instance.setValue(attrs.get(i), 1);
                        } else {
                            instance.setValue(attrs.get(i), 0);
                        }
                    }
                    // Ugly as hell
                    instance.setValue(attrs.get(64), values[1].length() - values[1].replace("k", "").length());
                    instance.setValue(attrs.get(65), values[1].length() - values[1].replace("q", "").length());
                    instance.setValue(attrs.get(66), values[1].length() - values[1].replace("r", "").length());
                    instance.setValue(attrs.get(67), values[1].length() - values[1].replace("n", "").length());
                    instance.setValue(attrs.get(68), values[1].length() - values[1].replace("b", "").length());
                    instance.setValue(attrs.get(69), values[1].length() - values[1].replace("p", "").length());
                    instance.setValue(attrs.get(70), values[1].length() - values[1].replace("K", "").length());
                    instance.setValue(attrs.get(71), values[1].length() - values[1].replace("Q", "").length());
                    instance.setValue(attrs.get(72), values[1].length() - values[1].replace("R", "").length());
                    instance.setValue(attrs.get(73), values[1].length() - values[1].replace("N", "").length());
                    instance.setValue(attrs.get(74), values[1].length() - values[1].replace("B", "").length());
                    instance.setValue(attrs.get(75), values[1].length() - values[1].replace("P", "").length());

                    instances.add(instance);
                }
            }
        } catch (Exception e) { // who cares
            e.printStackTrace();
        }
        return instances;
    }
}

Fatalize

Posted 2017-03-21T15:37:38.423

Reputation: 32 976

How do you encode the input? – Nathan Merrill – 2017-03-22T12:59:06.170

@NathanMerrill I'm not sure I understand your question – Fatalize – 2017-03-22T12:59:31.357

How are you passing the test case as input to the neural network? Are you just passing in the raw string? – Nathan Merrill – 2017-03-22T13:03:36.863

@NathanMerrill Edited with a paragraph on the feature vector construction. – Fatalize – 2017-03-22T13:08:24.523

How does weka know you are trying to predict the winner? – user1502040 – 2017-03-23T19:25:23.013

The 94.5% is the result on the training data? I don't think that number is meaningful as neural network also works as some kind of hard-coding mechanism if the parameter space is large enough. Note that if the probability that white wins is close to 0.5 (which is quite true if we are not given any information), the optimal expected score should be only slightly larger than 75%, which your result on test data seems to match quite well, but not the result on training data. The result by @user1502040 seems more reliable (on validation set) and realistic. – justhalf – 2017-03-24T06:45:26.220

@justhalf This is exactly why (1) I did not use 100% of the set to train (2) have commented on OP's question to tell them that they should use a secret set of data to test all answers. I don't know how you can pull that 75% percent figure out of thin air. user1502040 uses 90% of all data for training instead of 80% as I do. – Fatalize – 2017-03-24T07:20:15.897

Sorry I didn't reference my claim earlier. The 75% figure is based on my comment in the question, responding to xnor call for confirmation on the optimal score. An optimal system that outputs the true probability will get a score of 75% if p=0.5 as can be seen in the plot I gave in the comment. If p!=0.5 (which is likely here) then the optimal score will be higher than 75%. And yes, the comparison is not fair with user1502040 since more data is used there.

– justhalf – 2017-03-24T07:58:47.907

Hey, I've added a ton more test cases if you are interested. – Nathan Merrill – 2017-03-28T15:09:00.610

4

Python 3 - 84.6%, 5275 points on a validation set

If we cheat and use all the data, we can achieve an accuracy of 99.3%, and a score of 6408

Just a simple large MLP with dropout using Keras

import collections
import numpy as np
import random

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers.noise import GaussianDropout
from keras.optimizers import Adam

np.random.seed(0)
random.seed(0)

def load_data():
    with open('test_cases.txt', 'r') as f:
        for line in f:
            yield line.split(',')

def parse_data(rows):
    black_pieces = "kpbkrq"
    white_pieces = black_pieces.upper()
    for i, row in enumerate(rows):
        if len(row) >= 2:
            board = row[0]
            board = np.array([1 if c == 'W' else -1 if c == 'B' else 0 for c in board], dtype=np.float32)
            pieces = row[1]
            counts = collections.Counter(pieces)
            white_counts = np.array([counts[c] for c in white_pieces], dtype=np.float32)
            black_counts = np.array([counts[c] for c in black_pieces], dtype=np.float32)
            yield (outcome, white_counts, black_counts, board)
        else:
            if 'White' in row[0]:
                outcome = 1
            else:
                outcome = 0

data = list(parse_data(load_data()))
random.shuffle(data)
data = list(zip(*data))
y = np.array(data[0])
x = list(zip(*data[1:]))
x = np.array([np.concatenate(xi) for xi in x])

i = len(y) // 10

x_test, x_train = x[:i], x[i:]
y_test, y_train = y[:i], y[i:]

model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(76,)))
model.add(GaussianDropout(0.5))
model.add(Dense(512, activation='relu'))
model.add(GaussianDropout(0.5))
model.add(Dense(512, activation='relu'))
model.add(GaussianDropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='mean_squared_error', optimizer=Adam())

use_all_data = False

x_valid, y_valid = x_test, y_test

if use_all_data:
    x_train, y_train = x_test, y_test = x, y
    validation_data=None
else:
    validation_data=(x_test, y_test)

batch_size = 128

history = model.fit(x_train, y_train, batch_size=batch_size, epochs=50, verbose=1, validation_data=validation_data)

y_pred = model.predict_on_batch(x_test).flatten()
y_class = np.round(y_pred)
print("accuracy: ", np.sum(y_class == y_test) / len(y_test))

score = np.sum((y_pred - (1 - y_test)) ** 2) * (len(y) / len(y_test))
print("score: ", score)

user1502040

Posted 2017-03-21T15:37:38.423

Reputation: 2 196

How much data do you use for training to get the 84.6% figure? – Fatalize – 2017-03-22T13:44:47.007

I used a 90-10 split as shown in the code – user1502040 – 2017-03-22T14:07:22.133

Hey, I've added a ton more test cases if you are interested. – Nathan Merrill – 2017-03-28T15:08:55.143

2

Python 3 - 94.3% accuracy, 6447 points on a validation set of 20% of the data

Uses 3 neural networks, a nearest neighbours regressor, a random forest, and gradient boosting. These predictions are combined with a random forest which also has access to the data.

import collections
import numpy as np
import numpy.ma as ma
import random

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, BatchNormalization, Activation, Conv2D, Flatten
from keras.layers.noise import GaussianDropout
from keras.callbacks import EarlyStopping
from keras.optimizers import Adam
from sklearn.neighbors import KNeighborsRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import GradientBoostingRegressor
import tensorflow

tensorflow.set_random_seed(1)
np.random.seed(1)
random.seed(1)

def load_data():
    with open('test_cases.txt', 'r') as f:
        for line in f:
            yield line.split(',')

def parse_data(rows):
    black_pieces = "kqrnbp"
    white_pieces = black_pieces.upper()
    for i, row in enumerate(rows):
        if len(row) >= 2:
            board = row[0]
            board = np.array([1 if c == 'W' else -1 if c == 'B' else 0 for c in board], dtype=np.float32)
            pieces = row[1]
            counts = collections.Counter(pieces)
            white_counts = np.array([counts[c] for c in white_pieces], dtype=np.float32)
            black_counts = np.array([counts[c] for c in black_pieces], dtype=np.float32)
            yield (outcome, white_counts, black_counts, board)
        else:
            if 'White' in row[0]:
                outcome = 1
            else:
                outcome = 0

data = list(parse_data(load_data()))
random.shuffle(data)
data = list(zip(*data))
y = np.array(data[0])
x = list(zip(*data[1:]))
conv_x = []
for white_counts, black_counts, board in x:
    board = board.reshape((1, 8, 8))
    white_board = board > 0
    black_board = board < 0
    counts = [white_counts, black_counts]
    for i, c in enumerate(counts):
        n = c.shape[0]
        counts[i] = np.tile(c, 64).reshape(n, 8, 8)
    features = np.concatenate([white_board, black_board] + counts, axis=0)
    conv_x.append(features)
conv_x = np.array(conv_x)
x = np.array([np.concatenate(xi) for xi in x])
s = x.std(axis=0)
u = x.mean(axis=0)
nz = s != 0
x = x[:,nz]
u = u[nz]
s = s[nz]
x = (x - u) / s

i = 2 * len(y) // 10

x_test, x_train = x[:i], x[i:]
conv_x_test, conv_x_train = conv_x[:i], conv_x[i:]
y_test, y_train = y[:i], y[i:]

model = Sequential()

def conv(n, w=3, shape=None):
    if shape is None:
        model.add(Conv2D(n, w, padding="same"))
    else:
        model.add(Conv2D(n, w, padding="same", input_shape=shape))
    model.add(BatchNormalization())
    model.add(Activation('relu'))

conv(128, shape=conv_x[0].shape) 
conv(128)
conv(128)
conv(128)
conv(128)
conv(128)
conv(128)
conv(128)
conv(128)
conv(128)
conv(2, w=1)
model.add(Flatten())
model.add(GaussianDropout(0.5))
model.add(Dense(256))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(GaussianDropout(0.5))
model.add(Dense(1))
model.add(BatchNormalization())
model.add(Activation('sigmoid'))

model.compile(loss='mse', optimizer=Adam())

model5 = model

model = Sequential()
model.add(Dense(50, input_shape=(x.shape[1],)))
model.add(Activation('sigmoid'))
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='mse', optimizer=Adam())

model0 = model

model = Sequential()
model.add(Dense(1024, input_shape=(x.shape[1],)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(GaussianDropout(0.5))
model.add(Dense(1024))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(GaussianDropout(0.5))
model.add(Dense(1024))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(GaussianDropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='mse', optimizer=Adam())

model4 = model

use_all_data = False

x_valid, y_valid = x_test, y_test

if use_all_data:
    x_train, y_train = x_test, y_test = x, y
    validation_data=None
else:
    validation_data=(x_test, y_test)

def subsample(x, y, p=0.9, keep_rest=False):
    m = np.random.binomial(1, p, size=len(y)).astype(np.bool)
    r = (x[m,:], y[m])
    if not keep_rest:
        return r
    m = ~m
    return r + (x[m,:], y[m])

epochs=100

x0, y0, x_valid, y_valid = subsample(conv_x_train, y_train, keep_rest=True)
model5.fit(x0, y0, epochs=epochs, verbose=1, validation_data=(x_valid, y_valid), callbacks=[EarlyStopping(patience=1)])

x0, y0, x_valid, y_valid = subsample(x_train, y_train, keep_rest=True)
model0.fit(x0, y0, epochs=epochs, verbose=1, validation_data=(x_valid, y_valid), callbacks=[EarlyStopping(patience=1)])

x0, y0, x_valid, y_valid = subsample(x_train, y_train, keep_rest=True)
model4.fit(x0, y0, epochs=epochs, verbose=1, validation_data=(x_valid, y_valid), callbacks=[EarlyStopping(patience=1)])

model1 = RandomForestRegressor(n_estimators=400, n_jobs=-1, verbose=1)
model1.fit(*subsample(x_train, y_train))

model2 = GradientBoostingRegressor(learning_rate=0.2, n_estimators=5000, verbose=1)
model2.fit(*subsample(x_train, y_train))

model3 = KNeighborsRegressor(n_neighbors=2, weights='distance', p=1)
model3.fit(*subsample(x_train, y_train))

models = (model0, model1, model2, model3, model4, model5)

model_names = [
    "shallow neural net",
    "random forest",
    "gradient boosting",
    "k-nearest neighbors",
    "deep neural net",
    "conv-net",
    "ensemble"
]

def combine(predictions):
    clip = lambda x: np.clip(x, 0, 1)
    return clip(np.array([y.flatten() for y in predictions]).T)

def augment(x, conv_x):
    p = combine([m.predict(x) for m in models[:-1]] + [models[-1].predict(conv_x)])
    return np.concatenate((x, p), axis=1)

model = RandomForestRegressor(n_estimators=200, n_jobs=-1, verbose=1)
model.fit(augment(x_train, conv_x_train), y_train)

def accuracy(prediction):
    class_prediction = np.where(prediction > 0.5, 1, 0)
    return np.sum(class_prediction == y_test) / len(y_test)

predictions = [m.predict(x_test).flatten() for m in models[:-1]] + [models[-1].predict(conv_x_test).flatten()]+ [model.predict(augment(x_test, conv_x_test))]

for s, p in zip(model_names, predictions):
    print(s + " accuracy: ", accuracy(p))

def evaluate(prediction):
    return np.sum(1 - (prediction - y_test) ** 2) * (len(y) / len(y_test))

for s, p in zip(model_names, predictions):
    print(s + " score: ", evaluate(p))

user1502040

Posted 2017-03-21T15:37:38.423

Reputation: 2 196

Hey, I've added a ton more test cases if you are interested. – Nathan Merrill – 2017-03-28T15:08:28.740

Woah you went out on this. – Robert Fraser – 2017-08-20T16:10:48.100

Note the java answer here that "beats" yours appears to report % on the entire data set and only gets 77% on the data it didn't train with. – Robert Fraser – 2017-08-20T16:26:03.773

0

Python 3 - 4353.25/6785 points - 64%

So I worked on this mostly yesterday. My first post golfing, and I've only been using python a week or so now, so forgive me if not everything is optimized.

def GetWhiteWinPercent(a):
finalWhiteWinPercent=0
i=a.index(',')

#position
board=a[:i]
blackBoardScore=0
whiteBoardScore=0
for r in range(i):
    if board[r] == 'B': blackBoardScore += abs(7 - (r % 8))
    if board[r] == 'W': whiteBoardScore += r % 8
if   whiteBoardScore > blackBoardScore: finalWhiteWinPercent += .5
elif whiteBoardScore < blackBoardScore: finalWhiteWinPercent += .0
else: finalWhiteWinPercent+=.25

#pieces
pieces=a[i:]
s = {'q':-9,'r':-5,'n':-3,'b':-3,'p':-1,'Q':9,'R':5,'N':3,'B':3,'P':1}
pieceScore = sum([s.get(z) for z in pieces if s.get(z) != None])
if   pieceScore < 0: finalWhiteWinPercent += 0
elif pieceScore > 0: finalWhiteWinPercent += .5
else: finalWhiteWinPercent += .25

return finalWhiteWinPercent

I ended up along the same path as seshoumara's answer to begin with. But the large number of test cases that had even counts of pieces left me dissatisfied.

So I googled traits that dictate who is winning in chess (I don't play the game myself) and noticed board position, specifically center control, is big. That's where this bit comes in.

for r in range(i):
    if board[r] == 'B': blackBoardScore += abs(7 - (r % 8))
    if board[r] == 'W': whiteBoardScore += r % 8
if   whiteBoardScore > blackBoardScore: finalWhiteWinPercent += .5
elif whiteBoardScore < blackBoardScore: finalWhiteWinPercent += .0
else: finalWhiteWinPercent+=.25

Both of those halves combined are used to find the score (0.0, 0.25, 0.50, 0.75, 1.0)

Very interesting that this board position extra doesn't seem to increase the chance at all of guessing the winner.

If you drop the test cases into some files, here's the testing.

whiteWins=0
blackWins=0
totalWins=0
for line in open('testcases2.txt','r'):
    totalWins += 1
    blackWins += 1 - GetWhiteWinPercent(line)
for line in open('testcases.txt','r'):
    totalWins += 1
    whiteWins += GetWhiteWinPercent(line)

print(str(whiteWins+blackWins) +'/'+str(totalWins))

I know this isn't a golf challenge, but any tips or advice in that regard is appreciated!

Datastream

Posted 2017-03-21T15:37:38.423

Reputation: 41

My answer? You mean seshoumara's answer? Also, you don't need to golf this (unless you want to). This isn't a [tag:code-golf] challenge. – Nathan Merrill – 2017-03-22T12:41:23.660

You can save many bytes by only using one-character variable names. (Although it doesn't really matter because this isn't code-golf) – HyperNeutrino – 2017-03-22T12:42:54.997

Woops! Editing that now. At work, this is what I get for skimming! – Datastream – 2017-03-22T12:43:12.830

2Please don't golf it. It's better to keep the code readable when it's not code-golf. – mbomb007 – 2017-03-22T14:36:37.847

Control of the middle of the board does not mean occupying the middle of the board, but attacking the middle of the board. If you wanted to add some complexity around that, it may improve your score. – Not that Charles – 2017-03-22T16:21:55.470

Out of interest I tried assigning weights to each square to indicate how valuable it was for white or black, i.e. how well occupation of the square correlated with victory. For instance, a black piece on f1 often means black wins. To predict a board I just find the piece with the most valuable prediction, and use that as the result. It's only 58.7% accurate though. – Neil – 2017-03-22T21:23:39.170