KOTH - Loaded RPS

12

1

Contest permanently opened - Updated August 10th 2017

Even though on June 5th 2017 I declared a winner (who will be kept as the best answer) I'll be rnning new bots and updating the results.

June 5th Results

Congratulations user1502040

Since there are no ties, I only show the % of matches won.

Statistician2 - 95.7%
Fitter - 89.1%
Nash - 83.9%
Weigher - 79.9%
ExpectedBayes - 76.4%
AntiRepeater - 72.1%
Yggdrasil - 65.0%
AntiGreedy - 64.1%
Reactor - 59.9%
NotHungry - 57.3%
NashBot - 55.1%
Blodsocer - 48.6%
BestOfBothWorlds - 48.4%
GoodWinning - 43.9%
Rockstar - 40.5%
ArtsyChild - 40.4%
Assassin - 38.1%
WeightedRandom - 37.7%
Ensemble - 37.4%
UseOpponents - 36.4%
GreedyPsychologist - 36.3%
TheMessenger - 33.9%
Copycat - 31.4%
Greedy - 28.3%
SomewhatHungry - 27.6%
AntiAntiGreedy - 21.0%
Cycler - 20.3%
Swap - 19.8%
RandomBot - 16.2%

I created a Google Sheet with the grid of results of each pairing: https://docs.google.com/spreadsheets/d/1KrMvcvWMkK-h1Ee50w0gWLh_L6rCFOgLhTN_QlEXHyk/edit?usp=sharing


Thanks to the Petri Dilemma I found myself able to handle this King of the Hill.

The game

The game is a simple "Rock-Paper-Scissors" with a twist: Points gained with each victory increase during the match (your R, P or S get loaded).

  • Paper wins Rock
  • Scissors wins Paper
  • Rock wins Scissors

The winner gets as many points as his load on his play.

The loser increases by 1 the load on his play.

In the case of a tie, each player increases the load on his play by 0.5.

After 100 plays, the one with more points is the winner.

e.g.: P1 has loads [10,11,12] (Rock, Paper, Scissors) and P2 [7,8,9]. P1 plays R, P2 plays P. P2 wins and gets 8 points. P1 loads become [11,11,12], P2 loads stay the same.

Challenge specifications

Your program must be written in Python (sorry, I don't know how to handle it otherwise). You are to create a function that take each of these variables as an argument on each execution:

my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history

points - Current points (yours and your opp)

loaded- Array with loads (in order RPS) (yours and your opp)

history- String with all plays, last character is the last play (yours and your opp)

You must return "R", "P" or "S". If you would return something different, it would be an automatic lose of the match.

Rules

You cannot change built-in functions.

Testing

I'll keep a Git updated with the code and all the bots compiting: https://github.com/Masclins/LoadedRPS

Judging

The winner will be decided by selecting the person with the most win matches after 1000 full round-robin. Ties will be broken by matches tied. 1000 matches are being played rather than one because I expect a lot of randomness, and that way the randomness would be less relevant.

You can submit up to 5 bots.

The contest ends on JulyJune 4th (that will be the last day I'll accept any answer), and on JulyJune 5th I'll post the final stadings (might try to post an advancemnt before).


Since this is my first KOTH, I'm 100% opened to changing anything for improvement, such as the number of matches played against each bot.

Edited to 1000 matches, since I see there really is quite randomness involved.

Masclins

Posted 2017-05-24T08:00:52.610

Reputation: 914

with some randomised bots, you actually want to make multiple games of multiple rounds – Destructible Lemon – 2017-05-24T08:31:07.070

@DestructibleLemon I thought about making each bot play three times against each other bot rather than once. Seeing you think similarly, I'll do so. – Masclins – 2017-05-24T08:35:54.010

three might not be enough, for some pairings – Destructible Lemon – 2017-05-24T08:36:51.793

1(really you need a fair large number, since some probabilites do really extend over multiple matches. see my bot, where it could get trounced, but likely wouldn't with a fair amount of matches) – Destructible Lemon – 2017-05-24T08:39:57.963

1I'm glad my question helped you be able to run this, @AlbertMasclans! – Gryphon – 2017-05-24T12:44:01.507

It would be nice to allow any language that can take input from command-line args and can output to STDOUT. I am intrested in a node.js submission. – programmer5000 – 2017-05-24T12:48:16.550

@programmer5000 it would be indeed, but I don't know how to do so. – Masclins – 2017-05-24T13:17:16.387

@AlbertMasclans is it possible to have python execute shell code and get the printed result? – programmer5000 – 2017-05-24T13:18:46.243

@programmer5000 yes. Using from subprocess import call, the function call does what you want. – Masclins – 2017-05-24T14:16:27.300

What's runcode? – CalculatorFeline – 2017-05-24T18:43:03.893

2@AlbertMasclans Can you post the full testscript (including runcode and bots)? – CalculatorFeline – 2017-05-24T18:53:31.763

Does "full round-robin" mean that each bot plays with each other bot and all results are summed? – Display Name – 2017-05-24T20:01:56.943

It would be more interesting if bots could maintain state between rounds – user1502040 – 2017-05-24T20:02:36.560

Which Python version will be used for testing? – Display Name – 2017-05-24T20:11:45.447

@CalculatorFeline Full code posted – Masclins – 2017-05-25T06:58:30.943

@SargeBorsch It means exactly what you said. I'm using Python 3. – Masclins – 2017-05-25T06:58:50.907

@AlbertMasclans code can be simplified A Lot, don't forget that python has first class functions. it will also be less error prone. – Display Name – 2017-05-25T07:00:58.407

does current testing code make bots also play with themselves? it looks like it does, and if a bot makes lots of ties when playing with itself, it reduces its chances to be first in leaderboard – Display Name – 2017-05-25T07:03:14.160

@SargeBorsch to be honest, the main reason for me to handle this KOTH is learn Python (which I just began doing), so any advice would be welcomed. And no, it doesn't make them play themselves – Masclins – 2017-05-25T07:04:06.420

oops, i looked in wrong place. yes, looks correct – Display Name – 2017-05-25T07:05:22.460

i think a time limit should be chosen otherwise someone will post a solution which practically disrupts the whole shootout – Display Name – 2017-05-25T07:15:12.013

Can you make a Github repo with all of the bots? – clismique – 2017-05-25T07:17:09.457

@Qwerp-Derp done (I think) – Masclins – 2017-05-25T07:32:27.817

How long is the script expected to take? My PC is no slouch but it's been 2 hours and the terminal still says 0%. – Neil – 2017-06-04T23:18:47.547

in my PC it took less than an hour. Still, though, in no time the first matches should be solved, since they are for quick bots. – Masclins – 2017-06-05T09:17:10.913

You said July, but you meant June, didn't you? Seems I got lucky that my bot even got accepted. Then again, I didn't realise that someone had already submitted a better Nash bot otherwise I wouldn't have bothered. And I didn't even beat NotHungry which is was my original aim... – Neil – 2017-06-09T09:49:47.817

@Neil Ooops! Yes I ment June. My deepest apologies. – Masclins – 2017-06-09T11:03:04.103

Answers

8

Statistician (no longer playing)

import random
import collections

R, P, S = moves = range(3)
move_idx = {"R": R, "P": P, "S": S}
name = "RPS"
beat = (P, S, R)
beaten = (S, R, P)

def react(_0, _1, _2, _3, _4, opp_history):
    if not opp_history:
        return random.randrange(0, 3)
    return beat[opp_history[-1]]

def anti_react(_0, _1, _2, _3, _4, opp_history):
    if not opp_history:
        return random.randrange(0, 3)
    return beaten[opp_history[-1]]

def random_max(scores):
    scores = [s + random.normalvariate(0, 1) for s in scores]
    return scores.index(max(scores))

def greedy_margin(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    scores = [my_loaded[move] - opp_loaded[beat[move]] for move in moves]
    return random_max(scores)

def anti_greedy(my_points, opp_pints, my_loaded, opp_loaded, my_history, opp_history):
    scores = [-my_loaded[move] for move in moves]
    return random_max(scores)

def recent_stats(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    opp_history = opp_history[-10:-1]
    counts = collections.Counter(opp_history)
    scores = [(counts[beaten[move]] + 1) * my_loaded[move] - 
              (counts[beat[move]] + 1) * opp_loaded[move] for move in moves]
    return random_max(scores)

def statistician(_0, _1, _2, _3, my_history, opp_history):
    m1 = []
    o1 = []
    my_loaded = [0] * 3
    opp_loaded = [0] * 3
    my_points = 0
    opp_points = 0
    strategies = [react, anti_react, greedy_margin, anti_greedy, recent_stats]
    strategy_scores = [0 for _ in strategies]
    for i, (mx, ox) in enumerate(zip(my_history, opp_history)):
        mx = move_idx[mx]
        ox = move_idx[ox]
        for j, strategy in enumerate(strategies):
            strategy_scores[j] *= 0.98
            move = strategy(my_points, opp_points, my_loaded, opp_loaded, m1, o1)
            if move == beat[ox]:
                strategy_scores[j] += my_loaded[move]
            elif move == beaten[ox]:
                strategy_scores[j] -= opp_loaded[ox]
        m1.append(mx)
        o1.append(ox)
        if mx == beat[ox]:
            opp_loaded[ox] += 1
            my_points += my_loaded[mx]
        elif mx == beaten[ox]:
            my_loaded[mx] += 1
            opp_points += opp_loaded[ox]
        else:
            my_loaded[mx] += 0.5
            opp_loaded[ox] += 0.5
    strategy = strategies[random_max(strategy_scores)]
    return name[strategy(my_points, opp_points, my_loaded, opp_loaded, m1, o1)]

Switches between a few simple strategies based on expected past performance

Statistician 2

import random
import collections
import numpy as np

R, P, S = moves = range(3)
move_idx = {"R": R, "P": P, "S": S}
names = "RPS"
beat = (P, S, R)
beaten = (S, R, P)

def react(my_loaded, opp_loaded, my_history, opp_history):
    if not opp_history:
        return random.randrange(0, 3)
    counts = [0, 0, 0]
    counts[beat[opp_history[-1]]] += 1
    return counts

def random_max(scores):
    scores = [s + random.normalvariate(0, 1) for s in scores]
    return scores.index(max(scores))

def argmax(scores):
    m = max(scores)
    return [s == m for s in scores]

def greedy_margin(my_loaded, opp_loaded, my_history, opp_history):
    scores = [my_loaded[move] - opp_loaded[beat[move]] for move in moves]
    return argmax(scores)

recent_counts = None

def best_move(counts, my_loaded, opp_loaded):
    scores = [(counts[beaten[move]] + 0.5) * my_loaded[move] - 
              (counts[beat[move]] + 0.5) * opp_loaded[move] for move in moves]
    return argmax(scores)

def recent_stats(my_loaded, opp_loaded, my_history, opp_history):
    if len(opp_history) >= 10:
        recent_counts[opp_history[-10]] -= 1
    recent_counts[opp_history[-1]] += 1
    return best_move(recent_counts, my_loaded, opp_loaded)

order2_counts = None

def order2(my_loaded, opp_loaded, my_history, opp_history):
    if len(my_history) >= 2:
        base0 = 9 * my_history[-2] + 3 * opp_history[-2]
        order2_counts[base0 + opp_history[-1]] += 1
    base1 = 9 * my_history[-1] + 3 * opp_history[-1]
    counts = [order2_counts[base1 + move] for move in moves]
    return best_move(counts, my_loaded, opp_loaded)

def nash(my_loaded, opp_loaded, my_history, opp_history):
    third = 1.0 / 3
    p = np.full(3, third)
    q = np.full(3, third)
    u = np.array(my_loaded)
    v = np.array(opp_loaded)
    m0 = np.zeros(3)
    m1 = np.zeros(3)
    lr = 0.2
    for _ in range(10):
        de0 = u * np.roll(q, 1) - np.roll(v * q, 2)
        de1 = v * np.roll(p, 1) - np.roll(u * p, 2)
        m0 = 0.9 * m0 + 0.1 * de0
        m1 = 0.9 * m1 + 0.1 * de1
        p += lr * m0
        q += lr * m1
        p[p < 0] = 0
        q[q < 0] = 0
        tp, tq = np.sum(p), np.sum(q)
        if tp == 0 or tq == 0:
            return np.full(3, third)
        p /= tp
        q /= tq
        lr *= 0.9
    return p

strategies = [react, greedy_margin, recent_stats, order2, nash]

predictions = strategy_scores = mh = oh = None

def statistician2func(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    global strategy_scores, history, recent_counts, mh, oh, predictions, order2_counts
    if not opp_history:
        strategy_scores = [0 for _ in strategies]
        recent_counts = collections.Counter()
        order2_counts = collections.Counter()
        mh, oh = [], []
        predictions = None
        return random.choice(names)
    my_move = move_idx[my_history[-1]]
    opp_move = move_idx[opp_history[-1]]
    if predictions is not None:
        for j, p in enumerate(predictions):
            good = beat[opp_move]
            bad = beaten[opp_move]
            strategy_scores[j] += (my_loaded[good] * p[good] - opp_loaded[opp_move] * p[bad]) / sum(p)
    mh.append(my_move)
    oh.append(opp_move)
    predictions = [strategy(my_loaded, opp_loaded, mh, oh) for strategy in strategies]
    strategy = random_max(strategy_scores)
    p = predictions[strategy]
    r = random.random()
    for i, pi in enumerate(p):
        r -= pi
        if r <= 0:
            break
    return names[i]

Nash

import numpy as np
import random

def nashfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    third = 1.0 / 3
    p = np.full(3, third)
    q = np.full(3, third)
    u = np.array(my_loaded)
    v = np.array(opp_loaded)
    m0 = np.zeros(3)
    m1 = np.zeros(3)
    lr = 0.2
    for _ in range(10):
        de0 = u * np.roll(q, 1) - np.roll(v * q, 2)
        de1 = v * np.roll(p, 1) - np.roll(u * p, 2)
        m0 = 0.9 * m0 + 0.1 * de0
        m1 = 0.9 * m1 + 0.1 * de1
        p += lr * m0
        q += lr * m1
        p[p < 0] = 0
        q[q < 0] = 0
        tp, tq = np.sum(p), np.sum(q)
        if tp == 0 or tq == 0:
            return random.choice("RPS")
        p /= tp
        q /= tq
        lr *= 0.9
    r = random.random()
    for i, pi in enumerate(p):
        r -= pi
        if r <= 0:
            break
    return "RPS"[i]

Computes an approximate Nash equilibrium by gradient descent.

user1502040

Posted 2017-05-24T08:00:52.610

Reputation: 2 196

1I really like this approach, and can understand why you'd want to be able to keep the state between rounds. Even though I see it a huge problem to change it given the number of submissions. I'll take that into account for further challenges (which I expect to do when this finishes). – Masclins – 2017-05-25T08:51:29.740

5

Weigher

I lost track of reasoning while experimenting with the code, but the basic idea is to estimate opponent's move probability by last 3 moves using some weights and multiply them by another weight which depends on loads. I thought that I can somehow use my_loaded too, but I couldn't decide how, so left it out.

def weigher(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    idx = {"R": 0, "P": 1, "S": 2}
    sc = [0, 0, 0]
    for i, m in enumerate(reversed(opp_history[-3:])):
        sc[idx[m]] += (1 / (1 + i))

    for i in range(3):
        sc[i] *= (opp_loaded[i] ** 2)

    return "PSR"[sc.index(max(sc))]

Satan

Probably will be disqualified, because it's kind of cheating and it makes some assumptions about the testing function (it has to have opponent's function in a variable on its stack frame), but it doesn't technically break any current rules — it doesn't redefine or rewrite anything. It simply uses black magic to execute the opponent function to see what turn did/will they do. It cannot deal with randomness, but deterministic bots have no chance to defeat Satan.

def satan(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    import inspect, types
    f = inspect.currentframe()
    s = f.f_code.co_name
    try:
        for v in f.f_back.f_locals.values():
            if isinstance(v, types.FunctionType) and v.__name__ != s:
                try:
                    return "PSR"[{"R": 0, "P": 1, "S": 2}[
                        v(opp_points, my_points, opp_loaded, my_loaded, opp_history, my_history)]]
                except:
                    continue
    finally:
        del f

Display Name

Posted 2017-05-24T08:00:52.610

Reputation: 654

Without any doubt the best one in terms of Simplicity-Results – Masclins – 2017-05-25T12:38:36.273

By the way, to use my_loaded you could add a weight that values the move that would lose against your last move(s). That is like assuming your opponent will do something similar to what you did, and therefore punishing him for assuming you will keep playing the same. Something like: for i, m in enumerate(reversed(my_history[-3:])): sc[(idx[m]+1)%3] += (K / (1 + i)) – Masclins – 2017-05-25T14:01:01.043

@AlbertMasclans added another solution – Display Name – 2017-05-25T19:40:30.017

1I really like the Satan one. But as you said, I believe it shouldn't qualify: Even if it doesn't break any explicit rule, it's clearly against the spirit of the game. Still, congratulations on your idea! – Masclins – 2017-05-25T23:14:16.330

4

Fitter

This Bot improves Pattern and fuses it with Economist (Pattern and Economist will no longer participate)

The improvement of Pattern is that the Bot now looks for two two kinds of patterns: Opponent reacting to his last play and opponent reacting to my last play. Then evaluates both predictions to use the one that fits the best.

From that pattern the Bot has now the probability for R, P and S. Taking that into account and the expected value of each play (as Economist did), the Bot plays the one that gives the most value.

import random
import numpy as np
def fitterfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
        t = len(opp_history)
        RPS = ["R","P","S"]
        if t <= 2:
                return RPS[t]
        elif t == 3:
                return random.choice(RPS)

        def n(c): return RPS.index(c)

        total_me = np.zeros(shape=(3,3))
        total_opp= np.zeros(shape=(3,3))
        p_me = np.array([[1/3]*3]*3)
        p_opp = np.array([[1/3]*3]*3)

        for i in range(1, t):
                total_me[n(my_history[i-1]), n(opp_history[i])] += 1
                total_opp[n(opp_history[i-1]), n(opp_history[i])] += 1
        for i in range(3):
                if np.sum(total_me[i,:]) != 0:
                        p_me[i,:] = total_me[i,:] / np.sum(total_me[i,:])
                if np.sum(total_opp[i,:]) != 0:
                        p_opp[i,:] = total_opp[i,:] / np.sum(total_opp[i,:])

        error_me = 0
        error_opp = 0

        for i in range(1, t):
                diff = 1 - p_me[n(my_history[i-1]), n(opp_history[i])]
                error_me += diff * diff
                diff = 1 - p_opp[n(opp_history[i-1]), n(opp_history[i])]
                error_opp += diff * diff

        if error_me < error_opp:
                p = p_me[n(my_history[-1]),:]
        else:
                p = p_opp[n(opp_history[-1]),:]


# From here, right now I weight values, though not 100% is the best idea, I leave the alternative in case I'd feel like changing it
        value = [(p[2]*my_loaded[0] - p[1]*opp_loaded[1], "R"), (p[0]*my_loaded[1] - p[2]*opp_loaded[2], "P"), (p[1]*my_loaded[2] - p[0]*opp_loaded[0], "S")]
        value.sort()

        if value[-1][0] > value[-2][0]:
                return value[-1][1]
        elif value[-1][0] > value[-3][0]:
                return random.choice([value[-1][1], value[-2][1]])
        else:
                return random.choice(RPS)

#       idx = p.tolist().index(max(p))
#       return ["P", "S", "R"][idx]

Here are the two old codes

Pattern (no longer playing)

The Pattern tries to find patterns on his opponent. It looks what the opponent had played after the last play he did (giving more weight to the latter plays). Through that, it guesses what the opponent will play, and plays the countermatch to that.

import random
import numpy as np
def patternfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
        if len(opp_history) == 0:
                return random.choice(["R","P","S"])
        elif len(opp_history) == 1:
                if opp_history == "R":
                        return "P"
                elif opp_history == "P":
                        return "S"
                elif opp_history == "S":
                        return "R"

        p = np.array([1/3]*3)
        c = opp_history[-1]
        for i in range(1, len(opp_history)):
                c0 = opp_history[i-1]
                c1 = opp_history[i]
                if c0 == c:
                        p *= .9
                        if c1 == "R":
                                p[0] += .1
                        elif c1 == "P":
                                p[1] += .1
                        elif c1 == "S":
                                p[2] += .1

        idx = p.tolist().index(max(p))
        return ["P", "S", "R"][idx]

Economist (no longer playing)

The Economist does the following: Guesses the probability of each play by the opponent by watching what he had played the last 9 turns. From that, computes the expected benefit of each play and goes with the one that has the best expected value.

import random
def economistfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
        if len(opp_history) == 0:
                return random.choice(["R","P","S"])
        if len(opp_history) > 9:
                opp_history = opp_history[-10:-1]
        p = [opp_history.count("R"), opp_history.count("P"), opp_history.count("S")]

        value = [(p[2]*my_loaded[0] - p[1]*opp_loaded[1], "R"), (p[0]*my_loaded[1] - p[2]*opp_loaded[2], "P"), (p[1]*my_loaded[2] - p[0]*opp_loaded[0], "S")]
        value.sort()

        if value[-1][0] > value[-2][0]:
                return value[-1][1]
        elif value[-1][0] > value[-3][0]:
                return random.choice([value[-1][1], value[-2][1]])
        else:
                return random.choice(["R","P","S"])

Masclins

Posted 2017-05-24T08:00:52.610

Reputation: 914

4

Anti-Repeater

from random import choice
def Antirepeaterfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    s = opp_history.count("S")
    r = opp_history.count("R")
    p = opp_history.count("P")

    if s>p and s>r:
        return "R"
    elif p>s and p>r:
        return "S"
    else:
        return "P"

Picks paper on the first turn, after which it returns whatever beats what the opponent has done the most, picking paper in case of a tie.

Copycat

import random
def copycatfunc(I,dont,care,about,these,enmoves):
    if not enmoves:
        return random.choice(["R","P","S"])
    else:
        return enmoves[len(enmoves)-1]

Simply copies the opponents last move.

Anti-Anti-Greedy

from random import choice
def antiantigreedy(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    if opp_loaded[0] > opp_loaded[1] and opp_loaded[0] > opp_loaded[2]:
        return "S"
    if opp_loaded[1] > opp_loaded[0] and opp_loaded[1] > opp_loaded[2]:
        return "R"
    if opp_loaded[2] > opp_loaded[0] and opp_loaded[2] > opp_loaded[1]:
        return "P"
    else:
        return choice(["R","P","S"])

Picks whatever loses to the opponent's most heavily weighted choice.

Somewhat Hungry

from random import choice
def somewhathungryfunc(blah, blah2, load, blah3, blah4, blah5):
    if load[0] > load[1] and load[0] < load[2] or load[0] < load[1] and load[0] > load[2]:
        return "R"
    if load[1] > load[0] and load[1] < load[2] or load[1] < load[0] and load[1] > load[2]:
        return "P"
    if load[2] > load[1] and load[2] < load[0] or load[2] < load[1] and load[2] > load[0]:
        return "S"
    else:
        return choice(["R","P","S"])

Gryphon

Posted 2017-05-24T08:00:52.610

Reputation: 6 697

4

Yggdrasil

This is named "Yggdrasil" because it looks ahead in the game tree. This bot does not perform any prediction of the opponent, it simply attempts to maintain a statistical advantage if it is given one (by balancing current and future profits). It calculates an approximately ideal mixed strategy, and returns a move selected randomly with those weights. If this bot were perfect (which it's not, because the state valuation function is pretty bad and it doesn't look very far ahead), then it would be impossible to beat this bot more than 50% of the time. I don't know how well this bot will do in practice.

def yggdrasil(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    cache = {}
    def get(turn, ml, ol):
        key = str(turn) + str(ml) + str(ol)
        if not key in cache:
            cache[key] = State(turn, ml, ol)
        return cache[key]

    def wrand(opts):
        total = sum(abs(w) for c,w in opts.items())
        while True:
            r = random.uniform(0, total)
            for c, w in opts.items():
                r -= abs(w)
                if r < 0:
                    return c
            print("error",total,r)

    class State():
        turn = 0
        ml = [1,1,1]
        ol = [1,1,1]
        val = 0
        strat = [1/3, 1/3, 1/3]
        depth = -1
        R = 0
        P = 1
        S = 2
        eps = 0.0001
        maxturn = 1000

        def __init__(self, turn, ml, ol):
            self.turn = turn
            self.ml = ml
            self.ol = ol
        def calcval(self, depth):
            if depth <= self.depth:
                return self.val
            if turn >= 1000:
                return 0
            a = 0
            b = -self.ol[P]
            c = self.ml[R]
            d = self.ml[P]
            e = 0
            f = -self.ol[S]
            g = -self.ol[R]
            h = self.ml[S]
            i = 0
            if depth > 0:
                a += get(self.turn+1,[self.ml[R]+1,self.ml[P],self.ml[S]],[self.ol[R]+1,self.ol[P],self.ol[S]]).calcval(depth-1)
                b += get(self.turn+1,[self.ml[R]+2,self.ml[P],self.ml[S]],[self.ol[R],self.ol[P],self.ol[S]]).calcval(depth-1)
                c += get(self.turn+1,[self.ml[R],self.ml[P],self.ml[S]],[self.ol[R],self.ol[P],self.ol[S]+2]).calcval(depth-1)
                d += get(self.turn+1,[self.ml[R],self.ml[P],self.ml[S]],[self.ol[R]+2,self.ol[P],self.ol[S]]).calcval(depth-1)
                e += get(self.turn+1,[self.ml[R],self.ml[P]+1,self.ml[S]],[self.ol[R],self.ol[P]+1,self.ol[S]]).calcval(depth-1)
                f += get(self.turn+1,[self.ml[R],self.ml[P]+2,self.ml[S]],[self.ol[R],self.ol[P],self.ol[S]]).calcval(depth-1)
                g += get(self.turn+1,[self.ml[R],self.ml[P],self.ml[S]+2],[self.ol[R],self.ol[P],self.ol[S]]).calcval(depth-1)
                h += get(self.turn+1,[self.ml[R],self.ml[P],self.ml[S]],[self.ol[R],self.ol[P]+2,self.ol[S]]).calcval(depth-1)
                i += get(self.turn+1,[self.ml[R],self.ml[P],self.ml[S]+1],[self.ol[R],self.ol[P],self.ol[S]+1]).calcval(depth-1)
            self.val = -9223372036854775808
            for pr in range(0,7):
                for pp in range(0,7-pr):
                    ps = 6-pr-pp
                    thisval = min([pr*a+pp*d+ps*g,pr*b+pp*e+ps*h,pr*c+pp*f+ps*i])
                    if thisval > self.val:
                        self.strat = [pr,pp,ps]
                        self.val = thisval
            self.val /= 6


            if depth == 0:
                self.val *= min(self.val, self.maxturn - self.turn)
            return self.val

    turn = len(my_history)
    teststate = get(turn, [x * 2 for x in my_loaded], [x * 2 for x in opp_loaded])
    teststate.calcval(1)
    return wrand({"R":teststate.strat[R],"P":teststate.strat[P],"S":teststate.strat[S]})

PhiNotPi

Posted 2017-05-24T08:00:52.610

Reputation: 26 739

please remove comments which don't make code more understandable – Display Name – 2017-05-24T18:41:51.737

@SargeBorsch done – PhiNotPi – 2017-05-24T18:45:29.610

1@PhiNotPi I'm aware that I posted no time limitation, but Yggdrasil is taking more than a minute against each opponent. Would it be possible to optimize it a little bit? – Masclins – 2017-05-25T06:55:26.520

yeah it's unbearably slow – Display Name – 2017-05-25T07:11:55.043

@AlbertMasclans by minute per opponent do you mean 1 minute total for all games against an opponent? Also I can try to speed it up but I don't really know how to do it, it only looks 1 move ahead as is. – PhiNotPi – 2017-05-25T10:50:14.083

@PhiNotPi yours is not the slowest, though. For future partial rankings I'll try to post also the elapsed mean time of each bot so you see how long it took. – Masclins – 2017-05-25T10:54:16.263

@AlbertMasclans I believe I have sped it up a little, although its performance might decrease. – PhiNotPi – 2017-05-25T21:45:03.497

I just timed this thing at well under 1 millsecond per move. That's pretty fast if you ask me. – PhiNotPi – 2017-05-26T00:29:37.390

3

The messenger

def themessengerfunc(I, do, not, need, these, arguments):return "P"

Rockstar

def rockstarfunc(I, do, not, need, these, arguments):return "R"

Assassin

def assassinfunc(I, do, not, need, these, arguments):return "S"

Explanation

Now, you may think that these bots are entirely stupid.

not entirely true, these are actually based on idea, of amassing a huge bonus, and the enemy making a misstep and getting walloped with it.

now, these bots play very similarly to greedy, however, they are simpler, and do not randomly pick until they get a load on one weapon, they stick with their weapon of choice.

Another thing to note: these will each beat greedy around half the time, drawing a third of the time, and losing one sixth of the time. when they do win, they will tend to win by a lot. why is this?

Greedy, until he loses a round, will randomly pick a weapon. this means that when he does not win a round, he will pick a weapon randomly again, which could be a winning one again. if greedy draws or loses, he sticks with that weapon. if greedy wins at least one round, then picks the same weapon as the bot, greedy wins. if greedy picks the losing weapon at some point, our bot wins, because the load on our weapon would have been higher than the score greedy has.

Assuming greedy doesn't always just pick the winning weapon through grand chance, this will mean that the chances are:

1/3 : { 1/2 win (1/6 total). 1/2 lose (1/6 total). }

1/3 draw

1/3 win

so: 1/3 chance to draw, 1/6 chance of a loss, 1/2 chance to win.

this probably shows that you need to do multiple games of multiple rounds

these are mainly to get the challenge rolling

Destructible Lemon

Posted 2017-05-24T08:00:52.610

Reputation: 5 908

3

Reactor

Makes the play that would have won the previous round.

import random
def reactfunc(I, dont, need, all, these, opp_history):
    if not opp_history:
        return random.choice(["R","P","S"])
    else:
        prev=opp_history[len(opp_history)-1]
        if prev == "R":
            return "P"
        if prev == "P":
            return "S"
        else:
            return "R"

KSmarts

Posted 2017-05-24T08:00:52.610

Reputation: 1 830

1You can replace opp_history[len(opp_history)-1] with opp_history[-1]. – CalculatorFeline – 2017-05-24T18:50:13.567

3

Artsy Child

This bot acts like a child playing arts and crafts, will start with paper and use either paper or scissors randomly, but will not use scissors after rock or scissors because she needs to use the scissors on paper. Will throw a rock back at anyone who throws a rock at her.

import random
def artsychildfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    if len(opp_history) == 0:
            return "P"
    elif opp_history[-1] == "R":
            return "R"
    elif my_history[-1] != "P":
            return "P"
    else:
            return random.choice(["P", "S"])

TitusLucretius

Posted 2017-05-24T08:00:52.610

Reputation: 121

2

Here the three Bots I have build for testing:


RandomBot

import random
def randombotfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
        return random.choice(["R","P","S"])

Greedy

Simply chooses his most loaded option.

import random
def greedyfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
        if my_loaded[0] > my_loaded[1]:
                if my_loaded[0] > my_loaded[2]:
                        return "R"
                elif my_loaded[0] < my_loaded[2]:
                        return "S"
                else:
                        return random.choice(["R","S"])
        elif my_loaded[0] < my_loaded[1]:
                if my_loaded[1] > my_loaded[2]:
                        return "P"
                elif my_loaded[1] < my_loaded[2]:
                        return "S"
                else:
                        return random.choice(["P","S"])
        else:
                if my_loaded[0] > my_loaded[2]:
                        return random.choice(["R","P"])
                elif my_loaded[0] < my_loaded[2]:
                        return "S"
                else:
                        return random.choice(["R","P","S"])

Antigreedy

Assumes opponent will play greedy and plays the winning alternative.

import random
def antigreedyfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
        if opp_loaded[0] > opp_loaded[1]:
                if opp_loaded[0] > opp_loaded[2]:
                        return "P"
                elif opp_loaded[0] < opp_loaded[2]:
                        return "R"
                else:
                        return "R"
        elif opp_loaded[0] < opp_loaded[1]:
                if opp_loaded[1] > opp_loaded[2]:
                        return "S"
                elif opp_loaded[1] < opp_loaded[2]:
                        return "R"
                else:
                        return "S"
        else:
                if opp_loaded[0] > opp_loaded[2]:
                        return "P"
                elif opp_loaded[0] < opp_loaded[2]:
                        return "R"
                else:
                        return random.choice(["R","P","S"])

Masclins

Posted 2017-05-24T08:00:52.610

Reputation: 914

1

Not Hungry

def nothungryfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    if my_loaded[0] < my_loaded[1]:
            if my_loaded[0] < my_loaded[2]:
                    return "R"
            elif my_loaded[0] > my_loaded[2]:
                    return "S"
            else:
                    return random.choice(["R","S"])
    elif my_loaded[0] > my_loaded[1]:
            if my_loaded[1] < my_loaded[2]:
                    return "P"
            elif my_loaded[1] > my_loaded[2]:
                    return "S"
            else:
                    return random.choice(["P","S"])
    else:
            if my_loaded[0] < my_loaded[2]:
                    return random.choice(["R","P"])
            elif my_loaded[0] > my_loaded[2]:
                    return "S"
            else:
                    return random.choice(["R","P","S"])

This is literally the inverse of Greedy, it chooses the lowest points option available.

Teal pelican

Posted 2017-05-24T08:00:52.610

Reputation: 1 338

1

Use Opponent's Favorite

from collections import Counter
import random
def useopponents(hi, my, name, is, stephen, opp_history):
  if opp_history:
    data = Counter(opp_history)
    return data.most_common(1)[0][0]
  else:
    return random.choice(["R","P","S"])

For the first turn, chooses a random item. For every other turn, uses the opponent's most common choice. If there is a tie, it defaults to the earliest most common choice.

//I stole code from here


Winning is Good

import random
def goodwinning(no, yes, maybe, so, my_history, opp_history):
  if opp_history:
    me = my_history[len(my_history)-1]
    you = opp_history[len(opp_history)-1]
    if you == me:
      return goodwinning(no, yes, maybe, so, my_history[:-1], opp_history[:-1])
    else:
      if me == "R":
        if you == "P":
          return "P"
        else:
          return "R"
      elif me == "P":
        if you == "S":
          return "S"
        else:
          return "R"
      else:
        if you == "R":
          return "R"
        else:
          return "P"
  else:
    return random.choice(["R","P","S"])

Returns the choice of the winner of the previous round. If the previous round was a tie, recursively checks the round before that. If it was only ties, or it is the first round, returns a random choice.

Stephen

Posted 2017-05-24T08:00:52.610

Reputation: 12 293

1

Best of Both Worlds

This bot basically combines Anti-Greedy and Greedy (hence the name).

def bobwfunc(a, b, my_loaded, opp_loaded, c, d):
    opp_max = max(opp_loaded)
    opp_play = "PSR"[opp_loaded.index(opp_max)]

    my_max = max(my_loaded)
    my_play = "RPS"[my_loaded.index(my_max)]

    if opp_play == my_play:
        return opp_play
    else:
        return my_play if opp_max < my_max else opp_play

clismique

Posted 2017-05-24T08:00:52.610

Reputation: 6 600

This is the Antigreedy, already posted as an example. – Masclins – 2017-05-25T07:23:50.790

@AlbertMasclans Changed it to another bot. – clismique – 2017-05-25T07:43:36.650

find is for strings. my_loaded and opp_loaded are both lists. index should be good for what you want. – Masclins – 2017-05-25T07:48:04.497

@AlbertMasclans Whoops, fixed now. Thanks for the catch! I hope this isn't another dup... I don't want to delete this post again. – clismique – 2017-05-25T07:49:20.997

This is ok, thanks for playing – Masclins – 2017-05-25T07:54:47.393

1

NashBot

import random
def nashbotfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    r = opp_loaded[0] * opp_loaded[2]
    p = opp_loaded[0] * opp_loaded[1]
    s = opp_loaded[1] * opp_loaded[2]
    q = random.uniform(0, r + p + s) - r
    return "R" if q < 0 else "P" if q < p else "S"

Randomly chooses between the three options in such a way that the opponent statistically has no preference between moves with regards to how much it scores; in other words, both Greedy and Not Hungry should have the same average expected score against it.

Neil

Posted 2017-05-24T08:00:52.610

Reputation: 95 035

1

Expectedbayes

Edit: Updated Ranking

This is the new top ranking after inclusion of Expectedbayes:

  • statistician2func 91.89%
  • fitterfunc 85.65%
  • nashfunc 80.40%
  • weigherfunc 76.39%
  • expectedbayesfunc 73.33%
  • antirepeaterfunc 68.52%
  • ...

Explanations

(NB: post 05/06/2017 submission)

This bot tries to maximize the expected value of its next move by:

  • Computing the probability for each of the opponent's next possible move
  • Using that figure and the loads to compute the expected value for each of R,P and S
  • Selecting the move that has the highest expected value
  • Randomly selecting a value if predicting failed

Probabilities are updated every ten moves. The number of past moves used to compute the probabilities has been set to 10 for each bot (so 20 features overall). This is probably overfitting the data, but I did not try to check further.

It relies on the scikit library to compute the opponent's move probabilities (i am saying it in case I misread the rules and it was in fact not allowed).

It easily wins against bots that always make the same choice. Surprisingly, it is quite effective against the random bot with a 93% win rate (I believe this is due to the fact that it limits the number of points his opponent can get while maximizing its own number of possible points for each round).

I made a quick try with 100 turn and only a limited number of bots, and this is what I got from result_standing:

  • randombotfunc,35
  • nashbotfunc,333
  • greedyfunc,172
  • antigreedyfunc,491
  • themessengerfunc,298
  • rockstarfunc,200
  • statistician2func,748
  • fitterfunc,656
  • expectedbayesfunc,601

Which is not that bad!

from sklearn.naive_bayes import MultinomialNB
import random

#Number of past moves used to compute the probability of next move
#I did not really try to make such thing as a cross-validation, so this number is purely random
n_data = 10

#Some useful data structures
choices = ['R','P','S']
choices_dic = {'R':0,'P':1,'S':2}
point_dic = {(0,0):0,(1,1):0,(2,2):0, #Same choices
             (0,1):-1,(0,2):1, #me = rock
             (1,0):1,(1,2):-1, #me = paper
             (2,0):-1,(2,1):1} #me = scissor

def compute_points(my_choice,opp_choice,my_load,opp_load):
    """
    Compute points
    @param my_choice My move as an integer
    @param opp_choice Opponent choice as an integer
    @param my_load my_load array
    @param opp_load opp_load array
    @return A signed integer (+ = points earned, - = points losed)
    """
    points = point_dic[(my_choice,opp_choice)] #Get -1, 0 or 1
    if points > 0:
        return points*my_load[my_choice] 
    else:
        return points*opp_load[opp_choice]

#This use to be a decision tree, before I changed it to something else. Nevertheless, I kept the name
class Decision_tree:
    def __init__(self):
        self.dataX = []
        self.dataY = []
        self.clf = MultinomialNB()

    def decide(self,my_load,opp_load,my_history,opp_history):
        """
        Returns the decision as an integer

        Done through a try (if a prediction could be made) except (if not possible)
        """
        try:
            #Let's try to predict the next move
            my_h = list(map(lambda x: choices_dic[x],my_history[-n_data:-1]))
            opp_h = list(map(lambda x: choices_dic[x],opp_history[-n_data:-1]))
            pred = self.clf.predict_proba([my_h+opp_h])
            #We create a points array where keys are the available choices
            pts = []
            for i in range(3):
                #We compute the expected gain/loss for each choice
                tmp = 0
                for j in range(3):
                    tmp += compute_points(i,j,my_load,opp_load)*pred[0][j]
                pts.append(tmp)
            return pts.index(max(pts)) #We return key for the highest expected value
        except:
            return random.choice(range(3))

    def append_data(self,my_history,opp_history):
        if my_history == "":
            self.clf = MultinomialNB()
        elif len(my_history) < n_data:
            pass
        else:
            my_h = list(map(lambda x: choices_dic[x],my_history[-n_data:-1]))
            opp_h = list(map(lambda x: choices_dic[x],opp_history[-n_data:-1]))
            self.dataX = self.dataX + [my_h+opp_h]
            self.dataY = self.dataY + [choices_dic[opp_history[-1:]]]

            if len(self.dataX) >= 10:
                self.clf.partial_fit(self.dataX,self.dataY,classes=[0,1,2])

                self.dataX = []
                self.dataY = []


#Once again, this is not actually a decision tree
dt = Decision_tree()

#There we go:
def expectedbayesfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    dt.append_data(my_history,opp_history)
    choice = choices[dt.decide(my_loaded,opp_loaded,my_history,opp_history)]
    return choice

lesibius

Posted 2017-05-24T08:00:52.610

Reputation: 11

Welcome to PPCG, and nice first post! – Zacharý – 2017-08-07T23:36:47.247

Thanks a lot! I wanted to participate in PPCG for a long time. Now it's fixed! – lesibius – 2017-08-08T16:06:30.667

0

Cycler

def cycler(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    return "RPS"[len(myhistory)%3]

0

CalculatorFeline

Posted 2017-05-24T08:00:52.610

Reputation: 2 608

0

b l o d s o c e r

s o c e r y

I gave it a fix, so it should probably work now I hope

I messed something up again so I deleted and undeleted. I am making a lot of mess ups.

def blodsocerfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    import random
    # tuned up an ready to go hopeful
    # s o c e r y
    if len(my_history) > 40 and len(set(opp_history[-30:])) == 1:
        if opp_history[-1] == "S":
            return "R"
        elif opp_history[-1] == "R":
            return "P"
        else:
            return "S"
        # against confused bots that only do one thing most of the time.
    elif len(my_history)>30 and min(opp_history.count(i) for i in "RPS")/max(opp_history.count(i) for i in "RPS") >0.8:
        return "RPS"[my_loaded.index(max(my_loaded))] # This is so if the other bot is acting errratic
                                                      # the max bonus is used for advantage
    elif len(my_history) < 10:
        if len(my_history) > 2 and all(i == "S" for i in opp_history[1:]):
            if len(my_history) > 5: return "S"
            return "P"
        return "S" # Be careful, because scissors are SHARP
    elif len(set(opp_history[1:10])) == 1 and len(my_history) < 20:
        if opp_history[1] == "S":
            return "R"
        elif opp_history[1] == "R":
            return "R"
        else:
            return "P"
    elif len(opp_history) -  max(opp_history.count(i) for i in "RPS") < 4 and len(my_history) < 30:
        if opp_history.count("R") > max(opp_history.count(i) for i in "PS"):
            return "P"
        if opp_history.count("P") > max(opp_history.count(i) for i in "RS"):
            return "S"
        if opp_history.count("S") > max(opp_history.count(i) for i in "RP"):
            return "R"
    elif len(my_history) < 15:
        if max(opp_loaded)<max(my_loaded):
            return "RPS"[len(my_history)%3]
        else:
            return "RPS"[(my_loaded.index(max(my_loaded))+len(my_history)%2)%3]
    elif len(my_history) == 15:
        if max(opp_loaded)<max(my_loaded):
            return "RPS"[(len(my_history)+1)%3]
        else:
            return "RPS"[(my_loaded.index(max(my_loaded))+ (len(my_history)%2)^1)%3]
    else:
        if max(opp_loaded)<max(my_loaded):
            return random.choice("RPS")
        else:
            return "RPS"[(my_loaded.index(max(my_loaded))+ (random.randint(0,1)))%3]

Destructible Lemon

Posted 2017-05-24T08:00:52.610

Reputation: 5 908

1if opp_history[1] == "S": return "R" elif opp_history[1] == "R": return "R" else: return "P" what sort of socery is this? – Robert Fraser – 2017-05-25T06:27:53.103

@DestructibleLemon This divides by 0: elif min(opp_history.count(i) for i in "RPS")/max(opp_history.count(i) for i in "RPS") >0.8 and len(my_history)>30: – Masclins – 2017-05-25T06:35:26.020

@AlbertMasclans I fixed that. – Destructible Lemon – 2017-05-25T10:05:25.760

@RobertFraser what exactly is outstanding about that code snippet? – Destructible Lemon – 2017-05-25T10:06:08.230

@DestructibleLemon I'm not entirely sure what you wanted to do here: "RPS"[my_loaded.index(max(my_loaded))+len(my_history)%2] but it looks out of range (and so will the further lines). – Masclins – 2017-05-25T10:10:56.463

whoops. I should fix this. will delete and undelete – Destructible Lemon – 2017-05-25T10:45:15.983

@RobertFraser *r i s k m a n g m e n t s o c e r y* – Destructible Lemon – 2017-05-26T00:05:29.033

0

Ensemble

from random import *
def f(I):
    if I==0:return "R"
    if I==1:return "P"
    return "S"
def b(I):
    if I=="R":return 0
    if I=="P":return 1
    return 2
def Ensemble(mp,op,ml,ol,mh,oh):
    A=[0]*3
    B=[0]*3
    if(len(oh)):
        k=b(oh[-1])
        A[k-2]+=0.84
        A[k]+=0.29
        for x in range(len(oh)):
            g=b(oh[x])
            B[g-2]+=0.82
            B[g]+=0.22
        s=sum(B)
        for x in range(len(B)):
            A[x]+=(B[x]*1.04/s)
        r=max(A)
    else:
        r=randint(0,3)
    return f(r)

Several competing algorithms vote on the best solution.

Swap

from random import *
def f(I):
    if I==0:return "R"
    if I==1:return "P"
    return "S"
def b(I):
    if I=="R":return 0
    if I=="P":return 1
    return 2
def Swap(mp,op,ml,ol,mh,oh):
    A=[0]*3
    B=[0]*3
    if(len(mh)):
        r=(b(mh[-1])+randint(1,2))%3
    else:
        r=randint(0,3)
    return f(r)

Does a random move, but without repeating the last move it did.

Magenta

Posted 2017-05-24T08:00:52.610

Reputation: 1 322

0

Weighted Random

Like RandomBot, but it picks only 2 to throw each time it is invoked. Will sometimes beat Rockstar or Assassin, but will pump up the scores of the other one (e.g., if it beats Rockstar, it gives Assassin a point boost).

import random

selection_set = ["R", "P", "S"]
selection_set.pop(random.randint(0,2))
def weightedrandombotfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    return random.choice(selection_set)

Andrew U Baker

Posted 2017-05-24T08:00:52.610

Reputation: 41

0

Greedy psychologist

Named that because it defaults to greedy, but if it can't decide, it counters whatever the opponent would do if they used the greedy strategy. If it still can't decide, it goes randomly.

from random import choice

def greedypsychologistfunc(my_points, opp_points, my_loaded, opp_loaded, my_history, opp_history):
    greedy = get_my_move(my_loaded)
    combined = list(set(greedy) & set(get_opp_counter(opp_loaded)))

    if len(combined) == 0:
        return choice(greedy)
    return choice(combined)

def get_indexes(lst, value):
    return [i for i,x in enumerate(lst) if x == value]

def get_my_move(my_loaded):
    return ["RPS"[i] for i in get_indexes(my_loaded, max(my_loaded))]

def get_opp_counter(opp_loaded):
    return ["PSR"[i] for i in get_indexes(opp_loaded, max(opp_loaded))]

Solomon Ucko

Posted 2017-05-24T08:00:52.610

Reputation: 439