The `History` class
===================

.. currentmodule:: trueskillthroughtime

We use the `History` class to compute the learning curves and predictions of a sequence of events.

.. autoclass:: History
    :members: learning_curves
             ,convergence
             ,log_evidence
   
Let us return to the example seen on the first page of this manual.
We define the composition of each game using the names of the agents (i.e. their identifiers).
In the following example, all agents (:code:`"a", "b", "c"`) win one game and lose the other. 
The results will be implicitly defined by the order in which the game compositions are initialized: the teams appearing firstly in the list defeat those appearing later. 
By initializing :code:`gamma = 0.0` we specify that skills do not change over time.

.. code-block::

    >>> c1 = [["a"],["b"]]
    >>> c2 = [["b"],["c"]]
    >>> c3 = [["c"],["a"]]
    >>> composition = [c1, c2, c3]
    >>> h = ttt.History(composition, gamma=0.0)
    History(Events=3, Batches=3, Agents=3)

After initialization, the :code:`History` class immediately instantiates a new player for each name and activates the computation of the TrueSkill estimates (not yet TrueSkill Through Time).

Learning curves
----------------

To access estimates we can call the method :code:`learning_curves()`, which returns a dictionary indexed by the names of the agents.

.. code-block::

    >>> h.learning_curves()["a"]
    [(1, N(mu=3.339, sigma=4.985)), (3, N(mu=-2.688, sigma=3.779))]
    >>> h.learning_curves()["b"]
    [(1, N(mu=-3.339, sigma=4.985)), (2, N(mu=0.059, sigma=4.218))]


Individual learning curves are lists of tuples: each tuple has the time of the estimate as the first component and the estimate itself as the second one.
Although in this example no player is stronger than the others, the TrueSkill estimates present strong variations between players.

Convergence
------------

TrueSkill Through Time solves TrueSkill's inability to obtain correct estimates by allowing the information to propagate throughout the system.
To compute them, we call the method :code:`convergence()` of the :code:`History` class.

.. code-block::

    >>> h.convergence()
    >>> h.learning_curves()["a"]
    [(1, N(mu=0.000, sigma=2.395)), (3, N(mu=-0.000, sigma=2.395))]
    >>> h.learning_curves()["b"]
    [(1, N(mu=-0.000, sigma=2.395)), (2, N(mu=-0.000, sigma=2.395))]

TrueSkill Through Time not only returns correct estimates (same for all players), they also have less uncertainty.

Model evidence
--------------

We would like to have a procedure to decide whether TrueSkill Through Time is better than others models and the optimal values of the parameters :code:`\sigma` and :code:`\gamma`.
In the same way that we use probability theory to evaluate the hypotheses of a model given the data, we can also evaluate different models given the data.

:math:`P(\text{Model}|\text{Data}) \propto P(\text{Data}|\text{Model})P(\text{Model})`

where :math:`P(\text{Model})` is the prior of the models, which we define, and :math:`P(\text{Data}|\text{Model})` is the prediction made by the model.
In the special case where we have no prior preference over any model, we need only compare the predictions made by the models.

:math:`P(\text{Model}|\text{Data}) \propto P(\text{Data}|\text{Model})`

In other words, we prefer the model with the best prediction.

:math:`P(\text{Data}|\text{Model}) = P(d_1|\text{M})P(d_2|d_1,\text{M}) \dots P(d_n|d_{n-1}, \dots, d_1, \text{M})`

where D represents the data set, M the model, and :math:`d_i` the individual data points.
This measure can be obtained by the :code:`evidence` method.
Let us develop a complex synthetic example in which this measure is useful for choosing the optimal dynamic uncertainty.

Optimizing the dynamic factor
-----------------------------

We now analyze a scenario in which a new player joins a large community of already known players.
In this example, we focus on the estimation of an evolving skill.
For this purpose, we establish the skill of the target player to change over time following a logistic function.
The community is generated by ensuring that each opponent has a skill similar to that of the target player throughout their evolution. 
In the following code, we generate the target player's learning curve and 1000 random opponents. 

.. code-block::

    import math; from numpy.random import normal, seed; seed(99); N = 1000
    def skill(experience, middle, maximum, slope):
        return maximum/(1+math.exp(slope*(-experience+middle)))

    target = [skill(i, 500, 2, 0.0075) for i in range(N)]
    opponents = normal(target,scale=0.5)
    
    
The list :code:`target` has the agent's skills at each moment: the values start at zero and grow smoothly until the target player's skill reaches two.
The list :code:`opponents` includes the randomly generated opponents' skills following a Gaussian distribution centered on each target's skills and a standard deviation of 0.5.


.. code-block::

    composition = [[["a"], [str(i)]] for i in range(N)]
    results = [[1,0] if normal(target[i]) > normal(opponents[i]) else [0,1] for i in range(N)]
    times = [i for i in range(N)]
    priors = dict([(str(i), ttt.Player(ttt.Gaussian(opponents[i], 0.2))) for i in range(N)])

    h = ttt.History(composition, results, times, priors, gamma=0.015)
    h.convergence()
    mu = [tp[1].mu for tp in h.learning_curves()["a"]]

In this code, we define four variables to instantiate the class `History` to compute the target's learning curve.
The variable :code:`composition` contains 1000 games between the target player and different opponents.
The list :code:`results` is generated randomly by sampling the agents' performance following Gaussian distributions centered on their skills. The winner is the player with the highest performance.
The variable :code:`time` is a list of integer values ranging from 0 to 999 representing the time batch in which each game is located: the class :code:`History` uses the temporal distance between events to determine the amount of dynamic uncertainty (:math:`\gamma^2`) to be added between games.
The variable :code:`priors` is a dictionary used to customize player attributes: we assign low uncertainty to the opponents' priors as we know their skills beforehand.

The class :code:`History` receives these four parameters and initializes the target player using the default values and a dynamic uncertainty :code:`gamma=0.018`.
Using the method :code:`convergence()`, we obtain the TrueSkill Through Time estimates and the target's learning curve.
The following figure shows the evolution of the true (solid line) and estimated (dotted line) target player's learning curves.

.. image::  ../_static/logistic0.png

The estimated learning curves remain close to the actual skill during the whole evolution.

.. code-block::

    le = h.log_evidence()

The geometric mean of the evidence is

.. code-block::

    >>> math.exp(le/h.size)
    0.51802292530

To optimize, repeat this procedure with different values of gamma until minimize the :code:`log_evidence` (or maximize the geommetric mean).