King Gonzalo and the Round Ball
If you break the all-time record for most goals scored in a Serie A season, you make history.
If you do it with this goal, you are legend.
If you break the all-time record for most goals scored in a Serie A season, you make history.
If you do it with this goal, you are legend.
Fantasy Scout players:
Player that can still be picked (from Wikipedia):
I don’t know when (or even whether!) I’ll be able to write the next post in the “Anatomy of a good pick” series, so it may be the right time to take stock.
What have I done so far?
What questions do I want to answer next? (and how?)
“FantasySCOTUS is the leading Supreme Court Fantasy League.”
(I hoped it was about the Subtle Doctor…)
As a break between trees and forests, I ran a totally silly experiment: let’s ask scikit-learn to build a tree that tells my 76 players from Daniele’s 65 players.
Here is the tree (max depth 3, min samples per leaf 5):
In short, I pick players who are valued more and are younger… i.e. who are better! (so why I perform worse than Daniele?!)
Also (I knew this!) I pick more Dutch players, and more defensive players.
Jokes aside, this tree made me doubt my belief that Daniele uses some kind of formula to choose the players to pick…
(see the previous posts in this series)
As planned, I asked scikit-learn (AKA sklearn) to add a fifth level. Here is the resulting tree:
As expected, there aren’t great insight. Let me highlight some curious things, though (references to the numbers I used to annotate the tree):
It’s official: four levels are enough.
Minimum 30 samples per leaf (previously it was 5); no depth limits.
Not so different from tree #4. It’s just that nations can’t be used anymore (too few players per nation), so they are replaced by age.
The best chance to get a good pick is: value >5M, age 22 or 23 (younger players sometimes lose their path to success?).
Enough with trees, time to move to forests.
(I guess the only way to follow it is to be accostumed with the previous posts in this series.)
New tree, with the following changes to the features:
The tree is… the same as in the previous post. So changing the type of the “Position” feature had no effect.
Changed max depth to 4.
The top three levels are the same as in the previous tree, so we’ll focus on the fourth one. But before doing that, I have to tell you how position is coded. I used the same values Daniele uses for the prediction formula:
So let’s read the fourth level, starting from the right (i.e., generally speaking, going from better to worse):
One little thing that I like is that all features are used: Age, Value, Nation, Caps, Position.
Future work on decision trees: I don’t expect a fifth level to add anything meaningful, but why should not I try anyway? Also, what happens is we set a higher values (e.g. 30) for the minimum number of samples in leaf nodes?
Fantasy Scout players:
Player that can still be picked (from Wikipedia):
I did what I had announced in my previous post. More precisely, for the time being I built (well, Sklearn built…) the decision tree.
The only difference between what I actually did and what I had posted is that in the end I only used the one-hot encoded data, because I realized that the decision tree were doing things like “if Position < 1".
Here is the decision tree (full resolution):
Even before explaining what the text and the colours mean, it’s clear that the tree is way too detailed: this is overfit, for sure. So I told Sklearn to limit the tree to 3 levels. Then, since the resulting tree had a leaf with just one player, I told Sklearn that I didn’t want any leaf with less than 5 players. Here is the result (full resolution):
OK, now I can explain the text and the colours (the meaning is quite intuitive, but writing it down doesn’t hurt).
Text in each node, first line to last:
Colour is ((if I’m not wrong) the function of two values:
So, what does this tree tells you?
I checked the transfer value in Euro (according to Transfermarkt) of every player the day he was picked.
Average value: 3,623,311.
Top 10 highest values:
Player | Scout | Picked | Value |
---|---|---|---|
Callejon | Daniele | 2014-11-10 | 25,000,000 |
Diego Costa | Michael H. | 2013-09-26 | 25,000,000 |
Gago | Andrea V. | 2007-01-15 | 20,000,000 |
Arteta | Pietro | 2010-05-20 | 18,000,000 |
Amauri | Daniele | 2008-05-29 | 16,000,000 |
Leno | Andrea V. | 2015-04-05 | 16,000,000 |
Frey | Mattia | 2009-10-13 | 15,000,000 |
Garcia | Abubakr | 2014-09-04 | 15,000,000 |
Roberto Firmino | Andrea V. | 2014-03-27 | 15,000,000 |
Wendell | Andrea V. | 2015-11-04 | 15,000,000 |
Considering I also have players #11, #12 and #13… I think I have found out why I suck.
Players who had no value when picked: Alena, Allione, Ariaudo, Banega, Bartley, Batalla, Bell S., Beretta, Boga, Bolatti, Caldirola, Carrera, Cerri, Ciano, Ciro, Comi, Criscito, Crisetig, Danilinho, Danilo Barbosa, Danti, Destro, Driussi, Dumitru, Edouard, Fossati, Gabbiadini, Gallinetta, Geferson, Gnabry, Gomez, Guglielmi, Hewson, Immobile, Jean, Kakuta, Lamela, Laribi, Leandro Joaquim Ribeiro, Leandro Lima, Lucas Piazon, Macheda, Mammana, Mannini, Marrone, Mastour, Maurides, Mayoral, Miquel, Morosini, Muniesa, Murphy, Nahuel, Neymar, Ozyakup, Pereira G., Pezzella, Rashford, Robinson, Rodrigo Dourado, Rodriguez Jese, Rodriguez L., Salvio, Sinclair J., Sneijder, Sterling, Suso, Vadala, Valdivia, Wilson, Yildirim.
Top 10 scouts by numbers of non-valued players picked:
Scout | Totale |
---|---|
Giovanni B. | 11 |
Jesus | 8 |
Abubakr | 5 |
Saintjust | 5 |
Alberto | 4 |
Nigel | 4 |
Daniele | 3 |
Mattia | 3 |
Pietro | 3 |
William | 3 |