SlideShare a Scribd company logo
1 of 25
Download to read offline
Team
Keilin Bickar
Ayush K Singh
Monisha Singh
Populous Player Rankings
Machine Learning Course Project
CS 6140 Spring ‘16
Instructor
Lu Wang
Problem Description
Populous: The Beginning is a strategy and
god-style video game, where teams of
players create settlements and battle to
destroy the opposition.
People play 1 vs 1 and 2 vs 2 games via a
matchmaking service.
Fair games are fun games so accurate
rankings are needed to make fair matches.
Problem Description - Goal
● Create league table and rankings based on game results
● Predict the results of future games based on rankings
● Improve accuracy of game predictions
Problem Description - Inputs
Data from “Populous: Reincarnated” game database
- games.csv
- One row per game played
- General information concerning game such as map and number of players
- game_details.csv
- One row per player per game played (two - four per game)
- Stats for how well a player did such as enemy buildings destroyed and fights lost
- users.csv
- One row per user, not much besides name
- game_pops.csv
- Populations of each player taken every 15 minutes of each game
Sample Input Data
Games.csv
"Id";"pack_id";"map_id";"time";"players";"rated"
"107299";"1";"0";"1330734968";"2";"11"
Game_details.csv
"game_id";"user_id";"player";"ping";"tribe";"status";"length";"start_allies";"allies";"fights_won";"fights_lost";"followers_killed";"buil
dings_destroyed";"shamans_killed";"followers_lost";"buildings_lost";"shaman_deaths";"pop0";"pop1";"pop2";"pop3"
"107299";"38680";"0";"443";"0";"A";"1338";"1";"1";"25";"25";"46";"3";"7";"80";"1";"5";"80";"30";"0";"0"
Users.csv
"User_id";"user_regdate";"username"
"38680";"1327504496";"ADVENTURER_XD"
Game_pops.csv:
"Game_id";"user_id";"sec";"pop0";"pop1";"pop2";"pop3"
"220327";"48618";"485";"23";"32";"16";"21"
Problem Description - Outputs
● Ranking of players ordered by skill level
● Point value assigned to each player to
be used for predictions
● Ranking system that can iteratively rank
new inputs
Rank Name Points
# 1 Alice 64
# 2 Bob 53
# 3 Charlie 29
# 4 Dan 15
Related Work
Traditional Rating System
● Players on winning team gain 1 point
● Ratings of the losing team remain unaffected
● This method is currently being used in the game of Populous.
● Only requires winners to report games
Simple Ranking
● Players on winning team gain 1 point
● Players on losing team lose 1 point
● Intuitive system, widely used
Related Work
Elo Rating System
● Rating process takes into account prior ratings of players
● Subtracts X points from loser and gives X points to winner
● Very widely used for 1 vs 1 matchups such as Chess
● Update calculations are very fast
Glicko2 Rating System
● Rating process takes into account prior ratings of players and their experience
● Stores 𝜇 (rating), 𝜎 (volatility), and 𝜙 (rating deviation) for each player
● More recently created and used in 1 vs 1 matchups
Related Work
TrueSkill Rating System:
● Rating process uses Bayesian inference to compare two team distribution
● Every time a player plays a game, the system accordingly changes the perceived skill of the player
and acquires more confidence about this perception
● Stores 𝜇 (rating) and 𝜎 (uncertainty | variance) for each player
● The extent of actual updates depends on how "surprising" the outcome is to the system
● Designed to support teams of variable size along with ties and grows polynomially
● Assumes team performance is the sum of performance of the players
● Developed by Microsoft Research and used in Xbox matchmaking system
Our Work
Address shortcomings and find optimal model
● TrueSkill suffers from “rich get richer” problem with unbalanced teams (auto rating boost)
● TrueSkill handles ties in a naive fashion ignoring the complexity of the system
● We experimented with different models like
○ numerous values of of K for Elo
○ Same and Separate Feature Factors in Weighted Elo
○ Swapped constant used to standardize the logistic function in Glicko
○ Selected Trueskill and further experimented with Weighted TrueSkill
● Find the best model, use it to rate players and predict result of future games
Methodology - Preprocessing
● Data contained around 300,000 games
● Removed games with irregularities e.g. players crashing, etc
● Removed games with incomplete data e.g. 4 players games with data only
from 3 players
● Some games had spectators but were still valid, these were converted from a
4 player game with 2 spectators to a 2 player 1v1 game
Methodology - Preprocessing
● Mixture of 1v1, 2v2, 1v3, etc - stripped everything but 1v1 and 2v2
○ Unbalanced games hard to rate in complex ranking systems
○ Skills for 3 vs 1 game don’t translate to 1 vs 1 or 2 vs 2 games
● 136k Remaining games:
○ 50k - 1 vs 1 games
○ 86k - 2 vs 2 games
● 3 datasets stored separately for faster loading to run experiments
○ Disk IO the main contributor to load times so smaller sets were better
● Post preprocessing we were able to increase accuracy from 69 to 76%
Experiments: Datasets
● Ranking System
○ Traditional Ranking
○ Simple Ranking
○ Elo rating - Modified to support 2v2
○ Glicko2 Rating - Modified to support 2v2
○ TrueSkill Rating
● Features
○ Feature Selection based on Info Gain, Gain Ratio, Correlation Feature Selection
○ Feature Weights based on Perceptron Learning algorithm, SMO, Multilayer
perceptron with Backpropagation, and Logistic Regression
Experiments: Evaluation metrics
Ranking
● Traditional, Simplified, and Elo systems use native Points
● Glicko and TrueSkill use:
○ Points = 𝜇(rating) - 3𝜎(uncertainty)
● Players sorted by Points highest to lowest
Experiments: Evaluation metrics
Future Game Predictions
● Winning team predicted by selecting team with higher sum of Points
● Accuracy of Predictions
○ Accuracy = Correct Predictions / Total number of instances
○ Iterative calculations so all training data is used as test data
○ Order matters so cross-validation is cannot be used
Experiments: Baselines
Prediction Accuracies
League Full (136,144) 1 vs 1 (50,134) 2 vs 2 (86,010)
Traditional 0.678164296627 0.683827342722 0.66064411115
Simple 0.64075537666 0.651813140783 0.643448436228
Elo 0.739577212363 0.718574221087 0.737774677363
Glicko 0.72718592079 0.714664698608 0.714463434484
TrueSkill 0.756955870255 0.742968843499 0.758051389373
Experiment - Weighted Elo
● Elo is close in score to TrueSkill, but runs much faster
● Uses “K” value to decide how many points to move between teams
● K was weighted based on features in game details
● Features and factors selected one at a time by increasing/decreasing factor
until accuracy reached maximum
● Weighting was capped to prevent small/large values from exploding ranks
Experiment - Weighted Elo
● Tested raw feature vs. ratio of winning team/losing team
○ Ratio was better
● Tested inverting ratio for winning/losing and losing/winning
○ Mixed results
● Tested adding factors to K vs. multiplying K by factors
○ Multiplying worked better
● Tested assigning different weight for winners and losers
○ Improved accuracy!
Experiment - Weighted Elo
Notable improvements in score
Best feature was “shamans_killed” of winning team
League Full (136,144) 1 vs 1 (50,134) 2 vs 2 (86,010)
Baseline Elo 0.739577212363 0.718574221087 0.737774677363
Weighted Elo 0.750146903279 0.725136633821 0.749098941983
Experiment - Weighted TrueSkill
● TrueSkill starts out more accurate than Elo
● Has a built in weight for an update ranging from 0.0 - 1.0
● Using same feature/factor as Elo resulted in negligible improvements
● Running the same process to find new feature/factors also resulted in
negligible improvements
Experiment - Weighted TrueSkill
Tested skewing the results to give weight to the player doing the most work in
2 vs 2 games. Accuracy of top four features after weighing:
Feature Score
Unweighted 0.758051389373
followers_killed 0.758783862342
fights_won 0.758714103011
shamans_killed 0.758109522149
buildings_destroyed 0.757923497268
Results overall were
positive, but small
Experiment - Value of Games
There is a hidden feature of game that is hard to calculate i.e. the value of how
helpful a game is for ranking players.
● Tested 1 vs 1 games using Elo (for speed)
● Removed one game from test, compared accuracy to baseline
● Resulting change very small, but enough to see positive/negative
● Values normalized and stored as boolean
● Can run algorithms to classify games based on value
Experiment - Value of Games
BinarySMO
Machine linear: showing attribute weights, not support vectors.
-0.0003 * (normalized) map
+ -0.0076 * (normalized) length
+ 0.0025 * (normalized) fights_won
+ 0.0149 * (normalized) fights_lost
+ 0.0037 * (normalized) followers_killed
+ 0.0018 * (normalized) buildings_destroyed
+ 0.0001 * (normalized) shamans_killed
+ 0.0015 * (normalized) followers_lost
+ -0.0022 * (normalized) buildings_lost
+ 0.005 * (normalized) shaman_deaths
+ -0.0002 * (normalized) fights_won
+ -0.0004 * (normalized) fights_lost
+ -0.0001 * (normalized) followers_killed
+ 0.0001 * (normalized) buildings_destroyed
+ -0.0003 * (normalized) shamans_killed
+ 0.0001 * (normalized) followers_lost
+ -0.0008 * (normalized) buildings_lost
+ 0.0013 * (normalized) shaman_deaths
+ 0.0004 * (normalized) fights_won
+ -0.0001 * (normalized) fights_lost
+ 0.0009 * (normalized) followers_killed
+ 0.0003 * (normalized) buildings_destroyed
+ 0.0004 * (normalized) shamans_killed
+ 0 * (normalized) followers_lost
+ -0.0004 * (normalized) buildings_lost
+ 0.0003 * (normalized) shaman_deaths
- 1.0003
However, everything classified as bad:
=== Confusion Matrix ===
a b <-- classified as
16405 0 | a = True
13669 0 | b = False
Some insight into most influential factors -
“fights_lost” is largest
Experiment - Value of Games
Tested weighting 1 vs. 1 games in TrueSkill using values found:
Shows some positive results
Tested weighting all games in TrueSkill using values found:
Results against full dataset slightly negative
1v1 Base 0.742968843499
Weighting on fights_lost 0.743287988192
All Base 0.756955870255
Weighting on fights_lost 0.756698789517
Weighting on fights_lost only for 1v1 0.756882418616
Results
● Experimenting with different parameters did not result in quantitative
accuracy
● Overall we were able to predict outcome 8% better than the traditional system
● Found some features to be more important in the gameplay than others
● Model takes into account all priors so player’s first game is also a part of
rating
Thank You!

More Related Content

Similar to PopRank

Game analytics - The challenges of mobile free-to-play games
Game analytics - The challenges of mobile free-to-play gamesGame analytics - The challenges of mobile free-to-play games
Game analytics - The challenges of mobile free-to-play gamesChristian Beckers
 
Hackathon 2013 - The Art Of Cheating In Games
Hackathon 2013 - The Art Of Cheating In GamesHackathon 2013 - The Art Of Cheating In Games
Hackathon 2013 - The Art Of Cheating In GamesSouhail Hammou
 
Knowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em Up
Knowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em UpKnowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em Up
Knowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em UpLuke Dicken
 
98 374 Lesson 06-slides
98 374 Lesson 06-slides98 374 Lesson 06-slides
98 374 Lesson 06-slidesTracie King
 
Mining the Madden Experience
Mining the Madden ExperienceMining the Madden Experience
Mining the Madden ExperienceBen Weber
 
draftrec_www22.pdf
draftrec_www22.pdfdraftrec_www22.pdf
draftrec_www22.pdfssuserb0c0b4
 
F2P Game Balancing: Data Movies
F2P Game Balancing: Data MoviesF2P Game Balancing: Data Movies
F2P Game Balancing: Data MoviesThomas Hulvershorn
 
Introduction to Alphago Zero
Introduction to Alphago ZeroIntroduction to Alphago Zero
Introduction to Alphago ZeroChia-Ching Lin
 
Long Lin at AI Frontiers : AI in Gaming
Long Lin at AI Frontiers : AI in GamingLong Lin at AI Frontiers : AI in Gaming
Long Lin at AI Frontiers : AI in GamingAI Frontiers
 
MongoDC 2012: How MongoDB Powers Doodle or Die
MongoDC 2012: How MongoDB Powers Doodle or DieMongoDC 2012: How MongoDB Powers Doodle or Die
MongoDC 2012: How MongoDB Powers Doodle or DieMongoDB
 
Static Game Data structure, why it's crucial to know and understand for LiveO...
Static Game Data structure, why it's crucial to know and understand for LiveO...Static Game Data structure, why it's crucial to know and understand for LiveO...
Static Game Data structure, why it's crucial to know and understand for LiveO...DevGAMM Conference
 
Making Decisions - A Game Theoretic approach
Making Decisions - A Game Theoretic approachMaking Decisions - A Game Theoretic approach
Making Decisions - A Game Theoretic approachDr Ganesh Iyer
 
Machine Learning-Driven Injury Prediction for a Professional Sports Team
Machine Learning-Driven Injury Prediction for a Professional Sports TeamMachine Learning-Driven Injury Prediction for a Professional Sports Team
Machine Learning-Driven Injury Prediction for a Professional Sports TeamInstitute of Contemporary Sciences
 
Running live game events for fun and profit
Running live game events for fun and profitRunning live game events for fun and profit
Running live game events for fun and profitJames Gwertzman
 
GDC 2014 - Jeremy Ehrhardt, KIXEYE - Building Customer Support and Loyalty
GDC 2014 - Jeremy Ehrhardt, KIXEYE - Building Customer Support and LoyaltyGDC 2014 - Jeremy Ehrhardt, KIXEYE - Building Customer Support and Loyalty
GDC 2014 - Jeremy Ehrhardt, KIXEYE - Building Customer Support and LoyaltySteelPangolin
 
Delivering Winning Results with Sports Analytics and HPCC Systems
Delivering Winning Results with Sports Analytics and HPCC SystemsDelivering Winning Results with Sports Analytics and HPCC Systems
Delivering Winning Results with Sports Analytics and HPCC SystemsHPCC Systems
 

Similar to PopRank (20)

Game analytics - The challenges of mobile free-to-play games
Game analytics - The challenges of mobile free-to-play gamesGame analytics - The challenges of mobile free-to-play games
Game analytics - The challenges of mobile free-to-play games
 
Hackathon 2013 - The Art Of Cheating In Games
Hackathon 2013 - The Art Of Cheating In GamesHackathon 2013 - The Art Of Cheating In Games
Hackathon 2013 - The Art Of Cheating In Games
 
Knowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em Up
Knowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em UpKnowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em Up
Knowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em Up
 
98 374 Lesson 06-slides
98 374 Lesson 06-slides98 374 Lesson 06-slides
98 374 Lesson 06-slides
 
Cs229 final report
Cs229 final reportCs229 final report
Cs229 final report
 
Mining the Madden Experience
Mining the Madden ExperienceMining the Madden Experience
Mining the Madden Experience
 
draftrec_www22.pdf
draftrec_www22.pdfdraftrec_www22.pdf
draftrec_www22.pdf
 
Module_3_1.pptx
Module_3_1.pptxModule_3_1.pptx
Module_3_1.pptx
 
F2P Game Balancing: Data Movies
F2P Game Balancing: Data MoviesF2P Game Balancing: Data Movies
F2P Game Balancing: Data Movies
 
Introduction to Alphago Zero
Introduction to Alphago ZeroIntroduction to Alphago Zero
Introduction to Alphago Zero
 
Long Lin at AI Frontiers : AI in Gaming
Long Lin at AI Frontiers : AI in GamingLong Lin at AI Frontiers : AI in Gaming
Long Lin at AI Frontiers : AI in Gaming
 
MongoDC 2012: How MongoDB Powers Doodle or Die
MongoDC 2012: How MongoDB Powers Doodle or DieMongoDC 2012: How MongoDB Powers Doodle or Die
MongoDC 2012: How MongoDB Powers Doodle or Die
 
Static Game Data structure, why it's crucial to know and understand for LiveO...
Static Game Data structure, why it's crucial to know and understand for LiveO...Static Game Data structure, why it's crucial to know and understand for LiveO...
Static Game Data structure, why it's crucial to know and understand for LiveO...
 
Skill Mapping
Skill MappingSkill Mapping
Skill Mapping
 
Making Decisions - A Game Theoretic approach
Making Decisions - A Game Theoretic approachMaking Decisions - A Game Theoretic approach
Making Decisions - A Game Theoretic approach
 
Finalver
FinalverFinalver
Finalver
 
Machine Learning-Driven Injury Prediction for a Professional Sports Team
Machine Learning-Driven Injury Prediction for a Professional Sports TeamMachine Learning-Driven Injury Prediction for a Professional Sports Team
Machine Learning-Driven Injury Prediction for a Professional Sports Team
 
Running live game events for fun and profit
Running live game events for fun and profitRunning live game events for fun and profit
Running live game events for fun and profit
 
GDC 2014 - Jeremy Ehrhardt, KIXEYE - Building Customer Support and Loyalty
GDC 2014 - Jeremy Ehrhardt, KIXEYE - Building Customer Support and LoyaltyGDC 2014 - Jeremy Ehrhardt, KIXEYE - Building Customer Support and Loyalty
GDC 2014 - Jeremy Ehrhardt, KIXEYE - Building Customer Support and Loyalty
 
Delivering Winning Results with Sports Analytics and HPCC Systems
Delivering Winning Results with Sports Analytics and HPCC SystemsDelivering Winning Results with Sports Analytics and HPCC Systems
Delivering Winning Results with Sports Analytics and HPCC Systems
 

Recently uploaded

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 

PopRank

  • 1. Team Keilin Bickar Ayush K Singh Monisha Singh Populous Player Rankings Machine Learning Course Project CS 6140 Spring ‘16 Instructor Lu Wang
  • 2. Problem Description Populous: The Beginning is a strategy and god-style video game, where teams of players create settlements and battle to destroy the opposition. People play 1 vs 1 and 2 vs 2 games via a matchmaking service. Fair games are fun games so accurate rankings are needed to make fair matches.
  • 3. Problem Description - Goal ● Create league table and rankings based on game results ● Predict the results of future games based on rankings ● Improve accuracy of game predictions
  • 4. Problem Description - Inputs Data from “Populous: Reincarnated” game database - games.csv - One row per game played - General information concerning game such as map and number of players - game_details.csv - One row per player per game played (two - four per game) - Stats for how well a player did such as enemy buildings destroyed and fights lost - users.csv - One row per user, not much besides name - game_pops.csv - Populations of each player taken every 15 minutes of each game
  • 6. Problem Description - Outputs ● Ranking of players ordered by skill level ● Point value assigned to each player to be used for predictions ● Ranking system that can iteratively rank new inputs Rank Name Points # 1 Alice 64 # 2 Bob 53 # 3 Charlie 29 # 4 Dan 15
  • 7. Related Work Traditional Rating System ● Players on winning team gain 1 point ● Ratings of the losing team remain unaffected ● This method is currently being used in the game of Populous. ● Only requires winners to report games Simple Ranking ● Players on winning team gain 1 point ● Players on losing team lose 1 point ● Intuitive system, widely used
  • 8. Related Work Elo Rating System ● Rating process takes into account prior ratings of players ● Subtracts X points from loser and gives X points to winner ● Very widely used for 1 vs 1 matchups such as Chess ● Update calculations are very fast Glicko2 Rating System ● Rating process takes into account prior ratings of players and their experience ● Stores 𝜇 (rating), 𝜎 (volatility), and 𝜙 (rating deviation) for each player ● More recently created and used in 1 vs 1 matchups
  • 9. Related Work TrueSkill Rating System: ● Rating process uses Bayesian inference to compare two team distribution ● Every time a player plays a game, the system accordingly changes the perceived skill of the player and acquires more confidence about this perception ● Stores 𝜇 (rating) and 𝜎 (uncertainty | variance) for each player ● The extent of actual updates depends on how "surprising" the outcome is to the system ● Designed to support teams of variable size along with ties and grows polynomially ● Assumes team performance is the sum of performance of the players ● Developed by Microsoft Research and used in Xbox matchmaking system
  • 10. Our Work Address shortcomings and find optimal model ● TrueSkill suffers from “rich get richer” problem with unbalanced teams (auto rating boost) ● TrueSkill handles ties in a naive fashion ignoring the complexity of the system ● We experimented with different models like ○ numerous values of of K for Elo ○ Same and Separate Feature Factors in Weighted Elo ○ Swapped constant used to standardize the logistic function in Glicko ○ Selected Trueskill and further experimented with Weighted TrueSkill ● Find the best model, use it to rate players and predict result of future games
  • 11. Methodology - Preprocessing ● Data contained around 300,000 games ● Removed games with irregularities e.g. players crashing, etc ● Removed games with incomplete data e.g. 4 players games with data only from 3 players ● Some games had spectators but were still valid, these were converted from a 4 player game with 2 spectators to a 2 player 1v1 game
  • 12. Methodology - Preprocessing ● Mixture of 1v1, 2v2, 1v3, etc - stripped everything but 1v1 and 2v2 ○ Unbalanced games hard to rate in complex ranking systems ○ Skills for 3 vs 1 game don’t translate to 1 vs 1 or 2 vs 2 games ● 136k Remaining games: ○ 50k - 1 vs 1 games ○ 86k - 2 vs 2 games ● 3 datasets stored separately for faster loading to run experiments ○ Disk IO the main contributor to load times so smaller sets were better ● Post preprocessing we were able to increase accuracy from 69 to 76%
  • 13. Experiments: Datasets ● Ranking System ○ Traditional Ranking ○ Simple Ranking ○ Elo rating - Modified to support 2v2 ○ Glicko2 Rating - Modified to support 2v2 ○ TrueSkill Rating ● Features ○ Feature Selection based on Info Gain, Gain Ratio, Correlation Feature Selection ○ Feature Weights based on Perceptron Learning algorithm, SMO, Multilayer perceptron with Backpropagation, and Logistic Regression
  • 14. Experiments: Evaluation metrics Ranking ● Traditional, Simplified, and Elo systems use native Points ● Glicko and TrueSkill use: ○ Points = 𝜇(rating) - 3𝜎(uncertainty) ● Players sorted by Points highest to lowest
  • 15. Experiments: Evaluation metrics Future Game Predictions ● Winning team predicted by selecting team with higher sum of Points ● Accuracy of Predictions ○ Accuracy = Correct Predictions / Total number of instances ○ Iterative calculations so all training data is used as test data ○ Order matters so cross-validation is cannot be used
  • 16. Experiments: Baselines Prediction Accuracies League Full (136,144) 1 vs 1 (50,134) 2 vs 2 (86,010) Traditional 0.678164296627 0.683827342722 0.66064411115 Simple 0.64075537666 0.651813140783 0.643448436228 Elo 0.739577212363 0.718574221087 0.737774677363 Glicko 0.72718592079 0.714664698608 0.714463434484 TrueSkill 0.756955870255 0.742968843499 0.758051389373
  • 17. Experiment - Weighted Elo ● Elo is close in score to TrueSkill, but runs much faster ● Uses “K” value to decide how many points to move between teams ● K was weighted based on features in game details ● Features and factors selected one at a time by increasing/decreasing factor until accuracy reached maximum ● Weighting was capped to prevent small/large values from exploding ranks
  • 18. Experiment - Weighted Elo ● Tested raw feature vs. ratio of winning team/losing team ○ Ratio was better ● Tested inverting ratio for winning/losing and losing/winning ○ Mixed results ● Tested adding factors to K vs. multiplying K by factors ○ Multiplying worked better ● Tested assigning different weight for winners and losers ○ Improved accuracy!
  • 19. Experiment - Weighted Elo Notable improvements in score Best feature was “shamans_killed” of winning team League Full (136,144) 1 vs 1 (50,134) 2 vs 2 (86,010) Baseline Elo 0.739577212363 0.718574221087 0.737774677363 Weighted Elo 0.750146903279 0.725136633821 0.749098941983
  • 20. Experiment - Weighted TrueSkill ● TrueSkill starts out more accurate than Elo ● Has a built in weight for an update ranging from 0.0 - 1.0 ● Using same feature/factor as Elo resulted in negligible improvements ● Running the same process to find new feature/factors also resulted in negligible improvements
  • 21. Experiment - Weighted TrueSkill Tested skewing the results to give weight to the player doing the most work in 2 vs 2 games. Accuracy of top four features after weighing: Feature Score Unweighted 0.758051389373 followers_killed 0.758783862342 fights_won 0.758714103011 shamans_killed 0.758109522149 buildings_destroyed 0.757923497268 Results overall were positive, but small
  • 22. Experiment - Value of Games There is a hidden feature of game that is hard to calculate i.e. the value of how helpful a game is for ranking players. ● Tested 1 vs 1 games using Elo (for speed) ● Removed one game from test, compared accuracy to baseline ● Resulting change very small, but enough to see positive/negative ● Values normalized and stored as boolean ● Can run algorithms to classify games based on value
  • 23. Experiment - Value of Games BinarySMO Machine linear: showing attribute weights, not support vectors. -0.0003 * (normalized) map + -0.0076 * (normalized) length + 0.0025 * (normalized) fights_won + 0.0149 * (normalized) fights_lost + 0.0037 * (normalized) followers_killed + 0.0018 * (normalized) buildings_destroyed + 0.0001 * (normalized) shamans_killed + 0.0015 * (normalized) followers_lost + -0.0022 * (normalized) buildings_lost + 0.005 * (normalized) shaman_deaths + -0.0002 * (normalized) fights_won + -0.0004 * (normalized) fights_lost + -0.0001 * (normalized) followers_killed + 0.0001 * (normalized) buildings_destroyed + -0.0003 * (normalized) shamans_killed + 0.0001 * (normalized) followers_lost + -0.0008 * (normalized) buildings_lost + 0.0013 * (normalized) shaman_deaths + 0.0004 * (normalized) fights_won + -0.0001 * (normalized) fights_lost + 0.0009 * (normalized) followers_killed + 0.0003 * (normalized) buildings_destroyed + 0.0004 * (normalized) shamans_killed + 0 * (normalized) followers_lost + -0.0004 * (normalized) buildings_lost + 0.0003 * (normalized) shaman_deaths - 1.0003 However, everything classified as bad: === Confusion Matrix === a b <-- classified as 16405 0 | a = True 13669 0 | b = False Some insight into most influential factors - “fights_lost” is largest
  • 24. Experiment - Value of Games Tested weighting 1 vs. 1 games in TrueSkill using values found: Shows some positive results Tested weighting all games in TrueSkill using values found: Results against full dataset slightly negative 1v1 Base 0.742968843499 Weighting on fights_lost 0.743287988192 All Base 0.756955870255 Weighting on fights_lost 0.756698789517 Weighting on fights_lost only for 1v1 0.756882418616
  • 25. Results ● Experimenting with different parameters did not result in quantitative accuracy ● Overall we were able to predict outcome 8% better than the traditional system ● Found some features to be more important in the gameplay than others ● Model takes into account all priors so player’s first game is also a part of rating Thank You!