SlideShare a Scribd company logo
1 of 24
Download to read offline
#whoami 
• Christophe Willemsen 
• Bruges (Belgium) 
• Created Graphgen (graphgen.neoxygen.io) 
! 
! 
! 
! 
• @ Graph Aware
#how-it-started
#github-data-archive 
• Github Events 
• archive available @ http://www.githubarchive.org/ 
• events json files per hour 
• approx. 10k events per hour 
• ! the file in itself is not valid json, all file rows are 
valid json
#event-types 
• CommitCommentEvent 
• CreateEvent 
• DeleteEvent 
• DeploymentEvent 
• DeploymentStatusEvent 
• DownloadEvent 
• FollowEvent 
• ForkEvent 
• ForkApplyEvent 
• GistEvent 
• GollumEvent 
• IssueCommentEvent 
• IssuesEvent 
• MemberEvent 
• PageBuildEvent 
• PublicEvent 
• PullRequestEvent 
• PullRequestReviewCommentEvent 
• PushEvent 
• ReleaseEvent 
• StatusEvent 
• TeamAddEvent 
• WatchEvent
#gh4j 
Github Events importer for Neo4j 
Parse file + build customized Cypher Statements for 
each event + load in Neo4j
#PullRequestEvent 
Payload and informations from the past 
• You get information of the PR 
• You can also build informations about the repo, who is 
owning it for e.g. 
• On which branch 
• Depending of the P.R. Action (open/close/merge), you 
can determine for a close/merge who opened first the PR 
and from which fork it is coming
MERGE (u:User {name:'pixelfreak2005'}) 
CREATE (ev:PullRequestEvent {time:toInt(1401606356) }) 
MERGE (u)-[:DO]->(ev) 
MERGE (pr:PullRequest {html_url:'https://github.com/pixelfreak2005/liqiud_android_packages_apps_Settings/pull/2'}) 
SET pr += { id:toInt(16573622), number:toInt(2), state:'open'} 
MERGE (ev)-[:PR_OPEN]->(pr) 
MERGE (ow:User {name:'pixelfreak2005'}) 
MERGE (or:Repository {id:toInt(20338536), name:'liqiud_android_packages_apps_Settings'}) 
MERGE (or)-[:OWNED_BY]->(ow) 
MERGE (pr)-[:PR_ON_REPO]->(or)
#ForkEvent 
MERGE (u:User {name:'rudymalhi'}) 
CREATE (ev:ForkEvent {time:toInt(1401606379) }) MERGE (u)-[:DO]->(ev) 
CREATE (fork:Fork:Repository {name:'Full-Stack-JS-Nodember'}) 
MERGE (ev)-[:FORK]->(fork)-[:OWNED_BY]->(u) 
MERGE (bro:User {name:'mgenev'}) 
MERGE (br:Repository {id:toInt(15503488), name:'Full-Stack-JS-Nodember'})-[:OWNED_BY]->(bro) 
MERGE (fork)-[:FORK_OF]->(br)
#IssueCommentEvent 
You can check if the issue is related to a P.R. and build the complete P.R. schema 
MERGE (u:User {name:'johanneswilm'}) 
CREATE (ev:IssueCommentEvent {time:toInt(1401606384) }) 
MERGE (u)-[:DO]->(ev) 
MERGE (comment:IssueComment {id:toInt(44769338)}) 
MERGE (ev)-[:ISSUE_COMMENT]->(comment) 
MERGE (issue:Issue {id:toInt(34722578)}) 
MERGE (repo:Repository {id:toInt(14487686)}) 
MERGE (comment)-[:COMMENT_ON]->(issue)-[:ISSUE_ON]->(repo) 
SET repo.name = 'diffDOM' 
MERGE (owner:User {name:'fiduswriter'}) 
MERGE (comment)-[:COMMENT_ON]->(issue)-[:ISSUE_ON]->(repo)-[:OWNED_BY]->(owner)
Let’s have some fun and try some queries 
! 
demo
who did the most events ? 
! 
MATCH (u:User)-[r:DO]->() 
RETURN u.name, count(r) as events 
ORDER BY events DESC 
LIMIT 1
which repo has been the most touched ? 
! 
MATCH (repo:Repository)<-[r]-() 
RETURN repo.name, count(r) as touchs 
ORDER BY touchs DESC 
LIMIT 1
which repo has been the most forked ? 
! 
MATCH (repo:Repository)<-[:FORK_OF]-(fork:Fork)<-[:FORK]- 
(event:ForkEvent) 
RETURN repo.name, count(event) as forks 
ORDER BY forks DESC 
LIMIT 1
which repo has the most merged PRs ? 
! 
MATCH (repo:Repository)<-[:PR_ON_REPO]- 
(pr:PullRequest)<-[merge:PR_MERGE]-() 
RETURN repo.name, count(merge) as merges 
ORDER BY merges DESC 
LIMIT 1
how much forks are resulting in an open PR ? 
! 
MATCH p=(u:User)-[:DO]->(fe:ForkEvent)-[:FORK]->(fork:Fork) 
-[:FORK_OF]->(repo:Repository)<-[:PR_ON_REPO]-(pr:PullRequest) 
-[:PR_OPEN]-(pre:PullRequestEvent)<-[:DO]-(u2:User)<-[:OWNED_BY]- 
(f2:Fork)<-[:BRANCH_OF]-(br:Branch)<-[:FROM_BRANCH]-(pr2:PullRequest) 
WHERE u = u2 AND fork = f2 AND pr = pr2 
RETURN count(p)
Number of comments on a PR before the PR is merged ? 
! 
MATCH p=(ice:IssueCommentEvent)-[:ISSUE_COMMENT]->(comment:IssueComment) 
-[:COMMENT_ON]->(issue:Issue)-[:BOUND_TO_PR]->(pr:PullRequest) 
<-[:PR_MERGE]-(pre:PullRequestEvent) 
WHERE ice.time <= pre.time 
WITH pr, count(comment) as comments 
RETURN avg(comments)
Top contributor ? 
Which user has the most merged PR’s on repositories 
not owned by him 
! 
MATCH (u:User)-[r:DO]->(fe:PullRequestEvent)-[:PR_OPEN]->(pr:PullRequest {state:'merged'}) 
-[:PR_ON_REPO]-(repo:Repository)-[:OWNED_BY]->(u2:User) 
WHERE NOT u = u2 
RETURN u.name, count(r) as prs 
ORDER BY prs DESC 
LIMIT 1
Relate together Users having Merged PR's on same 
repositories, could serve as Follow Recommendations Engine! 
! 
MATCH p=(u:User)-[:DO]-(e:PullRequestEvent)-->(pr:PullRequest {state:'merged'})- 
[:PR_ON_REPO]->(r:Repository)<-[:PR_ON_REPO]-(pr2:PullRequest 
{state:'merged'})--(e2:PullRequestEvent)<-[:DO]-(u2:User) 
WHERE NOT u = u2 
WITH nodes(p) as coll 
WITH head(coll) as st, last(coll) as end 
MERGE (st)-[r:HAVE_WORKED_ON_SAME_REPO]-(end) 
ON MATCH SET r.w = (r.w) + 1 
ON CREATE SET r.w = 1
QUESTIONS ?
• More queries in the gist file : https://gist.github.com/ikwattro/ 
071d36f135131e8e4442 
• Not valid with Github Live API (different payload) 
• zipped db file http://bit.ly/1BaMCy9
THANK YOU 
@ikwattro
avg time between a repo is forked and this fork result in 
an opened PR ? 
! 
MATCH p=(u:User)-[:DO]->(fe:ForkEvent)-[:FORK]->(fork:Fork)-[:FORK_OF] 
->(repo:Repository)<-[:PR_ON_REPO]-(pr:PullRequest)-[:PR_OPEN]- 
(pre:PullRequestEvent) 
<-[:DO]-(u2:User)<-[:OWNED_BY]-(f2:Fork)<-[:BRANCH_OF]-(br:Branch)<- 
[:FROM_BRANCH]-(pr2:PullRequest) 
WHERE u = u2 AND fork = f2 AND pr = pr2 
RETURN count(p), avg(pre.time - fe.time) as offsetTime

More Related Content

What's hot

Git Basics (Professionals)
 Git Basics (Professionals) Git Basics (Professionals)
Git Basics (Professionals)bryanbibat
 
Infinum Android Talks #16 - Retrofit 2 by Kristijan Jurkovic
Infinum Android Talks #16 - Retrofit 2 by Kristijan JurkovicInfinum Android Talks #16 - Retrofit 2 by Kristijan Jurkovic
Infinum Android Talks #16 - Retrofit 2 by Kristijan JurkovicInfinum
 
名古屋SGGAE/J勉強会 Grails、Gaelykでハンズオン
名古屋SGGAE/J勉強会 Grails、Gaelykでハンズオン名古屋SGGAE/J勉強会 Grails、Gaelykでハンズオン
名古屋SGGAE/J勉強会 Grails、GaelykでハンズオンTsuyoshi Yamamoto
 
Git: An introduction of plumbing and porcelain commands
Git: An introduction of plumbing and porcelain commandsGit: An introduction of plumbing and porcelain commands
Git: An introduction of plumbing and porcelain commandsth507
 
Introducción a git y GitHub
Introducción a git y GitHubIntroducción a git y GitHub
Introducción a git y GitHubLucas Videla
 
Python from zero to hero (Twitter Explorer)
Python from zero to hero (Twitter Explorer)Python from zero to hero (Twitter Explorer)
Python from zero to hero (Twitter Explorer)Yuriy Senko
 
Introduction To Git Workshop
Introduction To Git WorkshopIntroduction To Git Workshop
Introduction To Git Workshopthemystic_ca
 
10 tips for making Bash a sane programming language
10 tips for making Bash a sane programming language10 tips for making Bash a sane programming language
10 tips for making Bash a sane programming languageYaroslav Tkachenko
 
Asynchronous CompletableFuture Presentation by László-Róbert Albert @Crossover
Asynchronous CompletableFuture Presentation by László-Róbert Albert @CrossoverAsynchronous CompletableFuture Presentation by László-Róbert Albert @Crossover
Asynchronous CompletableFuture Presentation by László-Róbert Albert @CrossoverCrossover Romania
 
2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Herokuronnywang_tw
 
Puppet camp Portland 2015: -windows (1)
Puppet camp Portland 2015: -windows (1)Puppet camp Portland 2015: -windows (1)
Puppet camp Portland 2015: -windows (1)Puppet
 
The async/await concurrency pattern in Golang
The async/await concurrency pattern in GolangThe async/await concurrency pattern in Golang
The async/await concurrency pattern in GolangMatteo Madeddu
 
G*なクラウド ~雲のかなたに~
G*なクラウド ~雲のかなたに~G*なクラウド ~雲のかなたに~
G*なクラウド ~雲のかなたに~Tsuyoshi Yamamoto
 
My Notes from https://www.codeschool.com/courses/git-real
My Notes from  https://www.codeschool.com/courses/git-realMy Notes from  https://www.codeschool.com/courses/git-real
My Notes from https://www.codeschool.com/courses/git-realEneldo Serrata
 
How to send gzipped requests with boto3
How to send gzipped requests with boto3How to send gzipped requests with boto3
How to send gzipped requests with boto3Luciano Mammino
 

What's hot (20)

Gittalk
GittalkGittalk
Gittalk
 
Git Basics (Professionals)
 Git Basics (Professionals) Git Basics (Professionals)
Git Basics (Professionals)
 
Infinum Android Talks #16 - Retrofit 2 by Kristijan Jurkovic
Infinum Android Talks #16 - Retrofit 2 by Kristijan JurkovicInfinum Android Talks #16 - Retrofit 2 by Kristijan Jurkovic
Infinum Android Talks #16 - Retrofit 2 by Kristijan Jurkovic
 
名古屋SGGAE/J勉強会 Grails、Gaelykでハンズオン
名古屋SGGAE/J勉強会 Grails、Gaelykでハンズオン名古屋SGGAE/J勉強会 Grails、Gaelykでハンズオン
名古屋SGGAE/J勉強会 Grails、Gaelykでハンズオン
 
Git: An introduction of plumbing and porcelain commands
Git: An introduction of plumbing and porcelain commandsGit: An introduction of plumbing and porcelain commands
Git: An introduction of plumbing and porcelain commands
 
Introducción a git y GitHub
Introducción a git y GitHubIntroducción a git y GitHub
Introducción a git y GitHub
 
Python from zero to hero (Twitter Explorer)
Python from zero to hero (Twitter Explorer)Python from zero to hero (Twitter Explorer)
Python from zero to hero (Twitter Explorer)
 
Introduction To Git Workshop
Introduction To Git WorkshopIntroduction To Git Workshop
Introduction To Git Workshop
 
10 tips for making Bash a sane programming language
10 tips for making Bash a sane programming language10 tips for making Bash a sane programming language
10 tips for making Bash a sane programming language
 
Asynchronous CompletableFuture Presentation by László-Róbert Albert @Crossover
Asynchronous CompletableFuture Presentation by László-Róbert Albert @CrossoverAsynchronous CompletableFuture Presentation by László-Róbert Albert @Crossover
Asynchronous CompletableFuture Presentation by László-Róbert Albert @Crossover
 
Git real slides
Git real slidesGit real slides
Git real slides
 
Nginx3
Nginx3Nginx3
Nginx3
 
2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku2012 coscup - Build your PHP application on Heroku
2012 coscup - Build your PHP application on Heroku
 
Puppet camp Portland 2015: -windows (1)
Puppet camp Portland 2015: -windows (1)Puppet camp Portland 2015: -windows (1)
Puppet camp Portland 2015: -windows (1)
 
The async/await concurrency pattern in Golang
The async/await concurrency pattern in GolangThe async/await concurrency pattern in Golang
The async/await concurrency pattern in Golang
 
G*なクラウド ~雲のかなたに~
G*なクラウド ~雲のかなたに~G*なクラウド ~雲のかなたに~
G*なクラウド ~雲のかなたに~
 
My Notes from https://www.codeschool.com/courses/git-real
My Notes from  https://www.codeschool.com/courses/git-realMy Notes from  https://www.codeschool.com/courses/git-real
My Notes from https://www.codeschool.com/courses/git-real
 
Gaelyk
GaelykGaelyk
Gaelyk
 
Git github
Git githubGit github
Git github
 
How to send gzipped requests with boto3
How to send gzipped requests with boto3How to send gzipped requests with boto3
How to send gzipped requests with boto3
 

Viewers also liked

Graph Database Prototyping made easy with Graphgen
Graph Database Prototyping made easy with GraphgenGraph Database Prototyping made easy with Graphgen
Graph Database Prototyping made easy with GraphgenChristophe Willemsen
 
Management des issues Github avec Neo4j et NLP
Management des issues Github avec Neo4j et NLPManagement des issues Github avec Neo4j et NLP
Management des issues Github avec Neo4j et NLPChristophe Willemsen
 
Neo4j au secours de l'Internet of Connected Things
Neo4j au secours de l'Internet of Connected ThingsNeo4j au secours de l'Internet of Connected Things
Neo4j au secours de l'Internet of Connected ThingsChristophe Willemsen
 
Recommendation Engines with Neo4j, Symfony and Reco4PHP
Recommendation Engines with Neo4j, Symfony and Reco4PHPRecommendation Engines with Neo4j, Symfony and Reco4PHP
Recommendation Engines with Neo4j, Symfony and Reco4PHPChristophe Willemsen
 
Graphgen - le générateur de graphes
Graphgen - le générateur de graphesGraphgen - le générateur de graphes
Graphgen - le générateur de graphesChristophe Willemsen
 
20161020 - Paris - Retour GC
20161020  - Paris - Retour GC20161020  - Paris - Retour GC
20161020 - Paris - Retour GCBenoît Simard
 
Your own recommendation engine with neo4j and reco4php - DPC16
Your own recommendation engine with neo4j and reco4php - DPC16Your own recommendation engine with neo4j and reco4php - DPC16
Your own recommendation engine with neo4j and reco4php - DPC16Christophe Willemsen
 
Moteurs de recommendation avec Neo4j et GraphAwareReco
Moteurs de recommendation avec Neo4j et GraphAwareRecoMoteurs de recommendation avec Neo4j et GraphAwareReco
Moteurs de recommendation avec Neo4j et GraphAwareRecoChristophe Willemsen
 
Recommandations avec Neo4j et le GraphAware Recommendation Engine
Recommandations avec Neo4j et le GraphAware Recommendation EngineRecommandations avec Neo4j et le GraphAware Recommendation Engine
Recommandations avec Neo4j et le GraphAware Recommendation EngineChristophe Willemsen
 

Viewers also liked (11)

Graph Database Prototyping made easy with Graphgen
Graph Database Prototyping made easy with GraphgenGraph Database Prototyping made easy with Graphgen
Graph Database Prototyping made easy with Graphgen
 
GMDSS - Practice1
GMDSS - Practice1GMDSS - Practice1
GMDSS - Practice1
 
Management des issues Github avec Neo4j et NLP
Management des issues Github avec Neo4j et NLPManagement des issues Github avec Neo4j et NLP
Management des issues Github avec Neo4j et NLP
 
Présentation symfony drupal
Présentation symfony drupalPrésentation symfony drupal
Présentation symfony drupal
 
Neo4j au secours de l'Internet of Connected Things
Neo4j au secours de l'Internet of Connected ThingsNeo4j au secours de l'Internet of Connected Things
Neo4j au secours de l'Internet of Connected Things
 
Recommendation Engines with Neo4j, Symfony and Reco4PHP
Recommendation Engines with Neo4j, Symfony and Reco4PHPRecommendation Engines with Neo4j, Symfony and Reco4PHP
Recommendation Engines with Neo4j, Symfony and Reco4PHP
 
Graphgen - le générateur de graphes
Graphgen - le générateur de graphesGraphgen - le générateur de graphes
Graphgen - le générateur de graphes
 
20161020 - Paris - Retour GC
20161020  - Paris - Retour GC20161020  - Paris - Retour GC
20161020 - Paris - Retour GC
 
Your own recommendation engine with neo4j and reco4php - DPC16
Your own recommendation engine with neo4j and reco4php - DPC16Your own recommendation engine with neo4j and reco4php - DPC16
Your own recommendation engine with neo4j and reco4php - DPC16
 
Moteurs de recommendation avec Neo4j et GraphAwareReco
Moteurs de recommendation avec Neo4j et GraphAwareRecoMoteurs de recommendation avec Neo4j et GraphAwareReco
Moteurs de recommendation avec Neo4j et GraphAwareReco
 
Recommandations avec Neo4j et le GraphAware Recommendation Engine
Recommandations avec Neo4j et le GraphAware Recommendation EngineRecommandations avec Neo4j et le GraphAware Recommendation Engine
Recommandations avec Neo4j et le GraphAware Recommendation Engine
 

Similar to Graphing GitHub data to analyze open source collaboration

Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub ActionsBo-Yi Wu
 
How to create your own hack environment
How to create your own hack environmentHow to create your own hack environment
How to create your own hack environmentSumedt Jitpukdebodin
 
Denys Serhiienko "ASGI in depth"
Denys Serhiienko "ASGI in depth"Denys Serhiienko "ASGI in depth"
Denys Serhiienko "ASGI in depth"Fwdays
 
ConFoo 2016 - Mum, I want to be a Groovy full-stack developer
ConFoo 2016 - Mum, I want to be a Groovy full-stack developerConFoo 2016 - Mum, I want to be a Groovy full-stack developer
ConFoo 2016 - Mum, I want to be a Groovy full-stack developerIván López Martín
 
FrenchKit 2017: Server(less) Swift
FrenchKit 2017: Server(less) SwiftFrenchKit 2017: Server(less) Swift
FrenchKit 2017: Server(less) SwiftChris Bailey
 
Dev ninja -> vagrant + virtualbox + chef-solo + git + ec2
Dev ninja  -> vagrant + virtualbox + chef-solo + git + ec2Dev ninja  -> vagrant + virtualbox + chef-solo + git + ec2
Dev ninja -> vagrant + virtualbox + chef-solo + git + ec2Yros
 
Mum, I want to be a Groovy full-stack developer
Mum, I want to be a Groovy full-stack developerMum, I want to be a Groovy full-stack developer
Mum, I want to be a Groovy full-stack developerGR8Conf
 
GR8Conf 2016 - Mum, I want to be a Groovy full-stack developer
GR8Conf 2016 - Mum, I want to be a Groovy full-stack developerGR8Conf 2016 - Mum, I want to be a Groovy full-stack developer
GR8Conf 2016 - Mum, I want to be a Groovy full-stack developerIván López Martín
 
Spring I/O 2015 - Mum, I want to be a Groovy full-stack developer
Spring I/O 2015 - Mum, I want to be a Groovy full-stack developerSpring I/O 2015 - Mum, I want to be a Groovy full-stack developer
Spring I/O 2015 - Mum, I want to be a Groovy full-stack developerIván López Martín
 
How to automate all your SEO projects
How to automate all your SEO projectsHow to automate all your SEO projects
How to automate all your SEO projectsVincent Terrasi
 
Non stop random2b
Non stop random2bNon stop random2b
Non stop random2bphanhung20
 
OpenStack API's and WSGI
OpenStack API's and WSGIOpenStack API's and WSGI
OpenStack API's and WSGIMike Pittaro
 
リローダブルClojureアプリケーション
リローダブルClojureアプリケーションリローダブルClojureアプリケーション
リローダブルClojureアプリケーションKenji Nakamura
 
Asynchronous Programming at Netflix
Asynchronous Programming at NetflixAsynchronous Programming at Netflix
Asynchronous Programming at NetflixC4Media
 
Android Jetpack + Coroutines: To infinity and beyond
Android Jetpack + Coroutines: To infinity and beyondAndroid Jetpack + Coroutines: To infinity and beyond
Android Jetpack + Coroutines: To infinity and beyondRamon Ribeiro Rabello
 
Go Web Development
Go Web DevelopmentGo Web Development
Go Web DevelopmentCheng-Yi Yu
 

Similar to Graphing GitHub data to analyze open source collaboration (20)

Introduction to GitHub Actions
Introduction to GitHub ActionsIntroduction to GitHub Actions
Introduction to GitHub Actions
 
Finding Clojure
Finding ClojureFinding Clojure
Finding Clojure
 
Subversion To Mercurial
Subversion To MercurialSubversion To Mercurial
Subversion To Mercurial
 
dotCloud and go
dotCloud and godotCloud and go
dotCloud and go
 
How to create your own hack environment
How to create your own hack environmentHow to create your own hack environment
How to create your own hack environment
 
Denys Serhiienko "ASGI in depth"
Denys Serhiienko "ASGI in depth"Denys Serhiienko "ASGI in depth"
Denys Serhiienko "ASGI in depth"
 
ConFoo 2016 - Mum, I want to be a Groovy full-stack developer
ConFoo 2016 - Mum, I want to be a Groovy full-stack developerConFoo 2016 - Mum, I want to be a Groovy full-stack developer
ConFoo 2016 - Mum, I want to be a Groovy full-stack developer
 
FrenchKit 2017: Server(less) Swift
FrenchKit 2017: Server(less) SwiftFrenchKit 2017: Server(less) Swift
FrenchKit 2017: Server(less) Swift
 
Dev ninja -> vagrant + virtualbox + chef-solo + git + ec2
Dev ninja  -> vagrant + virtualbox + chef-solo + git + ec2Dev ninja  -> vagrant + virtualbox + chef-solo + git + ec2
Dev ninja -> vagrant + virtualbox + chef-solo + git + ec2
 
groovy & grails - lecture 13
groovy & grails - lecture 13groovy & grails - lecture 13
groovy & grails - lecture 13
 
Mum, I want to be a Groovy full-stack developer
Mum, I want to be a Groovy full-stack developerMum, I want to be a Groovy full-stack developer
Mum, I want to be a Groovy full-stack developer
 
GR8Conf 2016 - Mum, I want to be a Groovy full-stack developer
GR8Conf 2016 - Mum, I want to be a Groovy full-stack developerGR8Conf 2016 - Mum, I want to be a Groovy full-stack developer
GR8Conf 2016 - Mum, I want to be a Groovy full-stack developer
 
Spring I/O 2015 - Mum, I want to be a Groovy full-stack developer
Spring I/O 2015 - Mum, I want to be a Groovy full-stack developerSpring I/O 2015 - Mum, I want to be a Groovy full-stack developer
Spring I/O 2015 - Mum, I want to be a Groovy full-stack developer
 
How to automate all your SEO projects
How to automate all your SEO projectsHow to automate all your SEO projects
How to automate all your SEO projects
 
Non stop random2b
Non stop random2bNon stop random2b
Non stop random2b
 
OpenStack API's and WSGI
OpenStack API's and WSGIOpenStack API's and WSGI
OpenStack API's and WSGI
 
リローダブルClojureアプリケーション
リローダブルClojureアプリケーションリローダブルClojureアプリケーション
リローダブルClojureアプリケーション
 
Asynchronous Programming at Netflix
Asynchronous Programming at NetflixAsynchronous Programming at Netflix
Asynchronous Programming at Netflix
 
Android Jetpack + Coroutines: To infinity and beyond
Android Jetpack + Coroutines: To infinity and beyondAndroid Jetpack + Coroutines: To infinity and beyond
Android Jetpack + Coroutines: To infinity and beyond
 
Go Web Development
Go Web DevelopmentGo Web Development
Go Web Development
 

Recently uploaded

Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfLivetecs LLC
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 

Recently uploaded (20)

Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdf
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 

Graphing GitHub data to analyze open source collaboration

  • 1.
  • 2. #whoami • Christophe Willemsen • Bruges (Belgium) • Created Graphgen (graphgen.neoxygen.io) ! ! ! ! • @ Graph Aware
  • 4. #github-data-archive • Github Events • archive available @ http://www.githubarchive.org/ • events json files per hour • approx. 10k events per hour • ! the file in itself is not valid json, all file rows are valid json
  • 5. #event-types • CommitCommentEvent • CreateEvent • DeleteEvent • DeploymentEvent • DeploymentStatusEvent • DownloadEvent • FollowEvent • ForkEvent • ForkApplyEvent • GistEvent • GollumEvent • IssueCommentEvent • IssuesEvent • MemberEvent • PageBuildEvent • PublicEvent • PullRequestEvent • PullRequestReviewCommentEvent • PushEvent • ReleaseEvent • StatusEvent • TeamAddEvent • WatchEvent
  • 6. #gh4j Github Events importer for Neo4j Parse file + build customized Cypher Statements for each event + load in Neo4j
  • 7. #PullRequestEvent Payload and informations from the past • You get information of the PR • You can also build informations about the repo, who is owning it for e.g. • On which branch • Depending of the P.R. Action (open/close/merge), you can determine for a close/merge who opened first the PR and from which fork it is coming
  • 8. MERGE (u:User {name:'pixelfreak2005'}) CREATE (ev:PullRequestEvent {time:toInt(1401606356) }) MERGE (u)-[:DO]->(ev) MERGE (pr:PullRequest {html_url:'https://github.com/pixelfreak2005/liqiud_android_packages_apps_Settings/pull/2'}) SET pr += { id:toInt(16573622), number:toInt(2), state:'open'} MERGE (ev)-[:PR_OPEN]->(pr) MERGE (ow:User {name:'pixelfreak2005'}) MERGE (or:Repository {id:toInt(20338536), name:'liqiud_android_packages_apps_Settings'}) MERGE (or)-[:OWNED_BY]->(ow) MERGE (pr)-[:PR_ON_REPO]->(or)
  • 9. #ForkEvent MERGE (u:User {name:'rudymalhi'}) CREATE (ev:ForkEvent {time:toInt(1401606379) }) MERGE (u)-[:DO]->(ev) CREATE (fork:Fork:Repository {name:'Full-Stack-JS-Nodember'}) MERGE (ev)-[:FORK]->(fork)-[:OWNED_BY]->(u) MERGE (bro:User {name:'mgenev'}) MERGE (br:Repository {id:toInt(15503488), name:'Full-Stack-JS-Nodember'})-[:OWNED_BY]->(bro) MERGE (fork)-[:FORK_OF]->(br)
  • 10. #IssueCommentEvent You can check if the issue is related to a P.R. and build the complete P.R. schema MERGE (u:User {name:'johanneswilm'}) CREATE (ev:IssueCommentEvent {time:toInt(1401606384) }) MERGE (u)-[:DO]->(ev) MERGE (comment:IssueComment {id:toInt(44769338)}) MERGE (ev)-[:ISSUE_COMMENT]->(comment) MERGE (issue:Issue {id:toInt(34722578)}) MERGE (repo:Repository {id:toInt(14487686)}) MERGE (comment)-[:COMMENT_ON]->(issue)-[:ISSUE_ON]->(repo) SET repo.name = 'diffDOM' MERGE (owner:User {name:'fiduswriter'}) MERGE (comment)-[:COMMENT_ON]->(issue)-[:ISSUE_ON]->(repo)-[:OWNED_BY]->(owner)
  • 11. Let’s have some fun and try some queries ! demo
  • 12. who did the most events ? ! MATCH (u:User)-[r:DO]->() RETURN u.name, count(r) as events ORDER BY events DESC LIMIT 1
  • 13. which repo has been the most touched ? ! MATCH (repo:Repository)<-[r]-() RETURN repo.name, count(r) as touchs ORDER BY touchs DESC LIMIT 1
  • 14. which repo has been the most forked ? ! MATCH (repo:Repository)<-[:FORK_OF]-(fork:Fork)<-[:FORK]- (event:ForkEvent) RETURN repo.name, count(event) as forks ORDER BY forks DESC LIMIT 1
  • 15. which repo has the most merged PRs ? ! MATCH (repo:Repository)<-[:PR_ON_REPO]- (pr:PullRequest)<-[merge:PR_MERGE]-() RETURN repo.name, count(merge) as merges ORDER BY merges DESC LIMIT 1
  • 16. how much forks are resulting in an open PR ? ! MATCH p=(u:User)-[:DO]->(fe:ForkEvent)-[:FORK]->(fork:Fork) -[:FORK_OF]->(repo:Repository)<-[:PR_ON_REPO]-(pr:PullRequest) -[:PR_OPEN]-(pre:PullRequestEvent)<-[:DO]-(u2:User)<-[:OWNED_BY]- (f2:Fork)<-[:BRANCH_OF]-(br:Branch)<-[:FROM_BRANCH]-(pr2:PullRequest) WHERE u = u2 AND fork = f2 AND pr = pr2 RETURN count(p)
  • 17.
  • 18. Number of comments on a PR before the PR is merged ? ! MATCH p=(ice:IssueCommentEvent)-[:ISSUE_COMMENT]->(comment:IssueComment) -[:COMMENT_ON]->(issue:Issue)-[:BOUND_TO_PR]->(pr:PullRequest) <-[:PR_MERGE]-(pre:PullRequestEvent) WHERE ice.time <= pre.time WITH pr, count(comment) as comments RETURN avg(comments)
  • 19. Top contributor ? Which user has the most merged PR’s on repositories not owned by him ! MATCH (u:User)-[r:DO]->(fe:PullRequestEvent)-[:PR_OPEN]->(pr:PullRequest {state:'merged'}) -[:PR_ON_REPO]-(repo:Repository)-[:OWNED_BY]->(u2:User) WHERE NOT u = u2 RETURN u.name, count(r) as prs ORDER BY prs DESC LIMIT 1
  • 20. Relate together Users having Merged PR's on same repositories, could serve as Follow Recommendations Engine! ! MATCH p=(u:User)-[:DO]-(e:PullRequestEvent)-->(pr:PullRequest {state:'merged'})- [:PR_ON_REPO]->(r:Repository)<-[:PR_ON_REPO]-(pr2:PullRequest {state:'merged'})--(e2:PullRequestEvent)<-[:DO]-(u2:User) WHERE NOT u = u2 WITH nodes(p) as coll WITH head(coll) as st, last(coll) as end MERGE (st)-[r:HAVE_WORKED_ON_SAME_REPO]-(end) ON MATCH SET r.w = (r.w) + 1 ON CREATE SET r.w = 1
  • 22. • More queries in the gist file : https://gist.github.com/ikwattro/ 071d36f135131e8e4442 • Not valid with Github Live API (different payload) • zipped db file http://bit.ly/1BaMCy9
  • 24. avg time between a repo is forked and this fork result in an opened PR ? ! MATCH p=(u:User)-[:DO]->(fe:ForkEvent)-[:FORK]->(fork:Fork)-[:FORK_OF] ->(repo:Repository)<-[:PR_ON_REPO]-(pr:PullRequest)-[:PR_OPEN]- (pre:PullRequestEvent) <-[:DO]-(u2:User)<-[:OWNED_BY]-(f2:Fork)<-[:BRANCH_OF]-(br:Branch)<- [:FROM_BRANCH]-(pr2:PullRequest) WHERE u = u2 AND fork = f2 AND pr = pr2 RETURN count(p), avg(pre.time - fe.time) as offsetTime