SlideShare a Scribd company logo
1 of 43
Download to read offline
ICWSM’11 Tutorial
Exploratory Network Analysis with:




             Instructors: Sébastien Heymann, Julian Bilcke
                  seb@gephi.org, julian.bilcke@gephi.org

                     July 17, 2011 | 1 PM - 4 PM
Exploratory Network Analysis with Gephi


This tutorial is an introduction to Gephi, the open source graph network
visualization and manipulation software.

Gephi aims to fulfill the complete chain from data importing to aesthetics
refinements and interaction.

Users interact with the visualization and manipulate structures, shapes
and colors to reveal hidden properties.

The goal is to help data analysts to make hypotheses, intuitively discover
patterns or errors in large data collections.




                                                                                E
At the end, the participants will walk away with the practical knowledge




                                                                             IN
enabling them to use Gephi for their own projects.




                                                                    F F L
                                                               O
Exploratory Network Analysis with Gephi


It starts with a brief introduction on the network exploration process and
a hands-on demonstration of the essential functionalities of Gephi.

Participants are guided step by step through the complete chain of rep-
resentation, manipulation, layout, analysis and aesthetics refinements.
Next, teams work on real datasets.

They finally present their preliminary results. The tutorial concludes with
a general question and answer session.




                                                                              IN E
                                                                     F F L
                                                                O
Requirements


Bring your own laptop with Java and Gephi installed.
Gephi should be updated (menu Help > Check for Updates).

Bring a mouse with a wheel.

Bring a dataset of your own if you want, verify if it loads well in Gephi.[1]




[1] http://gephi.org/users/supported-graph-formats/
Workshop Schedule - Part I


Exploratory Network Analysis

• Exploratory Data Analysis
• Exploratory Network Analysis
• Looking for Orderness in Data
• Examples
• Guideline

Introduction to Gephi

• Approach and Community
• Networked Data
• Quick Start Demo

        * 30 min break *
Workshop Schedule - Part II


Hands-On!

• Team Work on a Dataset
• Presentation of Preliminary Results

Q&A
Exploratory Data Analysis




   Confirmatory                   results
    Exploratory                   intuition
    Serendipity                   surprise


 “The greatest value of a picture is when it forces us        started with
     to notice what we never expected to see”            John Tukey (1962)
Exploratory Data Analysis




                                 Non-linear processing chain of Ben Fry
                            in Computational Information Design (2004)
Dummy Example


                                                    Observation:
                                                    visual saliences on specific
                                                    file sizes

                                                    External knowledge:
                                                    these sizes correspond to
                                                    films

                                                    New hypothesis on data:
                                                    films are highly exchanged,
                                                    so the study might dig in
                                                    this direction
 P2P file size distribution (Latapy et al., 2008)
Exploratory Network Analysis




                                      2   interact in real time
      1    see the network
                                     Gephi prototype (2008)
  1st graph viz tool: Pajek (1996)   group, filter, compute metrics...
  Vladimir Batagelj, Andrej Mrvar


  3       build a visual language
 size by rank, color by partition,
 label, curved edges, thickness...
Looking for a “Simple Small Truth”?




Drew Conway, What Data Visualization Should Do:     1. Make complex things simple
                                                    2. Extract small information from large data
                                                    3. Present truth, do not deceive
                                             http://www.dataists.com/2010/10/what-data-visualization-should-do-simple-small-truth/
Looking for Orderness in Data


        Make varying 3 cursors simultaneously to extract
                      meaningful patterns


MICRO level         MACRO level
                                      at different levels


1 dimension         N dimensions
                                      on multiple dimensions


T+0                         T+N
                                      at time scale
“Zoom” cursor on Quantitative Data

MICRO level   MACRO level

                            Global
                            - connectivity
                            - density
                            - centralization

                            Local
                            - communities
                            - bridges between communities
                            - local centers vs periphery

                            Individual
                            - centrality
                            - distances
                            - neighborhood
                            - location
                            - local authority vs hub
“Crossing” cursor on Qualitative Data

1 dimension           N dimensions


Social
- who with whom
- communities
- brokerage
- influence and power
- homophily

Semantic
- topics
- thematic clusters

Geographic
- spatial phenomena
“Timeline” cursor on Temporal Data

T+0                        T+N




Evolution of social ties

Evolution of communities

Evolution of topics
Mapping an Innovation Center
Collaborations on projects at Images et Réseaux



                                     Themes and content




                                     Actors




                                     Territory


                     Franck Ghitalla & Ecole de Design de Nantes
Mapping Scientific Cooperations
Network Map: a Series of Choices

 corpus
          data
                           graphical
                           operations




algorithms
                              communication
           thresholds         goals
Guideline

   # nodes
    1 - 100      lists + edges in bonus, focus on qualitative data


                 How attributes explain the structure?
 100 - 1,000     • easy to read, “obvious” patterns
                 • focus on entities (in context)
                 • metrics are tools to describe the graph (centrality, bridging...)
                 • links help to build and interpret categories of entities
                 challenge: mix attribute crossing and connectivity

                 How the structure explains attributes?
1,000 - 50,000 • hard to read, problem of “hidden signals”:
                 track patterns with various layouts and filtering
               • focus on structures
               • metrics are tools to build the graph (cosine similarity...)
               • categories help to understand the structure
               challenge: pattern recognition

   > 50,000      require high computational power
Gephi now!
Gephi in a Nutshell


                « Like Photoshop™ for graphs. »

   Helps data analysts to reveal patterns and trends,
    highlight outliers and tells story with their data.


• Network visualization platform
• Open source, supported by a community
• Built for performance and usability
• Extensible by plug-ins
• Windows, MacOS X, Linux
Gephi Community




                  Nonprofit organization




  Communities     Contributors
                  Mathieu Bastian, Mathieu Jacomy,
                  Eduardo Ramos Ibañez, Sébastien
                  Heymann, Guillaume Ceccarelli,
                  André Panisson, Antonio Patriarca,
                  Cezary Bartosiak, Martin Škurla,
                  Patrick McSweeney, Yi Du, Hélder
                  Suzuki, Daniel Bernardes, Ernesto
                  Aneiro, Keheliya Gallaba, Luiz
                  Ribeiro, Urban Škudnik, Vojtech
                  Bardiovsky, Yudi Xue
Community Mission


         Provide a “sustainable” software

         Maintain the technical ecosystem

            Build a business ecosystem

  Face cutting-edge technological challenges with
                a long-term vision

      Distribute the software in Open Source
Community Values


  Open innovation: ideas and features come from
             the entire community.

      Decisions are taken with transparency.

   We consider this technology as a public good,
         and will keep it in open source.
Diversity of Usages

business              leisure :-)




communication         academic      art
Diversity of Network Encoding


V = { a, b, c, d, e }                                  <graph>
E = { (a,b), (a,d), (b,c), (e,a), (c,e) }                   <nodes>
                                                               <node id=”a” />
                                                               <node id=”b” />
                   Textual                                     <node id=”c” />
                                                               <node id=”d” />
                                                               <node id=”e” />
                                                            </nodes>
                                                            <edges>
                                                               <edge source=”a” target=”b” />
                                                               <edge source=”a” target=”d” />
           a   b   c   d   e                                   <edge source=”b” target=”c” />
       a   -   1   -   1   -                                   <edge source=”e” target=”a” />
                                                               <edge source=”c” target=”e” />
       b   -   -   1   -   -
                                                            </edges>
       c   -   -   -   -   1                           </graph>
       d   -   -   -   -   -
       e   1   -   -   -   -                                            XML
                                        Graphical
           Tabular

                                                    and many others...
Software I/O




                             }
    MySQL
 PostgreSL
SQL Server
                databases        user input
    Neo4j

             CSV                                  CSV
             Pajek NET                            Pajek NET     file
             Guess GDF                            Guess GDF


                                              >
             GEXF                                 GEXF
             GraphML                              GraphML
   file      Graphviz DOT                         Excel Spreadsheet
             UCInet DL                            SVG
             NetdrawVNA                           PDF
             Tulip TLP                            PNG
             Excel Spreadsheet



 graph streaming
Choosing a File Format




                                re




                                                              es


                                                                       e
                               tu




                                                                    lu
                                                             ut
                                c




                                                                  Va
                             ru




                                                                            s
                                                          rib




                                                                         ph
                          St




                                                                lt
                                                          t




                                                                       ra
                                                       At

                                                              au
                         rix




                                                                     G
                               re




                                                            ef
                                                      n
                                      t
                     at




                                    gh




                                                                  al
                                                  io
                           tu




                                                            D
                    /M




                                                                  ic
                                           es




                                                                        s
                                                  at
                                 ei
                          ru




                                                                       ic
                                                         e

                                                                 h
                    st




                                       ut

                                                liz
                                W




                                                       ut




                                                                     am
                                                              rc
                         St
                Li




                                     rib




                                                      rib
                                            ua




                                                            ra
                               ge
                     L




                                                                  yn
               ge




                                                            ie
                    XM




                                              s
                                       t




                                                     t
                           Ed

                                    At




                                                  At
                                           Vi




                                                                  D
                                                          H
               Ed




CSV                                                                         Table of features supported
DL Ucinet                                                                   by Gephi
DOT Graphviz
GDF
GEXF
                                                                            * spreadsheets can be loaded
GML                                                                         in the Data Laboratory
GraphML
NET Pajek
TLP Tulip
VNA Netdraw
Spreadsheet*
Do you need...


                     Many features
          GEXF
          Spreadsheet
          GraphML
          Guess GDF
          GML
          UCINet DL
          Netdraw VNA
          Graphviz DOT
          Pajek NET                     File Type
          CSV                               XML
          Tulip TLP                         Tabular
                         Few features       Text
Using Gephi




               E M O
              D
Team work




 1   Create a team of 2~3 people.


 2   Choose a dataset.


 3   Explore it during 1H.


 4   Two teams present their preliminary findings.
Dataset #1: GitHub Software Repository




 “GitHub is an application used by nearly a million people to store
 over two million code repositories, making GitHub the largest code
                         host in the world.”

Started in 2008, it provides the features of an online social network
and a software repository to lower the barriers of collaboration and
make the code easier to contribute.

                                                 https://github.com
Dataset #1: GitHub Software Repository


Data extracted by Franck Cuny* at Linkfluence SAS

1st release in March 2010 -> this poster
2nd release in June 2011 -> your data

_____________Network of user profiles__________

Nodes: peoples with at least one repository who
are followed by at least two other people
Edges: A follows B

_____________Network of repositories__________

Nodes: repositories
Edges: A shares a developer with B

        Very few research publications on this OSN!

                                                      * franck.cuny@linkfluence.net
Dataset #1: GitHub Software Repository


Data extracted by a crawl using the GitHub API
Seed: 10 well-known contributors in the Perl community

Networks by country: Japan, France, United States
Networks by language: Perl, PHP, Python, Ruby

Node attributes:
• user country
• number of followers
• main programming language

Edges:
• directed
• weight = number of projects A has forked from B
Dataset #1: GitHub Software Repository




         Your mission (should you decide to accept it):
      find research hypotheses based on your exploration

  Example question: are the Perl communities based on geography?
Dataset #2: The Irish Blogosphere


“Identifying Representative Textual Sources in Blog Networks”. K. Wade, D.
Greene, C. Lee, D. Archambault, P. Cunningham (2011) http://mlg.ucd.ie/blogs



_______________Blogroll Network______________

Nodes: blogs with more than two blogroll links
Edges: blogroll link (in-link)

_______________Post-link Network_____________

Nodes: blogs with more than two blogroll links
Edges: hyperlink inside post from a blog to another
(post-link)
Dataset #2: The Irish Blogosphere


Data extracted by a crawl at distance 2 from the seed for the in-links
and Google Blog Search for the post-links.
Seed: 21 popular blogs, winners of the “2010 Irish Blog Awards”

Node attributes:
• post count = total number of posts by blog
• category = from the irish blog index at www.irishblogdirectory.com,
  where available
• infomap_comm = community to which a node belongs (infomap algo)
• gce_comms = overlapping communities (GCE algo)
• moses_comms = overlapping communities (MOSES algo)

Edges:
• directed
• weight = number of hyperlinks in the Post-link network
                                                            crawl at distance 2 from the seed
Dataset #2: The Irish Blogosphere




                       Your mission:
       explore and try to confirm the official results
Hands-On!


Start:

• Load a graph
• Apply a layout
• Color the nodes by a qualitative variable in Partition Panel
• Size the nodes by a quantitative variable in Ranking Panel
• Start to explore...compute metrics, filter the network

End:

• Export maps to PDF in Preview Tab
• Save
Presentations




  GitHub Repository   Irish Blogosphere
Gephi Documentation


Web Site:       http://gephi.org

Support:        http://forum.gephi.org
Wiki:           http://wiki.gephi.org
Source code:    https://launchpad.net/gephi


Online Tutorials
http://gephi.org/users/quick-start/
http://gephi.org/users/tutorial-visualization/
http://gephi.org/users/tutorial-layouts/
http://wiki.gephi.org/index.php/Import_CSV_Data
http://wiki.gephi.org/index.php/Import_Dynamic_Data


Tutorial in Spanish
https://code.google.com/p/camon/wiki/Taller_Gephi


Supported Graph Formats
http://gephi.org/users/supported-graph-formats/
Thank You!




             Caspar David Friedrich -
             Wanderer Above the Sea of Fog
Credits


[slide 11] images from Drew Conway
http://www.dataists.com/2010/10/what-data-visualization-should-do-simple-small-truth/

[slide 22 top left] Benoît Vidal at MFG Labs
[slide 22 bottom center] Franck Ghitalla at UTC
[slide 22 right] Studies in MA Digital Fashion at LCF by Peter Jeun Ho Tsang
http://jeunhotsang.com/blog/2010/12/07/prototype/

[slide 27] sketches from Ben Fry, Computational Information Design



           Special Thanks to Franck Ghitalla and Mathieu Jacomy
                         for their insightful discussions.

More Related Content

What's hot

Community Detection in Social Networks: A Brief Overview
Community Detection in Social Networks: A Brief OverviewCommunity Detection in Social Networks: A Brief Overview
Community Detection in Social Networks: A Brief OverviewSatyaki Sikdar
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis WorkshopData Works MD
 
Community Detection in Social Media
Community Detection in Social MediaCommunity Detection in Social Media
Community Detection in Social MediaSymeon Papadopoulos
 
Social network analysis
Social network analysisSocial network analysis
Social network analysisCaleb Jones
 
Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)SocialMediaMining
 
Group and Community Detection in Social Networks
Group and Community Detection in Social NetworksGroup and Community Detection in Social Networks
Group and Community Detection in Social NetworksKent State University
 
Cascading behavior in the networks
Cascading behavior in the networksCascading behavior in the networks
Cascading behavior in the networksVani Kandhasamy
 
Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...Nees Jan van Eck
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systemsguest77b0cd12
 
Questions of journalism and mass communication tribhuvan university nepal
Questions of journalism and mass communication tribhuvan university nepalQuestions of journalism and mass communication tribhuvan university nepal
Questions of journalism and mass communication tribhuvan university nepalRabi Raj Baral
 
Social Network Analysis: Applications & Challenges
Social Network Analysis: Applications & ChallengesSocial Network Analysis: Applications & Challenges
Social Network Analysis: Applications & ChallengesIIIT Hyderabad
 
Link prediction with the linkpred tool
Link prediction with the linkpred toolLink prediction with the linkpred tool
Link prediction with the linkpred toolRaf Guns
 
Community detection algorithms
Community detection algorithmsCommunity detection algorithms
Community detection algorithmsAlireza Andalib
 
Social media mining PPT
Social media mining PPTSocial media mining PPT
Social media mining PPTChhavi Mathur
 
Nonnegative Matrix Factorization
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix FactorizationTatsuya Yokota
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network AnalysisPatti Anklam
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphsNicola Barbieri
 

What's hot (20)

Community Detection in Social Networks: A Brief Overview
Community Detection in Social Networks: A Brief OverviewCommunity Detection in Social Networks: A Brief Overview
Community Detection in Social Networks: A Brief Overview
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis Workshop
 
Community Detection in Social Media
Community Detection in Social MediaCommunity Detection in Social Media
Community Detection in Social Media
 
Social network analysis
Social network analysisSocial network analysis
Social network analysis
 
Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)
 
Group and Community Detection in Social Networks
Group and Community Detection in Social NetworksGroup and Community Detection in Social Networks
Group and Community Detection in Social Networks
 
Cascading behavior in the networks
Cascading behavior in the networksCascading behavior in the networks
Cascading behavior in the networks
 
Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systems
 
Questions of journalism and mass communication tribhuvan university nepal
Questions of journalism and mass communication tribhuvan university nepalQuestions of journalism and mass communication tribhuvan university nepal
Questions of journalism and mass communication tribhuvan university nepal
 
Social Network Analysis: Applications & Challenges
Social Network Analysis: Applications & ChallengesSocial Network Analysis: Applications & Challenges
Social Network Analysis: Applications & Challenges
 
Basics Gephi Tutorial
Basics Gephi TutorialBasics Gephi Tutorial
Basics Gephi Tutorial
 
Link prediction with the linkpred tool
Link prediction with the linkpred toolLink prediction with the linkpred tool
Link prediction with the linkpred tool
 
Ppt
PptPpt
Ppt
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
 
Community detection algorithms
Community detection algorithmsCommunity detection algorithms
Community detection algorithms
 
Social media mining PPT
Social media mining PPTSocial media mining PPT
Social media mining PPT
 
Nonnegative Matrix Factorization
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix Factorization
 
Introduction to Social Network Analysis
Introduction to Social Network AnalysisIntroduction to Social Network Analysis
Introduction to Social Network Analysis
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphs
 

Similar to SP1: Exploratory Network Analysis with Gephi

Gephi icwsm-tutorial
Gephi icwsm-tutorialGephi icwsm-tutorial
Gephi icwsm-tutorialcsedays
 
Trends in Human-Computer Interaction in Information Seeking
Trends in Human-Computer Interaction in Information SeekingTrends in Human-Computer Interaction in Information Seeking
Trends in Human-Computer Interaction in Information SeekingRich Miller
 
Graph visualization options and latest developments
Graph visualization options and latest developmentsGraph visualization options and latest developments
Graph visualization options and latest developmentsLinkurious
 
20111103 con tech2011-marc smith
20111103 con tech2011-marc smith20111103 con tech2011-marc smith
20111103 con tech2011-marc smithMarc Smith
 
LSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLocal Social Summit
 
Ml pluss ejan2013
Ml pluss ejan2013Ml pluss ejan2013
Ml pluss ejan2013CS, NcState
 
Gephi short introduction
Gephi short introductionGephi short introduction
Gephi short introductionSébastien
 
Network of Excellence in Internet Science (Multidisciplinarity and its Implic...
Network of Excellence in Internet Science (Multidisciplinarity and its Implic...Network of Excellence in Internet Science (Multidisciplinarity and its Implic...
Network of Excellence in Internet Science (Multidisciplinarity and its Implic...i_scienceEU
 
Introduction to the FP7 CODE project @ BDBC
Introduction to the FP7 CODE project @ BDBCIntroduction to the FP7 CODE project @ BDBC
Introduction to the FP7 CODE project @ BDBCFlorian Stegmaier
 
Mining Social Graph Data
Mining Social Graph DataMining Social Graph Data
Mining Social Graph DataDrew Conway
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningNeo4j
 
Geographic Information Systems and Social Learning in Participatory Spatial P...
Geographic Information Systems and Social Learning in Participatory Spatial P...Geographic Information Systems and Social Learning in Participatory Spatial P...
Geographic Information Systems and Social Learning in Participatory Spatial P...Robert Goodspeed
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software AnalyticsMargaret-Anne Storey
 
20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...Marc Smith
 
Social Event Detection using Multimodal Clustering and Integrating Supervisor...
Social Event Detection using Multimodal Clustering and Integrating Supervisor...Social Event Detection using Multimodal Clustering and Integrating Supervisor...
Social Event Detection using Multimodal Clustering and Integrating Supervisor...Symeon Papadopoulos
 
Social network analysis for modeling & tuning social media website
Social network analysis for modeling & tuning social media websiteSocial network analysis for modeling & tuning social media website
Social network analysis for modeling & tuning social media websiteEdward B. Rockower
 

Similar to SP1: Exploratory Network Analysis with Gephi (20)

Gephi icwsm-tutorial
Gephi icwsm-tutorialGephi icwsm-tutorial
Gephi icwsm-tutorial
 
STI Summit 2011 - Visual analytics and linked data
STI Summit 2011 - Visual analytics and linked dataSTI Summit 2011 - Visual analytics and linked data
STI Summit 2011 - Visual analytics and linked data
 
Trends in Human-Computer Interaction in Information Seeking
Trends in Human-Computer Interaction in Information SeekingTrends in Human-Computer Interaction in Information Seeking
Trends in Human-Computer Interaction in Information Seeking
 
Graph visualization options and latest developments
Graph visualization options and latest developmentsGraph visualization options and latest developments
Graph visualization options and latest developments
 
Blended Libraries (Harald Reiterer)
Blended Libraries (Harald Reiterer)Blended Libraries (Harald Reiterer)
Blended Libraries (Harald Reiterer)
 
20111103 con tech2011-marc smith
20111103 con tech2011-marc smith20111103 con tech2011-marc smith
20111103 con tech2011-marc smith
 
LSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social Media
 
Benoit Visual Only Retrieval
Benoit Visual Only RetrievalBenoit Visual Only Retrieval
Benoit Visual Only Retrieval
 
Ml pluss ejan2013
Ml pluss ejan2013Ml pluss ejan2013
Ml pluss ejan2013
 
Gephi short introduction
Gephi short introductionGephi short introduction
Gephi short introduction
 
Network of Excellence in Internet Science (Multidisciplinarity and its Implic...
Network of Excellence in Internet Science (Multidisciplinarity and its Implic...Network of Excellence in Internet Science (Multidisciplinarity and its Implic...
Network of Excellence in Internet Science (Multidisciplinarity and its Implic...
 
Introduction to the FP7 CODE project @ BDBC
Introduction to the FP7 CODE project @ BDBCIntroduction to the FP7 CODE project @ BDBC
Introduction to the FP7 CODE project @ BDBC
 
Mining Social Graph Data
Mining Social Graph DataMining Social Graph Data
Mining Social Graph Data
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine Learning
 
Geographic Information Systems and Social Learning in Participatory Spatial P...
Geographic Information Systems and Social Learning in Participatory Spatial P...Geographic Information Systems and Social Learning in Participatory Spatial P...
Geographic Information Systems and Social Learning in Participatory Spatial P...
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software Analytics
 
20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...20120301 strata-marc smith-mapping social media networks with no coding using...
20120301 strata-marc smith-mapping social media networks with no coding using...
 
Social Event Detection using Multimodal Clustering and Integrating Supervisor...
Social Event Detection using Multimodal Clustering and Integrating Supervisor...Social Event Detection using Multimodal Clustering and Integrating Supervisor...
Social Event Detection using Multimodal Clustering and Integrating Supervisor...
 
Social network analysis for modeling & tuning social media website
Social network analysis for modeling & tuning social media websiteSocial network analysis for modeling & tuning social media website
Social network analysis for modeling & tuning social media website
 
Insemtives stanford
Insemtives stanfordInsemtives stanford
Insemtives stanford
 

More from John Breslin

Ireland: Island of Innovation and Entrepreneurship
Ireland: Island of Innovation and EntrepreneurshipIreland: Island of Innovation and Entrepreneurship
Ireland: Island of Innovation and EntrepreneurshipJohn Breslin
 
Old Ireland in Colour
Old Ireland in ColourOld Ireland in Colour
Old Ireland in ColourJohn Breslin
 
A Balanced Routing Algorithm for Blockchain Offline Channels using Flocking
A Balanced Routing Algorithm for Blockchain Offline Channels using FlockingA Balanced Routing Algorithm for Blockchain Offline Channels using Flocking
A Balanced Routing Algorithm for Blockchain Offline Channels using FlockingJohn Breslin
 
Collusion Attack from Hubs in the Blockchain Offline Channel Network
Collusion Attack from Hubs in the Blockchain Offline Channel NetworkCollusion Attack from Hubs in the Blockchain Offline Channel Network
Collusion Attack from Hubs in the Blockchain Offline Channel NetworkJohn Breslin
 
Collaborative Leadership to Increase the Northern & Western Region’s Innovati...
Collaborative Leadership to Increase the Northern & Western Region’s Innovati...Collaborative Leadership to Increase the Northern & Western Region’s Innovati...
Collaborative Leadership to Increase the Northern & Western Region’s Innovati...John Breslin
 
TRICS: Teaching Researchers and Innovators how to Create Startups
TRICS: Teaching Researchers and Innovators how to Create StartupsTRICS: Teaching Researchers and Innovators how to Create Startups
TRICS: Teaching Researchers and Innovators how to Create StartupsJohn Breslin
 
Entrepreneurship is in Our DNA
Entrepreneurship is in Our DNAEntrepreneurship is in Our DNA
Entrepreneurship is in Our DNAJohn Breslin
 
Galway City Innovation District
Galway City Innovation DistrictGalway City Innovation District
Galway City Innovation DistrictJohn Breslin
 
Innovation Districts and Innovation Hubs
Innovation Districts and Innovation HubsInnovation Districts and Innovation Hubs
Innovation Districts and Innovation HubsJohn Breslin
 
Disciplined mHealth Entrepreneurship
Disciplined mHealth EntrepreneurshipDisciplined mHealth Entrepreneurship
Disciplined mHealth EntrepreneurshipJohn Breslin
 
Searching for Startups
Searching for StartupsSearching for Startups
Searching for StartupsJohn Breslin
 
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...John Breslin
 
Innovation and Entrepreneurship: Tips, Tools and Tricks
Innovation and Entrepreneurship: Tips, Tools and TricksInnovation and Entrepreneurship: Tips, Tools and Tricks
Innovation and Entrepreneurship: Tips, Tools and TricksJohn Breslin
 
Growing Galway's Startup Community
Growing Galway's Startup CommunityGrowing Galway's Startup Community
Growing Galway's Startup CommunityJohn Breslin
 
Startup Community: What Galway Can Do Next
Startup Community: What Galway Can Do NextStartup Community: What Galway Can Do Next
Startup Community: What Galway Can Do NextJohn Breslin
 
Adding More Semantics to the Social Web
Adding More Semantics to the Social WebAdding More Semantics to the Social Web
Adding More Semantics to the Social WebJohn Breslin
 
Communities and Tech: Build Which and What Will Come?
Communities and Tech: Build Which and What Will Come?Communities and Tech: Build Which and What Will Come?
Communities and Tech: Build Which and What Will Come?John Breslin
 
Data Analytics and Industry-Academic Partnerships: An Irish Perspective
Data Analytics and Industry-Academic Partnerships: An Irish PerspectiveData Analytics and Industry-Academic Partnerships: An Irish Perspective
Data Analytics and Industry-Academic Partnerships: An Irish PerspectiveJohn Breslin
 
“I Like” - Analysing Interactions within Social Networks to Assert the Trustw...
“I Like” - Analysing Interactions within Social Networks to Assert the Trustw...“I Like” - Analysing Interactions within Social Networks to Assert the Trustw...
“I Like” - Analysing Interactions within Social Networks to Assert the Trustw...John Breslin
 
John Breslin at the Innovation Academy
John Breslin at the Innovation AcademyJohn Breslin at the Innovation Academy
John Breslin at the Innovation AcademyJohn Breslin
 

More from John Breslin (20)

Ireland: Island of Innovation and Entrepreneurship
Ireland: Island of Innovation and EntrepreneurshipIreland: Island of Innovation and Entrepreneurship
Ireland: Island of Innovation and Entrepreneurship
 
Old Ireland in Colour
Old Ireland in ColourOld Ireland in Colour
Old Ireland in Colour
 
A Balanced Routing Algorithm for Blockchain Offline Channels using Flocking
A Balanced Routing Algorithm for Blockchain Offline Channels using FlockingA Balanced Routing Algorithm for Blockchain Offline Channels using Flocking
A Balanced Routing Algorithm for Blockchain Offline Channels using Flocking
 
Collusion Attack from Hubs in the Blockchain Offline Channel Network
Collusion Attack from Hubs in the Blockchain Offline Channel NetworkCollusion Attack from Hubs in the Blockchain Offline Channel Network
Collusion Attack from Hubs in the Blockchain Offline Channel Network
 
Collaborative Leadership to Increase the Northern & Western Region’s Innovati...
Collaborative Leadership to Increase the Northern & Western Region’s Innovati...Collaborative Leadership to Increase the Northern & Western Region’s Innovati...
Collaborative Leadership to Increase the Northern & Western Region’s Innovati...
 
TRICS: Teaching Researchers and Innovators how to Create Startups
TRICS: Teaching Researchers and Innovators how to Create StartupsTRICS: Teaching Researchers and Innovators how to Create Startups
TRICS: Teaching Researchers and Innovators how to Create Startups
 
Entrepreneurship is in Our DNA
Entrepreneurship is in Our DNAEntrepreneurship is in Our DNA
Entrepreneurship is in Our DNA
 
Galway City Innovation District
Galway City Innovation DistrictGalway City Innovation District
Galway City Innovation District
 
Innovation Districts and Innovation Hubs
Innovation Districts and Innovation HubsInnovation Districts and Innovation Hubs
Innovation Districts and Innovation Hubs
 
Disciplined mHealth Entrepreneurship
Disciplined mHealth EntrepreneurshipDisciplined mHealth Entrepreneurship
Disciplined mHealth Entrepreneurship
 
Searching for Startups
Searching for StartupsSearching for Startups
Searching for Startups
 
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
 
Innovation and Entrepreneurship: Tips, Tools and Tricks
Innovation and Entrepreneurship: Tips, Tools and TricksInnovation and Entrepreneurship: Tips, Tools and Tricks
Innovation and Entrepreneurship: Tips, Tools and Tricks
 
Growing Galway's Startup Community
Growing Galway's Startup CommunityGrowing Galway's Startup Community
Growing Galway's Startup Community
 
Startup Community: What Galway Can Do Next
Startup Community: What Galway Can Do NextStartup Community: What Galway Can Do Next
Startup Community: What Galway Can Do Next
 
Adding More Semantics to the Social Web
Adding More Semantics to the Social WebAdding More Semantics to the Social Web
Adding More Semantics to the Social Web
 
Communities and Tech: Build Which and What Will Come?
Communities and Tech: Build Which and What Will Come?Communities and Tech: Build Which and What Will Come?
Communities and Tech: Build Which and What Will Come?
 
Data Analytics and Industry-Academic Partnerships: An Irish Perspective
Data Analytics and Industry-Academic Partnerships: An Irish PerspectiveData Analytics and Industry-Academic Partnerships: An Irish Perspective
Data Analytics and Industry-Academic Partnerships: An Irish Perspective
 
“I Like” - Analysing Interactions within Social Networks to Assert the Trustw...
“I Like” - Analysing Interactions within Social Networks to Assert the Trustw...“I Like” - Analysing Interactions within Social Networks to Assert the Trustw...
“I Like” - Analysing Interactions within Social Networks to Assert the Trustw...
 
John Breslin at the Innovation Academy
John Breslin at the Innovation AcademyJohn Breslin at the Innovation Academy
John Breslin at the Innovation Academy
 

Recently uploaded

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 

Recently uploaded (20)

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 

SP1: Exploratory Network Analysis with Gephi

  • 1. ICWSM’11 Tutorial Exploratory Network Analysis with: Instructors: Sébastien Heymann, Julian Bilcke seb@gephi.org, julian.bilcke@gephi.org July 17, 2011 | 1 PM - 4 PM
  • 2. Exploratory Network Analysis with Gephi This tutorial is an introduction to Gephi, the open source graph network visualization and manipulation software. Gephi aims to fulfill the complete chain from data importing to aesthetics refinements and interaction. Users interact with the visualization and manipulate structures, shapes and colors to reveal hidden properties. The goal is to help data analysts to make hypotheses, intuitively discover patterns or errors in large data collections. E At the end, the participants will walk away with the practical knowledge IN enabling them to use Gephi for their own projects. F F L O
  • 3. Exploratory Network Analysis with Gephi It starts with a brief introduction on the network exploration process and a hands-on demonstration of the essential functionalities of Gephi. Participants are guided step by step through the complete chain of rep- resentation, manipulation, layout, analysis and aesthetics refinements. Next, teams work on real datasets. They finally present their preliminary results. The tutorial concludes with a general question and answer session. IN E F F L O
  • 4. Requirements Bring your own laptop with Java and Gephi installed. Gephi should be updated (menu Help > Check for Updates). Bring a mouse with a wheel. Bring a dataset of your own if you want, verify if it loads well in Gephi.[1] [1] http://gephi.org/users/supported-graph-formats/
  • 5. Workshop Schedule - Part I Exploratory Network Analysis • Exploratory Data Analysis • Exploratory Network Analysis • Looking for Orderness in Data • Examples • Guideline Introduction to Gephi • Approach and Community • Networked Data • Quick Start Demo * 30 min break *
  • 6. Workshop Schedule - Part II Hands-On! • Team Work on a Dataset • Presentation of Preliminary Results Q&A
  • 7. Exploratory Data Analysis Confirmatory results Exploratory intuition Serendipity surprise “The greatest value of a picture is when it forces us started with to notice what we never expected to see” John Tukey (1962)
  • 8. Exploratory Data Analysis Non-linear processing chain of Ben Fry in Computational Information Design (2004)
  • 9. Dummy Example Observation: visual saliences on specific file sizes External knowledge: these sizes correspond to films New hypothesis on data: films are highly exchanged, so the study might dig in this direction P2P file size distribution (Latapy et al., 2008)
  • 10. Exploratory Network Analysis 2 interact in real time 1 see the network Gephi prototype (2008) 1st graph viz tool: Pajek (1996) group, filter, compute metrics... Vladimir Batagelj, Andrej Mrvar 3 build a visual language size by rank, color by partition, label, curved edges, thickness...
  • 11. Looking for a “Simple Small Truth”? Drew Conway, What Data Visualization Should Do: 1. Make complex things simple 2. Extract small information from large data 3. Present truth, do not deceive http://www.dataists.com/2010/10/what-data-visualization-should-do-simple-small-truth/
  • 12. Looking for Orderness in Data Make varying 3 cursors simultaneously to extract meaningful patterns MICRO level MACRO level at different levels 1 dimension N dimensions on multiple dimensions T+0 T+N at time scale
  • 13. “Zoom” cursor on Quantitative Data MICRO level MACRO level Global - connectivity - density - centralization Local - communities - bridges between communities - local centers vs periphery Individual - centrality - distances - neighborhood - location - local authority vs hub
  • 14. “Crossing” cursor on Qualitative Data 1 dimension N dimensions Social - who with whom - communities - brokerage - influence and power - homophily Semantic - topics - thematic clusters Geographic - spatial phenomena
  • 15. “Timeline” cursor on Temporal Data T+0 T+N Evolution of social ties Evolution of communities Evolution of topics
  • 16. Mapping an Innovation Center Collaborations on projects at Images et Réseaux Themes and content Actors Territory Franck Ghitalla & Ecole de Design de Nantes
  • 18. Network Map: a Series of Choices corpus data graphical operations algorithms communication thresholds goals
  • 19. Guideline # nodes 1 - 100 lists + edges in bonus, focus on qualitative data How attributes explain the structure? 100 - 1,000 • easy to read, “obvious” patterns • focus on entities (in context) • metrics are tools to describe the graph (centrality, bridging...) • links help to build and interpret categories of entities challenge: mix attribute crossing and connectivity How the structure explains attributes? 1,000 - 50,000 • hard to read, problem of “hidden signals”: track patterns with various layouts and filtering • focus on structures • metrics are tools to build the graph (cosine similarity...) • categories help to understand the structure challenge: pattern recognition > 50,000 require high computational power
  • 21. Gephi in a Nutshell « Like Photoshop™ for graphs. » Helps data analysts to reveal patterns and trends, highlight outliers and tells story with their data. • Network visualization platform • Open source, supported by a community • Built for performance and usability • Extensible by plug-ins • Windows, MacOS X, Linux
  • 22. Gephi Community Nonprofit organization Communities Contributors Mathieu Bastian, Mathieu Jacomy, Eduardo Ramos Ibañez, Sébastien Heymann, Guillaume Ceccarelli, André Panisson, Antonio Patriarca, Cezary Bartosiak, Martin Škurla, Patrick McSweeney, Yi Du, Hélder Suzuki, Daniel Bernardes, Ernesto Aneiro, Keheliya Gallaba, Luiz Ribeiro, Urban Škudnik, Vojtech Bardiovsky, Yudi Xue
  • 23. Community Mission Provide a “sustainable” software Maintain the technical ecosystem Build a business ecosystem Face cutting-edge technological challenges with a long-term vision Distribute the software in Open Source
  • 24. Community Values Open innovation: ideas and features come from the entire community. Decisions are taken with transparency. We consider this technology as a public good, and will keep it in open source.
  • 25. Diversity of Usages business leisure :-) communication academic art
  • 26. Diversity of Network Encoding V = { a, b, c, d, e } <graph> E = { (a,b), (a,d), (b,c), (e,a), (c,e) } <nodes> <node id=”a” /> <node id=”b” /> Textual <node id=”c” /> <node id=”d” /> <node id=”e” /> </nodes> <edges> <edge source=”a” target=”b” /> <edge source=”a” target=”d” /> a b c d e <edge source=”b” target=”c” /> a - 1 - 1 - <edge source=”e” target=”a” /> <edge source=”c” target=”e” /> b - - 1 - - </edges> c - - - - 1 </graph> d - - - - - e 1 - - - - XML Graphical Tabular and many others...
  • 27. Software I/O } MySQL PostgreSL SQL Server databases user input Neo4j CSV CSV Pajek NET Pajek NET file Guess GDF Guess GDF > GEXF GEXF GraphML GraphML file Graphviz DOT Excel Spreadsheet UCInet DL SVG NetdrawVNA PDF Tulip TLP PNG Excel Spreadsheet graph streaming
  • 28. Choosing a File Format re es e tu lu ut c Va ru s rib ph St lt t ra At au rix G re ef n t at gh al io tu D /M ic es s at ei ru ic e h st ut liz W ut am rc St Li rib rib ua ra ge L yn ge ie XM s t t Ed At At Vi D H Ed CSV Table of features supported DL Ucinet by Gephi DOT Graphviz GDF GEXF * spreadsheets can be loaded GML in the Data Laboratory GraphML NET Pajek TLP Tulip VNA Netdraw Spreadsheet*
  • 29. Do you need... Many features GEXF Spreadsheet GraphML Guess GDF GML UCINet DL Netdraw VNA Graphviz DOT Pajek NET File Type CSV XML Tulip TLP Tabular Few features Text
  • 30. Using Gephi E M O D
  • 31. Team work 1 Create a team of 2~3 people. 2 Choose a dataset. 3 Explore it during 1H. 4 Two teams present their preliminary findings.
  • 32. Dataset #1: GitHub Software Repository “GitHub is an application used by nearly a million people to store over two million code repositories, making GitHub the largest code host in the world.” Started in 2008, it provides the features of an online social network and a software repository to lower the barriers of collaboration and make the code easier to contribute. https://github.com
  • 33. Dataset #1: GitHub Software Repository Data extracted by Franck Cuny* at Linkfluence SAS 1st release in March 2010 -> this poster 2nd release in June 2011 -> your data _____________Network of user profiles__________ Nodes: peoples with at least one repository who are followed by at least two other people Edges: A follows B _____________Network of repositories__________ Nodes: repositories Edges: A shares a developer with B Very few research publications on this OSN! * franck.cuny@linkfluence.net
  • 34. Dataset #1: GitHub Software Repository Data extracted by a crawl using the GitHub API Seed: 10 well-known contributors in the Perl community Networks by country: Japan, France, United States Networks by language: Perl, PHP, Python, Ruby Node attributes: • user country • number of followers • main programming language Edges: • directed • weight = number of projects A has forked from B
  • 35. Dataset #1: GitHub Software Repository Your mission (should you decide to accept it): find research hypotheses based on your exploration Example question: are the Perl communities based on geography?
  • 36. Dataset #2: The Irish Blogosphere “Identifying Representative Textual Sources in Blog Networks”. K. Wade, D. Greene, C. Lee, D. Archambault, P. Cunningham (2011) http://mlg.ucd.ie/blogs _______________Blogroll Network______________ Nodes: blogs with more than two blogroll links Edges: blogroll link (in-link) _______________Post-link Network_____________ Nodes: blogs with more than two blogroll links Edges: hyperlink inside post from a blog to another (post-link)
  • 37. Dataset #2: The Irish Blogosphere Data extracted by a crawl at distance 2 from the seed for the in-links and Google Blog Search for the post-links. Seed: 21 popular blogs, winners of the “2010 Irish Blog Awards” Node attributes: • post count = total number of posts by blog • category = from the irish blog index at www.irishblogdirectory.com, where available • infomap_comm = community to which a node belongs (infomap algo) • gce_comms = overlapping communities (GCE algo) • moses_comms = overlapping communities (MOSES algo) Edges: • directed • weight = number of hyperlinks in the Post-link network crawl at distance 2 from the seed
  • 38. Dataset #2: The Irish Blogosphere Your mission: explore and try to confirm the official results
  • 39. Hands-On! Start: • Load a graph • Apply a layout • Color the nodes by a qualitative variable in Partition Panel • Size the nodes by a quantitative variable in Ranking Panel • Start to explore...compute metrics, filter the network End: • Export maps to PDF in Preview Tab • Save
  • 40. Presentations GitHub Repository Irish Blogosphere
  • 41. Gephi Documentation Web Site: http://gephi.org Support: http://forum.gephi.org Wiki: http://wiki.gephi.org Source code: https://launchpad.net/gephi Online Tutorials http://gephi.org/users/quick-start/ http://gephi.org/users/tutorial-visualization/ http://gephi.org/users/tutorial-layouts/ http://wiki.gephi.org/index.php/Import_CSV_Data http://wiki.gephi.org/index.php/Import_Dynamic_Data Tutorial in Spanish https://code.google.com/p/camon/wiki/Taller_Gephi Supported Graph Formats http://gephi.org/users/supported-graph-formats/
  • 42. Thank You! Caspar David Friedrich - Wanderer Above the Sea of Fog
  • 43. Credits [slide 11] images from Drew Conway http://www.dataists.com/2010/10/what-data-visualization-should-do-simple-small-truth/ [slide 22 top left] Benoît Vidal at MFG Labs [slide 22 bottom center] Franck Ghitalla at UTC [slide 22 right] Studies in MA Digital Fashion at LCF by Peter Jeun Ho Tsang http://jeunhotsang.com/blog/2010/12/07/prototype/ [slide 27] sketches from Ben Fry, Computational Information Design Special Thanks to Franck Ghitalla and Mathieu Jacomy for their insightful discussions.