Quick Upload

Loading...
Flash Player 9 (or above) is needed to view slideshows. We have detected that you do not have it on your computer.To install it, go here
Post to Twitter Post to Twitter
Share on Facebook
Myspace Hi5 Friendster Xanga LiveJournal Facebook Blogger Tagged Typepad Freewebs BlackPlanet gigya icons
SlideShare is now available on LinkedIn. Add it to your LinkedIn profile.

The Future of Social Networks on the Internet: The Need for Semantics

From Cloud, 6 months ago Add as contact

Semantic Technologies Conference 2008 / San Jose, USA / 19th May 2008

7789 views | 0 comments | 31 favorites | 578 downloads | 18 embeds (Stats)

Embed in your blog options close
Embed (wordpress.com) Exclude related slideshows Embed in your blog

More Info

This slideshow is Public
Total Views: 7789 on Slideshare: 6810 from embeds: 979
Flagged as inappropriate Flag as inappropriate

Flag as inappropriate

Select your reason for flagging this slideshow as inappropriate.

If needed, use the feedback form to let us know more details.

Slideshow Transcript

  1. Slide 1: The Future of Social Networks on the Internet: The Need for Semantics John G. Breslin, Stefan Decker, Uldis Bojars {firstname.lastname@deri.org} Semantic Technologies Conference / San Jose, USA / 19th May 2008  Copyright 2008 Digital Enterprise Research www.deri.org Institute. All rights reserved.
  2. Slide 2: URL for the presentation View the slides at Slideshare: http://url.ie/e46 2
  3. Slide 3: Where in the world are we? 3
  4. Slide 4: Our mission and vision • DERI Galway’s mission is “to exploit semantics for: – People – Organisations – Systems • to collaborate and interoperate on a global scale” • DERI Galway’s vision is “to be recognised as being among the leading international web science research institutes interlinking technologies, information and people to advance business and benefit society” 4
  5. Slide 5: Some statistics • Founded June 2003 with 1 fulltime member (green field) • Status as of May 2008: – About 130 members (from 27 nations) and growing • Total research grants: – About €23M so far, 17 national and 16 EU projects • Research publications > 370 – Leading in International and European Semantic Web Conferences – Participates in 12 standardisation groups • Example technologies: – Semantic Digital Libraries – Semantic Desktop (in KDE4) – Semantic Web Search Engine 5
  6. Slide 6: Core industrial partners 6
  7. Slide 7: On the shoulders of giants… • Memex (Vannevar Bush) A memex is “a device in which an individual stores all his books, records, and communications.” • Augmenting Human Intellect (Doug Engelbart) “By \"augmenting human intellect\" we mean increasing the capability of a man to approach a complex problem situation, to gain comprehension to suit his particular needs, and to derive solutions to problems.” • WWW (Tim Berners-Lee) “There was a second part of the dream […] we could then use computers to help us analyse it, make sense of what we re doing, where we individually fit in, and how we can better work together.” 7
  8. Slide 8: It wasn’t the right time then… Where are we now? 8
  9. Slide 9: Now, we are making progress… 9
  10. Slide 10: A network of knowledge… • Interconnected • Universal • All encompassing • Enable global and local collaboration • The right information for the right people at the right time 10
  11. Slide 11: Getting to work in the DERI house 11
  12. Slide 12: What we’re going to talk about today… 1. Collaborating via the Social Web 3. Social networking services (SNSs) so far 5. Issues with social networking services 7. Leveraging semantics on the Social Web: • FOAF and SIOC • Producers • Collectors • Consumers 8. Leveraging semantics in Enterprise 2.0 SNSs 12
  13. Slide 13: Social media sites are like data silos 13 * Source: Pidgin Technologies, www.pidgintech.com
  14. Slide 14: Many isolated communities of users and their data 14 * Source: Pidgin Technologies, www.pidgintech.com
  15. Slide 15: Need ways to connect these islands 15 * Source: Pidgin Technologies, www.pidgintech.com
  16. Slide 16: Allowing users to easily move from one to another 16 * Source: Pidgin Technologies, www.pidgintech.com
  17. Slide 17: Enabling users to easily bring their data with them 17 * Source: Pidgin Technologies, www.pidgintech.com
  18. Slide 18: 1. Collaborating via the Social Web  Copyright 2008 Digital Enterprise Research www.deri.org Institute. All rights reserved.
  19. Slide 19: A move from the Web to a “social web” The New Yorker, 1993 The New Yorker, 2005 “On the Internet, nobody knows “I had my own blog for a while, you’re a dog.” but I decided to go back to just pointless, incessant barking.” 19
  20. Slide 20: What is social media? • http://en.wikipedia.org/wiki/Social_media – “Social media uses the ‘wisdom of crowds’ to connect information in a collaborative manner.” – “Social media can take many different forms, including message boards, weblogs, wikis, podcasts, pictures and video.” • Popular examples of social media sites: – Wikipedia, MySpace / Facebook, Twitter, YouTube, SecondLife, Upcoming, Digg / Reddit / StumbleUpon, Flickr / Zooomr, del.icio.us, World of Warcraft, Amazon • Related terms: – Web 2.0, Social Web, social software, social networks, social news, social bookmarking, user-generated content 20
  21. Slide 21: What is Web 2.0? • http://en.wikipedia.org/wiki/Web_2.0 – “Web 2.0 refers to a perceived second generation of web-based communities and hosted services - such as social-networking sites, wikis and folksonomies - which aim to facilitate collaboration and sharing between users.” • The term Web 2.0 was made popular by Tim O’Reilly: – http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what -is-web-20.html 21
  22. Slide 22: Features / principles of Web 2.0 (O’Reilly) 1. The Web as platform 2. Harnessing collective intelligence 3. Data is the next “Intel Inside” 4. End of the software release cycle 5. Lightweight programming models 6. Software above the level of a single device 7. Rich user experiences + The long tail 22
  23. Slide 23: Web 2.0 and social media in simple terms 1. Users 2. Content 3. Tags 4. Comments – Users post content – Users share content – Users annotate content with tags – Users browse content via tags – Users discuss content via comments – Users connect via posted content – Users connect directly to users 23
  24. Slide 24: Content can be… • Books Amazon • Discussion postings Blogs • Bookmarks del.icio.us • Photos Flickr • Music Last.fm • Movies Netflix • Events Upcoming.org • Places Dopplr • Products Microsoft Aura • Articles Wikipedia 24
  25. Slide 25: Blogging: a phenomenon for a new generation? • Cincinnati Enquirer, October 2004 25
  26. Slide 26: Overview of blogs • Weblog, web log or simply a blog is a web journal • “A web application which contains periodic time-stamped posts on a common (usually open-access) webpage” • Individual diaries -> arms of political campaigns, media programs and corporations (e.g. the Google Blog) • Citizen journalism… • Posts are often shown in reverse chronological order • Comments can be made by the public on some blogs • Latest headlines, with hyperlinks and summaries, are syndicated using RSS or Atom formats (e.g. for reading favourite blogs with a feed reader) 26
  27. Slide 27: The state of the blogosphere from Technorati • 70 million blogs • The blogosphere is doubling in size every 320 days (slowing down a little) • 120,000 new blogs are created each day (i.e. 1.4 new blogs every second) • 1.5 million blog posts are made in a day (i.e. 17 posts per second) • Around 5-10% of new blogs are spam blogs or “splogs” • 35% of blog posts use tags 27
  28. Slide 28: Definition of wikis • A wiki is a type of website that allow users to easily add and edit content and is especially suited for collaborative writing • The name is based on the Hawaiian term wiki-wiki, meaning “quick”, “fast”, or “to hasten” • It amasses to a group of web pages that allows users to quickly add content and also allows others to edit the content: – It relies on cooperation, checks and balances of its members, and a belief in sharing of ideas
  29. Slide 29: Some uses of wikis • Wikis are being used for: – online encyclopaedias – free dictionaries – book repositories – software development – project proposals – writing research papers – event organisation
  30. Slide 30: The Wikipedia: from Irish to Esperanto 30
  31. Slide 31: Flickr, share your photos 31
  32. Slide 32: SlideShare for presentations 32
  33. Slide 33: The social bookmarking service del.icio.us 33
  34. Slide 34: All Consuming, what have you read today? 34
  35. Slide 35: LibraryThing, find out who else reads like you 35
  36. Slide 36: CiteULike, get publication references from peers 36
  37. Slide 37: Upcoming event listings and meetups 37
  38. Slide 38: Dopplr for managing travel, tracking friends abroad 38
  39. Slide 39: TouristR for travel destination stories and info 39
  40. Slide 40: You can even share your favourite walks… 40
  41. Slide 41: …and find others with like musical interests 41
  42. Slide 42: 2. Social networking services (SNSs) so far  Copyright 2008 Digital Enterprise Research www.deri.org Institute. All rights reserved.
  43. Slide 43: We all live in a social network… • …of friends, family, workmates, fellow students, acquaintances, etc. 43
  44. Slide 44: Everyone’s connected… • Friend of a friend, or “dúirt bean liom go ndúirt bean leí” • Theory that anybody is connected to everybody else (on average) by no more than six degrees of separation 44
  45. Slide 45: Milgram’s six degrees of separation theory • Sociologist Milgram conducted Stanley Milgram (1933-1984) this experiment: – Random people from Nebraska were to send a letter (via intermediaries) to a stock broker in Boston – Could only send to someone with whom they were on a first-name basis • Among the letters that found the target, the average number of links was six 45
  46. Slide 46: And now a major motion picture, kind of… – “I read somewhere that Six Degrees of Separation (1993) everybody on this planet is separated by only six other people. Six degrees of separation between us and everyone else on this planet. The President of the United States, a gondolier in Venice, just fill in the names... It’s not just big names — it’s anyone. A native in a rain forest, a Tiero del Fuegan, an Eskimo. I am bound — you are bound — to everyone on this planet by a trail of six people.” – Play from 1990 by John Guare 46
  47. Slide 47: The Erdős number • Number of links required to Paul Erdős (1913-1996) connect scholars to Erdős via co-authorship of papers • Erdős wrote 1500+ papers with 507 co-authors • Jerry Grossman’s site allows mathematicians to compute their Erdős numbers: – http://www.oakland.edu/enp/ • Connecting path lengths, among mathematicians only: – The average is 4.65 – The maximum is 13 47
  48. Slide 48: Trying to make friends Latvia Uldis Valdis DERI Met John Marc Clare Bros John C Andrew Dublin Marc and I already had friends in common! I later found out my cousin Ailish also knows Andrew. The “small world” phenomenon… 48
  49. Slide 49: “It’s a small world after all!”, by Kentaro Toyama Bash Kentaro Ranjeet Sharad Prof. McDermott Anandan Prof. Sastry Prof. Prof. Veni Prof. Balki Venkie Kannan Ravi’s Father Karishma Ravi Prof. Prahalad Pres. Kalam Maithreyi Pawan Prof. Jhunjhunwala Soumya Aishwarya PM Manmohan Dr. Isher Judge Singh Amitabh Ahluwalia Nandana Bachchan Dr. Montek Singh Prof. Amartya Sen Sen Ahluwalia 49 * Source: http://research.microsoft.com/toyama/talks/
  50. Slide 50: The Kevin Bacon game • Invented by three Albright Boxed version of the game College students in 1994: – Craig Fass, Brian Turtle, Mike Ginelly • Goal is to connect any actor to Kevin Bacon, by linking actors who have acted in the same movie • The “Oracle of Bacon” website uses IMDB to find the shortest link between any two actors: – http://oracleofbacon.org/ 50
  51. Slide 51: The Kevin Bacon game (2) • Total number of actors in database (as of 15th October): – 893283 • Average path length to Kevin: – 2.957 • Actor closest to “center”: – Rod Steiger (2.68) • Rank of Kevin, in terms of closeness to center: – 1049th • Most actors are within three links of each other! 51
  52. Slide 52: What are social networking services (SNSs)? • From the beginning, the • 2002: Internet was a medium for – Friendster connecting not only • 2003: machines but people – MySpace, LinkedIn, hi5 • 2004: • Idea behind SNSs is to – orkut, Facebook make the aforementioned • 2005: real-world relationships – Bebo explicitly defined online 52
  53. Slide 53: The popularity of SNSs • The 10 most popular Alexa rankings: domains ~= 40% percent of all page views on the #5: MySpace Web (Compete, November 2006) #6: Facebook – Nearly half of those views #8: hi5 were from the social #10: orkut networking services MySpace and Facebook – #18: Friendster wow! #119: Bebo – And that’s just in the top #212: LinkedIn 10… 53
  54. Slide 54: SNSs attracting lots of monetary / media attention • Friendster – $13M VC • Tribe – $6.3M VC • LinkedIn – $4.7M VC • Bebo – $15M VC, sold to AOL for $850M • MySpace – Sold for $580M • Friends Reunited – Sold for £120M • Facebook – $1B Y! offer, 1.6% sold to MS for $250M 54
  55. Slide 55: Motivation for social network services • Allows a user to create and maintain an online network of close friends or business associates for social and professional reasons: – Friendships and relationships – Offline meetings – Curiosity about others – Business opportunities – Job hunting … – For social good: • Kevin Bacon – sixdegrees.org • Ammado - ammado.com • Sun – openeco.org
  56. Slide 56: Big social network services (in terms of accounts) • myspace.com 110,000,000 • facebook.com 98,000,000 • habbo.com 86,000,000 • spaces.live.com 40,000,000 • orkut.com 59,000,000 • hi5.com 70,000,000 • friendster.com 58,000,000 • xanga.com 40,000,000 • classmates.com 40,000,000 • flixster.com 36,000,000 • netlog.com 32,000,000 • reunion.com 28,000,000 http://en.wikipedia.org/wiki/List_of_social_networking_websites 56
  57. Slide 57: Features of social network services • Network of friends (inner circle) • Person surfing • Private messaging • Discussion forums • Events management • Blogging and commenting • Media uploading 57
  58. Slide 58: Facebook, #6 in the world 58
  59. Slide 59: The success of (and hype around) Facebook • According to Robert Scoble today, MS want to buy Facebook for $15-$20B: – http://scobleizer.com/2008/05/19/why-microsoft-will-buy- facebook-and-keep-it-closed/ • 4,000 applications have been created for Facebook’s developer interface: – 70,000 developers signed up • Active user count jumped by 70% in the four months after this contributable application layer was added • 50% of Facebook users are non-students: – People over 24 are its fastest-growing demographic 59
  60. Slide 60: orkut, Google’s SNS 60
  61. Slide 61: Get LinkedIn to business contacts, 15 million users 61
  62. Slide 62: OpenEco, a SNS for managing GHG emissions 62
  63. Slide 63: Elgg, social networking software for education 63
  64. Slide 64: Other niche SNSs • Age: – Multiply (seniors and settled); Boomj (baby boomers); Rezoom • Country of origin: – Silicon India • Gender: – CaféMom; MothersClick; Sister Woman (female friends) • Occupation: – ModelsHotel; FanLib (fiction writers); AdGabber; TheFeng.org (financial services executives); MilitarySpot (military families); Sermo (doctors and physicians) • Business and careers: – ConnectBuzz; Doostang; Execunet; Netshare; Ryze; Viadeo; Xing • Interests: – TradeKing (investors); StreetCred (hip hop); IndiePublic (art and design); PeerTrainer (health and wellbeing) 64 * Source: Paul Gibler, Wisconsin Technology Network
  65. Slide 65: Enterprise 2.0 • Web 2.0 includes applications such as blogs, wikis, RSS feeds and social networking, while Enterprise 2.0 is the packaging of those technologies in both corporate IT and workplace environments • “Enterprise 2.0 is the use of emergent social software platforms within companies, or between companies and their partners or customers”, Harvard Business School’s Professor Andrew McAfee • “There are direct enterprise equivalents [to Facebook]. You can ask people the status of their projects, what they’re working on, are they travelling, things they’ve learned. All of these things would be very valuable inside an enterprise.” 65
  66. Slide 66: Enterprise 2.0 (2) • Social media services that people have been using in everyday life on the Web are now entering organisations: – Blogs – Wikis – Social networking – Tagging • Lots of companies and products in this space: – Awareness, Mentor Scout, Contact Networks, Microsoft SharePoint, IBM Lotus Connections, SelectMinds, introNetworks, Tacit, Illumio, Jive Software, Visible Path, Leverage Software, Web Crossing, SocialText • These new deployments also face the same issues that are on the Web 66
  67. Slide 67: introNetworks 67
  68. Slide 68: Jive Software 68
  69. Slide 69: Visible Path – Visible Path powers “Hoover’s Connect” for business research company Hoover's, which lets users know how they're connected to companies and people in the Hoover's database 69
  70. Slide 70: 3. Issues with social networking services  Copyright 2008 Digital Enterprise Research www.deri.org Institute. All rights reserved.
  71. Slide 71: Problems with SNSs • Fundamental problems block their potential to access the full range of available content and networked people online • There is a need to build semantic social networking into the fabric of the next- generation Internet itself: – Interconnecting both content and people in a meaningful way 71
  72. Slide 72: First issue Need interesting objects to draw you back to keep on using social networking services 72 * Source: Jyri Engestrom, “Object-Centered Sociality”, Reboot 7
  73. Slide 73: Many social networking services are boring… 73 * Source: Jyri Engestrom, “Object-Centered Sociality”, Reboot 7
  74. Slide 74: Object-centred sociality can provide meaning • Users connected via a common object, e.g., their job, university, hobbies, a date… • “Another tradition of theorizing offers an explanation of why Russell linked out, and why so many YASNS ultimately fail.” • “According to this theory, people don’t just connect to each other. They connect through a shared object.” 74 * Source: Jyri Engestrom, “Why Some Social Networks Work…”
  75. Slide 75: Object-centred sociality can provide meaning (2) • “When a service fails to offer the users a way to create new objects of sociality, they turn the connecting itself into an object [LinkedIn].” • “Good services allow people to create social objects that add value.” – Flickr = photos – del.icio.us = bookmarks – Blogs = discussion posts 75 * Source: Jyri Engestrom, “Why Some Social Networks Work…”
  76. Slide 76: These are the social objects… • Discussions • Bookmarks • Annotations • Profiles • Microblogs • Multimedia … …that connect us to other people 76
  77. Slide 77: Second issue We all have too many separate profiles and sets of contacts on disconnected social networking services 77
  78. Slide 78: So many social media sites… 78 * Source: Smashcut Media, www.smashcut-media.com
  79. Slide 79: Even more services… 79
  80. Slide 80: It takes a lot of time… 80
  81. Slide 81: Filling out your profiles, re-adding your friends… 81
  82. Slide 82: Uploading posts and content items to “stovepipes”! 82
  83. Slide 83: What if I use multiple services and I want to… • Move the stuff I have on one service to another (e.g. move all my blog posts, comments, friends, etc. from WordPress.com to “Acme Blogs”) • Move all my stuff from multiple services to one third-party service • Centralise my stuff on my own service, e.g. my blog • See my stuff on a third-party service providing an aggregate view, like FriendFeed 83
  84. Slide 84: (De-)centralised me 84
  85. Slide 85: Initiatives set up to address this recently • Social network portability: – http://groups.google.com/group/social-network-portability • A bill of rights for users of the Social Web: – http://opensocialweb.org/ • DataPortability: – http://dataportability.org/ • DiSo: – http://code.google.com/p/diso/ • OpenSocial (see also Friend Connect): – http://opensocial.org/ 85
  86. Slide 86: Social network portability • Need distributed social networks and reusable profiles • Users may have many identities and sets of friends on different social networks, where each identity was created from scratch • Allow user to import existing profile and contacts, using a single global identity with different views (e.g., via FOAF, hCard, OpenID, etc.) • See also: – http://bradfitz.com/social-graph-problem/ – http://danbri.org/words/2007/09/13/194 – http://code.google.com/apis/socialgraph/ 86
  87. Slide 87: Social networking fatigue • How many general or niche SNSs are you willing to register and / or interact with? • People search engine and aggregation sites are now appearing to compensate: – SocialURL – organise your online identities – PeekYou – matching web pages with their owners – Spock – organising information around people – Rapleaf – reputation lookup and email search – Wink – free people search engine – FriendFeed – subscribe to all of your friends’ feeds 87
  88. Slide 88: Ownership, control, freedom at opensocialweb.org 88
  89. Slide 89: The DataPortability initiative • http://dataportability.org • Existing technologies • Inventing no new ones 89
  90. Slide 90: Other initiatives “near” DataPortability 90
  91. Slide 91: Fold a social networking layer into tech stacks • Make social networking a shared component across various desktop and Web applications • Rather than having a fragmented view of one’s network in each application, the social networking stack would let users employ all their person-to-person connections in any application: – See http://doi.ieeecomputersociety.org/10.1109/MIC.2007.138 91
  92. Slide 92: 4. Leveraging semantics on the Social Web  Copyright 2008 Digital Enterprise Research www.deri.org Institute. All rights reserved.
  93. Slide 93: timbl on Semantic Web / Social Web synergies “I think we could have Sir Tim Berners-Lee, podcast both Semantic Web interview during ISWC 2005 technology supporting online communities, but at the same time also online communities can support Semantic Web data by being the sources of people voluntarily connecting things together.” http://esw.w3.org/topic/IswcPodcast 93
  94. Slide 94: Semantics can help • By using agreed-upon semantic formats to describe people, content objects and the connections that bind them all together, social media sites can interoperate by appealing to common semantics • Developers are already using semantic technologies to augment the ways in which they create, reuse, and link profiles and content on social media sites (using FOAF, XFN / hCard, SIOC, etc.) • In the other direction, object-centered social networks can serve as rich data sources for semantic applications 94
  95. Slide 95: The (evolving) Semantic Web layer cake • http://www.w3.org/2007/03/layerCake.png 95
  96. Slide 96: A need for common semantics • Communities should provide their data in a common, machine-understandable way: – RDF (resource description framework) as a data layer – One single format for all the data – Different transport layers (RDF/XML, N3, etc.) – The base of the Semantic Web • Communities should use common semantics to define this data: – Avoiding the use of proprietary APIs – Since this means that they can talk together, exchange information, using the same modelling layer for their data – Using SIOC for representing content and actions – Using FOAF for representing people and networks 96
  97. Slide 97: FOAF (Friend-of-a-Friend) • FOAF is an ontology for describing people and the relationships that exist between them • Can be integrated with any other SW vocabularies • Some services with FOAF exports: • People can also create their own FOAF document and link to it from their homepage • FOAF documents usually contain personal info, links to friends, and other related resources 97
  98. Slide 98: A distributed social network with FOAF • Can use FOAF to describe social networks across a number of services • Picture shows data from both boards.ie and John’s hand- coded FOAF file 98
  99. Slide 99: The (lowercase) semantic web • Microformats: – http://microformats.org/ – “Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards.” – Embedded metadata within (X)HTML web pages 99
  100. Slide 100: 100
  101. Slide 101: Semantically-Interlinked Online Communities (SIOC) • An effort from DERI to discover how we can create and establish ontologies on the Semantic Web • Goal of the SIOC ontology is to address interoperability issues on the (Social) Web • SIOC has been adopted in a framework of 50 applications or modules deployed on over 400 sites • http://sioc-project.org 101
  102. Slide 102: Motivations for SIOC • Need to understand how to create and establish ontologies on the Web: – Social engineering is required – Model, agree, deploy, re-model • Disconnected sites on the Social Web require ontologies for interoperation: – Lots of social data, inherent semantics (chicken and egg) – Potential for high impact • In parallel, lack of integration between social software and other systems in enterprise intranets 102
  103. Slide 103: The aims of SIOC • To “semantically-interlink online communities” • To fully describe the content and structure of community sites • To create new connections between online discussion posts and items, forums and containers • To enable the integration of online community information • To browse connected Social Web items in interesting and innovative ways • To overcome the chicken-and-egg problem with the Semantic Web 103
  104. Slide 104: 104
  105. Slide 105: 105
  106. Slide 106: The steps involved 1. Develop an ontology of terms for representing rich data from the Social Web 2. Create a food chain for producing, collecting and consuming SIOC data 3. As well dissemination via papers about SIOC, provide docs and examples at sioc-project.org • SIOC aims to enrich the Web infrastructure: – During the next upgrade cycle, gigabytes of community data become available! 106
  107. Slide 107: The SIOC ontology • The main classes and properties are: SIOC Specification: http://rdfs.org/sioc/spec 107
  108. Slide 108: The SIOC food chain 108
  109. Slide 109: Dissemination 109
  110. Slide 110: 110
  111. Slide 111: Quotes about SIOC • “I […] think the concept is HOT” – Robert Douglass, Drupal Developer • “It just dawned on me that the burgeoning SIOC-o-sphere (online communities exporting and exposing content via SIOC Ontology) is actually: Blogosphere 2.0” – Kingsley Idehen, Founder and CEO of OpenLink Software • “SIOC has the potential to become one of the foundational vocabularies that make Semantic Web applications useful” – Ivan Herman, W3C / ERCIM • “A project that started back in 2000 called Friend-of-a-Friend (FOAF) represents relationships between people, as well as basic contact details. SIOC does this for groups: it extends the FOAF idea to being able to talk about whole groups of people. I am excited about SIOC because you can use that information to determine trust, to let people in.” – Tim Berners-Lee, Creator of the World Wide Web 111
  112. Slide 112: SIOC metrics • SIOC documents at PTSW: – 107759 (SIOC) – 96540 (SIOC Types) • 42911 hits in Swoogle 120000 100000 • Sites producing SIOC data: 80000 – 373 listed in PTSW pings 60000 • SIOC ontology is ranked 4th 40000 and SIOC Types module 5th in 20000 500 ontologies at PTSW 0 29/09/2007 13/10/2007 27/10/2007 15/03/2008 29/03/2008 12/04/2008 01/09/2007 15/09/2007 24/11/2007 08/12/2007 22/12/2007 05/01/2008 19/01/2008 02/02/2008 16/02/2008 01/03/2008 10/11/2007 • SIOC developer mailing list: – 200 members – 900 posts 112
  113. Slide 113: What is required to represent a community? • Represent the data, not only documents: – From the WWW to a “GGG”, hyperlinks to semantic relationships • A model for all the aspects of a community: – Users accounts, groups and roles: • Reader, reviewer, moderator – Content and types: • A blog, a blog post, a bulletin board, a wiki page, etc. – Actions between users and content: • Uldis creates a post, Alex comments on it, John moderates it • A model for the entire content: – Any data: RSS 1.0 and Atom limited to syndication / latest posts – Any user and relationship: new user, new post, replies, etc. 113
  114. Slide 114: Representing community data with SIOC • Using SIOC as an ontology to represent the activities of online communities on the Web: – Namespace: http://rdfs.org/sioc/ns – Five top-level classes: User / Role / Space / Container / Item – A “SIOC Types” module for Social Web content – Action: A user posts an item in a container • A Semantic Web citizen: – Reusing and interlinking existing ontologies – Not reinventing the wheel (connects to DC, FOAF, etc.): • http://www.w3.org/Submission/2007/SUBM-sioc-related-20070612/ 114
  115. Slide 115: The SIOC ontology • The main classes and properties are: SIOC Specification: http://rdfs.org/sioc/spec 115
  116. Slide 116: Example of SIOC data • Alex wrote a post on his WordPress blog: :myblogpost rdf:type sioc:Post ; dc:title “I’m blogging this” ; sioc:has_creator :alex ; sioc:has_container :mywpblog . :mywpblog rdf:type sioc:Forum . 116
  117. Slide 117: The same model for any website • John wrote a post on his Drupal-powered blog: :myblogpost rdf:type sioc:Post ; dc:title “Another blog post” ; sioc:has_creator :john ; sioc:has_container :mydrupal . :mydrupal rdf:type sioc:Forum . 117
  118. Slide 118: The same model for rich data • Uldis owns a photo gallery on Flickr: :myitempost rdf:type exif:IFD ; dc:title “Another posted item”; sioc:has_creator :john ; sioc:has_container :myflickrgallery . :myflickrgallery rdf:type sioct:ImageGallery . • We reuse external vocabularies (e.g. EXIF) to define item types 118
  119. Slide 119: 119
  120. Slide 120: Interlinking communities • Since all communities can use the same model to define their data, it is easy to link them from a data point of view • Interlinking: – URIs are used to define things and created objects – A post on blog “A” can be semantically linked to a post on blog “B” • Using SPARQL to query data: – Can perform unified queries no matter where the data comes from – No need to learn new APIs from data providers – SPARQL is a W3C Recommendation for querying RDF 120
  121. Slide 121: FOAF and social network connections • FOAF allows us to represent the connections between people: – A machine-readable format for social-networking • Using the foaf:knows property: – :John foaf:knows :Alex • Extensions using the RELATIONSHIP vocabulary: – http://vocab.org/relationship/ – All rel:* properties are subproperties of foaf:knows – :John rel:worksWith :Uldis – RDFS inferencing allows tools to answer queries using foaf:knows when people use rel:* alternatives 121
  122. Slide 122: Linking people to user accounts • FOAF is the main vocabulary used to represent people: – http://foaf-project.org – foaf:Person class: • “The foaf:Person class represents people. Something is a foaf:Person if it is a person.” – foaf:holdsAccount property: • “The foaf:holdsAccount property relates a foaf:Agent to a foaf:OnlineAccount for which they are the sole account holder.” – Linking people to user accounts: • sioc:User rdfs:subClassOf foaf:onlineAccount • Links a foaf:Person to various sioc:User(s) • As many sioc:User(s) as required can be linked to a single person • One people, various identities 122
  123. Slide 123: Representing users and online accounts • The sioc:User class: – An online user account – Can be thought of as a virtual representation of any person online, within the context of a given social media website or community – A subclass of foaf:OnlineAccount – Various properties: • name, avatar, email – Users create and manage content: • has_creator and has_modifier properties • :blogpost123 sioc:has_creator :john – A user can have roles on a given container: • (Moderator, Forum 1) ← User A • (Contributor, Blog 2) ← User B 123
  124. Slide 124: A person and their user accounts 124
  125. Slide 125: Add SKOS for topics and categories • Interlinking using common categories: – Share tags and topics across different content • SKOS (Simple Knowledge Organisation System): – http://www.w3.org/2004/02/skos/ – A vocabulary to describe controlled vocabularies – Used in the “Tag Ontology”: • http://www.holygoat.co.uk/projects/tags/ 125
  126. Slide 126: Interlinking content with SKOS skos:isSubjectOf sioc:topic 126
  127. Slide 127: Interlinking content items • Can create direct links between instances of sioc:Item: – Link from a blog post to a bulletin board page – sioc:related_to, sioc:links_to, sioc:has_reply • Interlinking using common categories: – Share tags and topics across different content – SKOS: Simple Knowledge Organisation System • http://www.w3.org/2004/02/skos/ • A vocabulary to describe controlled vocabularies • Used in the “Tag Ontology”: http://www.holygoat.co.uk/projects/tags/ • Interlink using existing URIs as topics – geonames.org , DBpedia, Revyu – MOAT: a process to simplify linking content to such URIs • http://moat-project.org/ 127
  128. Slide 128: Identity management across networks • Social media sites (or RDF exporters) create a new foaf:Person instance when they export their data: – TalkDigger, Revyu, Flickr exporters, etc. – There is a need to unify URIs so as to represent one's unified identity • Linked-data principles are to use owl:sameAs and rdfs:seeAlso: – See http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/ – owl:sameAs: Used to identify two resources with different URIs as being the same resource – rdfs:seeAlso: “More information about this resource can be found here”, can be used by Semantic Web tools such as Tabulator • Inference using owl:InverseFunctionalProperty: – foaf:mbox, foaf:openid, etc. can be used to identify uniqueness for a foaf:Person • Unifying aspects of a foaf:Person across networks: – All relevant sioc:User accounts may be related to one foaf:Person 128
  129. Slide 129: Linking foaf:Person URIs for one person :alex owl:sameAs flickr:33669349@N00 ; owl:sameAs twitter:terraces 129
  130. Slide 130: Distributed social networking with FOAF • Combining networks from multiple FOAF URIs via owl:sameAs: – Decentralised social networks can represent connections for the same person – A person’s networks can be merged together – Any sub-network in the social graph can be reached from a single entry point, via the person’s URI 130
  131. Slide 131: Integrating social networks with FOAF Common formats, unique URIs 131 * Source: Sheila Kinsella, Applications of Social Network Analysis 2007
  132. Slide 132: Distributed social networking with FOAF 132
  133. Slide 133: Applications for browsing the social (semantic) graph • FOAFnaut, FOAF Explorer, etc. • FOAFGear: thanks to common semantics, only 100 lines of code: http://apassant.net/home/2008/01/foafgear/ 133
  134. Slide 134: Aggregation of semantic social networks • Browse / re-use your social graph in personal applications • Merge identities with pre-defined rules • Tools: – Beatnik – Knowee – SPARQLpress – Nepomuk (Social Semantic Desktop) 134
  135. Slide 135: Using OpenID with FOAF • Can link to your FOAF profile from your OpenID URL, so that services can browse your machine-readable profile when you log-in: <head> <link rel=\"meta\" type=\"application/rdf+xml\" title=\"FOAF\" href=\"foaf.rdf\" /> </head> 135
  136. Slide 136: Example of OpenID used with FOAF • Bob creates an account on Networkr, a new social networking website, using OpenID • Networkr retrieves the FOAF URI thanks to an auto- discovery link • From the FOAF file, it identifies if there are any people already subscribed to Networkr who are listed in Bob’s defined relationships: – Bob can add them as “local connections”, share data with them, etc. without having to once again search for / add his friends • Specific rules: – If I know X from Flickr, he / she can see my pictures on Networkr 136
  137. Slide 137: 137
  138. Slide 138: SIOC data producers • SIOC applications list: – http://rdfs.org/sioc/applications/ • > 20 applications for producing SIOC data: – Free and open source • SIOC export tools for: – Blogs and forums: WordPress, phpBB, Drupal, b2evolution – “Legacy” applications: mailing lists, IRC – New media: Twitter, Jaiku, Facebook, Flickr – Enterprise applications: CWE (collaborative work environments) 138
  139. Slide 139: Case studies • WordPress SIOC exporter: – http://sioc-project.org/wordpress – First SIOC plugin created, custom built • vBulletin and phpBB SIOC exporters: – http://wiki.sioc-project.org/index.php/VBSIOC – http://sioc-project.org/phpbb – Uses SIOC API for PHP 139
  140. Slide 140: Overview of WordPress SIOC exporter • Installation: – Download from http://sioc-project.org/wordpress – “Drop” two files into the WordPress plugins folder – Go to the administrator’s user interface – Plugins → SIOC Plugin → “Activate” • SIOC data created for every page: – Data describing all blog posts, comments, users, etc. – SIOC data can be discovered via RDF autodiscovery links: – <link rel=\"meta\" type=\"application/rdf+xml\" title=\"SIOC\" href=\"http://www.johnbreslin.com/blog/index.php?sioc_ type=site\" /> • Data can be explored or crawled using existing Semantic Web applications 140
  141. Slide 141: Sample export of SIOC data from WordPress 141
  142. Slide 142: • RDF data from the WordPress SIOC Exporter, displayed in the SIOC RDF Browser 142
  143. Slide 143: SIOC export APIs • Benefits: – Hides the complexity from application developers – Can be used by people who are not Semantic Web experts – Automatically updated according to changes in the SIOC ontology and best practices documents • Existing SIOC APIs: – Java – Perl (new!) – PHP (most used) – RDFa on Rails • See “2.1 SIOC APIs” in http://rdfs.org/sioc/applications/ 143
  144. Slide 144: Overview of vBulletin and phpBB SIOC Exporters • There is a large amount of structured related information contained within message boards, and this can be leveraged in interesting ways by exposing the semantic data for new applications • Exporters have been developed for commercial (vBulletin) and open-source (phpBB) message board systems, bringing these islands together and allowing conversations on topics that are taking place across various sites • vBulletin and phpBB SIOC Exporters are based on the SIOC API for PHP: – http://wiki.sioc-project.org/index.php/PHPExportAPI 144
  145. Slide 145: Sample export of SIOC data from vBulletin 145
  146. Slide 146: Sample export of SIOC data from vBulletin (2) 146
  147. Slide 147: SIOC competition with boards.ie • boards.ie has been publishing social graph information online using FOAF since 2004 • With its 10 years of discussions, boards.ie can serve as a rich source of SIOC data for the Social Semantic Web: – The data to be “SIOC-ified” is already all publicly viewable, but it is difficult to leverage without any added semantics due to the fact that it is embedded in heavily-styled HTML pages • DERI are sponsoring a competition with prizes (the top prize is €3000) for whoever is judged to have produced the most interesting application(s) that makes use of the SIOC data exported from boards.ie • To enter, go to http://data.sioc-project.org 147
  148. Slide 148: Creating your own exporters • Use SIOC API(s) if possible: – Or create new APIs to contribute back to the community • Creating RDF data is easy: – Use the plugin API provided by the host system – Collect required information from the host (CMS) system – Create in-memory RDF or object model (optional) – Serialise RDF data (using RDF API or print templates) • Seek help from the SIOC developer community: – http://sioc-project.org/ or SIOC-Dev mailing list or #sioc on IRC 148
  149. Slide 149: Explore more producers of SIOC data • Sioku: – SIOC data from Jaiku microblogging service – http://sioku.sioc-project.org/ • SWAML: – Exports mailing list archives in RDF – http://swaml.berlios.de/ • OpenLink DataSpaces: – Uses SIOC as a representation format for multiple social spaces – http://virtuoso.openlinksw.com/wiki/main/Main/OdsIndex/ • Use the Semantic Radar extension for Firefox for detecting / exploring SIOC data: – http://sioc-project.org/firefox 149
  150. Slide 150: 150
  151. Slide 151: Motivation for finding and reusing semantic data • There is a lot of Social Semantic Web data available: – From services – Via exporters – Hand-crafted • But it is scattered all around the Web: – How do we find, browse, query, reuse it? • These need to be addressed: – To provide novel applications that can leverage the interlinked nature of this data from the Social Web – To show the benefits of RDF and the Semantic Web Semantic Web Documents (RDF) 151
  152. Slide 152: Finding data from the Social SW • PingTheSemanticWeb: – http://pingthesemanticweb.com – A ping service for SW documents – REST or XML/RPC – Accepts, reads different formats: • RDF/XML, N3, Turtle – The “blo.gs” of the Semantic Web • Various ontologies detected by PTSW: – FOAF, DOAP, SIOC, etc. – About 1M documents, 3.7M pings • “A Scripting Architecture to Discover and Query Decentralized RDF Data”, The 3rd Workshop on Scripting for the Semantic Web (SFSW 2007), Innsbruck, Austria, June 2007 152
  153. Slide 153: Advertising RDF data to PTSW • Direct ping to PingTheSemanticWeb: – Blog engines: WordPress, Drupal, etc. – Services: Revyu, TalkDigger • “Semantic Radar” extension for Firefox: – http://sioc-project.org/firefox – Easy to setup and use (Firefox extension, auto-update) – Support for RDFa! – Architecture of participation: just browse the Web – Discover Semantic Web documents using RDF autodiscovery links (a popular practice for advertising Atom/RSS and FOAF): <head> <link rel=\"meta\" type=\"application/rdf+xml\" title=\"FOAF\" href=\"http://example.com/people/~you/foaf.rdf\"/> </head> 153
  154. Slide 154: Semantic Radar in action, sending pings to PTSW Click to view SW data. 154
  155. Slide 155: Reusing data from PTSW Semantic Web Documents (RDF) • PTSW acts as a central access point for RDF data: – Subscribe to the service FireFox – Ask for recent updates Semantic Radars – Apply namespace restrictions (e.g. export FOAF only) – Get fresh Semantic Web data – Concentrate on your tools, rather than on finding the Ping the Semantic Web data Web Services and Software Agents doap:store 155
  156. Slide 156: Existing services that can make use of PTSW • Sindice: – Lookup service for Semantic Web documents • doap:store: – DOAP-based projects directory • SWSE, Zitgist, Swoogle: – Semantic Web search engines 156
  157. Slide 157: doap:store 157
  158. Slide 158: Write your own Social Semantic Web application • Find data: – Subscribe to PTSW – Make a crontab script to regularly fetch new data • Store data: – Plain-text files – RDF stores • Query the data: – SPARQL query language and protocol, a W3C recommendation – “Trying to use the Semantic Web without SPARQL is like trying to use a relational database without SQL” - Tim Berners-Lee 158
  159. Slide 159: Storing RDF data • RDF stores: – Storage systems for triples – Better performance that distributed queries – Some support inference engines (OWL, RDFS) – Many provide an open SPARQL endpoint to let people use data • Various implementations: – YARS (Java) – ARC2 (PHP) – 3Store (C) – Virtuoso, etc. 159
  160. Slide 160: Querying RDF data • SPARQL language: – A language to query a set of triples – REST-protocol between clients and endpoint – Results in standard formats (XML or JSON) – http://www.w3.org/TR/rdf-sparql-query/ • SPARQL endpoint: – Remotely accessible data – Data openness – Easy to use, e.g. ARC2 requires just three lines of code: include_once('path/to/arc/ARC2.php'); $ep = ARC2::getStoreEndpoint(array(...)); $ep->go(); 160
  161. Slide 161: Semantic Web Search Engine (SWSE) • A large-scale Semantic Web search engine developed and run by DERI Galway: – http://swse.deri.org/ – Andreas Harth, Jürgen Umbrich, Aidan Hogan, Stefan Decker, “YARS2: A Federated Repository for Querying Graph Structured Data from the Web”, The 6th International Semantic Web Conference (ISWC 2007), pp. 211-224, Busan, Korea, 2007 161
  162. Slide 162: What does SWSE do? • SWSE searches and navigates factual entities collected from over 200,000 data sources • Components: – Web-scale crawling and object consolidation – Fully-distributed RDF storage and SPARQL query processing using YARS2 (already achieved 7 billion synthetically generated triples) – Advanced schema agnostic ranking – User interface with guided navigation • Features: – Ability to handle various entity types (such as people, places, proteins) and various media types – Tracking provenance of triples using context / named graphs • Search and explore the Semantic Web at: – http://swse