Drupal and the Semantic Web: from RDF to Whitehouse.gov
As we usher in this era of open data in which organizations of all types are taking a queue from the administration's focus on transparency, accountability and efficiency by making raw data available to the public in platform independent formats, web CMS is rapidly becoming a useful place to showcase semantic web standards.
Meanwhile this trend is converging with a dramatic rise in the popularity of the most semantic web friendly open source CMS, Drupal which is featuring RDF as part of its core architecture. In October 2009, the official site of the President and flagship site for the administration, Whitehouse.gov re-launched on Drupal and featured some nods to semantic web technology with the addition of RDFa content and heavy use of taxonomies to drive content and search.
Designing IA for AI - Information Architecture Conference 2024
Drupal and the Semantic Web: from RDF to Whitehouse.gov - SemTech2010
1. Drupal and the Semantic Web:
from RDF to
WHITEHOUSE.G
OV
Jeff Walpole, CEO
2. What’s Being Covered
Why Drupal Rocks the Semantic Web
Why the OpenGov movement needs Drupal
and the semantic web
Deep dive case study on WhiteHouse.gov
The case for a semantic enabled distribution
of Drupal for open government
3. Why We use Drupal ?
Technology Extensibility
Easy Modular Enhancements
Out of the box Web 2.0
Semantic Web Friendly
Performance/Reliability
Ease of Implementation
THE COMMUNITY!
4. Why Drupal Works for Semantics
Open Source
Modular Architecture
System of Nodes and relationships
Great technical vision for the future
Touches the right type of sites (publishing,
government, etc.)
Allows common users to become semantic
publishers
9. RDF in Drupal 7 Core
This is how RDFa goes mainstream
Drupal 7 site content is published as RDFa
When enabled, marks up some attributes by default
including: Node Titles, User information, taxonomy,
comment information, image information, etc.
Create your own mappings from custom fields
(CCK)
Lots of new modules will use this functionality and
blow out semantic capabilities in the next release.
http://semantic-drupal.com/
10. RDF in Drupal 7 Tutorial Tomorrow
“How to Build Linked Data Sites with Drupal 7 and
RDFa”
Stéphane Corlosquet,
Lin Clark,
Axel Polleres,
Alexandre Passant
Franciscan C 8:30 AM - 3:00 PM
11. What’s Being Covered
Why Drupal Rocks the Semantic Web
Why the OpenGov movement needs Drupal
and the semantic web
Deep dive case study on WhiteHouse.gov
The case for a semantic enabled distribution
of Drupal for open government
13. Citizens view of what OG is...
Source: planspark Got this by dumping the full (and slightly cleaned-up) text of Rebooting America -- Ideas
for Redesigning American Democracy for the Internet Age into the Wordle tag cloud generator and returning
the top 80 tags
14. Techies view of what OG is...
Source: digiphile Got this by putting the agenda for Transparency Camp into
Wordle
15. While they work out the details, we have
many of the technical answers here to
get started with Drupal
16. Technical Requirements of OGD
www.agency.gov/open
Use of “modern technology” / best practices
Open data sets
Published Open Government Plan
FOIA Plan and Information
Mechanisms for public feedback and input
Downloadable/machine readable copies of
virtually everything
17. OGD showed how
hard web 2.0 thinking
is for government
web 3.0 might
actually be easier...
18. 8 Steps to Publishing
Public Data
1. Complete: All public data is made 5. Machine processable: Data is
available. Public data is data that is reasonably structured to allow
not subject to valid privacy, security automated processing.
or privilege limitations.
6. Non-discriminatory: Data is
2. Primary: Data is as collected at the
available to anyone, with no
source, with the highest possible requirement of registration.
level of granularity, not in aggregate
or modified forms. 7. Non-proprietary: Data is available in a
format over which no entity has
3. Timely: Data is made available as exclusive control.
quickly as necessary to preserve the
value of the data.
8. License-free: Data is not subject to
4. Accessible: Data is available to the any copyright, patent, trademark or
widest range of users for the widest trade secret regulation. Reasonable
privacy, security and privilege
range of purposes.
restrictions may be allowed.
Source: Open Government Working Group Meeting
in Sebastopol, CA, October 22, 2007
19. Open Government Needs the
Semantic Web
Structured Data
Linked Open Data
Visualizations
Mashups
Dataset metrics
Semantic Archives
20.
21. Innovation is Happening
OGI is creating
innovation we have
not seen before by
the Feds on the web
(well since they
invented it at least)
22. Typical WCMS
Gov Policy Stakeholders (OPA, OCIO, etc.)
Tech Information Assurance (IT, Security, Data Quality)
Stack Enterprise Architecture / Standards
Agency Warehouse/
Legacy Systems
Reporting Systems
23. Data Visualizations / Mashups
Data Directories / Linked Open Data
An Open Data APIs
Gov WCMS
Collaboration /
Social Media Tools
Tech Policy Stakeholders (OPA, OCIO, etc.)
Stack Information Assurance (IT, Security, Data Quality)
Enterprise Architecture / Standards
Agency Warehouse/
Legacy Systems
Reporting Systems
25. What’s Being Covered
Why Drupal Rocks the Semantic Web
Why the OpenGov movement needs Drupal
and the semantic web
Deep dive case study on WhiteHouse.gov
The case for a semantic enabled distribution
of Drupal for open government
28. Why The White House
Chose Drupal
Championed from within EOP
Robust Core & Contrib
Functionality
Allowed full control of platform
Open & Transparent
Ability to easily integrate new
tech (like semweb)
29. Why The White House
Chose Drupal
Championed from within EOP
Robust Core & Contrib
Functionality
Allowed full control of platform
Open & Transparent
Ability to easily integrate new
tech (like semweb)
43. What’s Being Covered
Why Drupal Rocks the Semantic Web
Why the OpenGov movement needs Drupal
and the semantic web
Deep dive case study on WhiteHouse.gov
The case for a semantic enabled distribution
of Drupal for open government
44. OpenGov Distribution
Helps tackle OG needs using Drupal
Government best practices
Regulatory compliance
Introduces semweb concepts
Meets security requirements
Allows for rapid site development
45. Data Directories (Data.gov) /
Features Server (Apps.gov)
Linked Open Data
Government Extensions
APIs
Themes
Contrib
Custom
Modules
Modules (NEW)
Content Types / Views / CCK
Default Configurations
D7 Core
OpenPublic Distribution
Introduce myself
Explain Phase2
Discuss the fact that Frank is not present
New agenda: less tech/more use case driven - no examples
Explain how we are/were a development shop building custom
Found OSS CMS in 2004 and Drupal in 2005
Explain the history and uses of Opencalais
How it works on the admin side
How configuration can be controlled
What are the other modules built and developed around it
How an API provides the engine through which we can develop new features
~ 12K downloads
~ 2,400 active sites
~ 20%
Developing quite a few great SemWeb modules too. Arto is a maniac
The RDF CCK module allows site administrators to map each content type, node title, node body and CCK field to an RDF term (class or property).
Drupal 7 takes RDF as a central part of the architecture. New modules are coming that will do even more
Drupal 7 RDF module maintainer: Stéphane "scor" Corlosquet
Drupal 7 RDF contributor and evangelist extraordinaire: Lin Clark
Code contributors:
Mark Birbeck
Alex Bronstein
John Breslin
Benjamin Doherty
Stefan Freudenberg
Rolf Guescini
Daniel F. Kudwien
Florian Lorétan
Frédéric Marand
Benjamin Melançon
John Morahan
Drupal 7 takes RDF as a central part of the architecture. New modules are coming that will do even more
Drupal 7 RDF module maintainer: Stéphane "scor" Corlosquet
Drupal 7 RDF contributor and evangelist extraordinaire: Lin Clark
Code contributors:
Mark Birbeck
Alex Bronstein
John Breslin
Benjamin Doherty
Stefan Freudenberg
Rolf Guescini
Daniel F. Kudwien
Florian Lorétan
Frédéric Marand
Benjamin Melançon
John Morahan
So how is this being used to fuel the open gov movement and why?
This is how citizens see the concepts behind open gov
Not everyone sees gov2.0 and opengov the same - some have interpreted it more from a data/technologists perspective. The good news is that Drupal is equally suited to address these needs.
Enabling the public to have a two way conversation with the government
Be pro-active in publishing to the web
Collect needs/ideas from citizens
Improve citizen services online
Be more open with information, data and policy decision making
December 8, 2009 Obama Administration released the OGD memo
But OG is not just about open source or even open data
47 data sets May 2009
270K+ data sets a year later in June 2010
unlocking data unlocks opportunities
public knowledge
core mission
economic opportunity
OG is doing something very important in that it is creating innovation. AppsforAmerica and Code for America is a great example. Lots of ways that developers can now get engaged in helping govt.
Kieran’s list has 17, but we know there are many more. The list is likely to double this year just based upon current inquiries.
The REAL Overview:
- More than scaling a website. It was scaling the delivering Drupal websites.
- Cover project details, the site itself, go over the launch, infrastructure, and what we've been doing since
- Why replace? They only had a website before, but when it was over, we provided them a platform to build on to tap into (and now participate in) the our vast community of creative problem solvers
Disclaimer.
- Due to NDA's etc. I cannot go into great detail about things.
- Thrilled that I can talk about it though
Why Drupal? (it rawks!!)
- New Media was a champion of Open Source and Drupal for whitehouse.gov.
- The team had a very clear vision of what they wanted, detailed control to tell the human interest side of the Presidency, Drupal provided that.
- New functionality and improved administrative capabilities and a platform to extend.
What makes this platform we built great?
- Great design
- Drupal 6
- Performance patches
- Lots of contrib modules
- Custom features and integrations
This is a rather typical architectural approach to some of the larger Drupal based site.
Key Functionality: Apache Solr search w/ Faceting.
- Big benefit here and a massive improvement over the original.
Over quarter of a million visitor records exposed. Released monthly. Bulk import, staging, and cutover via Drush
Key Functionality: Media Browser.
- Custom Solr Search integration
- Categorical filtering Media objects.
- AJAX enabled categorical browsing.
- Fallback HTML version for 508 compliance.
Building on that, we overhauled the handling of multimedia to take all the guesswork out. Strict process for content entry that leads to far more consistent usage and rendering of imagery and media. This also leads to better 508 as content input and referencing is strictly controlled. Node Embed is now released to the public
Building on that, we overhauled the handling of multimedia to take all the guesswork out. Strict process for content entry that leads to far more consistent usage and rendering of imagery and media. This also leads to better 508 as content input and referencing is strictly controlled. Node Embed is now released to the public
HTML5 version of the site was implemented. One of the great features of that is that is can now display video on my iPad.
Key Functionality: Tight integration with Akamai Cache Control Utility. Clears cache automatically on content updates, also allow any individual page to be cleared from a button on that page. This is a more flexible utility to clear any URL directly from the CMS.
Launch: No DNS delays, etc. We were locked into the launch 4 hours prior, so it was like clicking up the track of a roller coaster waiting to go over the top. Crazy. At each hour leading to launch we were checking the status of servers/functionality & monitoring performance. Then at 1pm exactly the firehose was turned on.
My desktop monitoring each web & database server the day of launch. I was looking at top, watching replication, database connections, number of apache processes, free memory, etc.
New user functionality
More opengov responsiveness
Great data use
More RDF???
How does this apply & what does the future hold?
- This site sets a new bar for how large scale Drupal can be deployed.
- Security, Process Review, and Scalability
- Processes are not all Drupal based, but the process is key
- As Drupal moves up market this will become more and more important
- These orgs are ready for us, but we need to be ready for them
How it is being developed?
From our work with related open government efforts, we’ve developed a framework and process for implementing sites that are compliant and forward-thinking about OGD.
Why?
Because open technology can only be used to accomplish OGD goals if it’s done correctly, responsibly, and with minimal burden on agencies.
Who will use it?
Government agency technology reps required to comply with the OGD.
To accomplish what?
Immediate help with compliance, but also proactive commitment to open government shared through open technology