2. OUTLINE
• XML, DTD and XML Schema, XSLT
• Meta data Standards.
• Information Representation in Semantic Web
• RDF, RDFS
• Syntactic Formats
• RDF/XML, N-triples, Turtles etc.
• Embedded Formats
• RDFs, Microformats, eRDF, HTML5, GRDDL, SPARQL
3. XML
Thing! Elements
XML
Documents Epilogue
Prolog
XML
Declaration and
Reference
Tag,
Attributes
and
Content
<?xml version="1.0" encoding="UTF-16"?>
<!DOCTYPE book SYSTEM "book.dtd">
<lecturer>David Billington</lecturer>
4. XML
• Comments and Processing Instructions
<!-- This is a comment -->
<?stylesheet type="text/css" href="mystyle.css"?>
5. DTD and XML Schema
• Used for defining the structure
• what values an attribute may take
• which elements may or must occur within other elements, etc.
<lecturer>
<name>David Billington</name>
<phone> +61 − 7 − 3875 507 </phone>
</lecturer>
<!ELEMENT lecturer (name,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
7. DTD
<order orderNo="23456" customer="John Smith" date="October 15, 2002">
<item itemNo="a528" quantity="1"/>
<item itemNo="c817" quantity="3"/>
</order>
<!ELEMENT order (item+)>
<!ATTLIST order orderNo ID #REQUIRED
customer CDATA #REQUIRED
date CDATA #REQUIRED>
<!ELEMENT item EMPTY>
<!ATTLIST item itemNo ID #REQUIRED
quantity CDATA #REQUIRED
comments CDATA #IMPLIED>
8. XML Schema
<schema http://www.w3.org/2000/10/XMLSchema version="1.0">
Element and Attribute Types using Data Types
• Numerical data types: integer, Short etc.
• String types: string, ID, IDREF, CDATA etc.
• Date and time data types: time, Month etc.
• User defined(Simple and Comlpex)
9. XML Namespaces and XPATH
• Form:
xmlns:prefix="location“
• XPATH:
Operates on Tree data model and is core for XML query lanaguage
10. XSLT
• XSLT specifies rules with which an input XML document
is transformed to
• another XML document
• an HTML document
• plain text
11. RDF
• It provides a uniform framework for interchange of
data and metadata between applications
• XML does not provide any means of talking about the
semantics (meaning) of data
• Object-Attribute-Value
14. RDFS Core Classes
• rdfs:Resource, the class of all resources
• rdfs:Class, the class of all classes
• rdfs:Literal, the class of all literals (strings)
• rdf:Property, the class of all properties.
• rdf:Statement, the class of all reified statements
15. RDFS Core Properties
• rdf:type, which relates a resource to its class
• The resource is declared to be an instance of that class
• rdfs:subClassOf, which relates a class to one of its
superclasses
• All instances of a class are instances of its superclass
• rdfs:subPropertyOf, relates a property to one of its
superproperties
• rdfs:domain,
• rdfs:range,
16. RDF Sparql Query
• SPARQL is based on matching graph patterns
• Example:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?c
WHERE
{
?c rdf:type rdfs:Class .
}
17. RDFa
• RDFa is an extension to HTML5 that helps you markup things like
People, Places, Events, Recipes and Reviews. Search Engines and Web
Services use this markup to generate better search listings and give
you better visibility on the Web, so that people can find your website
more easily
18. Microformats
• microformats are a set of simple, open data formats built upon
existing and widely adopted standards
25. Green Triples
• an implementation of RDFa 1.1 for browsers
• When triples are discovered in a web page, a little green turtle
will appear in the address bar. If you click on that turtle, you can
view the triple graph.
26.
27.
28. References
• “Semantic Web Primer”, Grigoris Antoniou, Frank van Harmelen
• http://rdfa.info/
• https://code.google.com/p/green-turtle/
• http://microformats.org/wiki/about
Editor's Notes
Define procedural attachments –PI
A piece of text that is to be ignored by parser– Comments
Well Formed XML
Syntactically correct documents
Only one outermost element (called root element)
Each element contains an opening and a corresponding closing tag
Tags may not overlap
<author><name>Lee Hong</author></name>
Attributes within an element have unique names
Element and tag names must be permissible
The tree representation of an XML document is an ordered labeled tree:
There is exactly one root
There are no cycles
Each non-root node has exactly one parent
Each node has a label.
The order of elements is important
… but the order of attributes is not important
An XML document is valid if
it is well-formed
respects the structuring information it uses
We express that a lecturer element contains either a name element or a phone element as follows:
<!ELEMENT lecturer (name|phone)>
A lecturer element contains a name element and a phone element in any order.
<!ELEMENT lecturer((name,phone)|(phone,name))>
#REQUIRED
Attribute must appear in every occurrence of the element type in the XML document
#IMPLIED
The appearance of the attribute is optional
#FIXED "value"
Every element must have this attribute
"value"
This specifies the default value for the attribute
Significantly richer language for defining the structure of XML documents
Tts syntax is based on XML itself
not necessary to write separate tools
Reuse and refinement of schemas
Expand or delete already existent schemas
Sophisticated set of data types, compared to DTDs (which only supports strings)
Opening tag
Element Types
<element name="email"/>
<element name="head" minOccurs="1" maxOccurs="1"/>
<element name="to" minOccurs="1"/>
Cardinality constraints:
minOccurs="x" (default value 1)
maxOccurs="x" (default value 1)
Generalizations of *,?,+ offered by DTDs
location is the address of the DTD or schema
If a prefix is not specified: xmlns="location" then the location is used by default
XPath is core for XML query languages
Language for addressing parts of an XML document.
It operates on the tree data model of XML
It has a non-XML syntax
The output document may use the same DTD or schema, or a completely different vocabulary
XSLT can be used independently of the formatting language
Resource:- Resource is object or thing
authors, books, publishers
URI=== URL or some unique identifier
Properties:
Describe relations between resources
Written by, age, title etc.
URI
Value can be resource or literals