Designing for Structure in a Sea of Unstructured Data
1. Prepared by:
Paul Kahn – Experience Design Director
February, 2013
Media Lab, Aalto University
Helsinki, Finland
Structured Data
None / Some / All
2. 1995-2012 = Gazillions of Websites
Our design problem was an evolution of visual literacy
— Readers were trained to find information in printed
books/magazines/newspapers
— Digital publications lack physical context
— Location and scope of information was invisible
Paul Kahn | 2
3. Clients = Publishers Users = Readers
Our Design Task was to connect Readers to Content
— Adapt graphic language – type, color, image – from the page
to the screen
— Create navigation systems that help users understand what
they can find on a website
— Communicate the structure of content in flexible repeatable
units
Paul Kahn | 3
6. Today Users are
— Convinced they can find what they want
“on the Internet”
— Producing & managing dematerialized content: photos,
videos, music, email, compound documents
— Creators & consumers with storage/creation and
retrieval/consumption needs
— Looking for something all the time
Paul Kahn | 6
7. Today Users want to
— Record, share, publish
— Be convinced, amused, in control
— Find, sort, sift and copy
— Mix, reorder and arrange
They don’t explicitly know what metadata is (in most cases)
They are solving problems by implicitly manipulating metadata
Paul Kahn | 7
8. Today’s IA/UX Problem
No Structure Leaping into a Vacuum Raw
Some Structure Stepping into a Marsh Eatable
Complete Structure Traversing a Field Cooked
Paul Kahn | 8
Every IA/UX problem is a Metadata Continuum
9. Structured Data Value Proposition
— People want to find things, they don’t want to “learn” how to
find things
— People understand how to use Structured Data
— No one wants to create Structured Data
— It is our task to leverage the Structured Data people already
understand
Paul Kahn | 9
10. Unstructured Data
Data Vacuum:
no metadata has been added to items
Even Data Vacuums include content & context
The 50-year-old Information Retrieval /
Library Science trade-off:
• Precision: finding only what you are looking for
• Recall: not missing anything that might contain what
you are looking for
Paul Kahn | 10
11. Data with no structure: Names
— A character-string a person, place or thing is known by
— People have many names: professional names, familiar
names, legal names
— Places and things have many names in different languages
— As data, a name presents a major problem:
it is not unique
— For example: “paul kahn”
Paul Kahn | 11
12. There are many “paul kahn”s
Paul Kahn | 12
Paul W.
Kahn, author
and Law
Professor at
Yale
University,
New Haven CT
Dr. Paul
Kahn,
Urologist in
Plantation FL
Paul Kahn,
General
Partner at
Himalaya
Capital
Ventures,
Silicon Valley,
CA
Roshi Paul
Genki Kahn
Spiritual
Director of
Zen Garland
in Wyckoff, NJ
Paul Kahn
Information
Architect,
Docent at
Media Lab,
Helsinki
17. Where did I put that document?
The tools we use:
— Personal Memory
— Folder names
— Desktop search
What kinds of structure can we present?
Paul Kahn | 17
19. Semi-Structured Data
Data Marsh: some metadata without predefined language
or requirements
— Tagging : users add uncontrolled keywords
— Profile: users intentionally add metadata about themselves
— Time / Location stamps: where and when
— Tracking: users unintentionally add metadata about
themselves as interactions are tracked
Paul Kahn | 19
20. Aggregation/Reproduction Sites
— Sites that aggregate user-provided content
Slideshare / YouTube / Dailymotion / Vimeo /
SoundCloud / Flickr
— Sites where users create and republish content to
social networks
LinkedIn / Facebook / Twitter
Paul Kahn | 20
21. Paul Kahn | 21
Implicit metadata:
— Sort criteria
— Document type
22. Paul Kahn | 22
Implicit
metadata:
— Related
— More
25. Structured Data
Paul Kahn | 25
Data Fields: where metadata has been explicitly added to
items according to an agreed-upon standard
— The Content is made to fit a pre-defined structure
— The required parts of the structure are completed
— Each metadata dimension qualifies and reinforces the
meaning of the content
— Many kinds of relationships can be harvested
36. Would the world be a better place if:
— Everything had a unique ID?
— Every digital object with a unique ID contained structured
data?
How does structured data affects quality of life questions?
Paul Kahn | 36
37. Paul Kahn | 37
Contact Information
Paul Kahn
Experience Design Director
pkahn@madpow.com
Mad*Pow
Portsmouth | Boston | Louisville
www.madpow.com