1. e u r o p e a n a s o u n d s . e u
Metadata Ingestion Training
23-24 October 2014
NTUA, Athens
Metadata Ingestion Plan
Targets
Reporting progress
Andra Patterson
Metadata Manager, Europeana Sounds
2. e u r o p e a n a s o u n d s . e u
Metadata Ingestion Plan
Takes into account:
• 4 main stages of aggregation
• Needs of data providers for scheduling
• Info from Rights and metadata ingestion survey
• Info from emails, phone calls, etc.
• Targets from DoW
Flexible - may need to take into account:
• Changing needs of data providers during project
• Needs of Europeana Ingestion Team
3. e u r o p e a n a s o u n d s . e u
Aggregation – 4 main stages
Content
selection
Metadata
preparation
Metadata
ingestion
Metadata
curation
4. e u r o p e a n a s o u n d s . e u
Aggregation – Stage 1
Content
selection
Select the objects for which you will provide metadata to
Europeana Sounds
• According to selection guidelines in D1.1 Content Selection Policy
• According to figures in Table 0, DoW (part B, p.22-27)
Establish the correct rights statements for the objects
• Use Europeana Available Rights Statements
5. e u r o p e a n a s o u n d s . e u
Aggregation – Stage 2
Metadata
preparation
Prepare your metadata and export in .xml or .csv
• Check that mandatory elements are included or can be added
• Check that source metadata is well-formed
• Ensure that digital objects are accessible via links in metadata
• Ensure that objects that can be made available for re-use fit
criteria in Europeana Content Re-use Framework
• File quality; Rights
6. e u r o p e a n a s o u n d s . e u
Aggregation – Stage 3
Metadata
ingestion
Ingest your metadata records using MINT tool
• MINT
• Web-based tool
• Developed by NTUA
• Used to map, ingest and deliver metadata to Europeana
• Map metadata to schema defined in D1.4 EDM Profile for Sound
7. e u r o p e a n a s o u n d s . e u
Aggregation – Stage 4
Metadata
curation
Enrich your metadata records using MINT tool
• Normalise metadata
• Enrich metadata
• Add controlled vocabulary terms
8. e u r o p e a n a s o u n d s . e u
Targets
Table 0 Underlying Content (Part B, p.22-27) =
what we are contracted to achieve
9. e u r o p e a n a s o u n d s . e u
Targets
Progress measured against Performance
Monitoring Table (Part B, p.91)
“Available for re-use” Europeana definition:
PDM, CC0, CC-BY, CC-BY-SA
10. e u r o p e a n a s o u n d s . e u
Targets
Targets for each “metadata set”
Set 1: October 2014-January 2015 (Milestone 5)
Set 2: February 2015-January 2016 (no formal Milestone)
Set 3: February 2016-July 2016 (Milestone 6)
Milestones say: “Content and metadata ready for ingestion”
11. e u r o p e a n a s o u n d s . e u
Targets
0
100000
200000
300000
400000
500000
600000
700000
800000
Re-use subset
Audio-related
Audio
Chart showing required (minimum) metadata ingestion progress
12. e u r o p e a n a s o u n d s . e u
Reporting progress – what to count
• DoW requires us to count digital objects
– Digital objects must be counted the same way as in the DoW
• Audio objects
• Audio-related objects
• Objects “Freely available for re-use”
– These are a subset of the total, not additional items
• Also count metadata records
– Useful to compare what you have prepared for publication
with what is actually published on Europeana
13. e u r o p e a n a s o u n d s . e u
Each line
is a
metadata
record
Counting BL digitised sound
One metadata record usually represents one digital object
14. e u r o p e a n a s o u n d s . e u
No duplicates, please!
Keep track internally of what you have supplied
to Europeana already for this project and for
other Europeana projects – no duplicates!
15. e u r o p e a n a s o u n d s . e u
Each line
is a
metadata
record
Number of digital objects
counted for DoW Table 0
Counting BL digitised printed scores
One metadata record often represents many digital objects
16. e u r o p e a n a s o u n d s . e u
Reporting progress – how to record
• Record statistics in your Google or Excel spreadsheet
– See Europeana Sounds Manual for Data Providers section
3.3.3 for links to Google spreadsheets (will be active next
week!)
• Update your spreadsheet by 3rd Friday of each
month
• Targets
– are based on Table 0, Metadata Ingestion Survey, emails
– are distributed across the 3 metadata sets
– are the minimum required - feel free to do more!
17. e u r o p e a n a s o u n d s . e u
Sample Google spreadsheet showing targets for BL – edit the orange cells!
18. e u r o p e a n a s o u n d s . e u
Thank you for listening!
19. e u r o p e a n a s o u n d s . e u
Metadata Ingestion Training
23-24 October 2014
NTUA, Athens
Metadata Quality
Meaningful metadata
Rights
Controlled vocabularies
Andra Patterson
Metadata Manager, Europeana Sounds
20. e u r o p e a n a s o u n d s . e u
Metadata Quality
• The richer the metadata, the better for discovery by
users
• Europeana Sounds provides an opportunity for us to
enhance our metadata and check quality
• EDM mandatory elements ensure a minimum metadata
standard
• Metadata Quality Task Force (end 2013-mid 2014)
– Quality of metadata varies between institutions
– Need meaningful information in fields
21. e u r o p e a n a s o u n d s . e u
Metadata Quality – Main Issues
• To aid discovery, metadata needs to provide context to
the CHO
– Include a meaningful title and/or description
• Metadata needs to be understandable to
– Humans (e.g. rich descriptions, rights information)
– Machines (e.g. UTF-8 coding, xml-lang)
• Metadata needs to be standardised
– EDM-compliant
– Controlled vocabularies (edm:type, ebucore:hasGenre)
22. e u r o p e a n a s o u n d s . e u
Rights
• Establish the rights of your web resources
– May need to discuss with colleagues
– Use information & resources from WP3
• Important to use the most appropriate rights
statement for your web resources
– Tells users what they can or can’t do with an object
– Web resources of Public Domain CHOs should be labelled
as Public Domain – discuss any issues about this with
Andra Patterson or Lisette Kalshoven
Right!Getting
23. e u r o p e a n a s o u n d s . e u
Rights – Public Domain Works
• Europeana Public Domain Charter
– “Digitisation of Public Domain content does not create new rights over it”
• Europeana Sounds Consortium Agreement
– “… where possible … content which is in the Public Domain … will be made
available without any access restriction and will be labelled as being in the Public
Domain …”
• Some data providers may encounter issues with this, e.g.
– Commercial re-use considered inappropriate
• Academic, artistic, private OK; some commercial re-use considered inappropriate;
sponsorship funds provided according to this (ONB)
– Desire to refinance digitisation activities
• Government funding is basic – charging fees for high quality images contributes to
refinancing digitisation (ONB)
• However, non-profit institutions run risk of losing non-profit status by earning too
much from commercial users! (ONB)
– Legal
• Case law in UK is inconclusive so far (BL)
25. e u r o p e a n a s o u n d s . e u
Rights - EDM
edm:ProvidedCHO dc:rights
– Name of rights holder of CHO, or more general rights information
edm:WebResource dc:rights
– Name of rights holder of a particular web resource, or more general rights information
edm:WebResource edm:rights (Strongly recommended)
– Formal rights statement for a particular web resource
– Overrides statement in ore:Aggregation edm:rights (see below)
– Choose from http://pro.europeana.eu/available-rights-statements
ore:Aggregation edm:rights (Mandatory)
– Formal rights statement for a particular web resource without edm:rights (see above)
– Formal rights statement for a group of web resources without their own edm:rights,
when these are attached to one CHO
– Choose with care from http://pro.europeana.eu/available-rights-statements
26. e u r o p e a n a s o u n d s . e u
What is this?
Danish pastry
Wieneråtta
Wienerbrød
Kopenhagener Plunder
Dänischer Plunder
Danish
27. e u r o p e a n a s o u n d s . e u
Vocabularies
• Enable users to search and navigate across different
metadata sets
• Important in Europeana Portal, where different data
providers use different vocabularies
• Bring together using linked data where possible
– LC Linked Data Service
– VIAF (Virtual International Authority File)
Controlled
28. e u r o p e a n a s o u n d s . e u
Controlled Vocabularies – Linked Data
VIAF Virtual International Authority File
29. e u r o p e a n a s o u n d s . e u
Controlled Vocabularies
• EDM vocabularies
– edm:rights
• http://pro.europeana.eu/available-rights-statements
– edm:type
• TEXT, VIDEO, SOUND, IMAGE, 3D
• Europeana Sounds new vocabularies
– dcterms:medium
• Europeana Carrier Types Vocabulary
– ebucore:hasGenre
• Europeana Music Genre/Form Vocabulary
• Europeana Non-Music Genre/Form Vocabulary
Shared,
30. e u r o p e a n a s o u n d s . e u
Europeana Vocabularies – Carrier Types
Europeana Carrier Types
Vocabulary
DISMARC
dmFormats
RDA Carrier
Types
dcterms:medium
31. e u r o p e a n a s o u n d s . e u
New Europeana Vocabularies – Genre/Form
Europeana Music Genre/Form
Vocabulary
Europeana Non-Music
(Generic) Genre/Form
Vocabulary
ebucore:hasGenre
DISMARC
dmGenre
DBpedia
D1.1 Content
Selection
Policy broad
categories
Freebase
32. e u r o p e a n a s o u n d s . e u
Broad Genre/Form Concepts (Mandatory)
Europeana Music Genre/Form
Vocabulary
Europeana Non-Music
(Generic) Genre/Form
Vocabulary
Broad Genre
(Mandatory)
• Music
• Spoken word
• Radio
• Environment
ebucore:hasGenre
33. e u r o p e a n a s o u n d s . e u
• Europeana Sounds Manual for Data Providers section 4.5
has links to recommended vocabularies
• Genre/Form
• Subjects
• Places
• Carrier types
• Digital formats
• Medium of performance
• Names
• Roles
• Works
More About Controlled Vocabularies
34. e u r o p e a n a s o u n d s . e u
Thank you for listening!
Image: Friends of Music
Society, Greece CC-BY-NC