SlideShare a Scribd company logo
1 of 15
The Europeana Newspapers
Project
IMPACT Final Event
Den Haag, 26-06-2012
Lotte Wilms
Europeana Newspapers
Why newspapers?
  • Important source of information for researchers
  • Relevant for general public

Europeana Newspapers:
  • Aims at the aggregation and refinement of newspapers for The European
    Library and Europeana.
  • Will use refinement methods for OCR, OLR (article segmentation), and named
    entity (NER) and class recognition
  • The libraries participating in the project will provide around 18 million digitised
    newspaper pages to Europeana
  • More libraries will be encouraged to contribute newspapers to Europeana and
    TEL by the project
  • Builds on work from IMPACT


                                                                                      2
Project Profile: Consortium & stakeholders

• 17 partners from 12 countries within the consortium
    • National libraries
    • University libraries
    • SME

• External partners and stakeholders:
    • Involvement of libraries outside the project consortium

• Framework:
    • Funded as a Best Practice Network in the ICT-PSP program of the
      European Commission
    • Project Duration: February 2012 – January 2015

                                                                        3
Europeana Newspapers Consortium


                                    N LE                       N LF
                   LIBER
       TEL
                              SUB HH
                                                         NLL
                                        CCS
USAL
                                                   NLP

       BL                         SBB
                      KB                   ONB

                                                                 NLT
                           UIBK
             BnF

                                              UB
                             LFT
Project Profile: Objectives
1) Selection, Refinement & Aggregation of content
   • Provision of more than 18 million newspaper pages to Europeana,
     many of those with full-text
   • Support move from images to texts in Europeana

2) Analysis of existing newspaper collections
   • Survey of newspaper holdings in Europe

3) Quality Assurance & Best practice recommendations
   • Contribute to optimised workflows
   • Provide best practice recommendations for digitisation, refinement,
     workflows, metadata etc.

4) Presentation and full-text search
   • Improve access to newspaper collections within Europeana

                                                                           5
1) Selection, Refinement & Aggregation of content

• Aggregation of 18 million pages of digitised
  newspapers to Europeana and to The
  European Library
    • 8 million pages “as is” (content providers)
    • 8 million refined pages: OCR (UIBK,
      Austria)                                      www.europeana.eu/
    • 2 million refined pages: OCR/OLR (article
      segmentation) (CCS, Germany)
• Analysis of available digital newspaper
  collections and selection of subsets
  suitable for refinement

                                                    www.theeuropeanlibrary.org/


                                                                              6
1) Refinement – OCR and OLR - UIBK

• 8 million refined pages:
 OCR using ABBYY FRE10 (UIBK,
 Austria)

   • UIBK enriches the OCR with structural
     information from the Document
     Understanding Platform (FEP)
     developed within IMPACT

   • Dedicated profiles will be produced
     which are specifically tuned to the
     characteristics of newspapers to yield
     optimal results
1) Refinement – OCR and OLR - CCS

• 2 million refined pages:
 OCR/OLR (article segmentation)
 (CCS, Germany)

   • CCS produces OCR and verification of
     column recognition, zoning, article
     segmentation, and page class
     recognition

   • CCS provides libraries with a client
     technology for manual correction of
     recognition and segmentation results

   • OCRing done with ABBYY FRE10,
     which includes improvements developed
     within IMPACT                           CCS: Column recognition, article segmentation
1) Refinement - Named Entity Recognition

• KB provides named entities recognition (NER) for material from up to
 three languages (Dutch, English, and German)
   • Pilot planned for second half of 2012




            Image by Frank Landsbergen (INL)
2 ) A n a ly s is o f e x is t in g d ig it is e d
n e w s p a p e r c o lle c t io n s

• Project partners and others are contacted to provide input until 31 July
  2012 to analyse the extent of digitised newspapers collections at their
  institutions
        • Results will be embedded in “Zeitschriftendatenbank” of
          Staatsbibliothek zu Berlin (Union Catalogue of Serials)
        • Potential new partners for the extension of the network will be
          suggested by survey
• Also useful to ascertain the technical status of digitised data


If you have a digital newspaper collection and would like to participate in
the survey  please go to: http://www.surveymonkey.com/s/BQ28579
3) Quality Assurance & Best practice recommendations


• The digitisation workflow for newspapers, including
 refinement, will be evaluation through an evaluation and
 quality assessment framework, containing tools developed
 in IMPACT
   • Document Management System
   • Ground truth production tool Aletheia
   • Evaluation tools


• Provide recommendations on best
  practices for digitisation and
  refinement of newspapers
3) Quality Assurance & Best practice recommendations


• Analysis of metadata formats in use by libraries in
 digitisation projects

• Align metadata models with the METS/ALTO
 standard

• Release best practice recommendation on how to
 apply these formats in newspaper digitisation and
 refinement

• Supports content browser
4) Presentation & Access to full-text

• Within the lifetime of the project, a content browser
 will be built within TEL portal so that users can …
  • Search full text, e.g.
     •   by search term,
     •   by named entities
     •   by collections of newspapers
     •   by date ….
  • See newspaper images
  • Be linked to relevant library sources
  • This browser will be built in TEL during the project;
    and exported to Europeana after the project
5) Dissemination

• Objectives:
   • Establishment of publicity
   • Increasing usage of Europeana
   • Awareness raising among target groups
• Tasks:
   • Media Communication
   • Workshops and conferences
   • Three main dissemination workshops
   • National information days
   • Network extension
   3. Exploitation



                                             14
Thank you for your attention!
http://www.europeana-newspapers.eu/
 Lotte Wilms
 Lotte.wilms@kb.nl

More Related Content

What's hot

Building Bridges: from Europeana Libraries to Europeana Newspapers
Building Bridges: from Europeana Libraries to Europeana NewspapersBuilding Bridges: from Europeana Libraries to Europeana Newspapers
Building Bridges: from Europeana Libraries to Europeana NewspapersLIBER Europe
 
GI2012 pekarek-liber
GI2012 pekarek-liberGI2012 pekarek-liber
GI2012 pekarek-liberIGN Vorstand
 
EuropeanaLocal: overview, progress, aggregation
EuropeanaLocal: overview, progress, aggregationEuropeanaLocal: overview, progress, aggregation
EuropeanaLocal: overview, progress, aggregationEuropeanaLocal Project
 
What library associations can do, advocacy experiences from Germany
What library associations can do, advocacy experiences from GermanyWhat library associations can do, advocacy experiences from Germany
What library associations can do, advocacy experiences from Germanynvbonline
 
Alastair Dunning, The successes of the Europeana Libraries project, The Europ...
Alastair Dunning, The successes of the Europeana Libraries project, The Europ...Alastair Dunning, The successes of the Europeana Libraries project, The Europ...
Alastair Dunning, The successes of the Europeana Libraries project, The Europ...The European Library
 
Positioning libraries in the digital preservation landscape
Positioning libraries in the digital preservation landscapePositioning libraries in the digital preservation landscape
Positioning libraries in the digital preservation landscapeLIBER Europe
 
Challenges on modeling annotations in the Europeana Sounds project
Challenges on modeling annotations in the Europeana Sounds projectChallenges on modeling annotations in the Europeana Sounds project
Challenges on modeling annotations in the Europeana Sounds projectHugo Manguinhas
 
Europeana Newspapers LFT Infoday Genereux
Europeana Newspapers LFT Infoday GenereuxEuropeana Newspapers LFT Infoday Genereux
Europeana Newspapers LFT Infoday GenereuxEuropeana Newspapers
 
Exploring comparative evaluation of semantic enrichment tools for cultural he...
Exploring comparative evaluation of semantic enrichment tools for cultural he...Exploring comparative evaluation of semantic enrichment tools for cultural he...
Exploring comparative evaluation of semantic enrichment tools for cultural he...Hugo Manguinhas
 
Europeana Newspapers - Data, Tools & Future Plans
 Europeana Newspapers - Data, Tools & Future Plans  Europeana Newspapers - Data, Tools & Future Plans
Europeana Newspapers - Data, Tools & Future Plans cneudecker
 
Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using...
Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using...Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using...
Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using...Hugo Manguinhas
 
Barcelona oldmapsonline
Barcelona oldmapsonlineBarcelona oldmapsonline
Barcelona oldmapsonlinePetr Pridal
 
Dunning seedi-2013-130517083015-phpapp02
Dunning seedi-2013-130517083015-phpapp02Dunning seedi-2013-130517083015-phpapp02
Dunning seedi-2013-130517083015-phpapp02The European Library
 
You’ve Digitised Your Collection. What Next ?
You’ve Digitised Your Collection. What Next ?You’ve Digitised Your Collection. What Next ?
You’ve Digitised Your Collection. What Next ?The European Library
 
Europeana Newspapers Transcribathon
Europeana Newspapers TranscribathonEuropeana Newspapers Transcribathon
Europeana Newspapers Transcribathoncneudecker
 
The Successes of Europeana Libraries
The Successes of Europeana LibrariesThe Successes of Europeana Libraries
The Successes of Europeana LibrariesThe European Library
 
Europeana en de digitale ontsluiting van cultureel erfgoed
Europeana en de digitale ontsluiting van cultureel erfgoedEuropeana en de digitale ontsluiting van cultureel erfgoed
Europeana en de digitale ontsluiting van cultureel erfgoedEuropeanaLocal Project
 
IFLA 2014 Europeana Newspapers Rossitza Atanassova
IFLA 2014 Europeana Newspapers Rossitza AtanassovaIFLA 2014 Europeana Newspapers Rossitza Atanassova
IFLA 2014 Europeana Newspapers Rossitza AtanassovaEuropeana Newspapers
 

What's hot (20)

Building Bridges: from Europeana Libraries to Europeana Newspapers
Building Bridges: from Europeana Libraries to Europeana NewspapersBuilding Bridges: from Europeana Libraries to Europeana Newspapers
Building Bridges: from Europeana Libraries to Europeana Newspapers
 
GI2012 pekarek-liber
GI2012 pekarek-liberGI2012 pekarek-liber
GI2012 pekarek-liber
 
EuropeanaLocal: overview, progress, aggregation
EuropeanaLocal: overview, progress, aggregationEuropeanaLocal: overview, progress, aggregation
EuropeanaLocal: overview, progress, aggregation
 
What library associations can do, advocacy experiences from Germany
What library associations can do, advocacy experiences from GermanyWhat library associations can do, advocacy experiences from Germany
What library associations can do, advocacy experiences from Germany
 
Alastair Dunning, The successes of the Europeana Libraries project, The Europ...
Alastair Dunning, The successes of the Europeana Libraries project, The Europ...Alastair Dunning, The successes of the Europeana Libraries project, The Europ...
Alastair Dunning, The successes of the Europeana Libraries project, The Europ...
 
Positioning libraries in the digital preservation landscape
Positioning libraries in the digital preservation landscapePositioning libraries in the digital preservation landscape
Positioning libraries in the digital preservation landscape
 
Ewelina Rockenbauer - WP1
Ewelina Rockenbauer - WP1Ewelina Rockenbauer - WP1
Ewelina Rockenbauer - WP1
 
Challenges on modeling annotations in the Europeana Sounds project
Challenges on modeling annotations in the Europeana Sounds projectChallenges on modeling annotations in the Europeana Sounds project
Challenges on modeling annotations in the Europeana Sounds project
 
Europeana Newspapers LFT Infoday Genereux
Europeana Newspapers LFT Infoday GenereuxEuropeana Newspapers LFT Infoday Genereux
Europeana Newspapers LFT Infoday Genereux
 
Exploring comparative evaluation of semantic enrichment tools for cultural he...
Exploring comparative evaluation of semantic enrichment tools for cultural he...Exploring comparative evaluation of semantic enrichment tools for cultural he...
Exploring comparative evaluation of semantic enrichment tools for cultural he...
 
Europeana Newspapers - Data, Tools & Future Plans
 Europeana Newspapers - Data, Tools & Future Plans  Europeana Newspapers - Data, Tools & Future Plans
Europeana Newspapers - Data, Tools & Future Plans
 
Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using...
Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using...Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using...
Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using...
 
Barcelona oldmapsonline
Barcelona oldmapsonlineBarcelona oldmapsonline
Barcelona oldmapsonline
 
Dunning seedi-2013-130517083015-phpapp02
Dunning seedi-2013-130517083015-phpapp02Dunning seedi-2013-130517083015-phpapp02
Dunning seedi-2013-130517083015-phpapp02
 
You've Digitised. What Next ?
You've Digitised. What Next ?You've Digitised. What Next ?
You've Digitised. What Next ?
 
You’ve Digitised Your Collection. What Next ?
You’ve Digitised Your Collection. What Next ?You’ve Digitised Your Collection. What Next ?
You’ve Digitised Your Collection. What Next ?
 
Europeana Newspapers Transcribathon
Europeana Newspapers TranscribathonEuropeana Newspapers Transcribathon
Europeana Newspapers Transcribathon
 
The Successes of Europeana Libraries
The Successes of Europeana LibrariesThe Successes of Europeana Libraries
The Successes of Europeana Libraries
 
Europeana en de digitale ontsluiting van cultureel erfgoed
Europeana en de digitale ontsluiting van cultureel erfgoedEuropeana en de digitale ontsluiting van cultureel erfgoed
Europeana en de digitale ontsluiting van cultureel erfgoed
 
IFLA 2014 Europeana Newspapers Rossitza Atanassova
IFLA 2014 Europeana Newspapers Rossitza AtanassovaIFLA 2014 Europeana Newspapers Rossitza Atanassova
IFLA 2014 Europeana Newspapers Rossitza Atanassova
 

Similar to IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspapers project by Lotte Wilms (KB)

ENP Belgrade Workshop Project Overview
ENP Belgrade Workshop Project OverviewENP Belgrade Workshop Project Overview
ENP Belgrade Workshop Project OverviewEuropeana Newspapers
 
Neudecker who-cares-about-yesterday’s-news-–-use-cases-and-requirements-for-n...
Neudecker who-cares-about-yesterday’s-news-–-use-cases-and-requirements-for-n...Neudecker who-cares-about-yesterday’s-news-–-use-cases-and-requirements-for-n...
Neudecker who-cares-about-yesterday’s-news-–-use-cases-and-requirements-for-n...cneudecker
 
EuropeanaLocal: objectives, progress and aggregation
EuropeanaLocal: objectives, progress and aggregationEuropeanaLocal: objectives, progress and aggregation
EuropeanaLocal: objectives, progress and aggregationEuropeanaLocal Project
 
Europeana Newspaper metadata LIBER2013
Europeana Newspaper metadata LIBER2013Europeana Newspaper metadata LIBER2013
Europeana Newspaper metadata LIBER2013Europeana Newspapers
 
Realising the value of Europe's newspaper heritage
Realising the value of Europe's newspaper heritage Realising the value of Europe's newspaper heritage
Realising the value of Europe's newspaper heritage Europeana Newspapers
 
Representation and Absence in Digital Resources: The Case of Europeana Newspa...
Representation and Absence in Digital Resources: The Case of Europeana Newspa...Representation and Absence in Digital Resources: The Case of Europeana Newspa...
Representation and Absence in Digital Resources: The Case of Europeana Newspa...TU Delft, Netherlands
 
Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?cneudecker
 
in Europeana and the projects
in Europeana and the projectsin Europeana and the projects
in Europeana and the projectsEuropeanaConnect
 
Europeana Newspapers LFT Infoday Muehlberger
Europeana Newspapers LFT Infoday MuehlbergerEuropeana Newspapers LFT Infoday Muehlberger
Europeana Newspapers LFT Infoday MuehlbergerEuropeana Newspapers
 
Summary of Day 1
Summary of Day 1Summary of Day 1
Summary of Day 1Europeana
 
Europeana Libraries: the value of a library domain aggregator
Europeana Libraries: the value of a library domain aggregatorEuropeana Libraries: the value of a library domain aggregator
Europeana Libraries: the value of a library domain aggregatorLIBER Europe
 
Europeana Newspapers in a Nutshell
Europeana Newspapers in a NutshellEuropeana Newspapers in a Nutshell
Europeana Newspapers in a Nutshellcneudecker
 
The ABES Discovery Study
The ABES Discovery StudyThe ABES Discovery Study
The ABES Discovery StudyABES
 
Des nouvelles d’Europeana
Des nouvelles d’EuropeanaDes nouvelles d’Europeana
Des nouvelles d’EuropeanaDouglas McCarthy
 
Naple presentation danish digital library
Naple presentation danish digital libraryNaple presentation danish digital library
Naple presentation danish digital libraryJakobheide
 
2012.03.20 ihr farquhar v03
2012.03.20 ihr   farquhar v032012.03.20 ihr   farquhar v03
2012.03.20 ihr farquhar v03Digital History
 
Europeana Newspapers ICT2013 networking session
Europeana Newspapers ICT2013 networking sessionEuropeana Newspapers ICT2013 networking session
Europeana Newspapers ICT2013 networking sessionEuropeana Newspapers
 

Similar to IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspapers project by Lotte Wilms (KB) (20)

ENP Belgrade Workshop Project Overview
ENP Belgrade Workshop Project OverviewENP Belgrade Workshop Project Overview
ENP Belgrade Workshop Project Overview
 
Neudecker who-cares-about-yesterday’s-news-–-use-cases-and-requirements-for-n...
Neudecker who-cares-about-yesterday’s-news-–-use-cases-and-requirements-for-n...Neudecker who-cares-about-yesterday’s-news-–-use-cases-and-requirements-for-n...
Neudecker who-cares-about-yesterday’s-news-–-use-cases-and-requirements-for-n...
 
EuropeanaLocal: objectives, progress and aggregation
EuropeanaLocal: objectives, progress and aggregationEuropeanaLocal: objectives, progress and aggregation
EuropeanaLocal: objectives, progress and aggregation
 
How to Build a Digital Library
How to Build a Digital LibraryHow to Build a Digital Library
How to Build a Digital Library
 
Europeana Newspaper metadata LIBER2013
Europeana Newspaper metadata LIBER2013Europeana Newspaper metadata LIBER2013
Europeana Newspaper metadata LIBER2013
 
Realising the value of Europe's newspaper heritage
Realising the value of Europe's newspaper heritage Realising the value of Europe's newspaper heritage
Realising the value of Europe's newspaper heritage
 
Representation and Absence in Digital Resources: The Case of Europeana Newspa...
Representation and Absence in Digital Resources: The Case of Europeana Newspa...Representation and Absence in Digital Resources: The Case of Europeana Newspa...
Representation and Absence in Digital Resources: The Case of Europeana Newspa...
 
Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?
 
in Europeana and the projects
in Europeana and the projectsin Europeana and the projects
in Europeana and the projects
 
Data Mining Newspapers Metadata
Data Mining Newspapers MetadataData Mining Newspapers Metadata
Data Mining Newspapers Metadata
 
Europeana Newspapers LFT Infoday Muehlberger
Europeana Newspapers LFT Infoday MuehlbergerEuropeana Newspapers LFT Infoday Muehlberger
Europeana Newspapers LFT Infoday Muehlberger
 
Summary of Day 1
Summary of Day 1Summary of Day 1
Summary of Day 1
 
Europeana Libraries: the value of a library domain aggregator
Europeana Libraries: the value of a library domain aggregatorEuropeana Libraries: the value of a library domain aggregator
Europeana Libraries: the value of a library domain aggregator
 
ENP Belgrade WS Metadata
ENP Belgrade WS MetadataENP Belgrade WS Metadata
ENP Belgrade WS Metadata
 
Europeana Newspapers in a Nutshell
Europeana Newspapers in a NutshellEuropeana Newspapers in a Nutshell
Europeana Newspapers in a Nutshell
 
The ABES Discovery Study
The ABES Discovery StudyThe ABES Discovery Study
The ABES Discovery Study
 
Des nouvelles d’Europeana
Des nouvelles d’EuropeanaDes nouvelles d’Europeana
Des nouvelles d’Europeana
 
Naple presentation danish digital library
Naple presentation danish digital libraryNaple presentation danish digital library
Naple presentation danish digital library
 
2012.03.20 ihr farquhar v03
2012.03.20 ihr   farquhar v032012.03.20 ihr   farquhar v03
2012.03.20 ihr farquhar v03
 
Europeana Newspapers ICT2013 networking session
Europeana Newspapers ICT2013 networking sessionEuropeana Newspapers ICT2013 networking session
Europeana Newspapers ICT2013 networking session
 

More from IMPACT Centre of Competence

More from IMPACT Centre of Competence (20)

Session6 01.helmut schmid
Session6 01.helmut schmidSession6 01.helmut schmid
Session6 01.helmut schmid
 
Session1 03.hsian-an wang
Session1 03.hsian-an wangSession1 03.hsian-an wang
Session1 03.hsian-an wang
 
Session7 03.katrien depuydt
Session7 03.katrien depuydtSession7 03.katrien depuydt
Session7 03.katrien depuydt
 
Session7 02.peter kiraly
Session7 02.peter kiralySession7 02.peter kiraly
Session7 02.peter kiraly
 
Session6 04.giuseppe celano
Session6 04.giuseppe celanoSession6 04.giuseppe celano
Session6 04.giuseppe celano
 
Session6 03.sandra young
Session6 03.sandra youngSession6 03.sandra young
Session6 03.sandra young
 
Session6 02.jeremi ochab
Session6 02.jeremi ochabSession6 02.jeremi ochab
Session6 02.jeremi ochab
 
Session5 04.evangelos varthis
Session5 04.evangelos varthisSession5 04.evangelos varthis
Session5 04.evangelos varthis
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Session5 02.tom derrick
Session5 02.tom derrickSession5 02.tom derrick
Session5 02.tom derrick
 
Session5 01.rutger vankoert
Session5 01.rutger vankoertSession5 01.rutger vankoert
Session5 01.rutger vankoert
 
Session4 04.senka drobac
Session4 04.senka drobacSession4 04.senka drobac
Session4 04.senka drobac
 
Session3 04.arnau baro
Session3 04.arnau baroSession3 04.arnau baro
Session3 04.arnau baro
 
Session3 03.christian clausner
Session3 03.christian clausnerSession3 03.christian clausner
Session3 03.christian clausner
 
Session3 02.kimmo ketunnen
Session3 02.kimmo ketunnenSession3 02.kimmo ketunnen
Session3 02.kimmo ketunnen
 
Session3 01.clemens neudecker
Session3 01.clemens neudeckerSession3 01.clemens neudecker
Session3 01.clemens neudecker
 
Session2 04.ashkan ashkpour
Session2 04.ashkan ashkpourSession2 04.ashkan ashkpour
Session2 04.ashkan ashkpour
 
Session2 03.juri opitz
Session2 03.juri opitzSession2 03.juri opitz
Session2 03.juri opitz
 
Session2 02.christian reul
Session2 02.christian reulSession2 02.christian reul
Session2 02.christian reul
 
Session2 01.emad mohamed
Session2 01.emad mohamedSession2 01.emad mohamed
Session2 01.emad mohamed
 

Recently uploaded

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Developmentchesterberbo7
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleCeline George
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 

Recently uploaded (20)

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Development
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP Module
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 

IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspapers project by Lotte Wilms (KB)

  • 1. The Europeana Newspapers Project IMPACT Final Event Den Haag, 26-06-2012 Lotte Wilms
  • 2. Europeana Newspapers Why newspapers? • Important source of information for researchers • Relevant for general public Europeana Newspapers: • Aims at the aggregation and refinement of newspapers for The European Library and Europeana. • Will use refinement methods for OCR, OLR (article segmentation), and named entity (NER) and class recognition • The libraries participating in the project will provide around 18 million digitised newspaper pages to Europeana • More libraries will be encouraged to contribute newspapers to Europeana and TEL by the project • Builds on work from IMPACT 2
  • 3. Project Profile: Consortium & stakeholders • 17 partners from 12 countries within the consortium • National libraries • University libraries • SME • External partners and stakeholders: • Involvement of libraries outside the project consortium • Framework: • Funded as a Best Practice Network in the ICT-PSP program of the European Commission • Project Duration: February 2012 – January 2015 3
  • 4. Europeana Newspapers Consortium N LE N LF LIBER TEL SUB HH NLL CCS USAL NLP BL SBB KB ONB NLT UIBK BnF UB LFT
  • 5. Project Profile: Objectives 1) Selection, Refinement & Aggregation of content • Provision of more than 18 million newspaper pages to Europeana, many of those with full-text • Support move from images to texts in Europeana 2) Analysis of existing newspaper collections • Survey of newspaper holdings in Europe 3) Quality Assurance & Best practice recommendations • Contribute to optimised workflows • Provide best practice recommendations for digitisation, refinement, workflows, metadata etc. 4) Presentation and full-text search • Improve access to newspaper collections within Europeana 5
  • 6. 1) Selection, Refinement & Aggregation of content • Aggregation of 18 million pages of digitised newspapers to Europeana and to The European Library • 8 million pages “as is” (content providers) • 8 million refined pages: OCR (UIBK, Austria) www.europeana.eu/ • 2 million refined pages: OCR/OLR (article segmentation) (CCS, Germany) • Analysis of available digital newspaper collections and selection of subsets suitable for refinement www.theeuropeanlibrary.org/ 6
  • 7. 1) Refinement – OCR and OLR - UIBK • 8 million refined pages: OCR using ABBYY FRE10 (UIBK, Austria) • UIBK enriches the OCR with structural information from the Document Understanding Platform (FEP) developed within IMPACT • Dedicated profiles will be produced which are specifically tuned to the characteristics of newspapers to yield optimal results
  • 8. 1) Refinement – OCR and OLR - CCS • 2 million refined pages: OCR/OLR (article segmentation) (CCS, Germany) • CCS produces OCR and verification of column recognition, zoning, article segmentation, and page class recognition • CCS provides libraries with a client technology for manual correction of recognition and segmentation results • OCRing done with ABBYY FRE10, which includes improvements developed within IMPACT CCS: Column recognition, article segmentation
  • 9. 1) Refinement - Named Entity Recognition • KB provides named entities recognition (NER) for material from up to three languages (Dutch, English, and German) • Pilot planned for second half of 2012 Image by Frank Landsbergen (INL)
  • 10. 2 ) A n a ly s is o f e x is t in g d ig it is e d n e w s p a p e r c o lle c t io n s • Project partners and others are contacted to provide input until 31 July 2012 to analyse the extent of digitised newspapers collections at their institutions • Results will be embedded in “Zeitschriftendatenbank” of Staatsbibliothek zu Berlin (Union Catalogue of Serials) • Potential new partners for the extension of the network will be suggested by survey • Also useful to ascertain the technical status of digitised data If you have a digital newspaper collection and would like to participate in the survey  please go to: http://www.surveymonkey.com/s/BQ28579
  • 11. 3) Quality Assurance & Best practice recommendations • The digitisation workflow for newspapers, including refinement, will be evaluation through an evaluation and quality assessment framework, containing tools developed in IMPACT • Document Management System • Ground truth production tool Aletheia • Evaluation tools • Provide recommendations on best practices for digitisation and refinement of newspapers
  • 12. 3) Quality Assurance & Best practice recommendations • Analysis of metadata formats in use by libraries in digitisation projects • Align metadata models with the METS/ALTO standard • Release best practice recommendation on how to apply these formats in newspaper digitisation and refinement • Supports content browser
  • 13. 4) Presentation & Access to full-text • Within the lifetime of the project, a content browser will be built within TEL portal so that users can … • Search full text, e.g. • by search term, • by named entities • by collections of newspapers • by date …. • See newspaper images • Be linked to relevant library sources • This browser will be built in TEL during the project; and exported to Europeana after the project
  • 14. 5) Dissemination • Objectives: • Establishment of publicity • Increasing usage of Europeana • Awareness raising among target groups • Tasks: • Media Communication • Workshops and conferences • Three main dissemination workshops • National information days • Network extension 3. Exploitation 14
  • 15. Thank you for your attention! http://www.europeana-newspapers.eu/ Lotte Wilms Lotte.wilms@kb.nl