In the recently started project Europeana Newspapers, several IMPACT applications will be used for refining digitised newspaper content.
Newspapers form an important part of library collections and are interesting to a large audience, which is why many libraries are currently digitizing their newspaper material. Despite these efforts, access to these collections is still erratic and often limited to local access points. In addition, the OCR (Optical Character Recognition) results are often unsatisfactory and there are problems with metadata and segmentation.
That is why a group of 17 European partner institutions has recently joined forces in the project Europeana Newspapers to work on solving these problems at a European level and providing better access to these newspaper collections. More than 18 million newspaper pages will be added to the Europeana service.
Europeana Newspapers started on 1st February 2012 and will run for three years. The project is funded by the European Commission as an ICT-PSP project, Best Practice Network (CIP 2007-2013). This means that the project will work on the practical application of tools and services developed in other projects, such as several applications that have been developed in IMPACT.
Quality improvement
The project aims at the aggregation and refinement of newspapers for The European Library and Europeana and will address challenges particularly linked with digitized newspapers:
- use of refinement methods for OCR, OLR/article segmentation, and named entity recognition (NER), and page class recognition to enhance search and presentation functionalities for Europeana customers;
- quality evaluation for automatic refinement technologies;
- transformation of local metadata to the Europeana Data Model (EDM);
- metadata standardization in close collaboration with stakeholders from the public and private sector.
Each library participating in the project will distribute digitized newspapers and full-texts free of any legal restrictions to Europeana. There will be a special focus on newspapers published during the First World War, thus providing a meaningful addition to the resources aggregated by the current Europeana Collections 1914-1918 project.
Project partners
- Staatsbibliothek zu Berlin (coordinator)
- Koninklijke Bibliotheek & The European Library (TEL)
- National Library of Estonia
- Asterreichische Nationalbibliothek
- National Library of Finland
- Staats- und Universitätsbibliothek Hamburg
- Bibliothèque nationale de France
- National Library of Poland
- University of Salford
- CCS Content Conversion Specialists GmbH
- Stichting LIBER
- National Library of Latvia
- National Library of Turkey
- University Library of Belgrade
- University of Innsbruck
- Landesbibliothek Dr. Friedrich Tessmann
- The British Library