The DATeCH 2017 schedule is already online. For more information, visit http://ddays.digitisation.eu/schedule/
Satellite workshops
Tuesday, May 30th
9:00 – 16:30 The journey from physical to digital and advancements in culture heritage digitisation
9:00 – 18:00 TRACER tutorial for computational text reuse detection
13:00 – 17:00 TextGrid user workshop
Wednesday, May 31st
9:00 – 16:30 Handwritten Text Recognition – Transkribus Workshop (project READ)
13:00 – 17:00 PoCoTo user workshop
9:00 – 17:00 IMPACT Members Meeting (only for IMPACT members)
Main conference
Thursday, June 1st
8:30: Registration
9:00 – 9:15: Conference Opening
9:15 – 10:45: Session 1. Transcription
(Chaired by Sinai Rusinek)
- Jesper Zedlitz and Norbert Luttenberger. 750 Volunteers Transcribing 31,000 Pages with 8.5 million Entries Online – an Evaluation
- Enrique Manjavacas and Peter Petre. Enabling Annotation of Historical Corpora in an Asynchronous Collaborative Environment
- Manuel Burghardt and Sebastian Spanner. Allegro: User-centered Design of a Tool for the Crowdsourced Transcription of Handwritten Music Scores
- Jesper Zedlitz and Norbert Luttenberger. Enhancing Human-Transcribed Records by Using OCR
10:45 – 11:15: Coffee break
11:15 – 13:15 Session 2. Natural Language Processing
(Chaired by Klaus Schulz)
- Filip Graliński, Rafał Jaworski, Łukasz Borchmann and Piotr Wierzchoń. The RetroC challenge: how to guess the publication year of a text?
- Catalina Maranduc, Cătălin Mititelu and Radu Simionescu. Parsing Romanian Specialized Dictionaries Structured in Nests
- Markus Paluch, Gabriela Rotari, David Steding, Maximilian Weß, Maria Moritz and Marco Büchler. Analysis of part-of-speech tagging of historical German texts
- Alessio Salomoni. Dependency Parsing on Late-18th-Century German Aesthetic Writings. A Preliminary Inquiry into Schiller and F. Schlegel.
- Candela Gustavo, Maria Pilar Escobar Esteban and Borja Navarro-Colorado. In search of Poetic Rhythm: Poetry retrieval trough text and metre
13:15 – 14:00: Lunch break
14:00 – 15:30: Session 3. OCR and Postprocessing
(Chaired by Neil Fitzgerald)
- Florian Fink, Klaus U. Schulz and Uwe Springmann. Profiling of OCR’ed Historical Texts Revisited
- Alicia González Martínez, Tillmann Feige and Thomas Eich. Clear-cut methodology for Arabic OCR and post-correction with low technical skilled annotators
- Harald Hammarström, Shafqat Virk and Markus Forsberg. Poor Man’s OCR Post-Correction: Unsupervised Recognition of Variant Spelling Applied to a Multilingual Document Collection
- Manuel Ayuso. OCR of a mixed corpus: early printings and manuscripts of Martianus Capella’s work
15:30 – 16:00: Coffee break
16:00 – 17:30 Session 4. Natural Language Processing on Latin and Greek
(Chaired by Greta Franzini)
- Marco Budassi and Marco Passarotti. The Impact of Unassimilated Loanwords on Latin Lexicon. A Qualitative and Quantitative Analysis
- Corien Bary, Peter Berck and Iris Hendrickx. A Memory-Based Lemmatizer for Ancient Greek
- Herbert Lange. Implementation of a Latin Grammar in Grammatical Framework
- Eleonora Litta, Marco Passarotti and Paolo Ruffolo. Node Formation. Using Networks to Inspect Productivity in Affixal Derivation in Classical Latin
17:30 – 18:15 Poster session
- Jim Salmons. Ground Truth & Softalk Magazine: Using Aletheia Web Edition to do FactMiners’ Text-mining
- Cătălina Mărănduc, Augusto Perez and Victoria Bobicev. Building a Corpus to Study the Historical and Geographical Variation of Romanian Language
- Fotios Fitsilis, Thomas Saalfeld and Carsten Schwemmer. Content reconstruction of parliamentary questions through a combination of meta-data with an OCR process
- Karen Thöle. Digital means for the presentation and evaluation of a 15th century liturgical book
- So Miyagawa, Kirill Bulert and Marco Büchler. Running OCR on Coptic
- Jim Salmons and Timlynn Babitsky. Print-Page Number to “Leaf” ID Mapping in Support of eResearch and Machine-Learning at the Internet Archive
19:00 Dinner
Friday, June 2nd
8:30: Registration
9:00 – 10:30 Session 5. Infrastructure and Linked Open Data
(Chaired by Tomasz Parkola)
- Péter Király. Towards an extensible measurement of metadata quality
- Christophe Onambélé, Matyáš Kopp, Marco Passarotti and Jiří Mírovský. Converting Latin Treebank Data into SQL Database for Query Purposes
- Thierry Declerck and Lisa Schäfer. Porting past classification schemes for narratives to a Linked Data Framework
- Simone Rebora. A Software Pipeline for the Reception of Italian Literature in Nineteenth-Century England. Preliminary Testing
10:30 – 10:45 Best Paper Award Ceromony
10:45 – 11:15 Coffee break
11:15 – 12:45 Session 6. Digitisation & Layout Analysis
(Chaired by Apostolos Antonacopoulos)
- Christian Reul, Uwe Springmann and Frank Puppe. LAREX – A semi-automatic open-source Tool for Layout Analysis and Region Extraction on Early Printed Books
- Svetlana Cojocaru, Malahov Ludmila and Alexandru Colesnicov. Digitization of Old Romanian Texts Printed in the Cyrillic Script
- Christian Clausner, Justin Hayes, Apostolos Antonacopoulos and Stefan Pletschacher. Unearthing the Recent Past: Digitising and Understanding Statistical Information from Census Tables
- Christian Reul, Marco Dittrich and Martin Gruner. Case Study of a highly automated Layout Analysis and OCR of an incunabulum: ‘Der Heiligen Leben’ (1488)
12:45 – 13:30 Lunch break
13:30 – 15:00 Session 7. Spatial Analysis
(Chaired by Marco Büchler)
- Rebecca Benefiel, Sara Sprenkle, Holly Sypniewski and Jamie White. Ancient Graffiti Project: Geo-Spatial Visualization and Search Tools for Ancient Handwritten Inscriptions
- Gustavo Candela, Maria Pilar Escobar Esteban and Manuel Marco-Such. Semantic Enrichment on Cultural Heritage collections: A case study using geographic information
- Mariona Coll Ardanuy and Caroline Sporleder. Weakly-supervised toponym disambiguation in historical documents using semantic and geographic features
- Kimmo Kettunen and Teemu Ruokolainen. Names, Right or Wrong: Named Entities in an OCRed Historical Finnish Newspaper Collection
15:00 – 15:30: Coffee break
15:30 – 16:30 Final Panel