A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts

Loading...
Thumbnail Image
Date
2014
Journal Title
Journal ISSN
Volume Title
Publisher
The Eurographics Association
Abstract
Historical and artistic handwritten books are valuable cultural heritage (CH) items, as they provide information about tangible and intangible cultural aspects from the past. Massive digitization projects have made these kind of data available to a world-wide population, and pose real challenges for automatic processing. In this scenario, document layout analysis plays a significant role, being a fundamental step of any document image understanding system. In this paper, we present a completely automatic algorithm to perform a robust text segmentation of old handwritten manuscripts on a per-book basis, and we show how to exploit this outcome to find two layout elements, i.e., text blocks and text lines. Our proposed technique have been evaluated on a large and heterogeneous corpus content, and our experimental results demonstrate that this approach is efficient and reliable, even when applied to very noisy and damaged books.
Description

        
@inproceedings{
:10.2312/gch.20141302
https::/diglib.eg.org/handle/10.2312/gch.20141302.035-044
, booktitle = {
Eurographics Workshop on Graphics and Cultural Heritage
}, editor = {
Reinhard Klein and Pedro Santos
}, title = {{
A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts
}}, author = {
Pintus, Ruggero
and
Yang, Ying
and
Gobbetti, Enrico
and
Rushmeier, Holly
}, year = {
2014
}, publisher = {
The Eurographics Association
}, ISSN = {
2312-6124
}, ISBN = {
978-3-905674-63-7
}, DOI = {
/10.2312/gch.20141302
https://diglib.eg.org/handle/10.2312/gch.20141302.035-044
} }
Citation