A new corpus platform for the Texas German Dialect Project.

Saved in:
Bibliographic Details
Title: A new corpus platform for the Texas German Dialect Project.
Authors: Boas, Hans C.1 (AUTHOR) hcb@mail.utexas.edu, Schmidt, Thomas2 (AUTHOR) thomas@linguisticbits.de, Blevins, Margaret1 (AUTHOR) mblevins@utexas.edu
Source: Language Resources & Evaluation. Sep2026, Vol. 60 Issue 3, p1-34. 34p.
Abstract: Texas German is a contact variety that is the result of dialect mixing of several German dialects brought to Texas from central Europe starting in the 1830s. Since 2001, the Texas German Dialect Project has been assembling a large collection of spoken data documenting this unique variety. The present paper describes how a substantial part of this collection was developed into an annotated corpus and how the corpus is now available through a corpus platform based on the ZuMult technology. We start with an outline of the project’s development and its established processes of data collection, transcription, and dissemination. We then explain the process by which the data were cleaned up and enriched with language tagging, orthographic normalization, lemmatization, and part-of-speech tagging. Finally, we illustrate how the new corpus platform makes these annotated data available for systematic browsing and querying. In the outlook, we sketch prospects for future development of the data and for their role in a larger landscape of comparable speech island data. [ABSTRACT FROM AUTHOR]
Copyright of Language Resources & Evaluation is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
Be the first to leave a comment!
You must be logged in first