Project
Referenzkorpus Frühneuhochdeutsch: Baumbank.UP (Referenzkorpus Frühneuhochdeutsch: Baumbank.UP)
Project members
- Ulrike Demske, Project leader: GND
- Ulyana Senyuk
- Dennis Pauly: ORCID
- Marianna Lohmann
- Pavel Logacev: ORCID
- Katrin Goldschmidt
- Iskra Fodor
Institutional details
- Project ID
- DE 677/7
- Url
- https://www.uni-potsdam.de/de/guvdds/baumbankup
- Funder
- Deutsche Forschungsgemeinschaft/German Science Foundation
- Institution
- Universität Potsdam, Institut für Germanistik
- Duration
- 2012-2018
Publications
-
Pavel Logacev, Katrin Goldschmidt and Ulrike Demske. Pavel Logacev, Katrin Goldschmidt, and Ulrike Demske. 2014. POS-tagging historical corpora: the case of Early New High German. Proceedings of the thirteenth workshop on treebanks and linguistic theories (TLT 13). Tübingen, Germany, 103-112.. https://tlt13.sfs.uni-tuebingen.de/tlt13-proceedings.pdf
-
Dennis Pauly, Ulyana Senyuk and Ulrike Demske. Dennis Pauly, Ulyana Senyuk, and Ulrike Demske. 2012. Strukturelle Mehrdeutigkeit in frühneuhochdeutschen Texten. Journal for Language Technology and Computational Linguistics 27(2), 65-82.. https://jlcl.org/content/2-allissues/11-Heft2-2012/5Pauly.pdf
Creation
Creators
- Ulrike Demske, Project leader: GND
Creation process
Original Sources
No information on original sources available for this resource.
Annotation
- Annotation mode
- Semi-automatic
- Annotation standoff
- no
- Interannotator agreement
- yes
- Annotation format
- TIGER annotation scheme
- Segmentation units
-
phrase, other
- Annotation types
-
- Levels
- Modes
- Semi-automatic,
- Tag sets
- STTS and HiTS (https://www.uni-potsdam.de/de/guvdds/referenzkorpus-fruehneuhochdeutsch-baumbankup/dokumentation)
- Annotation tools
-
-
@nnotate (annotation tool)
syntactic annotation based on constituent structure
-
Creation tools
No information available about the tools used to create this resource.
Documentation
Access
- Persistent Identifier (PID) of this digital object
- https://hdl.handle.net/11022/0000-0007-EAF7-B
- TALAR / Archive Contact
-
CLICK HERE to contact archivist to get access to data. - Availability
- Free for academic use
- Distribution Medium
- Catalogue Link
- Price
- Licence
- Contact
- Prof. Dr. Ulrike Demske, Project leader
ulrike.demske@uni-potsdam.de - Deployment Tool Info
-
-
Annotate
-
Text Corpus
Constituent-based annotation of 26 texts from Early New High German, including different dialect areas (https://www.uni-potsdam.de/de/guvdds/referenzkorpus-fruehneuhochdeutsch-baumbankup/korpusstruktur)
- Corpus type
- treebank
- Temporal classification
- historical
- Validation
- Size information
-
- 600500 token
- 21430 sentences
Subject languages
-
Early New High German (Dominant language, Source language, Target language)
Constituent-based annotation of 26 texts from Early New High German, including different dialect areas (https://www.uni-potsdam.de/de/guvdds/referenzkorpus-fruehneuhochdeutsch-baumbankup/korpusstruktur)
Data Files
Persistent Identifier (PID) of this resource: https://hdl.handle.net/11022/0000-0007-EAF7-B
Call CMDI Explorer with this resource: Open Link in CMDI Explorer
Landing page for this resource: https://hdl.handle.net/11022/0000-0007-EAF7-B
Subordinate resources
This data set contains no subordinate resources.
Files
This data set contains the following files:
TigerXML-schema.zip (application/zip, 6.6 KB)
- Original file name
- TigerXML-schema.zip
- Persistent identifier
- https://hdl.handle.net/11022/0000-0007-EAF7-B@TigerXML-schema.zip
- MIME Type
- application/zip
- File size
- 6.6 KB
- MD5
- 95392792a7e0ead2ccd1dfb28e83a628
- SHA1
- b34c64dcdbcc968c31bebee42c8ca590fc280b55
- SHA256
- e2ee6c06c87623774aaa966f08718d30d4c4bf8cc2beaebecda7dbbc2fa2ea17
Baumbank.UP-Negra-2.11.zip (application/zip, 3.4 MB)
- Original file name
- Baumbank.UP-Negra-2.11.zip
- Persistent identifier
- https://hdl.handle.net/11022/0000-0007-EAF7-B@Baumbank.UP-Negra-2.11.zip
- MIME Type
- application/zip
- File size
- 3.4 MB
- MD5
- de999a053298eeb1188d9656df3b4d92
- SHA1
- 8b8407d07cb48e439bde1fcec343d3c0d782eab0
- SHA256
- 34495da17ce8e7b98047dbb76f3a8a9a78d667b499f0f8402365bda017ddbaa7
Baumbank.UP-TigerXML-2.11.zip (application/zip, 7.1 MB)
- Original file name
- Baumbank.UP-TigerXML-2.11.zip
- Persistent identifier
- https://hdl.handle.net/11022/0000-0007-EAF7-B@Baumbank.UP-TigerXML-2.11.zip
- MIME Type
- application/zip
- File size
- 7.1 MB
- MD5
- 027f40ced43585d1a356c1c67a450c35
- SHA1
- 86688e168a4965aedec7bd38a5ed0e120a4a8909
- SHA256
- 289bef32485bcafb736b21442fdce743dc9dda36c0f637ea6a731de9f5d57a2d
Citation Information
Please cite the data set itself as follows:
Demske U. (2019): Baumbank.UP/Treebank.UP, version 1. Data set in Tübingen Archive of Language Resources.
Persistent identifier: https://hdl.handle.net/11022/0000-0007-EAF7-B
Individual items in the data set may be cited using their
persistent identifiers (see Data files).
For example, cite the file
Baumbank.UP-Negra-2.11.zip
as follows:
Demske U. (2019): Baumbank.UP-Negra-2.11.zip. In: Baumbank.UP/Treebank.UP, version 1. Data set in Tübingen Archive of Language Resources.
Persistent identifier: https://hdl.handle.net/11022/0000-0007-EAF7-B@Baumbank.UP-Negra-2.11.zip
Report Violation
To report a violation on this resource, please click on the following link to send
an email:
General Information
Baumbank.UP is a syntactically annotated corpus of Early New High German (1350 - 1650). The treebank consists of 26 historical sources originating from different dialect areas in Germany. It comprises 600,500 tokens in 21,430 sentences. The analysis of POS and syntactic structure was carried out manually.
- Resource Name
- fnhd.UP
- Resource Title
- Baumbank.UP/Treebank.UP
- Resource Class
- Corpus
- Version
- 1
- Life Cycle Status
- released
- Start Year
- 2012
- Completion Year
- 2018
- Publication Date
- 2019
- Last Update
- Time Coverage
- Early New High German (1350-1650)
- Legal Owner
- Universität Potsdam
- Genre
- historical corpus of fictional and non-fictional prose, Treebank, Syntactically annotated Corpus
- Field of Research
- Location
- Germany
- Tags
- Baumbank, Deutsch, Schriftsprache, treebank, German, written language
- Modality Info
- written