Resource: fnhd.UP

Project

Referenzkorpus Frühneuhochdeutsch: Baumbank.UP (Referenzkorpus Frühneuhochdeutsch: Baumbank.UP)

Project members

  1. Ulrike Demske, Project leader: GND
  2. Ulyana Senyuk
  3. Dennis Pauly: ORCID
  4. Marianna Lohmann
  5. Pavel Logacev: ORCID
  6. Katrin Goldschmidt
  7. Iskra Fodor

Institutional details

Project ID
DE 677/7
Url
https://www.uni-potsdam.de/de/guvdds/baumbankup
Funder
Deutsche Forschungsgemeinschaft/German Science Foundation
Institution
Universität Potsdam, Institut für Germanistik

Duration
2012-2018

Publications

  1. , and . Pavel Logacev, Katrin Goldschmidt, and Ulrike Demske. 2014. POS-tagging historical corpora: the case of Early New High German. Proceedings of the thirteenth workshop on treebanks and linguistic theories (TLT 13). Tübingen, Germany, 103-112.. https://tlt13.sfs.uni-tuebingen.de/tlt13-proceedings.pdf

  2. , and . Dennis Pauly, Ulyana Senyuk, and Ulrike Demske. 2012. Strukturelle Mehrdeutigkeit in frühneuhochdeutschen Texten. Journal for Language Technology and Computational Linguistics 27(2), 65-82.. https://jlcl.org/content/2-allissues/11-Heft2-2012/5Pauly.pdf

Creation

Creators

  1. Ulrike Demske, Project leader: GND

Creation process

Original Sources

No information on original sources available for this resource.

Annotation
Annotation mode
Semi-automatic
Annotation standoff
no
Interannotator agreement
yes
Annotation format
TIGER annotation scheme
Segmentation units

phrase, other

Annotation types
Levels
Modes
Semi-automatic,
Tag sets
STTS and HiTS (https://www.uni-potsdam.de/de/guvdds/referenzkorpus-fruehneuhochdeutsch-baumbankup/dokumentation)
Annotation tools
  • @nnotate (annotation tool)

    syntactic annotation based on constituent structure

Creation tools

No information available about the tools used to create this resource.

Documentation

Access

Persistent Identifier (PID) of this digital object
https://hdl.handle.net/11022/0000-0007-EAF7-B
TALAR / Archive Contact
CLICK HERE to contact archivist to get access to data.
Availability
Free for academic use
Distribution Medium
Catalogue Link
Price
Licence
Contact
Prof. Dr. Ulrike Demske, Project leader
ulrike.demske@uni-potsdam.de
Deployment Tool Info
  • Annotate

Text Corpus

Constituent-based annotation of 26 texts from Early New High German, including different dialect areas (https://www.uni-potsdam.de/de/guvdds/referenzkorpus-fruehneuhochdeutsch-baumbankup/korpusstruktur)

Corpus type
treebank
Temporal classification
historical
Validation
Size information
  • 600500 token
  • 21430 sentences
Subject languages
  • Early New High German (Dominant language, Source language, Target language)

    Constituent-based annotation of 26 texts from Early New High German, including different dialect areas (https://www.uni-potsdam.de/de/guvdds/referenzkorpus-fruehneuhochdeutsch-baumbankup/korpusstruktur)

Data Files

Persistent Identifier (PID) of this resource: https://hdl.handle.net/11022/0000-0007-EAF7-B

Call CMDI Explorer with this resource: Open Link in CMDI Explorer

Landing page for this resource: https://hdl.handle.net/11022/0000-0007-EAF7-B

Subordinate resources

This data set contains no subordinate resources.

Files

This data set contains the following files:

TigerXML-schema.zip (application/zip, 6.6 KB)
Original file name
TigerXML-schema.zip
Persistent identifier
https://hdl.handle.net/11022/0000-0007-EAF7-B@TigerXML-schema.zip
MIME Type
application/zip
File size
6.6 KB
MD5
95392792a7e0ead2ccd1dfb28e83a628
SHA1
b34c64dcdbcc968c31bebee42c8ca590fc280b55
SHA256
e2ee6c06c87623774aaa966f08718d30d4c4bf8cc2beaebecda7dbbc2fa2ea17
Baumbank.UP-Negra-2.11.zip (application/zip, 3.4 MB)
Original file name
Baumbank.UP-Negra-2.11.zip
Persistent identifier
https://hdl.handle.net/11022/0000-0007-EAF7-B@Baumbank.UP-Negra-2.11.zip
MIME Type
application/zip
File size
3.4 MB
MD5
de999a053298eeb1188d9656df3b4d92
SHA1
8b8407d07cb48e439bde1fcec343d3c0d782eab0
SHA256
34495da17ce8e7b98047dbb76f3a8a9a78d667b499f0f8402365bda017ddbaa7
Baumbank.UP-TigerXML-2.11.zip (application/zip, 7.1 MB)
Original file name
Baumbank.UP-TigerXML-2.11.zip
Persistent identifier
https://hdl.handle.net/11022/0000-0007-EAF7-B@Baumbank.UP-TigerXML-2.11.zip
MIME Type
application/zip
File size
7.1 MB
MD5
027f40ced43585d1a356c1c67a450c35
SHA1
86688e168a4965aedec7bd38a5ed0e120a4a8909
SHA256
289bef32485bcafb736b21442fdce743dc9dda36c0f637ea6a731de9f5d57a2d

Citation Information

Please cite the data set itself as follows:

Demske U. (2019): Baumbank.UP/Treebank.UP, version 1. Data set in Tübingen Archive of Language Resources.
Persistent identifier: https://hdl.handle.net/11022/0000-0007-EAF7-B

Individual items in the data set may be cited using their persistent identifiers (see Data files). For example, cite the file Baumbank.UP-Negra-2.11.zip as follows:

Demske U. (2019): Baumbank.UP-Negra-2.11.zip. In: Baumbank.UP/Treebank.UP, version 1. Data set in Tübingen Archive of Language Resources.
Persistent identifier: https://hdl.handle.net/11022/0000-0007-EAF7-B@Baumbank.UP-Negra-2.11.zip

Report Violation

To report a violation on this resource, please click on the following link to send an email: CLICK HERE TO REPORT VIOLATION.

General Information

Baumbank.UP is a syntactically annotated corpus of Early New High German (1350 - 1650). The treebank consists of 26 historical sources originating from different dialect areas in Germany. It comprises 600,500 tokens in 21,430 sentences. The analysis of POS and syntactic structure was carried out manually.

Resource Name
fnhd.UP
Resource Title
Baumbank.UP/Treebank.UP
Resource Class
Corpus
Version
1
Life Cycle Status
released
Start Year
2012
Completion Year
2018
Publication Date
2019
Last Update
Time Coverage
Early New High German (1350-1650)
Legal Owner
Universität Potsdam
Genre
historical corpus of fictional and non-fictional prose, Treebank, Syntactically annotated Corpus
Field of Research
Location
Germany
Tags
Baumbank, Deutsch, Schriftsprache, treebank, German, written language
Modality Info
written