Data Provision

To get your research data archived, you need to provide:

  • primary research data
  • description of the data (metadata) according to ISO 24622-1 (CMDI)
  • information on the availability of the data for the public, moratoria, etc.
  • a data depositing agreement (only for data providers external to the University of Tübingen)

Tool Support

  • To help package research data and to support its description, researchers may use Bagman, a tool to create packages in the BagIt file packaging format.

  • CLARIN-D provides DMPTY, a data management planner.

For depositing research data with the Tübingen Archive of Language Resources (TALAR) the archive recommennds and accepts the following data formats. In the case of special requirements not addressed by the following recommendations, researchers should contact the archivists at clarin-repository@sfs.uni-tuebingen.de.


Textual Resources

Type Data Format Recommended File Extension Recommended MIME Type Comment
Text TXT .txt text/plain recommended
TEI Documents XML .xml application/xml recommended
Document PDF/A .pdf application/pdf recommended

Specialised Linguistic Resources

Type Data Format Recommended File Extension Recommended MIME Type Comment
Treebanks ConLL .csv, .txt text/plain recommended
Treebanks negra .csv, .txt text/plain recommended
TCF XML .tcf application/xml+tcf recommended
E-Run Experiment File E-Run .ebs2 application/octet-stream accepted
E-Merge Experiment File E-Merge .emrg2 application/x-ole-storage accepted
E-Studio Experimental File E-Studio .es2 text/plain accepted
Feature Structures HPSG .skip text/plain legacy, now recommended: TEI
Feature Structures TDL .tdl text/plain legacy, now recommended: TEI
Diverse tusnelda .sgml text/sgml legacy, now recommended: TEI

Archive Packages

The content of these packaging formats should follow the TALAR recommendations.

Type Data Format Recommended File Extension Recommended MIME Type Comment
GNU zip .gz application/gzip, application/x-gzip recommended
RAR .rar application/x-rar recommended
TAR-GZ .tgz application/gzip recommended
TAR .tar application/x-tar recommended
Zip .zip application/zip recommended

Statistics Files and Program Code

Type Data Format Recommended File Extension Recommended MIME Type Comment
R-scripts R .r / .R text/plain, text/x-matlab recommended
SPSS Statistics SPSS .sav application/spss-sav recommended
SPSS Statistics SPSS .spss application/spss recommended
SPSS Statistics SPSS .spv application/x-spss-spv recommended
Lisp Program Code Lisp .lsp text/plain recommended
Tables CSV .csv text/plain recommended
Tab Separated Data File TAB .tab text/plain recommended
Perl script Perl Script .pl application/x-perl recommended

Media Resources

Type Data Format Recommended File Extension Recommended MIME Type Comment
Image BMP .bmp image/bmp recommended
Image JPEG .jpg image/jpep recommended
Image PNG .png image/png recommended
Image TIFF .tiff image/tiff recommended
Image GIF .gif image/gif recommended
Vector Graphic SVG .svg image/svg+xml recommended
Audio WAVE .wav audio/wav recommended
Video M4V .m4v application/octet-stream recommended
Document PDF/A .pdf application/pdf recommended
Biosemi EEG file Biosemi .bdf biosig/bdf accepted

Other Resources

Type Data Format Recommended File Extension Recommended MIME Type Comment
Stylesheet CSS .css text/plain recommended
Document Type Definition DTD .dtd application/xml-dtd legacy, now XSD
Extensible Markup Language XML .xml application/xml, text/xml recommended
Xschema XSD .xsd application/xml, text/xml recommended
Text HTML .html text/html recommended
Text HTML .xhtml text/html recommended
Database File SQLite .sqlite application/x-sqlite3 recommended
Stylesheet XML .xsl application/xslt+xml recommended
Metadata in CMDI XML .xml application/xml+cmdi recommended
Bibliography Document BibTeX .bib text/plain recommended
Jupyter Notebook Jupyter Notebook .ipynb text/plain accepted
Presentation PowerPoint Presentation .pptx application/vnd.openxmlformats-officedocument.presentationml.presentation accepted

Legacy Data

Type Data Format Recommended File Extension Recommended MIME Type Comment
Text Document DOC .doc application/msword not accepted for new data, tolerated for legacy data for the moment
Text Document DOCX .docx application/vnd.openxmlformats-officedocument.wordprocessingml.document not accepted for new data, tolerated for legacy data for the moment
Text Document RTF .rtf text/plain not accepted for new data, tolerated for legacy data for the moment
Text Document OpenDocument Text Document .odt application/vnd.oasis.opendocument.text accepted if containing formulas and other active components
Text Document Microsoft Access .mdb application/vnd.ms-access not accepted for new data, tolerated for legacy data for the moment
Database File Microsoft Access .accdb text/plain not accepted for new data, tolerated for legacy data for the moment
Table OpenDocument Spreadsheet .ods application/vnd.oasis.opendocument.spreadsheet accepted if containing formulas and other active components
Table Excel Spreadsheet .xls application/vnd.ms-excel accepted if containing formulas and other active components
Table Excel Spreadsheet .xlsx application/vnd.openxmlformats-officedocument.spreadsheetml.sheet accepted if containing formulas and other active components
Presentation PowerPoint Presentation .ppt application/vnd.ms-powerpoint not accepted for new data, tolerated for legacy data for the moment
Presentation PowerPoint Presentation .pptm application/vnd.ms-powerpoint.presentation.macroEnabled.12 not accepted for new data, tolerated for legacy data for the moment
Presentation Mac OS X-Paket-Format .key application/vnd.apple.keynote not accepted for new data, tolerated for legacy data for the moment
Variable Property Mapping .vpm text/plain not accepted for new data, tolerated for legacy data for the moment

Status of 2020-02-24