Element 1: Data Types

Relevant Information


Harvard Dataverse Repository supports any data format regardless of file extension. There is additional processing of tabular data stored in SPSS, Stata, RData, and CSV file formats upon upload to index variables and create an archive-friendly tabular file (.tab). In order for the ingest  process to work correctly, files must be valid and structured according to the formatting requirements described in the user guide: https://guides.dataverse.org/en/latest/user/tabulardataingest/index.html   

About Element 1: Data Types

Element 1: Data Types

Briefly describe the scientific data to be managed, preserved, and shared:

1.1. Summarize the types and estimated amount of scientific data expected to be generated in the project

1.2. Describe which scientific data from the project will be preserved and shared and provide the rationale for this decision

1.3. List the metadata, other relevant data, and any associated documentation that will be made accessible to facilitate interpretation of the scientific data

Harvard Dataverse Supports

Description

Links to User Guide for Additional Information

All data types

Any data format regardless of file extension

Tabular data

For SPSS, Stata, RData, and CSV files, the ingest process extracts the data content from the user’s files and archives it in an application-neutral, easily-readable format. 

 

Note: enhanced ingest for Excel files is not available.

Sample DMP Text for Section 1.3

For tabular data: PI will ensure that tabular data files are ingested properly by the Dataverse software. To best support archival preservation, Harvard Dataverse stores the raw data content extracted from successfully ingested tabular data files in plain text, TAB-delimited files. The metadata information that describes this content is stored separately, in a relational database, so that it can be accessed efficiently by the application. For the purposes of archival preservation it can be exported, in plain text XML files, using a standardized, open DDI Codebook format.