Element 1: Data Types
Relevant Information
Harvard Dataverse Repository supports any data format regardless of file extension. There is additional processing of tabular data stored in SPSS, Stata, RData, and CSV file formats upon upload to index variables and create an archive-friendly tabular file (.tab). In order for the ingest process to work correctly, files must be valid and structured according to the formatting requirements described in the user guide: https://guides.dataverse.org/en/latest/user/tabulardataingest/index.html
About Element 1: Data Types
Element 1: Data Types | Briefly describe the scientific data to be managed, preserved, and shared: 1.1. Summarize the types and estimated amount of scientific data expected to be generated in the project 1.2. Describe which scientific data from the project will be preserved and shared and provide the rationale for this decision 1.3. List the metadata, other relevant data, and any associated documentation that will be made accessible to facilitate interpretation of the scientific data | |
Harvard Dataverse Supports | Description | Links to User Guide for Additional Information |
All data types | Any data format regardless of file extension | Data formats for preservation: https://www.openaire.eu/data-formats-preservation-guide Digital Preservation Handbook: https://www.dpconline.org/handbook/technical-solutions-and-tools/file-formats-and-standards |
Tabular data | For SPSS, Stata, RData, and CSV files, the ingest process extracts the data content from the user’s files and archives it in an application-neutral, easily-readable format. Note: enhanced ingest for Excel files is not available. | Tabular Data File Ingest: https://guides.dataverse.org/en/latest/user/tabulardataingest/index.html |
Sample DMP Text for Section 1.3 | For tabular data: PI will ensure that tabular data files are ingested properly by the Dataverse software. To best support archival preservation, Harvard Dataverse stores the raw data content extracted from successfully ingested tabular data files in plain text, TAB-delimited files. The metadata information that describes this content is stored separately, in a relational database, so that it can be accessed efficiently by the application. For the purposes of archival preservation it can be exported, in plain text XML files, using a standardized, open DDI Codebook format. | |