Data Backup & Preservation Terms
Policy and Procedures for Digital Archiving
Harvard University’s policy for digital archiving is part of the institution’s general mission to preserve all of its archival collections and to ensure their availability for current and future use. More specifically, this policy for preserving our digital data collections is meant to ensure continued access to born digital and digitized data, to ensure their authenticity, and to maintain data quality using the best digital archival practices.
Harvard University (in particular with support from IQSS) commits to best archival practice to ensure that all materials deposited in the archive remain available and usable. This includes: preserving previously deposited versions of materials; deaccessioning (removal) of datasets only when legally compelled; maintaining public access to the materials; regularly reviewing risks to materials; and reformatting materials as necessary and if possible to avoid format obsolescence.
Preservation of Materials Deposited in the Harvard Dataverse
Harvard University supports permanent bit-level preservation of all materials directly deposited in the Harvard Dataverse. In addition, all social science data deposited in the Harvard Dataverse that is made publicly available is replicated by the Data-PASS partners for permanent preservation by the partnership.
On top of Harvard University’s commitment to archival and long term access of all data published in the Harvard Dataverse, the Harvard Dataverse takes data publication very seriously (see Joint Declaration of Data Citation Principles), encouraging good curation practices through support of standards-based metadata schemas, proper documentation, and automatic extraction of metadata from FITS and tabular files to enable data discovery and reuse. Tabular files deposited in the Harvard Dataverse are reformatted into simple open format text files (.tab format), with variable level XML metadata based on the Data Documentation Initiative (DDI), to ensure long-term preservation of the data. Also, once a dataset is published, the repository guarantees archival and long term access to that dataset with a DOI persistent identifier provided by DataCite.
In order to ensure long term accessibility of the dataset in the Harvard Dataverse, once a dataset is published it can not be unpublished and can only be deaccessioned under extreme circumstances, such as a legal requirement to destroy that dataset. However, even in these circumstances, a tombstone landing page with the basic citation metadata will always be accessible to the public if they use the persistent URL (Handle or DOI) provided in the citation for that dataset. Users will not be able to see any of the files or additional metadata that were previously available prior to deaccession.
Due to the self-curated nature of some of the datasets in the Harvard Dataverse, owners or distributors of individual datasets have control over selection of materials, documentation, access policies and data user agreements of their datasets. Therefore, questions about finding and using data distributed by others in the Harvard Dataverse should in general be referred to individual dataset owners.
Changes to this Preservation Policy
Harvard Dataverse may revise this preservation policy at its sole discretion. Please check this page regularly for our current practices. If you have any questions about this preservation policy, the practices of this site, or your dealings with this site, you can contact: firstname.lastname@example.org.
This policy was last modified: 01/15/2020.