Research data storage during project implementation

Proper management of research data throughout project execution represents a critical element ensuring integrity, security, and accessibility of scientific information. While FAIR principles and long-term archiving establish standards for ultimate data sharing, active storage requires detailed operational protocols.

Storage principles during project implementation
  • The 3-2-1 Rule – remains the gold standard for research data protection. Each dataset should exist in three copies: one working copy and two backup copies stored on different media. At least one backup copy must be located in a geographically separate location from the primary workplace. Backup software exists that automates the process of creating regular copies. Tools such as rsync, Bacula, or commercial solutions offer backup scheduling, data compression, and integrity verification.
  • Version Control – requires a systematic approach to tracking changes. Every data modification should be documented with date stamps, authorship, and descriptions of introduced changes. Version control systems like Git offer advanced capabilities for tracking change history in text files and code. Version control tools such as DVC (Data Version Control) enable tracking changes in large datasets, while Git LFS (Large File Storage) extends Git's capabilities to handle binary files.
  • Access Hierarchy – defines permission levels for research team members. Primary data should have restricted read-only access, while processed data may be available to a broader group of collaborators with editing permissions.
Storage media
  • Local Storage – encompasses SSD and HDD drives in research computers. SSD drives provide rapid access to frequently used datasets, while high-capacity HDD drives serve as local repositories for larger collections. Regular disk health monitoring and RAID system implementation are crucial for enhanced reliability.
  • Institutional Servers and Network File Systems – servers offer centralized storage with professional backup management. Universities and research institutes often provide dedicated server spaces with guaranteed availability and security parameters. Network file systems, such as NFS (Network File System) or SMB/CIFS, enable disk space sharing among multiple workstations. They provide unified data access for the entire research team while maintaining centralized permission control.
  • Cloud Computing and Synchronization Platforms – cloud solutions present flexible, scalable options aligned with project needs. Platforms like Amazon S3, Google Cloud Storage, or Microsoft Azure offer various storage classes adapted to data access frequency. Synchronization platforms include solutions like OneDrive or specialized scientific tools such as OSF (Open Science Framework). Automatic synchronization ensures working copy currency, while version history enables restoration of previous file states. When using cloud or synchronization platforms, legal considerations regarding server locations must always be addressed, particularly with personal data. For example, OneDrive on "A1 for faculty" licence ensures that files uploaded to this service will be stored on servers located within EU boundaries.
  • Optical and Tape Media – retain significance for long-term archiving of low-access-frequency data. Modern LTO drives offer capacities exceeding 18 TB with lifespans reaching 30 years under proper storage conditions.

Effective data storage requires strategy adaptation to research discipline specifics. Projects generating large volumes of imaging data need high-speed mass storage systems, while longitudinal studies require reliable long-term storage mechanisms. Documentation of all storage procedures in the data management plan remains a key element. Defining team member roles and responsibilities, backup creation schedules, and data recovery procedures ensures research continuity even during system failures. Monitoring disk space utilization and storage costs enables resource optimization. Regular dataset reviews allow identification of materials suitable for transfer to cheaper storage classes or archiving.

The TASK IT Center, as part of supporting researchers from universities participating in TASK in implementing projects funded by the National Center for Research and Development, offers the NCNdata service that fulfills NCN requirements for storage and backup creation during research.

More information