Sharing Plan

Data Sharing Plan

JUMP TO...

info

Quick Links

about the process

about the data

about the teams

resources

data repositories

standards

tools

example plans

seasonal work

phases

timelines

DMS Plan

Working with us

The INPC is your partner for Data Management and Sharing. Every research project should have a DMS Plan that evolves over time. The DMS Plan should be 3 - 5 pages and follow the NIH guidelines.

When working with the INPC, there are things to consider and details to include in your DMS Plan.

Neuroimaging

Processing

We integrate with the Magnetic Resonance Research Facility here at Iowa and we use the Argon cluster for processing. The neuroimaging data is stored in BIDS format on a dedicated LSS drive called the inc_database. Access to the inc_database is only allowed through Globus. Other types of sharing are possible but require more time to setup and execute. Sharing with NDA is complicated and should be discussed early in the process.

Your data

Advising

For data you collect in research, we can advise and help create workflows that create quality data using Redcap, Excel with macros, CSV with data dictionaries, and scripts with bash and python.

Source Data

TREE

Source data is the original, unprocessed data collected from various sources. It encompasses all potential variables and represents the starting point in the data management process.

There are several types of source data, the neuroimaging from the scanner, single file like the Redcap export or the Aseba download, and the multiple file like the NIH Toolbox. These unprocessed sources are difficult to use.

Raw Data

LUMBER

Raw data is the processed version of source data. It has been cleaned, organized, and made ready for analysis, serving as the foundation for data exploration and interpretation.

For neuroimaging raw data, we use NIFTI files in the BIDS standard. For phenotype data, we recommend CSV files with data dictionaries. For Redcap, this requires a detailed data dictionary and the creation of multiple reports. For other sources, they will need a processing pipeline to create the data and data dictionary.

Derivatives

CABIN

Derivatives are the outputs generated from raw data analysis. These include processed reports, visualizations, and models that provide actionable insights and drive decision-making processes.

Neuroimaging derivatives are organized by pipeline, such as our custom INC processing, Brains Auto Workup, and Freesurfer.

Sharing with NDA is a complex task that requires several new derivatives that follow the NDA data dictionary. This applies to both neuroimaging and phenotype data.

INPC

Gather neuroimaging data

INPC will handle the collection of all neuroimaging data, ensuring it is accurately recorded and maintained.

Process neuroimaging data

Once collected, INPC will be responsible for processing this data, organizing it in a way that is usable for the Project Research Team (PRT). This includes turning sourcedata into rawdata and then processing the rawdata into derivatives.

Share neuroimaging data

INPC will coordinate with the PRT and any relevant data repositories to ensure that processed neuroimaging data is shared correctly and responsibly, in line with relevant data sharing policies and guidelines.

Joint

Cooperation and communication

Both the INPC and the PRT will need to maintain open lines of communication, regularly updating each other on progress, challenges, and changes related to data management. This collaboration is key to ensuring the data management and sharing plan is executed effectively.

Adherence to data protection and privacy guidelines

Both teams must ensure all data handling adheres to required data protection and privacy standards. This includes the de-identification of data and any necessary permissions or consents for data sharing.

Research Team

Generate additional research data

The PRT will be in charge of collecting additional data, using tools like Redcap, NIH Toolbox, and Aseba.

Execute the DMS Plan

The PRT, under the direction of the Principal Investigator, will take the lead in executing the DMS plan. This includes ensuring all data collection and management is conducted according to the plan and coordinating with the INPC on data processing and sharing tasks.

NIH DATA

Data Repository

NIH encourages researchers to select the repository that is most appropriate for their data type and discipline.

WEBSITE Selecting a Data Repository Repositories for Sharing Scientific Data

NIMH Data Archive

NDA Repository

Sharing data with the NDA repository requires harmonizing your data to the NDA Data Dictionary, validating the data, and uploading the data every six months.

Website Data Dictionary Webinars and Tutorials

REDCap

Database and Reporting

About

REDCap is a versatile platform that can be used to store a variety of information for your project. It's recommended to utilize at least two REDCap databases per project to enhance data security. The first database should contain personally identifiable information for tracking purposes. The second database should solely contain deidentified data.

A crucial component of using REDCap is the data dictionary. It provides detailed information about the data, clarifying what each data entry represents.

In instances where data analysis will occur outside of REDCap, you might find it beneficial to establish additional REDCap databases for the deidentified data. This is particularly useful when some data is being manually entered into a form, while other data can be imported directly.

While REDCap excels in data storage, it is not ideally suited for data analysis. Therefore, multiple reports should be generated from the REDCap platform and saved as CSV files. With each report, it's important to create a corresponding data dictionary, derived from the original REDCap data dictionary. This new dictionary will serve as a comprehensive guide, outlining the contents of the respective CSV file.

redcap login

Excel

Spreadsheet and Macros

About

Excel is a powerful tool for managing large volumes of data, especially when it comes to quickly scanning information and making simple updates. For more complex data modifications, doing these manually is possible, but it's not the most efficient method.

Excel's macro feature is particularly useful for these complex tasks. Macros are sequences of commands that can automate repetitive tasks. By programming these sequences, you can perform intricate changes to your data with a single command, which makes your work more efficient and minimizes the chance of errors.

Our team has expertise in using Excel macros. If you find yourself needing to automate complex tasks in Excel, we're here to assist. This collaboration can help ensure your data is managed effectively and your project runs smoothly.

excel macros

Scripts

Bash and Python: Command-Line

About

For certain complex data tasks, traditional tools like REDCap or Excel may fall short. In such scenarios, using command-line scripts through languages like Bash and Python can be more effective. Bash is commonly used for tasks like managing files and automating repetitive processes. Python, with its powerful data processing capabilities, can handle more complex data manipulations.

Our team has experience with both Bash and Python. When necessary, we can assist in creating and running scripts to better manage and share your data. If you encounter any data tasks that require command-line operations, know that assistance is available.

BIDS

Brain Imaging Data Structure

The Brain Imaging Data Structure is a standard that organizes and formats neuroimaging data in a manner that makes it easier to understand, share, and use. By adhering to BIDS, we ensure that our neuroimaging data is uniformly organized and labeled, making it more comprehensible for users. BIDS supports a wide range of neuroimaging data types, including MRI and fMRI, and its consistent naming convention facilitates the sharing and usage of this data.

read the docs

CSV

Data Dictionary

CSV files are a simple, widely-used format for storing tabular data. Accompanied by a well-structured data dictionary, a CSV file can serve as a versatile and user-friendly data source. The data dictionary provides detailed information about the data elements contained in the CSV file, such as their names, meanings, and allowable values. This pairing enables users to easily understand, analyze, and manipulate the data, regardless of their specific field or level of technical expertise.

csv templates

NDA

Data Dictionary

The NDA Data Dictionary offers a standardized set of variables for structuring and detailing your data. This is pivotal for ensuring uniformity and transparency, allowing other researchers to more easily comprehend and interpret your data. By aligning your data with the NDA Data Dictionary, you're making it interoperable with a broad array of other datasets, and easily integrated into future research efforts.

search the dictionary

NIH Example

A sample plan that uses both neuroimaging and phenotype data. In the document, Element 3: Standards shows the integration with the NDA data dictionary.

SAMPLE

PLAN EVOLUTION

Using your Data Management and Sharing Plan is crucial for effectively managing your data. Just like a map guides you on your journey, the DMS Plan provides clear instructions on how to handle your data at each stage. Data management and processing activities occur throughout the research project. However, there are instances where focused efforts and additional processing are needed to meet specific objectives such as data sharing or optimizing the dataset for analysis.

To keep the DMS Plan useful, it's important to update it as your dataset evolves. This ensures that your Plan remains practical and relevant, allowing you to readily analyze and share your data. The ultimate goal is to maintain a flexible dataset that can meet the changing demands of your research.

PLAN EVOLUTION

PHASE ONE

PHASE TWO

Biannual

6 Month Updates

Some repositories, like NDA, mandate data sharing every six months. This frequency provides an excellent opportunity to refresh your DMS Plan and get your dataset in order. Regular updates ensure your data remain current, relevant, and readily available for research needs.

Annual

Yearly Review

Some repositories require annual updates. This yearly review is not just a routine task, but a valuable checkpoint to ensure your project is on the right track. It's an opportunity to realign your dataset with your project goals and make any necessary adjustments.

Conditional

Maintenance

Sometimes, you may find yourself in a situation where your dataset is needed for analysis or sharing, but it isn't ready. If you have been using your DMS Plan diligently, the process of making additional updates will be relatively straightforward. However, if the DMS Plan has been neglected or ignored, the required work can become challenging and troublesome. When properly used and maintained, your DMS Plan serves as a reliable guide, even in the face of unexpected events. It provides the necessary direction and support to navigate through such circumstances.

Sharing Plan

Data Sharing Plan

JUMP TO...

Quick Links

For an

Follow the

Create a

About

DMS Plan

Working with us

Neuroimaging

Processing

Your data

Advising

Data Management

Source Data

TREE

Raw Data

LUMBER

Derivatives

CABIN

Roles and Responsibilities

INPC

Gather neuroimaging data

Process neuroimaging data

Share neuroimaging data

Joint

Cooperation and communication

Adherence to data protection and privacy guidelines

Research Team

Generate additional research data

Execute the DMS Plan

Resources

Repositories

Data Repository

NDA Repository

Tools, Software, Code

Downloads

REDCap

About

Excel

About

Scripts

About

Standards

BIDS

Brain Imaging Data Structure

CSV

Data Dictionary

NDA

Data Dictionary

DMS Plans

NIH Example

INPC EXAMPLE

EXEMPLARY SAMPLE

Seasonal Work

PLAN EVOLUTION

To keep the DMS Plan useful, it's important to update it as your dataset evolves. This ensures that your Plan remains practical and relevant, allowing you to readily analyze and share your data. The ultimate goal is to maintain a flexible dataset that can meet the changing demands of your research.

PHASE ONE

Phase One includes gathering, assessing, and ensuring the accuracy and completeness of the source data.

PHASE TWO

Phase Two involves processing the source data into raw data and derivatives for sharing with the PI, PRT, or relevant repositories.

Timelines

Biannual

6 Month Updates

Annual

Yearly Review

Conditional

Maintenance