Data Organization in Spreadsheets: Enabling Reproducibility and Replicability in Research
Ended Nov 5, 2021
2 credits
Enrollment is closed
Add yourself to the wait list and you'll be auto enrolled when a spot opens
Full course description
Term: Fall 2021
Date: November 5th, 2021
Time: 12:00pm - 1:30pm
Location: Online Only
Instructor: Kory Trott
Presented By: Scholarly Integrity & Research Compliance (SIRC)
Description:
Reproducibility and replicability are hot topics at all levels of research. The National Academies define reproducibility as "obtaining consistent results using the same data and code as the original study (synonymous with computational reproducibility)". They define replicability as "obtaining consistent results across studies aimed at answering the same scientific question using new data or other new computational methods". Researchers must collect and organize complete data systematically in order to minimize the impact on reproducibility and replicability. This seminar will provide you with guidelines on establishing a data collection plan and organizing your data for statistical analysis. The data collection portion of this seminar focuses on brainstorming and creating a formal list of the data fields required, operational definitions, units, frequency of measurement, etc. It also emphasizes the importance of recording special events in comments fields that might affect your statistical analysis of results. The spreadsheet portion of the talk will help you work smart and not hard by doing it right the first time. No one wants to spend hours copying and pasting data for input into statistical software. Obviously, such reformatting introduce opportunities to make errors, compromising data integrity and research quality. After the seminar, attendees will be able to streamline their data management practices for subsequent studies which rely on spreadsheet data collection and strengthen their commitment to research R & R.