Full course description
Term: Spring 2022
Date: January 28th, 2022
Time: 12:00pm - 1:30pm
Location: Online Only
Instructor: Kory Trott
Presented By: Scholarly Integrity and Research Compliance (SIRC)
Statistical Disclosure Control has been an issue at the forefront of privacy for many years. The emphasis on open science and reproducible & replicable research has precipitated the release of both raw and summary data on publicly accessible portals. Yet, not all of those releasing such data are aware of re-identification risks. Government agencies and academic researchers have advanced initial methodological and policy work in this area ultimately culminating in NIST standards, journal articles, and books which address re-identification risk. The current state of statistical disclosure control (SDC) methodologies goes well beyond Safe Harbor from HIPAA and other best practices aimed at reducing the likelihood of data re-identification. The more advanced SDC techniques from these sources should be employed to reduce the risk in many cases.
This seminar will introduce you to the concepts of data re-identification, quasi-identifiers, and sensitive values from SDC. We will provide guidance on conducting data re-identification risk assessments and the accompanying SDC techniques used to assess and mitigate this risk. These include concepts like k-anonymity, l-diversity, household/cluster risk, etc. The trade-off between data utility and re-identification will be discussed in the context of methods such as suppression, perturbation, etc. that are used to mitigate the risk. Available tools for implementing these analyses will be highlighted.