Dataset disclosure risk assessment

Extra support for researchers publishing datasets containing sensitive information on human participants

When publishing research data it is important to maintain participant confidentiality. Exceptions can be made where explicit consent has been obtained to publish identifying information (for example, where quotes need to be attributed to named individuals), but for the most part research participants will expect to be anonymous in any published data. This means that there must be a low risk of identifying any individuals in a published dataset from the information included in the dataset, or by combining the dataset with other known information.

This can be difficult to assess, so the Research Data Service provides a dataset disclosure risk assessment service to users of the data.bris Research Data Repository. This is an extension of the standard data publication process and will happen automatically for any datasets which include information on human participants. The disclosure risk assessment process includes the following steps:

  1. Assessment of data environment and likely sources of risk outside the dataset
  2. Checks for direct identifiers (e.g. names, email addresses)
  3. Assessment of disclosure risk associated with combinations of common indirect identifiers (e.g. age, sex, geographical location, ethnicity)
  4. Suggested steps to mitigate any identified disclosure risks (for example, actions to anonymise variables within the dataset or changes to the dataset access level)

If any changes are suggested in step 4, these will be communicated back to the Data Steward (researcher). No changes to the dataset or access level will be made without the agreement of the Data Steward.

Researchers publishing via other repositories are also welcome to use the disclosure risk assessment service: please email to request an assessment.

Edit this page