Publishing (anonymisation and sharing)

Several projects have published data. Due to constraints imposed by the obligation of Universities to keep much student data confidential, and the legal requirements of the Data Protection Act(1998) data needs to be anonymised before publication, eg by encrypting or removing User Identifiers and/or IP addresses that could be used to identify an individual or by publishing summary or statistical information from which it is not possible to derive individual information. See the guide on anonymisation
Outside of the UK, there have been some concerns over the re- identification of users from anonymised data. The Information and Privacy Commissioner of Ontario , has produced a paper entitled Dispelling the Myths Surrounding De- identification: Anonymization Remains a Strong Tool for Protecting Privacy. Its introduction states:
"The goal of this paper is to dispel this myth - the fear of re- identification is greatly overblown. As long as proper de-identification techniques, combined with re-identification risk measurement procedures, are used, de-identification remains a crucial tool in the protection of privacy. De- identification of personal data may be employed in a manner that simultaneously minimizes the risk of re- identification, while maintaining a high level of data quality. De- identification continues to be a valuable and effective mechanism for protecting personal information, and we urge its ongoing use."
There are many reasons for sharing data, but the primary one at the moment seems to be that it enables other people to explore the data and come up with other ways in which it can be used to support institutional work.
What projects have done
All the projects have looked at sharing their data, and some have decided that is possible to do so in some form.
  • AEIOU are currently only sharing data with partners, but have looked at anonymisation and methods of sharing.
  • AGtivity - At present only sites that agree to receiving their data have been analysed. For more public documents specific names have been removed within for example the case studies.
  • EVAD - have published the data in anonymised form.
  • LIDP - are intending to publish statistical data only.
  • OpenURL Activity data - have anonymised and are sharing the data collected during the project.
  • RISE - anonymise the data and publish it using a schema based on that produced by Mosaic.
  • SALT - have not yet decided how to publish their data.
  • STAR-Trak data is for internal use only, removing the need to consider anonymisation.