1 / 13

Professor John Bacon-Shone Director, Social Sciences Research Centre &

Professor John Bacon-Shone Director, Social Sciences Research Centre & Chair, Human Research Ethics Committee The University of Hong Kong. Re-identification and Privacy risk. Introduction.

nan
Download Presentation

Professor John Bacon-Shone Director, Social Sciences Research Centre &

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Professor John Bacon-Shone Director, Social Sciences Research Centre & Chair, Human Research Ethics Committee The University of Hong Kong Re-identification and Privacy risk Asian Privacy Scholars Network: July 2013

  2. Introduction Ethics committees in universities generally assume that once personal data has been anonymized, it is no longer personal data, so the privacy risk is permanently addressed Recent papers suggest that this is not necessarily a wise assumption! I wish to examine the issue of re-identification and what it means for privacy, confidentiality and research ethics Asian Privacy Scholars Network: July 2013

  3. Anonymity Seems the most difficult ethical concept for academics to fully grasp. The dictionary says: Anonymous: not named or identified But most people think it just means not named, so for example, if I interview you, but do not record your name, they think it is anonymous, even if I know who you are or you make statements in the interview record that implicitly identify you What is much more tricky is that anonymity may not be static: being anonymous today does not necessarily mean being anonymous tomorrow Asian Privacy Scholars Network: July 2013

  4. Personal Identifier (PDPO) The ordinance states that: “Personal Identifier” means an identifier that is assigned to an individual by a data user for the purpose of the operations of the user and that uniquely identifies that individual in relation to the data user, but does not include an individual's name used to identify that individual While the assignment of a “personal identifier” may provide a certain degree of anonymity, its effectiveness relies on the data user taking the necessary action. For example, if a hospital uses the patient’s ID card number to identify the patient, the desired degree of anonymity will not be attained. Asian Privacy Scholars Network: July 2013

  5. Personal Identifier (my version) Personal Identifier means an identifier, other than name, that uniquely identifies some (but maybe not all) individuals in a specified population Clearly, the existence of a personal identifier does not mean we have anonymity for all individuals. Some privacy risk therefore exists. The evaluation of such privacy risk requires knowing both the chance of re-identification of individuals and the consequences. Next, let’s examine the chance of re-identification Asian Privacy Scholars Network: July 2013

  6. Chance of re-identification • This can be separated into 2 elements: • Chance of uniqueness • Ease of matching Asian Privacy Scholars Network: July 2013

  7. Chance of uniqueness The chance of uniqueness depends on both the identifier and the population The more variables in the dataset and the more possible values for each variable, the more likely that the identifier is unique for some individuals. Hence the concern about Big Data and the development of much larger datasets. A smaller population (e.g. identical twins in Hong Kong) has a much greater chance of uniqueness than a large population. Note that DNA profile may not be unique (identical twins) and the matching can be indirect using the similarities of DNA within families. Asian Privacy Scholars Network: July 2013

  8. Ease of matching The ease of matching means how easily can we match the identifier back to a specific person. Let us consider some examples: ID card number: here the risk of matching is high, because the government has enabled leakage of matching information (e.g. Company Registry) DNA profile: the risk of matching should be low, unless you or family members have provided DNA profiles to a registry (see later discussion) Date and time of admission to a specific hospital: would allow matching with hospital records, if they can be accessed Asian Privacy Scholars Network: July 2013

  9. Ease of matching Recent publications have discussed the possibility of matching becoming easier with time, for example: Data leakage: Individuals make DNA profiles public, making it increasingly possible to use familial matches to match individuals or surnames Arrested individuals are often required to provide DNA profiles that are not erased even if innocent ID card numbers are leaked from websites, making it even easier to match to names. Asian Privacy Scholars Network: July 2013

  10. Ease of matching Linkage of the identifier to individual characteristics: Hospital admission: If it is known that you were involved in a traffic accident, your hospital admission soon afterward near to the accident location becomes likely, increasing the ease of matching to hospital records. DNA: Researchers are developing methods to predict personal characteristics from DNA profiles, such as eye, skin and hair colour, so ease is likely to increase Asian Privacy Scholars Network: July 2013

  11. Value of matching Need to consider the reason of matching: Authentication – need to be able to match against an identifier carried by the individual such as ID card Matching other records – need only to match internally, so no need to use an identifier usable externally, greatly reducing the risk of unintended matching Asian Privacy Scholars Network: July 2013

  12. Consequences of re-identification Can range from the trivial (e.g. customer of a clothing retail outlet) to the serious (e.g. HIV status), but the full consequences cannot always be predicted While it is possible to change some identifiers (e.g. ID card number, mobile phone number), it is impossible to change other identifiers (e.g. DNA profile), so long term risk needs to be recognized and addressed Asian Privacy Scholars Network: July 2013

  13. Implications of re-identification Arguably unethical to promise: Zero risk – mistakes can always happen Future risk is same as current risk – technology and circumstances change; ease of matching continues to increase, especially for biological markers Need to review use of identifiers – what seemed privacy safe in the past may not be safe in the future, so need to continue to review privacy risk Asian Privacy Scholars Network: July 2013

More Related