22 May 2018


'CDIB: The Role of the Certificate of Degree of Indian Blood in Defining Native American Legal Identity' by Paul Spruhan in (2018) 6(2) American Indian Law Journal comments
Native Americans are the only group in the United States that possess a document stating the amount of their “blood” to receive government benefits. The official name is a “Certificate of Degree of Indian or Alaska Native Blood,” or (CDIB) for short. As suggested in its name, the CDIB states the amount of “Indian” or “Alaska Native” blood possessed by the person named on the document.  It may be broken down by different tribal blood or may only state the amount of blood of a specific tribe.  It is certified by a Bureau of Indian Affairs (BIA) or tribal official authorized to issue it.  It may be printed on a standard eight and a half by eleven inch piece of paper or on a smaller card, which may or may not be laminated. 
Why does such a document exist in the United States in 2018? Simple in form, yet possessing immense bureaucratic power, the CDIB is a key that unlocks educational loans, medical services, employment preference, or other federal benefits unique to Native Americans,  and, in some circumstances, even enrollment as a member of a tribal nation. 
Simultaneously derided and coveted,  pervasive yet mysterious, the CDIB is one of the most important documents for Native Americans, but is issued with no direct statutory authority and governed by no formally published regulations. A CDIB may be issued directly by the BIA or by a tribal enrollment office operating under a “638” contract, but with no clear rules to govern how those offices grant or deny a CDIB or calculate the blood quantum listed on the document. 
This article is about the CDIB and its role in defining Native American legal identity. The purpose of the article is to describe the CDIB, its function, its statutory authority (or lack thereof), and the BIA’s recent attempts at issuing regulations, which no other article or book has done. First, I discuss its primary purpose as proof of blood quantum for specific federal statutes and regulations, and how its use has expanded to other purposes, including by tribes to define eligibility for membership. Second, I discuss its origins as an internal BIA document lacking any direct congressional authorization or published regulations and suggest several possibilities for its first appearance. I then discuss a 1986 Interior Board of Indian Appeals (IBIA) decision, Underwood v. Deputy Ass’t Secretary- Indian Affairs (Operations). In that decision, the IBIA blocked an attempt by the BIA to unilaterally alter a person’s blood quantum on a CDIB, because there were no properly issued regulations. I then discuss the BIA’s attempts at issuing regulations since 2000 and the possible reasons for why they have never been finalized. I then discuss potential remedies the BIA might consider in order to solve problems arising out of the CDIB program, including the potential misuse of CDIBs in current disenrollment conflicts within some tribes. In the conclusion, I discuss the CDIB’s role in enshrining “blood” as the dominant definition of Native American legal identity. I also argue that, for as long as the CDIB continues, the BIA has an affirmative obligation to issue clear policies that prevent its misuse in internal tribal conflicts.

Legal Pragmatism

'Three Forms of Legal Pragmatism' by Charles L. Barzun in (2018) 95(5) Washington University Law Review comments 
The term “Legal Pragmatism” has been used so often for so long that it may now seem to lack any clear meaning at all. But that conclusion is too quick. Although there are diverse strands of legal pragmatism, there is also unity among them. This essay distinguishes among three such forms of legal pragmatism. It dubs them instrumentalist, quietist, and holist strands, and it offers, as representatives of each, the views of Richard Posner, Ronald Dworkin, and David Souter, respectively. Each of these forms of pragmatism has developed as a response to the same underlying philosophical problem, namely that of justifying moral and legal values within a naturalistic, nontheological worldview. That problem is an old one and a fundamental one. And it is one felt acutely by those judges and legal theorists over the last century or more who have sought to make sense of the judge’s task when deciding hard cases. The essay does not defend any one or more of these three understandings of law and adjudication against its critics. But it does suggest that the feature they share, in virtue of which they are all plausibly classed as “pragmatist,” may also be an important and distinctive feature of law as a discipline – that is, as a form of reasoning about matters practical and theoretical.

Digital Driver Licences and the Identity Hub

The Road Transport and Other Legislation Amendment (Digital Driver Licences and Photo Cards) Bill 2018 (NSW) seeks to amend the Road Transport Act 2013 (NSW), the Photo Card Act 2005 (NSW), Gaming and Liquor Administration Act 2007 (NSW), Liquor Act 2007 (NSW) and other legislation to 'provide for the issue and use of digital driver licences and digital Photo Cards and for other purposes'.

In essence, the new regime will provide for people to hold a digital version of their licence or government-issued photo identity card on their mobile phones. The biometric image will be used by NSW Police in relation to road management and, presumably, for other law enforcement.

The expectation is that it will also have extensive use across the private sector (for example in over 14,200 venues under NSW liquor law), consistent with the driver licence being the default identity document for most adult Australians.

NSW will presumably be emulated by the other state/territory jurisdictions

The IGA and the Hub

The Second Reading Speech understandably does not refer to sharing of images and other data with the Commonwealth Department of Home Affairs under the identity-matching services interoperability hub to be operated by that Department.

That hub is at the heart of the current Identity-matching Services Bill 2018 (Cth) - noted here - to 'facilitate the secure, automated and accountable exchange of identity information between the Commonwealth and state and territory governments' under the October 2017 Intergovernmental Agreement on Identity Matching Services (IGA).

Under the IGA, the Commonwealth, states and territories agreed to preserve or introduce legislation to support the collection, use and disclosure of facial images and related identity information between the parties, via a set of identity-matching services, for
  •  Preventing identity crime 
  •  General law enforcement 
  •  National security 
  •  Protective security 
  •  Community safety 
  •  Road safety, and 
  •  Identity verification. 
 The interoperability hub
facilitates data-sharing between agencies on a query and response basis, without storing any personal information. Passport, visa and citizenship images will continue to be held by the Commonwealth agencies that issue these documents, and that already have facial recognition systems.  
 Driver licence images will be made available by the establishment of a National Driver Licence Facial Recognition Solution (NDLFRS), hosted by the Commonwealth on behalf of the states and territories in accordance with the IGA. The NDLFRS will consist of a federated database of identification information contained in government identification documents (initially driver licences) issued by state and territory authorities, and a facial recognition system for biometric comparison of facial images against facial images in the database..
The NSW Bill

The 2nd Reading Speech states
As at the end of 2017 there were over six million New South Wales driver licences and over 568,000 photo cards in use. 
The bill delivers on the Government's 2015 election commitment to transition to digital driver licences by 2019. It also supports the Government's digital strategy, the Premier's priority to improve government services and the State priority of 70 per cent of government transactions to be conducted by digital channels by 2019. In 2015 the New South Wales Government announced its commitment to offering the people of New South Wales a range of digital licences, including a transition to digital driver licences by 2019. Since then this Government has successfully digitised the responsible service of alcohol and responsible conduct of gambling competency cards, the recreational fishing fee, boat driver licences and recreational vessel registrations. This bill will take the next step by delivering the digital driver licence and the digital photo card. 
Digitising the driver licence and photo card is an opportunity to provide benefits for the community of New South Wales in three key areas. Firstly, for the citizens of New South Wales the digital driver licence and digital photo card will provide greater convenience, choice and security. Digital licences are also an opportunity for citizens to have more control and transparency over how the personal information on their licence is shown and shared with others. 
The reality is that a digital driver licence or digital photo card brings a multitude of additional benefits and protections for users. One example of this is when a licence is lost. If you lose a physical driver licence or you have your wallet stolen, you have no ability to stop it being used by another person for nefarious purposes. Sure, you can report it to police and to Service NSW but once a licence is lost there is no way to cancel it in the way you would a credit card because so much checking of the licence is simply sighting it rather than it being scanned. There is a risk that it can still be used. 
Then to replace a lost physical card you must attend a Service NSW centre in person and apply for a new card, which would be sent to you sometime after applying for it. This process takes time out of your busy day and is a major inconvenience. However, for a digital driver licence it is a much more secure proposition. Say you lose your phone that has your digital driver licence on it. You eventually have to go out and buy new device but you are concerned that your digital driver licence is on there. As soon as you know your phone has been lost or stolen you can log into Service NSW and cancel your digital driver licence on that device. 
You will know if it is used by someone who is not you as you will have access to an activity log, just like you have with your Opal card. By being able to cancel their card at the click of a button the citizen is empowered to take control of their identity security and privacy and ensure that their licence cannot be used or scanned by an unauthorised person, just like they can with their credit card. To replace your digital driver licence you simply take your new device, re-download the app, accept the digital driver licence on the new phone and away you go. 
For businesses in New South Wales, digital licences present an opportunity to streamline manual processes for checking or recording licence details. This means that businesses may deliver a better experience for their customers and benefit from time and cost savings. Digital licences can also provide a greater level of assurance, reducing risks of fraud and loss. For government, this development will mean simpler and faster ways to communicate and interact with citizens—for example, digital notifications and licence renewals for those who prefer to deal with us in that way. 
The NSW photo card is an increasingly important identity product; in 2017 alone there was a 28.38 per cent increase in its adoption. This makes it a priority for digitisation. A digital photo card is also not constrained by the national driver licensing framework and therefore may be delivered in a more flexible form to enhance citizen privacy—for example, providing citizens with more control over the personal information they share, depending on the situation, such as to security staff at licensed venues. It will also give citizens a digital identity product that is independent of their authority to drive.
Private Sector use

The Speech quotes industry support
The Australian Hotels Association: The continued expansion of smartphone technology for cardless transactions will see the use of wallets as an option rather than a necessity, based on these feedback from our Dubbo members. The AHA NSW is supportive of the expansion of the digital driver licence statewide. 
The Liquor Stores Association: [The LSA] remains supportive of a full statewide rollout of the digital driver licence as it will give packaged liquor retailers, licensees and their staff at the point of purchase a safe and efficient digital service control age verification measure. 
The Restaurant and Catering Association: I am firmly of the view that this project will be of significant benefit to the approximately 14,200 cafĂ© and restaurant businesses in New South Wales. The addition of the digital driver licence as a valid form of identification will provide patrons with a more seamless method of ordering alcohol in licensed cafes and restaurants. It is for this reason I have no hesitation in supporting a state-wide rollout of the digital driver licence. 
ClubsNSW: Proper implementation of digital drivers' licences will be a positive development in better equipping clubs for the digital future and the industry is excited for what these changes mean. I look forward to continuing to work closely with industry as we progress to implementation of the digital driver licence and the digital photo card and thank them for their support to date. I now go through the statewide rollout of the digital driver licence and digital photo card.
Privacy is 'sacrosanct'

The Minister comments
Once launched, the people of New South Wales will be able to opt-in to receive a digital driver licence and digital photo card. These will essentially constitute a digital representation of a person's physical driver licence or photo card. 
The digital versions will be in addition to the physical licence or card, and accessible via the MyServiceNSW app, which can be downloaded to their device, such as a smartphone. The digital driver licence and digital photo card will provide a secure and user-friendly experience and be able to be authenticated visually, by viewing the visual security features, or electronically. Citizens who opt in for the digital driver licence will have the option of carrying or producing either their digital driver licence or their physical licence card when driving in New South Wales. Citizens will also be able to show their digital driver licence or digital photo card as evidence of their age and of their identity in the liquor and gaming industry to enter pubs and registered clubs, and in a variety of ways that the driver licence and photo card is currently used.
The rhetoric ramps up, complete with reference to privacy being sacrosanct ...
As many in this House know, a mobile phone is so much more than just a digital driver licence. A phone is a person's personal property and may also be used to store and access personal and private information. To ensure appropriate privacy and a citizen's right to maintain control of their personal electronic device, a driver will only need to display their digital driver licence on their device to the police or authorised officer in order for their digital driver licence to be checked. I am pleased that the Privacy Commissioner has supported this approach, stating, "This will ensure the privacy rights of an individual who holds personal information on their phone beyond the digital driver licence is preserved." ...
The member for Cessnock also will recall how important it was that, when we debated that legislation, both sides of politics agreed that privacy was sacrosanct. I do not think there is any debate in this Chamber when it comes to putting the privacy of the citizen front and centre. Indeed, when we drafted the Data Analytics Centre legislation—the Data Sharing (Government Sector) Act 2015, as it was appropriately titled—we made sure that the Privacy Commissioner was involved from the ground up in the steering committee so that we achieved the right outcome. In preparing this legislation, we engaged the Privacy Commissioner because privacy is beyond politics. It is an absolutely enshrined right of the citizen.
One final question, which in my view is the most important of all, is: How does the digital driver licence and digital photo card ensure security of personal information and protect against fraud? To obtain a digital driver licence and digital photo card, a person is required to register for a MyServiceNSW account and establish their identity to link their account with Roads and Maritime Services. Once verified, the person's driver licensing or photo card information and photograph is securely released to the Department of Finance, Services and Innovation and Service NSW digital platforms to be processed to create the digital driver licence and digital photo card in the Service NSW app. None of the information or photographs is stored by the Department of Finance, Services and Innovation or Service NSW platforms. The digital driver licence and digital photo card are securely stored on a person's device. On top of any device PIN code or touch identification—fingerprint—the Service NSW app is also PIN code protected to ensure that the person's personal information remains safe and secure.  
Identity Crime

In relation to identity crime the Speech states that
Visually, the digital driver licence contains several features that can be sighted to ensure that it is not a screenshot or a fake. The digital driver licence can then be further verified by police using a "MobiPol" device, which scans a digital driver licence to initiate a search against backend police systems without the police officer having to manually type in the licence number.
Approximately 95 per cent of road traffic infringements issued by police are issued through MobiPol devices and the digital driver licence leverages this technology. In network blackspots where MobiPol is unable to connect to backend police systems, police may still verify the digital driver licence in the same way as a physical licence: by radioing back to station or using the terminals in their vehicles.
The digital driver licence and digital photo card include several visual security features that can be sighted to ensure that it is not a fake or a screenshot. For example, the design includes animations and a hologram. The digital driver licence and digital photo card also include a quick response code that may be scanned to verify its authenticity. Unauthorised use of a digital driver licence and digital photo card may also be detected through a device management framework and activity log, which will notify the person of logins from unrecognised devices or other unusual activity. 
This would mean that if someone living in Sydney has opted in to have a digital driver licence, whenever that digital driver licence is scanned they could be notified by email instantly of when and where that was done—just like a credit card. For example, if your card was scanned in Byron Bay by someone seeking to defraud you, you could instantly deactivate the digital driver licence and inform Service NSW and/or the police of the breach. This tangible security and fraud benefit comes with the digital driver licence and simply is not available with the physical card. I am pleased that the Privacy Commissioner supports this added level of protection, stating: "The recommendation that holders of a digital driver licence are notified of transactions including third party checks is supported".

21 May 2018


'Understanding and responding to victimisation of whistleblowers' (AIC Trends and Issues in Crime and Criminal Justice 549, 2018) by Inez Dussuyer and Russell Smith comments
Speaking out in the public interest — being a whistleblower — can be risky. Media reports and public inquiries into allegations of misconduct in the public and private sectors regularly recount the negative consequences that those who make reports in the public interest have experienced—despite the presence of legislation that seeks to prevent reprisals and retaliation for disclosing misconduct. Instances in which whistleblowers have lost employment and careers, suffered harassment and intimidation, and experienced threats or acts of violence continue to occur in Australia. 
This study sought to understand the nature of victimisation experienced by whistleblowers who had reported or attempted to report wrongdoing in their workplace. Information was obtained from in-depth interviews with 36 whistleblowers and 21 people who dealt with their reports in public and private sector organisations. The results confirm the nature of the harms that almost all whistleblowers experience as a consequence of reporting misconduct. The paper concludes by identifying ways in which whistleblowers could better be protected from victimisation and how the procedures and safeguards involved in the whistleblowing process could be strengthened.

Facial Recognition Questions

The 56 page Big Brother Watch report Face Off - The lawless growth of facial recognition in UK policing comments
 Facial recognition has long been feared as a feature of a future authoritarian society, with its potential to turn CCTV cameras into identity checkpoints, creating a world where citizens are intensively watched and tracked. However, facial recognition is now a reality in the UK – despite the lack of any legal basis or parliamentary scrutiny, and despite the significant concerns raised by rights and race equality groups. This new technology poses an unprecedented threat to citizens’ privacy and civil liberties, and could fundamentally undermine the rights we enjoy in public spaces. Police forces in the UK have rolled out automatic facial recognition at a pace unlike any other democratic nation in the world. Leicestershire Police, South Wales Police and the Metropolitan Police have deployed this technology at shopping centres, festivals, sports events, concerts, community events – and even a peaceful demonstration. One police force even used the surveillance tool to keep innocent people with mental health issues away from a public event.
In this report, we explain how facial recognition technology works, how it is being used by police in the UK, and how it risks reshaping our rights. We are seeking to raise awareness of this growing issue with parliamentarians and inform the wider public about what is happening behind the cameras.
In this report, we:
• Reveal new statistics following a series of freedom of information requests, exposing the shocking inaccuracy and likely unlawful practices within a number of police forces using automated facial recognition; 
• Analyse the legal and human rights implications of the police’s use of facial recognition in the UK; 
• Review the evidence that facial recognition algorithms often disproportionately misidentify minority ethnic groups and women; 
• Present guest contributions from allies worldwide warning about the impact of facial recognition on rights, including contributions from representatives of American Civil Liberties Union, Electronic Frontier Foundation, Georgetown Privacy Centre, and the Race Equality Foundation;
 We conclude by launching our campaign against the lawless growth of facial recognition in the UK, supported by rights groups, race equality groups, technologists, lawyers and parliamentarians.
 The report's key findings :
• The overwhelming majority of the police’s ‘matches’ using automated facial recognition to date have been inaccurate. On average, a staggering 95% of ‘matches’ wrongly identified innocent people. 
• Police forces have stored photos of all people incorrectly matched by automated facial recognition systems, leading to the storage of biometric photos of thousands of innocent people. 
Metropolitan Police 
• The Metropolitan Police has the worst record, with less than 2% accuracy of its automated facial recognition ‘matches’ and over 98% of matches wrongly identifying innocent members of the public. The force has only correctly identified 2 people using the technology – neither of which was a wanted criminal. One of those people matched was incorrectly on the watch list; the other was on a mental health-related watch list. However, 102 innocent members of the public were incorrectly identified by automated facial recognition. 
• The force has made no arrests using automated facial recognition. 
South Wales Police 
• South Wales Police’s record is hardly better, with only 9% accuracy of its matches whilst 91% of matches wrongly captured innocent people. 
• 0.005% of ‘matches’ led to arrests, numbering 15 in total. 
• However, at least twice as many innocent people have been significantly affected, with police staging interventions with 31 innocent members of the public incorrectly identified by the system who were then asked to prove their identity and thus their innocence. 
• The force has stored biometric photos of all 2,451 innocent people wrongly identified by the system for 12 months in a policy that is likely to be unlawful. 
• Despite this, South Wales Police has used automated facial recognition at 18 public places in the past 11 months – including at a peaceful demonstration outside an arms fair.  
Custody images 
• Out of the 35 police forces that responded to our Freedom of Information request, not one was able to tell us how many photos they hold of innocent people in their custody image database.

20 May 2018

Trade Secrets and US startups

'Why Do Startups Use Trade Secrets?' by David S. Levine and Ted M. Sichelman in (1018) 94 Notre Dame Law Review comments  
Empirical studies of the use of trade secrecy are scant, and those focusing on startups, non-existent. In this paper, we present the first set of data — drawn from the Berkeley Patent Survey — on the use of trade secrets by U.S. startup companies in the software, biotechnology, medical device, and hardware industries. 
Specifically, we report on the prevalence of trade secrecy usage among startups. Additionally, we assess the importance of trade secrets in relation to other forms of intellectual property protection and barriers to entry, such as patents, copyrights, first-mover advantage, and complementary assets. We segment these results by a variety of factors, including industry, company business model, overall revenue, patenting propensity, funding sources, innovation types, and licensing. From this segmentation, we implement a basic regression model and report on those factors showing a statistically significant relationship in the use of trade secrets by startups. 
Our results point to three major findings. First, trade secrecy serves other important aims aside from first-mover advantage. Second, trade secrets may act both as economic complements and substitutes to patenting. Third, trade secrets may serve as important strategic assets, functioning much in the same manner as patents in terms of licensing and setting the boundaries of the firm.

18 May 2018


Protecting unit-record level personal information: The limitations of de-identification and the implications for the Privacy and Data Protection Act by Vanessa Teague, Chris Culnane and Benjamin Rubinstein for the Office of the Victorian Information Commissioner (OVIC) offers cautions about de-identication in Victoria's public and private sectors.

The report states
De-identification is a subject that has received much attention in recent years from privacy regulators around the globe. Once touted as a silver bullet for protecting the privacy of personal information, the reality is that when it involves the release of data to the public, the process of de-identification is much more complex. 
As improvements in technology increase the type and rate at which data is generated, the possibility of re-identification of publicly released data is greater than ever. Auxiliary information – or secondary information – can be used to connect an individual to seemingly de-identified data, enabling an individual’s identity to be ascertained. Auxiliary information can come from anywhere, including other publicly available sources online. 
In recent examples of successful re-identification that we have seen in Australia, it is clear that those releasing de-identified data did not appreciate the auxiliary information that would be available for re-identification – in that they did not expect re-identification would be possible. Individual data elements may be non-distinct and recognisable in many people, but a combination of them will often be unique, making them attributable to a specific individual. This is why de-identification poses a problem for unit-record level data.
 OVIC comments
This report is one of a number of publications on de-identification produced by, or for, the Victorian public sector. Notably, in early 2018 Victoria’s Chief Data Officer issued a de-identification guideline to point to what ‘reasonable steps’ for de-identification looks like in the context of data analytics and information sharing under the Victorian Data Sharing Act 2017 (VDS Act). This paper is not aimed at the work conducted by the Victorian Centre for Data Insights (VCDI), where information sharing occurs within government with appropriate controls, and it is not intended to inhibit that work. Rather, it speaks to the use of de-identification more broadly, in circumstances where so-called ‘de-identified’ data is made freely available through public or other less inhibited release of data sets, which occurs in so-called “open data” programs. This report should be interpreted in that context. ...
This report has been produced to demonstrate the complexities of de-identification and serve as a reminder that even if direct identifiers have been removed from a data set, it may still constitute ‘personal information’. The intention is not to dissuade the use of de-identification techniques to enhance privacy, but to ensure that those relying on and sharing de-identified information to drive policy design and service delivery, understand the challenges involved where the husbandry of that data is not managed. ... Public release of de-identified information may not always be a safe option, depending on the techniques used to treat the data and the auxiliary information that the public may have access to. Wherever unit level data – containing data related to individuals – is used for analysis, OVIC’s view is that this is most appropriately performed in a controlled environment by data scientists. Releasing the data publicly in the hope that ‘de-identification’ provides protection from a privacy breach is, as this paper demonstrates, a risky enterprise.
The authors go on to state
A detailed record about an individual that has been de-identified, but is released publicly, is likely to be reidentifiable, and there is unlikely to be any feasible treatment that retains most of the value of the record for research, and also securely de-identifies it. A person might take reasonable steps to attempt to deidentify such data and be unaware that individuals can still be reasonably identified.
The word ‘de-identify’ is, unfortunately, highly ambiguous. It might mean removing obvious identifiers (which is easy) or it might mean achieving the state in which individuals cannot be ‘reasonably identified’ by an adversary (which is hard). It is very important not to confuse these two definitions. Confusion causes an apparent controversy over whether de-identification “works”, but much of this controversy can be resolved by thinking carefully about what it means to be secure. When many different data points about a particular individual are connected, we recommend focusing instead on restricting access and hence the opportunity for misuse of that data. Secure research environments and traditional access control mechanisms are appropriate.
Aggregated statistics, such as overall totals of certain items (even within certain groups of individuals) could possibly be safely released publicly. Differential privacy offers a rigorous and strong definition of privacy protection, but the strength of the privacy parameters must be traded off against the precision and quantity of the published data.
This paper discusses de-identification of a data set in the context of release to the public, for example via the internet, where it may be combined with other data. That context includes the concept of “open data”, in which governments make data available for any researchers to analyse in the hope they can identify issues or patterns of public benefit.
Therefore, it’s important to emphasise that this document should not be read as a general warning against data sharing within government, or in a controlled research environment where the combination of the data set with other data can be managed. It is not intended to have a chilling effect on sharing of data in those controlled environments.
 In reference to statutory responsibilities the report comments
In taking ‘reasonable steps’, a data custodian must have regard to not only the mathematical methods of de-identifying the information, but also “the technical and administrative safeguards and protections implemented in the data analytics environment to protect the privacy of individuals”.
Therefore, there is a possibility that in some circumstances, a dataset in which ‘reasonable steps’ have been taken for de-identification under the VDS Act may not be de-identified according to the PDP Act, because individuals may still be ‘reasonably identified’ if the records are released publicly outside the kinds of research environments described in the VDS Act.
In this report, we describe the main techniques that are used for de-identifying personal information. There are two main ways of protecting the privacy of data intended for sharing or release: removing information, and restricting access. We explain when de-identification does (or does not) work, using datasets from health and transport as examples. We also explain why these techniques might fail when the de-identified data is linked with other data, so as to produce information in which an individual is identifiable.
Does de-identification work? In one sense, the answer is obviously yes: de-identification can protect privacy by deleting all the useful information in a data set. Conversely, it could produce a valuable data set by removing names but leaving in other personal information. The question is whether there is any middle ground; are there techniques for de-identification that “work” because they protect the privacy of unit-record level data while preserving most of its scientific or business value?
Controversy also exists in arguments about the definitions of ‘de-identification’ and ‘work’. De-identification might mean:
• following a process such as removing names, widening the ranges of ages or dates, and removing unusual records; or 
• achieving the state in which individuals cannot be ‘reasonably identified’.
These two meanings should not be confused, though they often are. A well-intentioned official might carefully follow a de-identification process, but some individuals might still be ‘reasonably identifiable’. Compliance with de-identification protocols and guidelines does not necessarily imply proper mathematical protections of privacy. This misunderstanding has potential implications for privacy law, where information that is assumed to be de-identified is treated as non-identifiable information and subsequently shared or released publicly.
De-identification would work if an adversary who was trying to re-identify records could not do so successfully. Success depends on ‘auxiliary information’ – extra information about the person that can be used to identify their record in the dataset. Auxiliary information could include age, place of work, medical history etc. If an adversary trying to re-identify individuals does not know much about them, re-identification is unlikely to succeed. However, if they have a vast dataset (with names) that closely mirrors enough information in the de-identified records, re-identification of unique records will be possible.
4. Can the risk of re-identification be assessed?
For a particular collection of auxiliary information, we can ask a well-defined mathematical question: can someone be identified uniquely based on just that auxiliary information?
There are no probabilities or risks here – we are simply asking what can be inferred from a particular combination of data sets and auxiliary information. This is generally not controversial. The controversy arises from asking what auxiliary information somebody is likely to have.
For example, in the Australian Department of Health's public release of MBS/PBS billing data, those who prepared the dataset carefully removed all demographic data except the patient’s gender and year of birth, therefore ensuring that demographic information was not enough on its own to identify individuals. However, we were able to demonstrate that with an individual's year of birth and some information about the date of a surgery or other medical event, the individual could be re-identified. There was clearly a mismatch between the release authority's assumptions and the reality about what auxiliary information could be available for re-identification.
5. How re-identification works
Re-identification works by identifying a ‘digital fingerprint’ in the data, meaning a combination of features that uniquely identify a person. If two datasets have related records, one person's digital fingerprint should be the same in both. This allows linking of a person's data from the two datasets – if one dataset has names then the other dataset can be re-identified.
Computer scientists have used linkage to re-identify de-identified data from various sources including telephone metadata, social network connections, health data and online ratings, and found high rates of uniqueness in mobility data and credit card transactions.  Simply linking with online information can work.
Most published re-identifications are performed by journalists or academics. Is this because they are the only people who are doing re-identification, or because they are the kind of people who tend to publish what they learn? Although by definition we won’t hear about the unpublished re-identifications, there are certainly many organisations with vast stores of auxiliary information. The database of a bank, health insurer or employer could contain significant auxiliary information that could be of great value in re-identifying a health data set, for example, and those organisations would have significant financial incentive to do so. The auxiliary information available to law-abiding researchers today is the absolute minimum that might be available to a determined attacker, now or in the future.
This potential for linkage of one data set with other data sets is why the federal Australian Government's draft bill to criminalise re-identification is likely to be ineffective, and even counterproductive. If re-identification is not possible then it doesn't need to be prohibited; if re-identification is straightforward then governments (and the people whose data was published) need to find out.
The rest of this report examines what de-identification is, whether it works, and what alternative approaches may better protect personal information. After assessing whether de-identification is a myth, we outline constructive directions for where to go from here. Our technical suggestions focus on differential privacy and aggregation. We also discuss access control via secure research environments