Privacy-enhancing technologies and compliance

The collection, computation and analysis of data is vital to the progress and prosperity of all industries. However, handling data also entails significant risks to privacy and exposure to cyberattacks. This is where privacy enhancing technologies (PETs) can assist to mitigate risks.

What are PETs?

PETs are technologies that aim to enhance the privacy and security of information associated with data. Although most kinds of PETs are still undergoing development, organisations such as the Centre for Data Ethics and Innovation in the UK consider them a valuable way for organisations to harness the power of data. PETs do this by securing sensitive information contained within the data, whilst maintaining computability or use so that the same level of insight and learning can still be achieved.

The below summary explains the benefits and risks of some of the most well-known PETs ─ including those within the categories of cryptographic algorithms, artificial intelligence (AI), machine learning (ML) algorithms and data masking techniques ─ and the relevance of these technologies to Australian privacy law.

Types of PETs: Benefits and risks

Cryptographic algorithms

Type of PET	Example	Benefits	Risks
Homomorphic encryption (HE) An encryption method that allows encrypted data to undergo computation. There are various levels of homomorphic encryption, including partial, somewhat and fully homomorphic encryption.	Outsourcing computation of data to third parties. The data is encrypted, sent to a data centre, computed on, and then sent back to the original data owner for decryption.	Cloud computing and storage becomes more secure. No need to rely on third parties to provide data security. Enhanced trust between the data owner and the third party, potentially allowing for greater cooperation. Multiple operations can occur on the encrypted data.	Significant ongoing and specialised processing and application requirements, resulting in higher costs and slower processing. Data leakage is still possible. Not immune from hacking.
Secure multi-party computation (MPC) An encryption method that distributes a computation process by splitting the data into pieces amongst multiple parties to compute on. This ensures no individual party knows the other parties' data or output.	A common example is computing the average salary of a group of employees, without revealing their individual salaries. Each person's salary is split into three randomly generated number pieces that add up to their true salary. Each person keeps one of the number pieces, and distributes the remaining two to the other participants. The data is encrypted while being used. All the shared number pieces are computed to calculate the average salary, without revealing the actual salary of the individuals.	Data can be shared with different organisations whilst preserving privacy. No need to rely on third parties to provide data security. Computation of data can occur without revealing or compromising the data itself. Data computations are highly accurate. Since the data is broken up into pieces, encrypted and distributed to multiple parties to compute, it is safe against quantum attacks.	Higher costs associated with processing and ensuring security for the random number piece generator. This is known as computational overhead. Distributing the data to multiple parties for computation can lead to higher costs of communication. It is assumed that each party's output is correct.
Differential privacy (DP) A process whereby organisations share a dataset by describing the patterns and/or groups that exist within the dataset, whilst preserving the privacy of the individuals. A minimum distraction, known as "noise", is introduced into the data to ensure data privacy, but in such a way that still allows analysts to assess and compute the data. Noise is the technique of adding random mathematical impurities to an output. It protects individuals' information from being revealed.	Organisations use DP to gather anonymous behavioural data. Apple, for example, has used DP to gather anonymous data from Apple devices, providing insights into usage. Amazon uses DP to see trends in personalised shopping preferences without revealing information about individual purchases.	Privacy levels are customisable. If the data were to be hacked, attackers would only be able to access partly correct data. Data noise also allows participants to plausibly deny that their data is within the dataset. Prevents re-identification where attackers can use information from another dataset to identify individuals.	The performance of DP depends on the size of the data pool and whether adequate noise has been implemented. Inferences can still be made about whether someone has been included in a dataset, even if "noise" is added. There is no absolute guarantee of privacy, nor is there immunity from attack. DP does not protect participants from attackers inferring that they have participated.
Zero-knowledge proofs (ZKPs) This method allows private data to be computed by a third party which can then verify that it has been computed properly without revealing the details of the data. It is increasingly being used in blockchain.	Using funds verification to prove there is enough money in a bank account to make a transaction, without revealing how much money is in the account.	Enhances privacy in online public networks. Increased efficiency of authentication and verification methods. Narrows the amount of information passing through a system or process, increasing scalability. Simpler method of enhancing privacy.	Costs associated with hardware and verification software. There are significant processing requirements. It is assumed that all parties are acting honestly. Requires trusted installation. There is vulnerability to quantum computing threats.

AI and ML algorithms

Type of PET	Example	Benefits	Risks
Federated learning (FL) A method of training artificial intelligence models without transferring any full datasets. The training data is kept locally on separate devices. It is not exchanged in a central location.	Data was used to train an artificial model to predict clinical outcomes for COVID-19 patients, whilst maintaining privacy of their sensitive health information.	Allows for collaboration across multiple devices, without compromising the data. If ML is trained, results can be produced quicker. Information is secure and provides added protection from cyberattacks. Larger and diverse sets of data can be used to train the AI or ML, resulting in more comprehensive models. Predictions can occur instantaneously. The training occurs whether or not the device is in use or being charged. Particularly useful in industries such as healthcare; health data is shared, without the identifying information being shared.	Communication between the FL networks is not entirely efficient. Data aggregation is limited due to lack of heterogeneity across devices. Some devices connect through Wi-Fi, whereas others connect via 4G or 5G or have different computational and storage capabilities. FL is not immune from cyberattacks. Sometimes other PETs such as MPC and DP (see above) are used to further enhance the privacy of FL.
Synthetic data generation (SDG) Fake data that is generated from real data to test systems and assist with machine learning. It maintains the statistical properties of the real data, without compromising privacy.	Using real images to create fake images that accurately mimic the real thing.	Enhanced data quality, precision, diversity and variety. If data is missing, inconsistent or erroneous, synthetic data can be generated to fill in the gaps. It can be implemented and scaled more easily.	Real-world outliers may not be reflected in the synthetic data. If the original data source is inaccurate, the synthetic data may also be inaccurate.

Type of PET

Example

Benefits

Risks

Federated learning (FL)

A method of training artificial intelligence models without transferring any full datasets. The training data is kept locally on separate devices. It is not exchanged in a central location.

Data was used to train an artificial model to predict clinical outcomes for COVID-19 patients, whilst maintaining privacy of their sensitive health information.

Allows for collaboration across multiple devices, without compromising the data.
If ML is trained, results can be produced quicker.
Information is secure and provides added protection from cyberattacks.
Larger and diverse sets of data can be used to train the AI or ML, resulting in more comprehensive models.
Predictions can occur instantaneously.
The training occurs whether or not the device is in use or being charged.
Particularly useful in industries such as healthcare; health data is shared, without the identifying information being shared.

Communication between the FL networks is not entirely efficient.
Data aggregation is limited due to lack of heterogeneity across devices. Some devices connect through Wi-Fi, whereas others connect via 4G or 5G or have different computational and storage capabilities.
FL is not immune from cyberattacks. Sometimes other PETs such as MPC and DP (see above) are used to further enhance the privacy of FL.

Synthetic data generation (SDG)

Fake data that is generated from real data to test systems and assist with machine learning. It maintains the statistical properties of the real data, without compromising privacy.

Using real images to create fake images that accurately mimic the real thing.

Enhanced data quality, precision, diversity and variety.
If data is missing, inconsistent or erroneous, synthetic data can be generated to fill in the gaps.
It can be implemented and scaled more easily.

Real-world outliers may not be reflected in the synthetic data.
If the original data source is inaccurate, the synthetic data may also be inaccurate.

Data masking techniques

Other forms of PETs include data masking techniques such as obfuscation, pseudonymisation and data minimisation.

Obfuscation means data is transformed into another form in order to hide the original data. An example of this includes the use of ciphers to mask the original data.
Pseudonymisation involves removing sensitive data, such as names, and replacing it with fictitious or unidentifiable data such as XX/XX/XXXX for a date of birth.
Data minimisation is a general principle of privacy that encourages organisations to consider the purpose of data collection, and to only collect, store and process essential data in accordance with that purpose. Every organisation is capable of considering data minimisation when handling personal and sensitive information.

It should be noted that these data masking techniques do not prevent cyberattacks or provide a foolproof form of data security. However, they will make it more difficult for hackers to reap information or to link information to certain individuals.

Other general benefits of data masking techniques include building customer trust and retention, greater reputation for data privacy and data management.

Implications

On 12 December 2022, the Privacy Legislation Amendment (Enforcement and Other Measures) Act 2022 received royal assent in Australia. This amending Act raised the maximum penalties for serious or repeated privacy breaches to whichever is the greater of:

AUD50 million;
three times the value of any benefit obtained through the misuse of information; or
30 per cent of a company's adjusted turnover in the relevant period.

Recent data breaches demonstrate that no organisation is immune from a cyberattack and the risk of exposure to significant penalties under the Privacy Act has risen as a result of the penalty increases last year.

While certain advanced PETs are still in development and may not be accessible to some organisations, they do demonstrate a proactive approach to managing cyber security, privacy and data protection. Adopting appropriate data masking techniques such as data minimisation, and if possible other advanced PETs as described above, could be part of ensuring your organisation maintains strong data protection and is demonstrating compliance with its obligations under the Privacy Act 1988 (Cth).

We will continue to watch the development of PETs and provide updates on how they can assist organisations with their privacy compliance obligations.

For advice and support regarding privacy and data requirements and best practice within your organisation, contact our experienced team of legal experts.

Photo by Sdecoret on Adobestock.

All information on this site is of a general nature only and is not intended to be relied upon as, nor to be a substitute for, specific legal professional advice. No responsibility for the loss occasioned to any person acting on or refraining from action as a result of any material published can be accepted. Lander & Rogers is furthermore committed to providing legal advice and content that is factual, true, practical and understandable. Learn more about our editorial policy.

Key contacts

Robert Neely

Consultant

Keely O'Dowd

Special Counsel

Rebeccah Richards

Lawyer

Technology and digital Blockchain and digital assets

Privacy-enhancing technologies and their role in privacy compliance

What are PETs?

Types of PETs: Benefits and risks

Cryptographic algorithms

AI and ML algorithms

Data masking techniques

Implications

Key contacts

Robert Neely

Consultant

Keely O'Dowd

Special Counsel

Rebeccah Richards

Lawyer

Related Insights

Lander & Rogers advises ADVFN on cross-border acqu...

Lander & Rogers advises Novigi on strategic acquis...

Why should you make privacy your business? Privacy...

CyberSight 360: Cyber risk and AI: the good, the b...

Privacy-enhancing technologies and their role in privacy compliance

What are PETs?

Types of PETs: Benefits and risks

Cryptographic algorithms

AI and ML algorithms

Data masking techniques

Implications

Key contacts

Robert Neely

Consultant

Keely O'Dowd

Special Counsel

Rebeccah Richards

Lawyer

Related areas of expertise

Share

Related Insights

Lander & Rogers advises ADVFN on cross-border acqu...

Lander & Rogers advises Novigi on strategic acquis...

Why should you make privacy your business? Privacy...

CyberSight 360: Cyber risk and AI: the good, the b...