Home
Insights
Data Scraping to harvest personal information is on the rise

Data Scraping to harvest personal information is on the rise

(12 regulators from around the world have their say)

Australia’s Office of the Australian Information Commissioner together with 11 other privacy regulators have released a joint statement addressing the growing prevalence of unauthorised data scraping and the need for better protection of personal information.

The joint statement has 4 main messages:

publicly accessible personal information is still subject to privacy laws;
social media companies and websites that host publicly accessible personal data have obligations to protect personal information on their platforms;
individuals can take steps to protect their data, and social media companies ought to assist; and
data scraping incidents can be reportable data breaches.

Publicly accessible data

The joint statement reminds us that in most jurisdictions, including Australia, personal information that is available to the public online is still subject to data protection and privacy laws. The fact that it is publicly available does not necessarily remove the applicability of privacy laws that protect an individual’s data.

Individuals and businesses that scrape websites or platforms that include personal information must comply with privacy laws. In addition, the businesses that host that personal information on sites that can be accessed by the public, such as social media companies and operators of other websites, also have data protection responsibilities regarding the third parties that engage in data scraping.

Responsibility of social media companies

Social media companies are specifically called out in the joint statement as entities that have a responsibility to protect individuals’ personal information from unlawful data scraping.

In recent years, data protection authorities have seen increased reports of mass data scraping from social media and other websites that publish personal information. Social media companies need to be across the legality of different types of data scraping and the emerging technologies that enable scraping and extracting value from datasets.

Multi-layered technical and procedural controls to mitigate the risks to individuals should be implemented, and a combination of these controls should be used that is proportionate to the sensitivity of the information and may include a range of technical measures to monitor accounts, limit visits to account profiles, detect scrapers by identifying ‘bot’ activity, and respond to unauthorised scraping.

The joint statement also confirms that social media companies should be taking appropriate legal action where data scraping is suspected or confirmed, such as the sending of ‘cease and desist’ letters, requiring and confirming the deletion of scraped information and other legal action to enforce terms and conditions that prohibiting data scraping on their websites and platforms.

Protecting individuals

Web scraping is an issue for individuals because they can lose control of their personal information when it is scraped, especially if this occurs without their knowledge and against their expectations.

Possible harms associated with unauthorised data scraping can include targeted cyberattacks, identity fraud, profiling and surveillance of individuals, unauthorised intelligence gathering and unwanted direct marketing or spam.

The joint statement suggests individuals take steps to empower themselves and better protect their personal information, including limiting the information that they post online, and being aware of privacy settings on their accounts.

Individuals however can only do so much. The joint statement also states that social media companies and other websites also have a role to play in enabling users to engage with their services in a privacy protective manner. Social media companies and other websites should support their users so that they can make informed decisions about how they use the platform and what personal information they share. This should also involve increasing user awareness and understanding of the privacy settings they can use.

Reportable data breaches

The joint statement confirms that mass data scraping incidents that harvest personal information can constitute reportable data breaches in many jurisdictions. So, social media and other companies that host a website and allow their data to be scraped may have a notifiable data breach on their hands.

Why now and what next?

The joint statement is the data protection authority’s response to an increase in reports of mass data scraping of social media and other websites. The availability and capacity of data scraping technologies to collect and process individuals’ personal information from the internet, including AI assisted technologies, has likely also contributed to the authority’s heightened concern.

The joint statement is consistent with the Office of the Australian Information Commissioner’s recent focus on illegal data scraping (such as the recently concluded Clearview AI data scraping case in which the Administrative Appeals Tribunal of Australia found Clearview AI breached the Privacy Act by collecting images of individuals' faces for biometric identification) and previous collaborations with international data protection authorities on global data issues (such as opening a joint investigation into the personal information handling practices of Clearview AI Inc with the UK’s Information Commissioner’s Office back in 2020).

The expectation that social media companies assist individuals to control how their personal information is shared online can be taken as a foreshadowing that making of misleading representations to consumers about the collection and use of their personal information, and failing to provide transparent privacy information controls, will become an increasing focus for the privacy and competition regulators.

The joint statement does not set any legal obligations in addition to the statutory requirements in each jurisdiction. But it does make recommendations based on the data protection authority’s expectations.

In an interesting move by the data protection authorities, the joint statement has been provided directly to the leading social media companies with a request for comments in what seems like a very tight 1-month turnaround from 23 August 2023.

The social media companies now have the substantial task of demonstrating in their responses how they comply with the data protection authorities’ expectations as outlined in the joint statement. We can expect the responses by the social media companies, which may be published, will address how unauthorised web scraping is detected, organisational and technical measures in place to prevent scraping and minimise harm, how privacy settings support individuals, and the data breach response plans in place.

Data Scraping to harvest personal information is on the rise

Published

Related

Practices

Sectors

Trending Topics

Regions

Countries

(12 regulators from around the world have their say)

Publicly accessible data

Responsibility of social media companies

Protecting individuals

Reportable data breaches

Why now and what next?

Latest insights

Report of Trade Mark Cases For the CIPA Journal May 2025

Germany: The insured event in the automotive product recall cost insurance

China Cybersecurity and Data Protection: Monthly Update - June 2025 Issue