Not yet the last word on efforts to scrap scraping
13Jan2025In our legal update of 3 October 2023 (link), we reported that the Privacy Commissioner for Personal Data in Hong Kong (“PCPD”) joined the international effort along with eleven other privacy authorities in issuing a joint statement in respect of illegal data scraping practices. The joint statement was circulated to major social media companies, including Facebook, Instagram, LinkedIn, X and YouTube. We foreshadowed that we could expect more on this topic in future. As Pádraig Walsh, Tara Chan and Vanessa Leung from our Data Privacy practice report, we did not have to wait long.
In October 2024, the PCPD and the other privacy authorities issued a concluding joint statement in respect of data scraping practices (“Concluding Joint Statement”). This followed feedback from social media companies and other industry stakeholders.
Data scraping refers to the extraction of data from the internet through automated processes. Unlawful data scraping, however, is the extraction of data for unauthorised purposes such as reselling the data, using the data in cyberattacks, identity fraud or unwanted direct marketing and spam messages.
The Concluding Joint Statement, which is fully endorsed by the PCPD in Hong Kong, sets expectations on how organisations are required to guard against unlawful data scraping. These include:
1. Organisations must deploy a combination of safeguarding measures against unlawful data scraping, including the use of AI. They are expected to regularly review and update those measures to keep pace with quick-evolving scraping technologies.
2. Engaging third-party service providers to guard against data scraping does not absolve the organisations’ own responsibility to protect personal data.
3. Generally, organisations should limit the amount and sensitivity of information they make publicly accessible so that they can adequately protect such data from unlawful scraping.
4. The level of safeguards ought to be appropriate and commensurate to the sensitivity of the information potentially available for unlawful scraping.
5. The obligation to protect against unlawful scraping applies to Small and Medium Enterprises (“SMEs”) as it is applicable to large corporations. SMEs are expected to deploy measures, albeit at lower costs, to guard against scraping. Measures on a modest budget include bot detection, rate limiting and CAPTCHAs.
6. Organisations engaged in or permitting data scraping must be transparent about the scraping and obtain consent where required by the applicable law.
7. Organisations must ensure that any third parties who are contractually authorised to scrape data do so in compliance with the applicable data protection and privacy laws. For example, contracts with third parties should specify limitations on the information that may be scraped, the purposes for which it may be used, and the consequences non-compliance.
8. Organisations are expected to implement measure to monitor third-party compliance with data scraping agreements and to enforce compliance when contract terms are not respected. They must not rely solely on the fact that there are formal contract terms imposed.
9. When granting access to data scraping to third parties, organisations must do so in a controlled environment to facilitate monitoring of data access by an application programme interface (API).
10. When using scraped data to train AI, organisations must comply with applicable data protection, privacy and any AI-specific laws.
Within the space of a year from the initial joint regulator statement on data scraping, there is more direct content in the Concluding Joint Statement focussing on data scraping issues arising from the wider adoption of artificial intelligence (AI) systems. There is a clear expectation that social media companies and other organisations that use scraped data sets or use data from their own platforms to train AI systems (such as large language models) must comply with data protection and privacy laws as well as any AI-specific laws where those exist. Likewise, organisations are expected to follow guidelines and principles published by regulators on the development and implementation of AI models. The PCPD directly referenced its “Guidance on the Ethical Development and Use of AI” and “AI: Model Personal Data Protection Framework” in its press release on the Concluding Joint Statement.
The engagement with social media companies is also worth noting. During the engagement process, social media companies indicated that they had implemented many of the measures that were identified in the initial statement, as well as further measures that could form part of a dynamic multi-layered approach to better protect against unlawful data scraping. The Concluding Joint Statement was sent to the relevant social media companies to provide further guidance.
Data scraping is a complex, broad and evolving issue. The issues involved are broader that personal data and privacy. Nonetheless, it is clear that this issue will stay on the radar of data protection authorities.
The full Concluding Joint Statement can be found at this link, and the media statement of the PCPD is on this link.
Pádraig Walsh, Tara Chan and Vanessa Leung
If you want to know more about the content of this article, please contact:
Partner | Email Partner | Email
Disclaimer: This publication is general in nature and is not intended to constitute legal advice. You should seek professional advice before taking any action in relation to the matters dealt with in this publication. This article was last reviewed on 13 January 2025.