This news story was updated on September 4, 2023, to reflect an update given regarding the cyber security incident by Duolingo.
The scraped data of more than 2.6 million users of language learning app, Duolingo, has been posted to a dark web hacking forum.
The information was put up for sale on a dark web hacking forum on August 22 by a malicious actor. The malicious actor was offering US$1,500 for all 2.6 million records. The hacker claimed to have gained access to the data by scraping and exposed application interface (API). They also confirmed the legitimacy of the data by offering a sample of the data from 1,000 accounts.
Duolingo confirmed to news site TheRecord that the data was scraped from public profile information. The data exposed includes users’ names, usernames, email addresses and other information relevant to Duolingo’s services. It is relevant to note, however, that email addresses are not public information on Duolingo.
A Duolingo spokesperson said of the cyber security incident: “No data breach or hack has occurred. We take data privacy and security seriously and are continuing to investigate this matter to determine if there’s any further action needed to protect our learners.”
The exposed API has been public knowledge since March 2023. It allows anyone to retrieve the public information of any Duolingo profile by inputting their username into it. Cyber security news site BleepingComputer confirmed that the API was still open in August 2023, despite Duolingo being alerted to its being open in January 2023. This was due to a malicious actor attempted to sell in on the now-defunct hacking forum, Breached.
On September 1, a Duolingo spokesperson contacted Cyber Security Hub with the following update: “Our investigation confirmed that this was not a breach or a hack; it was a scrape of data from public Duolingo profiles. No Duolingo systems or private user data were compromised. Regardless, as a precautionary measure we have taken some steps to limit this from happening again.
“We have put in place rate limits on the specific API endpoint to make it more difficult for attackers to abuse. We take data privacy and security seriously and will continue to constantly evaluate our security measures to ensure learner safety.”