Data Digest № 015
Hey there, and welcome to the 15th edition of the Data Digest, where I offer a weekly summary of the most important happenings in the data industry. This week in review: Hyp3r gets hyped on your personal data, security experts expose flaws in GDPR, Twitter admits to a data mishap, data silos halt AI progress, and more. Enjoy!
Instagram’s “Trusted Partner” HYP3R Has Been Caught Scraping Millions of Profiles
San Francisco based startup HYP3R, a “preferred marketing partner” of Facebook-owned Instagram until late, has been caught scraping huge amounts of public user data. The firm has publicly admitted that it has been hoarding “a unique dataset of hundreds of millions of the highest value consumers in the world”, revealing that “90% of its data came from Instagram.” Business Insider wrote that HYP3R took “advantage of an Instagram security lapse” allowing users who weren’t logged in to view posts from public location pages. Using that access, the company created geofenced locations, harvested “every public post tagged with that location on Instagram,” and stored them indefinitely. They even built a tool to download Instagram Stories, which are supposed to auto-delete after 24 hours but instead were essentially made permanent. With this data, the firm hashed together detailed records of millions of users’ locations, personal bios and photographs posted to ‘stories’, which enabled the firm to construe accurate interest and behavioral patterns of users, allowing them to be effectively targeted with ads. Facebook’s lax privacy policies have allowed this data breach to continue under its nose for over a year, issuing HYP3R a cease and desist and kicking them off the platform following the reported data breach. More than a year after the Cambridge Analytica scandal, it comes as no surprise that Facebook is still uncovering important privacy lapses, highlighting the widespread and urgent need for open platforms to perform due diligence when it comes to users’ personal information. The total volume of user data HYP3R scraped from Instagram remains unclear. A former HYP3R employee disclosed, “It takes very little effort for Instagram to protect the location accessed by HYP3R, why they haven’t done it remains a mystery.” Begging the question, how many others are still getting away with the same thing?
GDPR Privacy Law Exploited To Reveal Personal Data
University of Oxford based researcher and security expert, James Pavur, presented worrying findings from a curious experiment at the Black Hat conference in Las Vegas. The experiment was intended to replicate an attack that could be carried out by someone starting with the details found on a basic LinkedIn page or similar public profile. He contacted over 80 firms of different sizes based in the UK and US to see how they would handle a “right of access” request made in someone else’s name. In each case, he asked for all the details they held on his fiancee. During the experiment, he managed to expose a total of 60 distinct pieces of personal information about his partner. His findings revealed that a staggering 24% of the 83 firms supplied personal information without even verifying the requestor’s identity, 16% requested an easily forged type of ID that he did not provide, and only 5% said they had no data to share. This is the first of its kind in revealing the negligent security when firms are faced with the citation of an EU privacy law, and highlights the lack of clearly defined best practices to be implemented in order to keep up with enhanced privacy regulations. The implications of this ground-breaking research are yet to be seen. Industry-wide best practices should be established to guide companies’ compliance efforts. Clearly, stricter security measures within firms need to be implemented top-down.
Privacy law exploited to reveal fiancee’s data
Twitter Sharing User Data With Advertisers, Even After Users Explicitly Tell Them Not To
Twitter has admitted to sharing users personal data to advertisers regardless of the users’ permission. Whether this will attract regulatory attention is yet to be seen. European regulation under GDPR mandates disclosure of data breaches, meaning the case will depend on how long ago Twitter found the bugs. GDPR also includes fines for confirmed data protection violations. Twitter revealed bugs that affected the way it shares personal data back in May 2019 when they disclosed that they had been sharing users location data during Real Time Bidding (RTB) auctions by accident. Twitter stated that they “may have shared certain data (e.g., country code; if you engaged with the ad and when; information about the ad, etc)” with ad measurement and advertising partners. If social media companies are able to get away with these security ‘mishaps’ with not even as much as a slap on the wrist, what’s the point of putting the regulations there in the first place?
Twitter ‘fesses up to more adtech leaks – TechCrunch
Data Needs To Be Controlled By Users, But Remain Usable To Others
Data silos in the field of medicine are impeding scientific innovation. This can create huge problems for biomedical scientists like Robert Chang, a Stanford ophthalmologist, and James Zou, a practicing professor at Stanford, who aptly states that “there is a gap between the policy community and the technical community on what exactly it means to value data.” Data silos in the field of medicine are preventing data from being shared across institutions, ultimately hindering significant progress. Patients and doctors alike would almost certainly be more willing to share data knowing it won’t be visible to anyone but themselves. Embracing a privacy conscious design that enables data to be utilized for machine learning models, whether that be used for medicine, technology or otherwise, would significantly help medical innovation. For data ownership to manifest, data needs to be controlled by users, but still usable to others in a privacy preserving manner. This is possible through blockchain based platforms that encrypt and anonymize data, like Datawallet.
AI Needs Your Data—and You Should Get Paid for It
FBI Surveillance Proposal Interferes With FTC Scrutiny of Facebook’s Privacy Policies
FBI Surveillance Proposal Sets Up Clash With Facebook
Campaign Group Exposed 6.2 Million Americans’ Email Addresses
Email addresses of 6.2 million Americans have been left on an exposed server by an organization seeking to help elect Democratic candidates to the US Senate. According to security firm UpGuard, the data came from those “who had opted out or should otherwise be excluded” from the committee’s marketing. The spreadsheet was titled “EmailExcludeClinton.csv” and was found in a similarly named and unprotected collection of data in the cloud, without a password. The file was uploaded in 2010 — a year after former Democratic senator and presidential candidate Hillary Clinton, whom the data is believed to be named after, became secretary of state. Many questions arise when political data is exposed almost 10 years after it’s created.
Democratic Senate campaign group exposed 6.2 million Americans’ emails – TechCrunch
Amazon Is Looking to Put Advertising Data on a Blockchain
The tech industry is realizing that the predominantly convoluted discrepancies in adtech need a serious refurb. Amazon is joining a number of companies in order to utilize blockchain technology to remodel the way we distribute and record where the advertising dollars are going, and identify the middle men that are taking a cut. Currently, there are large sums of money flowing through the RTB system. In the US alone, about $20 billion. The general consensus across the board is that more transparency for transacting data in the online advertising industry is imperative.
Amazon Is Looking to Put Advertising Data on a Blockchain - CoinDesk
What I'm Reading:
VICE - Exclusive: Critical U.S. Election Systems Have Been Left Exposed Online Despite Official Denials
Trump campaign, GOP committees halt Twitter spending after McConnell account locked
Amazon lets Alexa users disable human voice recording review
Cultivated data is the next Gold Rush – TechCrunch
Why Tomorrow's Political Leaders Must Engage With Technologists
Democrats have told Google to make its contractors permanent employees
See you next week!
SerafinData Digest Consumer PrivacyData Misuse Data Breaches GDPR Industry Trends Regulatory Updates