Data Digest № 001

Data Digest ¦ March 17th, 2019, 11:00 pm

Welcome to Datawallet’s Weekly Data Digest. Here we’ll summarize the weeks most notable data events, adding our take when we have something useful to say. Normally we’ll publish every Sunday. Please don’t read into the fact we’re a day late on our first one scheduled for St. Patrick’s day. Now let’s get to it! Here are the stories you should pay attention to…

Facebook “Planning” a Move Towards More Private Messaging

The biggest announcement in the data world came from no other than the data czar himself, Mark Zuckerberg. After misusing, selling, and disowning people of their data for years while making hundreds of billions of dollars in the process, Zuck has now decided that he wants to be the guardian angel of privacy.

The move has the bizarre appearance of a Hail Mary where a touchdown in the privacy endzone would lead all of us, regulators included, to forget about the years of scandals that’ve plagued Facebook. It is also a wonderful spectacle that, coincidentally, happens at a time where Facebook is trying to pull off one of the most frightening data plays in recent history — unifying the Facebook, Instagram, and Whatsapp platforms — which has led to high profile executive departures such as the firm’s Chief Product Officer, Chris Cox, and Whatsapp co-founder, .

Mark Zuckerberg promises a newer, more private Facebook

A major new blog post about Facebook’s future

Facebook Traded Users Personal Information — Grand Jury Investigates

Speaking about scandals that Facebook may want to sweep under the rug: Zuckerberg’s grand privacy revelation was quickly followed by news that may have inspired his sudden change of heart, a criminal investigation. Unlike the Cambridge Analytica scandal, the investigation, led by the Eastern District of New York, focuses on partnerships that Facebook struck with 150+ companies to give them access to data on Facebook users without their express consent or knowledge. Companies who had these data sharing agreements with Facebook include Amazon, Apple, Microsoft, and Sony.

And even though Facebook is rightfully crucified for their data sharing practices, we ought not to forget the companies who were on the receiving end of these data deals. They knew exactly what they were doing: Amazon grabbed hoards of contact information; Bing mapped out nearly the entire social network; and Apple and other device companies hid all indications of all of this data stealing from their users from their smartphones, tablets, etc.

Facebook’s Data Deals Are Under Criminal Investigation (Published 2019)

A federal grand jury is looking at partnerships that gave major tech companies broad access to Facebook users’ information.

Weird flex of the week

Foursquare is lifting the veil on how much it knows about you. At least to a certain extent. Specifically for SXSW, it released a Hypertrending app that gives a “god view” of aggregated location data to show where everybody is hanging out in Austin, TX. Co-Founder & CEO Dennis Crowley’s pitch was for this app to be “provocative” in an effort to gauge consumer sentiment.

The idea to gain people’s trust by selectively showing a fraction of what one knows about them seems like a strategy destined for failure. And that failure is doubly-ensured once you realize that the people you are looking to win over are likely unaware that they are users of Foursquare, since all the data comes from a Foursquare SDK that runs in the background of many other apps. All of these clear and creepy signals suggest that this play may not have been a provocative consumer litmus test, but rather public flexing of Foursquare’s data muscles directed at big corporations.

Foursquare's unusual pitch: The ethical data company | Engadget

It seems counter-intuitive that, in the thick of a backlash against Big Tech's data privacy abuses, Dennis Crowley is pitching location tracking technology at South By Southwest. Foursquare, which he co-founded, recently announced Hypertrending. It's an in-app feature that shows a real-time heat map of where everyone on Foursquare (and the apps that use its technology) are hanging out in Austin. The data is anonymized and aggregated so you don't see how many people are in a particular bar or park.

Utility and Privacy

“People don’t realize the sand is shifting under their feet and that we can now in fact achieve privacy and utility at the same time.”  — Ramesh Raskar, Associate Professor, MIT Media Lab

A recent article in the MIT Technology Review highlighted advances in privacy preserving machine learningFederated Learning can train a model using data stored at multiple different hospitals without that data ever leaving a hospital’s premises or touching a tech company’s servers. Throughout the process, raw data is never exchanged — only the models, which cannot be reverse-engineered to reveal that data.

As a matter of fact, there are several companies who are actively working on tools that make implementing the vision of federated learning possible. OpenMined, for example, has open sourced their python library that enables converting traditional tensorflow ML code into federated learning by “changing 10 lines of code”, while being only about 2x slower than centralized learning.

A little-known AI method can train on your health data without threatening your privacy

In 2017, Google quietly published a blog post about a new approach to machine learning. Unlike the standard method, which requires the data to be centralized in one place, the new one could learn from a series of data sources distributed across multiple devices. The invention allowed Google to train its predictive text model on…

What’s in the Box? Unencrypted customer data…

Recently, there was a yet another data breach, this time it was a dump of millions of customers’ names and email addresses, all associated with Box accounts. Box is a cloud storage and sharing company, and although they provide some good usage standards and guidelines, it seems like not everyone cares enough to listen. Cybersecurity researchers from Adversis found that dozens of companies, carelessly using public links, have made sensitive customer and corporate easily discoverable. Companies involved include Amadeus, Apple, Discovery, Herbalife, Edelman, Pointcare, and (embarrassingly) Box.

Dozens of companies leaked sensitive data thanks to misconfigured Box accounts – TechCrunch

Security researchers have found dozens of companies inadvertently leaking sensitive corporate and customer data because staff are sharing public links to files in their Box enterprise storage accounts that can easily be discovered. The discoveries were made by Adversis, a cybersecurity firm, which …

Sen. Warren Runs on being Trust Buster in Chief

Senator Warren, who recently announced her bid for Presidency, announced one of the primary themes of her candidacy: going up against the dominance of Silicon Valley. Most notably, Sen. Warren laid out a plan to break up big tech conglomerates such as Facebook, Amazon, and Google. Her plan encompasses two steps:

  1. Passing legislation that requires large tech platforms to be designated as “Platform Utilities” and broken apart from any participant on that platform;

  2. Appointing regulators committed to reversing illegal and anti-competitive tech mergers. Her plan would seek to revert acquisitions such as Facebook’s purchase of Instagram and WhatsApp.

Good luck…

Here’s how we can break up Big Tech

By Elizabeth Warren

The argument for Facebook’s anticompetitive behavior

On that note, Roger McNamee, an early mentor of Mark Zuckerberg and now an outspoken critic of the platform, recently turned to federal investigators to help them with their reasoning over whether Facebook has been engaging in anticompetitive behavior and should, therefore, be broken up. According to McNamee:

The idea of breaking up large tech corporations has the appeal of an easy fix for a complex problem. But by employing this rhetoric, and possibly even following up with action, we are focusing on a symptom, not the root cause. The fact is that people’s data are being taken, stored, and used because people have no effective right to data ownership. We should focus on giving people more power over their own data. Reigning in corporations one-by-one may not be the right trick here.

Roger McNamee: “Why is it legal to collect data on kids, let alone sell it?”

The early Facebook advisor and outspoken critic believes that antitrust is the only way to curb the power of companies like Google, Amazon, and Facebook.

Protect your Mongo

800 million people have had their records leaked by Verifications IO, a website that helps companies get rid of inactive addresses from bulk email lists. Data included in these leaks: verified emails, phone numbers, addresses, and dates of birth; Facebook, LinkedIn and Instagram account details; and even credit scoring mortgage data, such as amount owing and interest rates being charged. The reason? An unprotected MongoDB database.

(Updated) 2 Billion Unencrypted Records Leaked In Marketing Data Breach --What To Do Next

Another day, another mega data breach. Except this one is different. More than two billion unencrypted records with very detailed information including mortgage data and credit scoring. So, what's happened and what should you do next?

This wraps the inaugural edition of the Data Digest.

Until next time,


Get the Data Digest in your inbox