This was an investigation deep into the heart of the “Collection #1-5” password leaks that appeared in the web in early 2019. We showed that more than 3 million Swiss email addresses and – more disquietingly – over 20’000 email addresses of Swiss authorities and providers of critical infrastructure appear in the leak.
Aside the usual broadcast and online channels, we also released a short Youtube video for a younger audience that explains the dangers of using a weak password. For demonstration purposes, I gained access to the Instagram account of our host, Lena, within a couple of hours.
In this project, I used a so-called “big data technology”, Spark, for the first time. While we at SRF Data usually publish the source code for our data processing, I decided against doing so in this case. Instead, I wrote a blog post that explains the process and helps other journalist tackling similar “big data” problems.