Bitwise

Who Is Reading Your Email?

Is it an algorithm or a human being? (And do you care?)

A woman on a computer in 2011 in New York City.
You’re not the only one reading your email.

Photo by Chris Hondros/Getty Images

Microsoft is reading your Hotmail! Are you surprised? That’s a serious question. Because I have a good sense of the innards of online mail and messaging services—I worked on Microsoft’s instant messenger for a time, which integrated with Hotmail—I have, paradoxically, less of a sense of what regular users expect in terms of privacy. The NSA surveillance of just-about-everything has provoked less outrage from our privacy-conscious society than I predicted, so is it so much worse if Microsoft does it?

The case here is a special one, since it involves leaking of Microsoft trade secrets. In 2012 Alex Kibkalo was a Microsoft software architect who allegedly leaked parts of the then-unreleased Windows 8 to a French blogger via a Hotmail account. The unidentified French blogger, also up against Kibkalo for a Darwin Award, then allegedly emailed the stolen files to Microsoft Windows President Steve Sinofsky to ask if they were authentic. Microsoft pounced on the blogger’s Hotmail account and searched it, where the company found an email from Kibkalo, then searched Kibkalo’s Hotmail account for more smoking guns.

The mindboggling stupidity at work here is reflected in some of the IM transcripts released in the claim:

Kibkalo:                I would leak enterprise today probably
Blogger:               Hmm
                               are you sure you want to do that? lol
Kibkalo:                why not?

Blogger:               “that’s crossing a line you know pretty illegal lol”
Kibkalo:                I know
                               :)

Leaking a product in development—even Windows 8—is indisputably a violation of pretty much any employee agreement a developer will sign. Kibkalo apparently leaked in retaliation for a poor performance review, though I’m still fuzzy on the exact benefit he sought to achieve.

So after this would-be Edward Snowden left a trail all over Hotmail servers that Microsoft owned, the company tracked him down immediately. Microsoft searched through his email, and more significantly, itsearched through the email of the blogger, who was unaffiliated with Microsoft. While Microsoft’s reaction to Judas Kibkalo is understandable, was it legally justified in snooping?

John Frank, Microsoft’s deputy general counsel and vice president of legal and corporate affairs, released an eye-rolling statement, which declared, “Courts do not … issue orders authorizing someone to search themselves, since obviously no such order is needed.” But before you go deleting all of your data from the cloud, Frank reassures us: “However, even we should not conduct a search of our own email and other customer services unless the circumstances would justify a court order, if one were available.” Even Microsoft shouldn’t!

Frank poses an interesting paradox, which I’ll paraphrase as: Searching never requires a court order, because one is never needed, but searching should only be performed if that nonexistent court order were available. That makes it sound like searching can never be performed, but Frank proceeds to cut the Gordian knot as follows: Microsoft will check with a “former federal judge” to see if the evidence merits the search, in what will evidently be the most one-sided legal argument heard outside the FISA warrant courts.

The Electronic Frontier Foundation has ridiculed Microsoft’s looking-glass logic. But it’s true that Microsoft’s terms of service more or less give the company carte blanche; the same can be said of Yahoo, Apple, and Google as well. It’s their data, you just live in it. As Mark Zuckerberg says, the age of privacy is over.

On Friday, TechCrunch founder and VC figure Michael Arrington claimed that Google spied on his Gmail account to identify one of his sources some years back. Arrington’s story relies on hearsay, though, and due to Arrington’s own ethical lapses and past conflicts of interest, I’m not inclined to take his word on much. But Microsoft’s actions, as well as a generally cavalier attitude toward privacy from the top tech companies in general, make the potential for scenarios like Arrington’s worrisome, and signal that we’re moving toward a point where corporate self-policing of user privacy is becoming insufficient.

Part of the problem is that the regulation of what a company does with its customers’ data is a tricky business, especially when high tech is involved. When Google’s Gmail first rolled out, it was with the assurance that no human being would read your email—only algorithms that would decide which ads to show you based on your email subjects. This is undoubtedly true in general, since there’s not enough time in the world to read the majority of email. But aggregate analysis of email or email metadata might not be as robustly anonymous as it appears. Let’s say a company wants to know what the most forwarded email is, and it turns out to be some right-wing crackpot missive about buying gold before the dollar collapses. A human might well see that email. Or take the more prosaic case of fixing bugs. When I worked on Microsoft’s instant messenger, there was a bug that came up that affected only users with really long account names, like billymcgrillymcgruffmcgroomcfiddlydinkmcfiddlydonkmccrasherface@hotmail.com. (To protect Billy’s privacy, I have changed the spelling of his name.) Just in the process of debugging, I saw more of Billy’s account information than would have been ideal. We clearly weren’t targeting Billy, nor did we have any idea who he (or she) really was. But the difference was in intent, not in action.

None of this justifies snooping. But it does explain why neither the government nor the tech companies have been eager to jump into the process of restricting companies’ ability to access their customers’ data at will. If explicit user approval were required each time to access account data for debugging, it would slow down debugging by several orders of magnitude. So some sort of preapproval in the terms of use has to be present, but the scope of it and the enforcement of its restrictions are thorny. Scholars like Frank Pasquale and Nick Diakopoulos are bringing up the need to “hold algorithms accountable”: paying closer attention to the automatic effects of corporate data management as well as manual interventions. But if the SEC is toothless and the U.S. government can barely prosecute anyone even for the blatant misdeeds that brought on the Great Recession, what are the chances of titrating policy around the unfathomable complexity of computer algorithms? The tech companies may just win this one by default.