Why is Google+ Built to Hold More Users than There Are People on the Planet?

July 28, 20112:08 PM

Photo by KEXINO via Flickr.

Since Google+ launched on June 28, several Slate staffers have jumped on the bandwagon and signed up for accounts. Each account, we noticed, comes with a 21-digit user ID. That means that Google+ could conceivably accommodate a sextillion users—or more than a billion times as many people as there are on the planet. Why are those ID numbers so long?

According to a Google+ spokeswoman, user ID numbers are randomly selected for each account and are in no way related to the actual number of Google+ users. (So if your user ID is 123456789012345678901, that doesn’t mean that you’re the 123,456,789,012,345,678,901th person to have signed up for the service.) The spokeswoman wouldn’t comment any further on the topic.

A lengthy string of arbitrary digits would help protect users’ privacy—and Google’s business interests. The social network business model relies on monopolizing user information, so the company would want to make it as hard as possible to guess existing Google+ IDs. The various guessing techniques available, such as so-called binary search algorithms, become virtually unsolvable when there are a few trillion or so empty slots for each assigned number.

According to Micah Sherr, a Georgetown professor of computer science (who did not work on the project), Google is probably using a hash function to create its lengthy identifiers. In a hash function, some input—say, a username and some additional bits of secret information, such as the name of the user’s childhood pet—is inserted into a formula that converts that data first into binary code and then into a numerical user ID. (So while that ID can appear to be randomly chosen from a very large set, it’s actually being calculated in a specific way.) The user ID is thus seeded with information about the user, though it’s practically impossible to extract that information.* A hash function would also guarantee that all user IDs remained unique.

If Google isn’t interested in embedding IDs with information, UCLA computer science professor Peter Reiher speculates, it could simply be using a random number generator, inserting a code to make each number in the set appear only once.

These ID-generating techniques aren’t unique to Google+: At least part of your credit card number is created by hash functions or random number generators. After all, if the number on your Visa corresponded directly to the number of bank customers, it would be easy to guess (and steal) other, existing credit card numbers—all you’d have to do is add or subtract one digit to or from your own. Considering how strongly we guard our credit card information, it should be reassuring that Google+ uses 21 digits—five more than credit card companies do.

* Correction, July 29: This post originally suggested that the user information used to seed ID numbers created with a hash function could be extracted; actually, hash functions are practically irreversible.