How Facebook is Adding an Identity Layer to the Internet
In what may become the next major privacy controversy for Facebook, the company has announced plans to automatically share certain information when a Facebook user visits certain “pre-approved” sites. In clarifying the feature, a spokesperson told VentureBeat that people should “think about Facebook Connect, but the user gets that experience when they arrive at the site rather than after clicking Connect.”
Given the way Facebook has repeatedly described “publicly available information” (PAI) since last fall’s privacy changes, this update is actually a logical next step for the company. Under a strict interpretation of Facebook’s policies, nothing would prevent a site from making use of such information already. Only technological barriers currently block the information flow – specifically, a site doesn’t automatically know who you are on Facebook when you visit.
At least, so it would seem. Researchers have already outlined ways that sites can infer a visitor’s social networking profile from other tracking mechanisms. In some ways, the new Facebook auto-connect simply builds on cookies and inline frames, the sources of earlier online privacy controversies. Furthermore, several security researchers have demonstrated exploits that led to data leakage. Nitesh Dhanjani demonstrated earlier this year that an authentication issue could give sites automatic access to the PAI of visitors, and just this week I reported to Facebook a vulnerability in their Platform that would allow sites to silently harvest all of a user’s profile information (details pending a patch).
Given the amount of data already flowing to Facebook applications and Facebook Connect sites (as well as their advertisers), the company’s moves towards more and more public sharing, and the history of privacy/security problems on the Facebook Platform, I’ve long argued that Facebook users should treat all of their content on the site as public. But Facebook has worked hard to maintain user trust, even making some content appear to be more private than it actually is. When I first discussed accessing public but hidden photo albums last December, I commented, “Making the albums hard to find gives an illusion of privacy and only delays any rude awakenings that may come from users who have inadvertently shared private photos.”
Now it may seem that Facebook users will finally understand the ramifications of default privacy settings. But the new system will probably be fairly subtle at first. Some users will find it creepy to be greeted on other sites by name, but such information will probably appear in a distinct, Facebook-labeled box (i.e., a Facebook Widget) to let a user know where the content comes from and make it still seem somewhat separate from the rest of the site. On the backend, though, the site will have access to the user’s public data.
What users may not realize is how much data they’re already sharing. This new style of Facebook Connect actually mirrors the behavior of Facebook itself. When you visit a Facebook application for the first time, it automatically knows who you are and can access your public data. (Correction: This only occurs in certain circumstances; more information here.) When you then click “Allow” to authorize the app, you give it access to all of your private data. Currently, an external web site knows nothing about you until you click “Connect.” If you do click, it has the same access to your private data as an authorized application. Now, Facebook is letting sites initially act like new applications by giving them access to your public data prior to full authorization.
In discussing the Facebook Platform, Anil Dash gave this analogy: “Think of the web, of the Internet itself, as water. Proprietary platforms based on the web are ice cubes. They can, for a time, suspend themselves above the web at large. But over time, they only ever melt into the water.” Depending on your perspective, either Facebook is finally melting into the water or the Web turned out to be the ice cube. With an automatic Connect system and the Open Graph API, Facebook is expanding its Platform to the rest of the Web. The only major difference between a Facebook-enabled web site and an actual Facebook application may soon be the URI.
You can start to get a sense of how this expansion may look by reading proposed changes to the service’s governing documents (see Inside Facebook’s excellent analysis):
We may also make information about the location of your computer or access device and your age available to applications and websites in order to help them implement appropriate security measures and control the distribution of age-appropriate content.
Currently, many sites hosting pornographic content will ask visitors to click a link verifying they are at least 18 or 21 before loading the material. With Facebook, the site could simply check your profile information first. Media companies worry about visitors accessing content outside of a given country; perhaps soon they can use your Facebook information to check your location.
Granted, providing fake details on your Facebook could easily foil some of these checks, but in many cases, that’s hardly different from lying about your age when you click or using a routing service to mask your location. Also, since if interact with friends on Facebook, you have a greater incentive to keep some information accurate. Facebook also reserves the right to terminate your account if you provide false profile information (despite also suggesting this strategy as a protection against identity theft).
My point is not to suggest that porn sites will soon be on Facebook’s “pre-approved” list or that Hulu would trust your profile over geographic IP data. I simply give these hypothetical scenarios to illustrate a larger trend: for better or for worse, your Facebook profile is becoming a virtual ID card.
Adding an identity layer to the Internet is not a new idea, but this may be the first time a system finds widespread adoption. Yet the Facebook identity model conflicts with many visions of how online identity should operate. “Open Stack” technologies, such as OpenID and OAuth, allow for federated setups. One of the first “Laws of Identity” by Kim Cameron states, “Digital identity systems must only reveal information identifying a user with the user’s consent.” Much of the consent in Facebook’s system comes from accepting the site’s terms at sign-up; many users will likely think that an opt-out Connect model violates Cameron’s principle.
And ultimately, user perception will be key to Facebook finding acceptance of its new endeavor. As social media researcher danah boyd discussed in her SXSW keynote, services with nothing technologically wrong can still disrupt social expectations (e.g. Google Buzz). (I rank the entire talk as must-read material for anyone working in the social networking space, but I’m only focusing on a few points here.) She also made a noteworthy distinction that I think will come up often as Facebook evolves:
Keep in mind that people don’t always make material publicly accessible because they want the world to see it….
Just because something is publicly accessible does not mean that people want it to be publicized. Making something that is public more public is a violation of privacy.
I think this distinction will be severely tested as the availability of Facebook data increases. I don’t dispute boyd’s evaluation, but coming from the perspective of security research, I know that when data becomes publicly available, it’s only a matter of time before it gets publicized in some way. With the wealth of information stored on Facebook’s servers, the site is becoming a favorite of both advertisers and attackers. Already we’ve seen hacks and tricks that make public Facebook data more public (see above), and each new site that integrates with Facebook is a new attack surface.
I’ve been cussed out by visitors to my site who think that by publishing weaknesses in the Facebook Platform or exposing seemingly hidden content I’m assisting those who maliciously hack people’s profiles. But much of what I post attempts to raise awareness of potential privacy and security issues before they get exploited by black hats. I can guarantee you I’m not the only one looking for Facebook weaknesses.
And that’s part of what concerns me about boyd’s distinction. The same technology that makes content “public” makes it easy to aggregate and publicize. For example, Pete Warden recently announced that he had built a dataset of 215 million Facebook profiles that he planned to publish for research purposes. Facebook eventually threatened to sue, prompting him to destroy the data, but no technology stands in the way of someone else recreating the dataset for their own purposes. In fact, with Facebook’s auto-connect system and the possibility of lighter rules for data storage, web sites may soon inadvertently recreate the dataset.
I honestly don’t think that Facebook is evil or that they care nothing about user privacy. Their new identity layer will likely bring benefits to many users and provide sites with valuable features. But just as Facebook became successful through providing users with a more private experience, the Internet became successful in large part because of its anonymity. While many users are happy with their personal Facebook account being a place “where everyone knows your name,” many users also value the rest of the Internet not knowing if they’re a dog. And as danah boyd put it so well, “No matter how many times a privileged straight white male technology executive pronounces the death of privacy, Privacy Is Not Dead.”