How Much Does Facebook Really Know About Its Users?

While recent data cross-referencing survey answers with Facebook activity might suggest Facebook knows its users better than their friends, there's also the likelihood that social media user-identities are carefully crafted for public consumption .


Since social networks have become more user-data and advertising driven, algorithms have been of chief importance, and Facebook is seemingly always under fire. A personality trait study on Facebook that began in 2007 recently drew to a close, and has led to questions about whether Facebook could knows us better than we know each other.

Researchers from the University of Cambridge and Stanford University surveyed 86,220 participants and 70,520 of the participant results were cross-checked using an algorithm that analysed their Facebook activity.

The study reads:

Using several criteria, we show that computers’ judgments of people’s personalities based on their digital footprints are more accurate and valid than judgments made by their close others or acquaintances (friends, family, spouse, colleagues, etc.). Our findings highlight that people’s personalities can be predicted automatically and without involving human social-cognitive skills.

So if a simple learning algorithm can best close friends and family when it comes to personality analysis, there may be some cause for concern. “This gives us a cheap, massive, fake-proof algorithm to judge the personality of millions of people at once,” Michal Kosinski, a computer science professor at Stanford told Wired.

Obviously, this study is independent of Facebook to some degree, so the network can’t be blamed for legitimate scientific studies that occurs among its users. However, Facebook is steeped in algorithmic learning, as we’ve seen by the company open-sourcing several of its tools recently.

While it’s important to consider how much Facebook might know about us, we’ve already passed several milestones that show just how much data the company holds on users. In the face of data mining, users are saying less, and seeking out other networks. It’s possible that the natural ebb and flow of social network use could prevent one network from really knowing too much about its users.

Maybe users are contributing to a spiral of silence by staying on Facebook and other large sites as more of their data is mined, and this could eventually lead to widespread abandonment of the site, but that seems unlikely. The more likely outcome is that users will continue to craft identities on Facebook, and the machine learning algorithms are only learning how we use social sites, and not who we are on a fundamental level.