Two recently released studies have shown that we are as transparent to others while online as we are in person -- perhaps more.
The first, done by H. Andrew Schwartz et al. of the University of Pennsylvania, is called "Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach." In it, the researchers used computer software to analyze 700,000 words from the Facebook status updates of 75,000 volunteers, who also agreed to take a battery of personality tests. The software then calculated word frequencies for all of the words in the statuses, and then matched up word frequencies with personality markers. Here's a piece of what they concluded:
Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase ‘sick of’ and the word ‘depressed’), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive ‘my’ when mentioning their ‘wife’ or ‘girlfriend’ more often than females use ‘my’ with ‘husband’ or 'boyfriend’).Which thus far is interesting but not particularly alarming. What I found more curious, and perhaps troubling, came up when I saw how many times word frequency could be related to other factors -- age, gender, degree of extroversion, and so on. The age breakdown, I thought, was particularly interesting. The 13 to 18 crowd unsurprisingly had "school," "homework," and "tomorrow" as their most common words; the clear winner from 19 to 22 were the various tenses and forms of the word "fuck;" by 23 to 28, there was a shift to "work," "office," "wedding," and "beer." (The absence of swear words in this age bracket is likely to reflect an awareness of how public Facebook is, and not wanting to get fired for posting something inadvisable online.)
Males of all ages have a great many macho words in their statuses, involving video games, movies, and sports. Unsurprisingly, "fuck" makes a reappearance in the male statuses. Women's statuses were almost stereotypically girly -- "shopping," "boyfriend," "love," "yummy," and "my hair" being some of the most common words.
While none of these were particularly surprising, I think this raises two questions -- one of them more serious than the other. The less serious one is whether our online presence is more revealing who we'd like to be seen as than who we actually are -- after all, we create these statuses, so the macho masculine statuses and girly feminine ones are just projections, ghosts of real people that we've built and then put on public display.
A more serious concern is how this sort of thing could be turned against us. Now, please don't think that I've suddenly turned conspiracy theorist; I'm not particularly worried that the government is going to start data mining my Facebook looking for some reason to lock me away. But think of the usefulness of this to marketing firms, who are always looking for ways to hook into demographic information so that they can focus their ads better.
If we reveal who we are even by the word choice in our status updates, that certainly is going to be something that advertisers are going to use.
The second study used Twitter, and the author, Burr Settles, came up with an algorithm (again based on word use) to sort out "geek" tweets from "nerd" tweets. As settles sees it:
In my mind, “geek” and “nerd” are related, but capture different dimensions of an intense dedication to a subject:Similar to the study by Schwartz et al., Settles tried to group words together that seemed to indicate something about the demographic that produced them -- resulting in a graph (you can take a look at it on the link posted above) that sorts our tweets out by character.
- geek - An enthusiast of a particular topic or field. Geeks are “collection” oriented, gathering facts and mementos related to their subject of interest. They are obsessed with the newest, coolest, trendiest things that their subject has to offer.
- nerd - A studious intellectual, although again of a particular topic or field. Nerds are “achievement” oriented, and focus their efforts on acquiring knowledge and skill over trivia and memorabilia.
I see this one as a bit more lighthearted than the first study, but still, it says something very interesting; that we reveal ourselves online every time we post anything, whether we want to or not.
Of course, all of this made me go back and check my own status updates and tweets, just to see what I'd inadvertently told the world about myself. Ignoring what most of my social media activity is about -- posting links to cool stuff -- I found, in the last couple of weeks, a status mourning the death of my 16-year-old cat, Puck; a status that described my elation at finding out that my high school creative writing teacher (who, amazingly, hasn't retired yet!) is teaching one of my novels in her English class this year; and a status about how much I enjoyed getting to see Laurie Anderson in concert. As far as tweets, I had to go a lot further back to hit one that wasn't just posting a link to something, but I did find one expressing frustration about the teacher evaluation scheme in New York State, and another one about the last day of school that used up a good many of the 140 character limit with the word "YIPPEEEEEEEEEEE!!!!"
My guess is that this trend of figuring out our demographic information from our electronic presence is only going to get more sophisticated. Should we be worried? My sense is probably not; just as with the conspiracy theorist's concern that the government is monitoring his whereabouts (and text messages and phone conversations), if they were doing this for everyone it would be such a mammoth amount of data that it would be impossible to manage. At least for now, I think this sort of thing will only be of interest to marketing firms.
And it's not like they haven't already been doing this for years, starting out with targeting advertisements to particular demographics on television (compare the ads on daytime soap operas and the Syfy channel, if you want a particularly good example). If you have a Facebook, check the ads along the sidebar -- no surprise that mine frequently have to do with travel, scuba diving, wine, and pets, is it?
So as long as we think before we post (which we should already be doing), this sort of thing may not make much difference, except in what sorts of things we're encouraged to purchase. At least I hope so. Last thing I want is the government keeping track of what concerts I go to.