Counting on Twitter: Harvard's Web Ecology Project (Part Two)

Last time, I shared with you some of the work being done by Harvard University's Web Ecology Project, specifically focusing on the use of Twitter in the aftermath of the Iran Elections and around the death of Michael Jackson. Through qualitative and quantitative research, the team is seeking to develop a better understanding of the flow of ideas through the social networking world and how different participants exert influence on Twitter. My respondent last time was Dharmishta Rood, who I worked with when I was back at MIT. Today, I am showcasing the research being conducted by three other researchers on the Web Ecology team -- Erhardt Graeff, Tim Hwang and Alex Leavitt. I asked them each to share some of their current research and explain why they think it can contribute to our understanding of the new media environment. For more on the Web Ecologies Project, check out Alex Leavitt's recent post on the Convergence Culture Consortium Blog. Erhardt Graeff: One of our hopes for Web Ecology is a fusion of quantitative and qualitative approaches to studying social media phenomena. Our goal of constructing a scientific framework for tackling quantifiable data online is only possible when we recognize the cultural contexts. In Web Ecology, we see the formalization of these contexts as web ecosystems such as LiveJournal, Facebook, and Twitter.

Inspired by the ethnographic work of a number of researchers, including danah boyd, Mimi Ito, and Keith Hampton on Netville, we are beginning to profile individual social media networks. We call the outputs of our research "Web Ecosystem Profiles". The goal of each profile is to characterize the cultural landscape of a web ecosystem. As you might expect, much of this is done through participant observation.

Of course, the boundaries of each ecosystem are negotiable as in any study of a community. More importantly, a web ecosystem's state is in constant flux with users joining and leaving, new features being introduced, and memes propagating the network. Thus, Web Ecosystem Profiles must be dynamic documents. And to guide our work, we rely on a few of the central tenets of Web Ecology, first laid out in Reimagining Internet Studies:

• Interdependence: code and users are part of an inseparable aggregate web phenomenon;

• Boundedness: the web is constrained by various forces and configurations;

• Significance: content on the web retains inherent value.

Here is an abbreviated version of the outline we are currently using to build profiles. The full version is on our wiki (requires registration):

• Introductory Overview of the Ecosystem

• Common language for discussing components of the ecosystem

• "Typical" reasons that users register / access the ecosystem

• Technical affordances / Constraints of the ecosystem

• Requirements of site usage

• Landmarks of the ecosystem's evolution (e.g. eternal septembers, jumping the shark)

• Defined user cohorts

• Ecosystem-specific lexicon

• Phenomenology of 'Typical' Sessions in an Ecosystem

• Describe general experience of using the site

• Key use cases

• Possibilities for Quantitative Analysis

• Introduce available APIs

• List of atomized site components / activity that could be quantified (e.g. tweets, likes)

• Documentation of successful and unsuccessful approaches to this ecosystem

The last section is unique to the quantitative research Web Ecology hopes to undertake. On Twitter, this is easy because they provide a very open API, with decent documentation, and also the forms of interaction are easily quantified. For Twitter, a web ecosystem profile is particularly useful to help formalize the documentation of unconventional use cases (see excellent examples in danah boyd's draft of "Tweet, Tweet, Retweet"). Charting all the different ways users retweet can enable a better quantitative study of retweeting behavior by ensuring that we: 1) catch all of the various forms of retweets and 2) understand what the different forms might signify.

A more straightforward use of a Web Ecosystem Profile is when a social network has not been explored by many researchers. A few weeks ago, fellow Web Ecologist Seth Woodworth started to use the profile framework to document aspects of LibraryThing, which no one else in our community was using at the time. Did you know that the key demonym in the community is a "thingabrarian", or that one unconventional practice is the creation of fakester libraries for popular, dead authors?

Web Ecosystem Profiling is at a very early stage of preparation. But we believe the need for a peer-produceable way to continually document the contexts of social media phenomena is obvious and immediate. Hopefully, a larger community of researchers are willing to contribute and offer feedback.

Tim Hwang: The Era of Social Media has gifted us with two Big Ironies. First, there's the Big Irony of Business, where extensive practical experience with communities online hasn't successfully translated to the emergence of a science (or even a cluster of useful, concrete reliable methods) around building vibrant social spaces on the web. Second, there's also the Big Irony of Academia, where massive amounts of data, talent, and research on the dynamics of social networks fails to make it into informing the day-to-day practice of businesses (or, indeed, the popular discourse).

In both worlds, the irony is the same: we do in some sense have the key information right in front of us (either in terms of practical experience or reams of qualitative and quantitative research), but a notable lack of ability to convert it into specific, actionable knowledge.

Indeed, this has led us to kind of a sorry state, where good people -- some seriously sharp, brilliant people -- can spend hours talking about the really beautiful research about the social nature of the web. But when the key questions come down the pipe, "So what can I do to foster a community?," "So what factors are responsible for promoting the propagation of culture?," most folks are reduced to wandering generalities and the mantra-like suggestion that the person in question should really consider starting a Twitter account. Where we should be sifting through the available data and offering specific ideas, we've largely only got vague philosophies and anecdotes. Depressingly, the Emperor has no clothes. At the point we're sitting, he's not even really the Emperor, either.

And perhaps most scarily, there's a kind of superstition I feel that's starting to circle around the research, a suspicion that the whole idea of digging deep with data and getting scientific with our prescriptions is, in fact, a largely misguided idea. Social media expert Chris Brogan recently wrote about the quantitative side of things:

I'm writing this from a conference full of researches [sic]. They are all talking passionately about numbers, and I get this. I understand that they're passionate about exacting a science out of the crazy data of human passion. And yet, part of me thinks that numbers often serve us as little life rafts. [...]

We cling to numbers. In business, we use numbers as our primary gauges. But in relationships, we don't. Right? Do you count who hosted the holiday party and do you measure just how delicious the meal was on a chart? (If you do, I take it you like sleeping on couches.)

And he's dead on. But about the wrong point. It's true: you're are in fact a serious jerkface if you behave in the robotic way he's talking about. But we probably wouldn't , for example, blame the host for meticulously keeping track of what people liked and didn't like -- and using it to plan the menu for the next holiday party. This is a simple way of saying that, rigorous exploration isn't bad when it improves our results in a real way. And so, the responsibility for the flaw in Chris' voiced skepticism doesn't fall on him at all. I think it's a natural response to the failure of the research to actually step up to the plate and deliver some implementable knowledge beyond the generalities. If all of our experience and hard data can't come to anything practical, it's easy to believe that it might not be a worthwhile approach to rely on.

So how do we finally step up to the plate? And, before we get to that: how did we get here?

Largely, I'm willing to argue that the Big Ironies have emerged because there's no good space where people can playtest, experiment, and rapidly iterate on a variety of strategies, particularly where influencing the social space online is concerned. There's no good place to measure success, or even compare various approaches against one another to assess their usefulness. There's no way to prove that your methods and data mining can actually produce repeated success. Without that kind of lab, it's tough to take insights from both the research and business world, and try them again and again. Without trying and trying again, we never get to know how information might actually be transformed into useful, applied knowledge.

One of the big projects of the web ecology community has been to see if there's a way of providing that exact environment. Specifically, we've been talking about the concept of competitive games, and the fact that they provide the ideal social structure that we're looking for. Games create repeatable scenarios, allowing us to identify and test a given situation over and over again. Competitive games require measurable goals, and a structured way of assessing success. Finally (and, perhaps best of all) games are good experimental zones, places to try out tactics and strategies on low stakes.

Add the involvement of real people and social structures to take it out of an abstracted lab scenario, and you've gotten to an experiment that we're starting to undertake, something we call social wargaming.

The general premise is simple: beginning with a "battlefield" population of users (who are unaware that a game is going on -- indeed, revealing the existence of the game is against the rules of the game), teams compete to effect specific changes in their behavior. This goes from as simple as getting a social network to pass around a piece of content, to as complex as attempting to bridge the structural gaps between two unconnected clusters of users. We're starting out with single platforms, but the eventual idea is to level up to testing the ability of teams to create certain effects across various networks, and in the social ecosystem of the web as a whole.

The open, implicit challenge is equally simple, though perhaps provocative to the point of being considered trolling: if you're really so good at understanding what culture and community online is all about, if you're really so good at "engaging communities" and being a "trust agent," why not put the money where the mouth is and see if you can't straight up just do it?

The first iteration of this game, entitled "Triangles," builds around this premise. Essentially, teams are given a "terrain" of contested target users to study on Twitter that are connected in some way. The competition is for them to start fresh with an "ego account," which will compete with other groups to create as many tightly linked triangles of connection between their account and two other target users in a short period of time. Over a series of games, we can also change up the terrain and rules to ask other questions -- what tactics work best when trying to build new connections in an already tightly interconnected social group? Can robots achieve the same results as humans in fostering certain types of behavior?

The rules in more depth are available here (Social_Wargaming_Triangles.txt,) and we're actively looking for participants who want to play a role in this. First round begins November 20th, and will be running during the first week of December. Definitely drop an e-mail to, if you'd like to be involved. And, with any hope, we're hoping that the outcome of this gaming will be something in actuality quite different that just mere entertainment: experiments towards forging an applied science of cultural and community spaces online.

Alex Leavitt: A primary goal of the Web Ecology Project aims to analyze how the relationship between social networking platforms and its users affects and is affected by the cultural practices of online communication and community building. To approach this goal, we had striven to establish a set of first principles for the Web on which to base our future research. Our analyses of influence on the Web usually started with these first principles. For example, the smallest units of communication might be a page view or a click. Using these measurements, how could we make declarative statements about how people interacted in mediated spaces like Twitter (which structure communication based on how the programmers design the platform)?

However, designing first principles proved a bit difficult, and when I wrote "The Influentials" I realized that we would have to shape sets of "elementary particles" (like chemical atoms and molecules) per each system. Basically, because each platform controls the possible modes of communication, first principles for Facebook are inherently different than those of Google Reader, for example. For Twitter, the platform analyzed in "The Influentials," these elements begin with the ordinary tweet, out of which we see related particles, like replies, retweets, and mentions.

For the elements on Twitter, I established an operational definition of influence (meaning that our analysis is ultimately separated from any theories of influence previously researched in academic circles). Tweets became actions on which replies, retweets, and mentions were enacted. Thus, we organized our arguments around influence as those messages sustaining a large amount of responses.

The focus on response is key to our results. The Web Ecology Project has attempted to respond to extremely generalized analyses of social media phenomenon, particularly with large amounts of quantitative evidence to support our claims. In "The Influentials," we wanted to criticize those analyses of influence that had primarily focused on follower counts, which of course are important; however, if a user has 10,000 followers and none of them respond to the user, then can we claim that this user is influential? Of course, we couldn't ignore follower counts, so we included equations and calculated graphs that accounted for both responses and numbers of followers, to weigh users that had smaller follower networks.

Probably the more interesting aspect of our initial analysis of influence of Twitter lay in our categorization of the cultural practices that lay underneath these interactions between popular users on Twitter and their followers. We split the ten users into three groups: celebrities, news outlets, and social media analysts. For the most part, the trends show that the members of these groups act fairly similarly (with discrepancies, of course, usually based on the number of followers).

The under-appreciated piece of our research ended up being our visualizations. We generated a colorful graph that illustrates the density of tweets and responses for each user in our report. It's intriguing to analyze our statistics visually, because you can occasionally pick out exceptional instances of response explosions. Although in our visualization our code could not parse out which responses corresponded to which original tweet, we can suppose that most of the wild groups of responses that follow occasional tweets are immediate responses that eventually ebb away.

To move beyond this initial, basic analysis of influence on Twitter, we would like to look closer at the networks of followers behind these mega-users. Looking at hypothetical extremes hints at the problems we might foresee in future research: If a user has a follower network that responds at an ordinary rate, but each of those users have extremely active responding networks (ie., the original user's secondary follower network), then that certainly affects how we might provide ratings or levels of influence for specific users.

Erhardt Graeff is a Lead Researcher and Developer for the Web Ecology Project, and also a social scientist and entrepreneur with an MPhil in Modern Society and Global Transformations from Cambridge University and a couple of bachelor's degrees from RIT. In addition to researching social media, he has studied rural internet use and social capital, digital divides, e-government, networked public spheres, and new media literacy. Beyond the Web Ecology Project, Erhardt is the Director of Technology and Strategy for BetterGrads, a startup aimed at preparing high school students for college life, and is a research assistant at the Berkman Center for Internet & Society at Harvard, studying OER and the political economy of the textbook industry.

Tim Hwang is the Director of the Web Ecology Project and an analyst with The Barbarian Group -- where he works on issues of group dynamics and web influence. He is interested in building a science around measuring the system-wide flows of content and patterns of community formation online. He is also the founder of ROFLCon, a series of conferences celebrating and examining internet culture and celebrity. He currently Twitters @timhwang, blogs at BrosephStalin, and is in the process of watching every homemade flamethrower video on YouTube.

Alex Leavitt is a Lead Researcher for the Web Ecology Project. His interests include geographical, linguistic, and transnational subcultures; the hybridization of popular culture and online humor; and the emergent cultural practices of (un)controlled online social networks. Alex also works as a research specialist with the Convergence Culture Consortium in the Comparative Media Studies department at MIT, and has previously worked with the Digital Natives Project at the Berkman Center for Internet & Society (Harvard Law School). In addition to his weekly articles on the Convergence Culture blog, Alex writes long-form about Japanese popular culture at The Department of Alchemy and short-form on Twitter (@alexleavitt).