Zeynep Tufekci's recent paper in First Monday "Engineering the public: Big Data, surveillance and computational politics" focuses our attention on the shifting power dynamics of data-driven targeting of political influence and the expected effects on society. The term "computational politics" itself seems to have been used here and there for at least 10 years, previously referring more narrowly to computational study of elections and electoral systems, but in her definition it expands to:
"applying computational methods to large datasets derived from online and off-line data sources for conducting outreach, persuasion and mobilization in the service of electing, furthering or opposing a candidate, a policy or legislation."
Her arguments (summarized weakly here to be sure) focus on how big data "can undermine the civic experience," concluding that the conditions she observes lead to a fractured public sphere replaced by privatized targeting of attempts to influence, that this activity is an important result of an implied but just-invisible surveillance state, and that all of this "favors incumbents who already have troves of data," including the already-wealthy, the new platform providers, and those with the means to acquire and apply data at scale to these same ends.
With my limited training, knowledge, and experience I won't deign to offer a substantial critique beyond saying that these are compelling arguments, especially in light of the decades-long trends noted along the way. I encourage you to read for yourself and to dig in to the notes and references, which taken together might, as do many of the best works shared freely online, form an ideal sort of self-contained open online course.
The only minor quibble I'll mention regards how these explicit expectations of how big data affordances will continue to be used are to a degree undercut by an implied expectation that those empowered to do so will themselves act rationally. In the section describing how our shifting understanding of behavioral sciences has moved beyond the rational actor model to incorporate a more nuanced sense of individual and collective irrationality which may be modeled and targeted accurately at the individual level, I got a sense that this assumes that those wielding this power will each apply it uniformly toward singular or related objectives, such as swaying an election toward a candidate, party, or one side of a set of issues. Although this no doubt is how things work in political campaigns, I am less confident that the major platform providers of our day are likely to have such tunnel vision. If you own the platform, would you use it to shift your user base toward a consistent set of political aims at the macro scale (everyone), or, rather, would you engineer the platform to profit off all efforts to do so, at all scales? It seems to me that for-profit platform providers like Facebook and Twitter have an overwhelming motive to profit off empowering the peddling of influence and the manufacturing of consent for every micro-community who finds a voice and audience, especially if they clash into each other in the spaces themselves and create more traffic and sustain attention and charge emotions further. All of this activity enhances the ability of those who can afford to acquire and model all of the data to target their users, and the churn of charged messages itself obscures how only the already-rich and already-powerful can apply the most powerful models over all the data for their own aims.
Because we don't want to know (as is pointed out in the paper) that we are being manipulated, we prefer our platforms to provide at least a meaningful illusion of neutrality. What better way to provide this illusion than to allow the smaller and less rich among us to believe the platform gives us our own insights into our own more modest target audiences? There is a line beyond which the utility of a small organization's or movement's ability to tie in a handful of social media accounts to an inexpensive hosted CRM to target its own members is surpassed by the cost or risk to that organization and thousands of others like it when they stop believing that the platform serves its modest interests rather than those of the platform provider itself. The minute Facebook or Twitter is believed to be taking political sides, they will die like so many other platforms before them. If anything, this reinforces the paper's argument that the opacity and proprietary nature of how these platforms might choose to inject influence should be of greater concern.
I wrote some years ago that user-generated content is a reverse supply chain for information, but in that piece I considered mainly profit, rather naively. It is not lost on me how several cases described in the paper were occurring back in 2007 when I was writing my column, and how small whatever insight I might have been able to share back then seems now. In a surveillance state, my own activities define how my own future actions will be swayed, imperceptibly but precisely so and just for me, by the environments in which I choose to take those activities.
In the months following the events of 9/11 I spent a lot of my transit time on the train between New Haven and Cambridge reading a report from MITRE or Rand on the shift in paradigms from hierarchical to network and peer-to-peer command and control systems (my apologies, I searched several times in the years since but have not found that piece again and cannot provide a reference). If I recall correctly, the main thesis of this report was that our surveillance and military systems were not up to the challenge, that our 20th century infrastructure for intelligence gathering would not enable us to study and fight a distributed 21st century enemy successfully, and that we needed to shift rapidly to adapt to the network model. Although we can readily question how successfully this objective has been achieved militarily, it is startling to consider how successfully every aspect of our computed lives embody this achievement.
One last thought, more grounded in a field I do actually know something about. There are also implications here for the provision of what we blithely refer to as open data. We could argue that by giving data away in a raw form, we are empowering anyone to make what they will of it. This seems, at first, to be a positive result all around. Who doesn't want to be empowered? But the skill, experience, manpower, and computing power needed to bring diverse data together meaningfully and at scale are, unfortunately, not readily available to most of us as individuals, and this paper hammers that message home. Even if I fancy myself a competent programmer and a budding data scientist, even if I have inexpensive at-will computing power at my disposal, how can I afford to make my own targeted models of influence and persuasion and apply them meaningfully?
How can I afford not to?
I don't have answers, but it seems clear that there is a huge gap to fill in service of those who wish to study data meaningfully and at scale, and that more of us would do well to step in and fill it.