Question: What role, then, is the Internet playing in Russian media?
Answer: Elena Vartanova ( Moscow State University Journalism Faculty): It really is a new part of our media system. People are increasingly consuming online news, and online news often takes the first step in agenda-setting. Only then do consumers get more analysis and commentary from print sources.
One of the functions of online media is creating an alternative news agenda. If you watch big television channels you see distilled content, which is double-checked by company managers, by people in power ¬ you won’t find problematic material. The alternative agenda on the Internet is helping Russians see pitfalls and problems. And the Internet has become a tool for people to create public opinion, to support the “man on the street.” In Russia, when mainstream media says something, you should double-check on the Internet. It provides a different point of view.
In the above quote, Elena Vartanova echos two key research questions we have for Russian Media Cloud:
1. Do blogs and other online media provide an alternative public sphere, and;
2. What role do they play in agenda setting of the news.
To begin to test these hypotheses we have built off the hard work by Ethan Zuckerman, Hal Roberts, David Larochelle, Yochi Benkler and Zoe Fraade-Blanar on English Media Cloud, which collects data on different sets of English language blogs and popular traditional media available online (mostly newspapers). For the Russia effort we have an even larger and more varied set of feeds, including:
1. 1000 popular Russian blogs: The Yandex Top 1000 list
2. Over 11,000 Russian language blogs divided into link-based attentive clusters, based on the results of our previous Russian blog research
3. 1000 random, or long tail, blogs based on our own spider of the Russian blogosphere
4. Top 25 ‘mainstream media’: This is currently the Google Ad Planner list of the top 25 most popular news Web sites in Russia, which we filtered to remove sites any sites that are not news related or not primarily about Russia (*See list at bottom of this post)
5. Russian TV news feeds from: Channel 1, Vesti, REN TV, TV Tsentra, NTV, Channel 5, Mir, Zvezda, and TV Stolitsa
6. Russian government Web sites: President Medvedev’s official site, Putin’s official site, the Russian government portal government.ru, and sites of the Ministry of Emergency Situations, Ministry of Justice, Ministry of Defense, and the Ministry of Foreign Affairs
Using the same method as Ethan describes in his blog post on calculating cosine similarity among sources and sets of sources, we are able to draw a visual map that shows how similar these different sets of feeds are to one another, based on content (as opposed to links). What this method allows us to do, and what we have done with all of the below examples, is compare the similarity of bags of words in different media sets. Media Cloud outputs alone do not say anything about the meaning behind those differences between different sources. However, with additional context about what we know of the political situation and media ownership in Russia, as well qualitative analysis of sentences within queries, we can begin to hypothesize about the possible meaning behind similarity scores, word clouds, polar maps and other automated outputs.
As Ethan writes about cosine similarity:
This is a technique computer scientists use to detect a type of similarity between documents. Basically, a computer program counts the appearances of words in a document (in this case, a week’s worth of media coverage by 25 outlets) and compares that frequency list to that of another document. If those documents are identical in word frequency – both mention Obama 23 times, Libya 5 times and basketball twice – they score a 1. If they’ve got no words in common, they score a zero.
(The actual math behind this is wonderfully cool, if slightly mind-bending. Imagine a set of documents with only two words in them – “Obama” and “NCAA”. In source A, Obama is mentioned 8 times, NCAA 2 times. Put a point on a graph at (8,2) – Obama’s our X axis, NCAA our Y axis, and draw a line that passes through 0,0 and 8,2 – that’s the vector that represents set A. In source B, Obama gets mentioned twice, NCAA 8 times – put the point at 2,8 and draw the vector for source B. The angle between vectors A and B is a measure of how similar the sets are, and taking the cosine of that angle is a simple way to scale the value to be between 0 and 1 for angles between 0 and 90 degrees. The trick, of course, is that documents contain words other than Obama and NCAA, and cosine similarity adds a new dimension to our graph for each new term. So the vectors we’re measuring when we compare all the words in 25 media sources over a week to another comparable week exist in 3000-dimensional space. Don’t bother imagining 3000-dimensional space – it will make your head hurt. Just imagine three dimensional space and think about two vectors that each emerge from 0,0,0 and each pass through an arbitrary point in positive x,y,z space – it’s easy enough to imagine measuring the angle between those two vectors. Then take it on faith that, mathematically, you can do the same thing in many-dimensional space.)
Popular Blogs Compared to the Government and Traditional Media
As a first test of whether blogs are different than Russian traditional media and government information channels, in the first polar map we compare the similarity of the Yandex Top 1000 popular blogs compared to the Russian government, TV news transcripts, and top 25 MSM over the period of December 15, 2010 to February 21, 2011. The center node, or pole around which the map is drawn, is the collective content of Russian government feeds over that same time period. The further a source is from the black dot in the center, the more different it is from Russian government feeds. What we see at first glance from this map is that, although fairly overwhelming because of their large number, most blogs are located near the outer ring of this map, while the government, MSM and TV sources are located more closely to the center of the map, showing that the media are more similar to the government than most blogs. This is probably at least in part due to the fact that Russian popular blogs are not focused exclusively on politics, which we see from the content clustering (color) process.