Mapping Iran Online Big Data dmi summer school, 8 July 2011 Welcome Introduction registration dmi summer school, 8 July 2011
Has repression suppressed voice? Mapping Iran Online Questions Has repression suppressed voice? Has the health of the Iranian Web (its content and its users) worsened? Are we able to detect changes in the degrees of expression online (over time data)? Hand out individual login-sheets
Mapping Iran Online “Given the repressive media environment in Iran today, blogs represent the most open public communications platform for political discourse. The peer-to-peer architecture of the blogosphere is more resistant to capture or control by the state than the older, hub and spoke architecture of the mass media model, and if Yochai Benkler’s theory about the networked public sphere is correct in relation to blogs, then the most salient political and social issues for Iranians will find expression and some manner of synthesis in the Iranian blogosphere. Future research could address whether or not this is true.” John Kelly and Bruce Etling (2008). “Mapping Iran’s Online Public: Politics and Culture in the Persian Blogosphere,” Berkman Center Research Publication No. 2008-01, p24. Hand out individual login-sheets
Mapping Iran Online Two-step procedure Demarcate Iranian Web Query / Analyze it Hand out individual login-sheets
Mapping Iran Online Two-step procedure Demarcate Iranian Web a) Google ad planner (hit economy) b) Alexa (surfer's paths) c) Balatarin/Donbaleh/Sabzlink (crowd- sourced) d) Likekhor (like economy) e) Google Web search (link economy) ( & regions search of .com, .net, .org etc.) 2) Query / Analyze it Hand out individual login-sheets
Mapping Iran Online Two-step procedure Demarcate Iranian Web 2) Query / Analyze it a) TLDs b) Language c) Site Type d) Blocked e) Broken (does not resolve) Hand out individual login-sheets
Mapping Iran Online Findings Demarcated Iranian Webs = 11,000 URLs total a) Google ad planner (1500 URLs) b) Alexa (500) c) Donbaleh/Sabzlink (2700) d) Likekhor (2600) e) Google Web search (4300) Only one URL appears in all collections (Webs) - Hand out individual login-sheets
Mapping Iran Online Two-step procedure Demarcate Iranian Web 2) Query / Analyze it a) TLDs b) Language c) Site Type d) Blocked e) Broken (does not resolve) Hand out individual login-sheets
Mapping Iran Online Two-step procedure Demarcate Iranian Web 2) Query / Analyze it a) TLDs b) Language c) Site Type d) Blocked e) Broken (does not resolve) Hand out individual login-sheets
Google Web
Likekhor (blogosphere)
Google Ad Planner
Broken web?
Mapping Iran Online To do Demarcate Iranian Web x 2) Query / Analyze it a) TLDs x b) Language x c) Site Type to do d) Blocked to do e) Broken (does not resolve) x Hand out individual login-sheets
Mapping Iran Online Issues Big data analysis require access to corporate data sets and corporate lab 2) Alternatives to access are time-consuming, incomplete – Google Scraper repeatedly blocked; Facebook API produces results 3) Big Data, Small Data Hand out individual login-sheets