Social Networking sites and Indian caste system

Slides:



Advertisements
Similar presentations
(SEM) SEARCH ENGINE MARKETING Build Your Brand & Maximize Revenue.
Advertisements

Split Databases. What is a split database? Two databases Back-end database –Contains tables (data) only –Resides on server Front-end database –Contains.
Semiautomatic Generation of Data-Extraction Ontologies Master’s Thesis Proposal Yihong Ding.
P449. p450 Figure 15-1 p451 Figure 15-2 p453 Figure 15-2a p453.
Direct Congress Dan Skorupski Dan Vingo 15 October 2008.
U of R eXtensible Catalog Team MetaCat. Problem Domain.
Real time vehicle tracking and driver behavior monitoring using a cellular handset based on accelerometry and GPS data Kevin Burke Interim Presentation.
Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly.
Santosh Ghimire – 066 BCT 533 Subit Raj Pokharel – 066 BCT 538 Sudip Kafle – 066 BCT
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
Data analysis, interpretation and presentation
B I G 28 / M ARS 2014 C ORFU Avlonitis Markos DEPARTMENT OF INFORMATICS IONIAN UNIVERSITY.
_______________________________________________________________________________________________________________ PHP Bible, 2 nd Edition1  Wiley and the.
The Opinion Evaluation Network Nikos Korfiatis Computer Technology Institute (CTI) University of Patras, Greece & Royal Institute of Technology (KTH),
Simply Visualizing Politics: voxPolitico Adrian Besimi, Visar Shehu Contemporary Sciences and Technologies South East European University
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Welcome to the State of the STEM School Address National Inventor’s Hall of Fame ® School Center for Science, Technology, Engineering and Mathematics (STEM)
Enhancing Linkages Between Projects and Datasets: Examples from LBA-ECO for NACP Lisa Wilcox, Amy L. Morrell,
NMED 3850 A Advanced Online Design January 12, 2010 V. Mahadevan.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
Bipin Shetty Santosh Kalyankrishnan. Project Thesis In this project, we have analyzed information gathered from social networks to understand the nature.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Analysis of Complex Systems John Sherwood Period 2.
Curtis Spencer Ezra Burgoyne An Internet Forum Index.
Chapter 11 Using SAS ® Web Report Studio. Section 11.1 Overview of SAS Web Report Studio.
User Interactions in Social Networks and their Implications Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, Ben Y. Zhao (UC Santa.
Evgeniy Gabrilovich and Shaul Markovitch
1 University of Qom Information Retrieval Course Web Search (Spidering) Based on:
CTRImages Group: Kurban Hakimov, Andrew Hartwell, Robert Ulmet Clients: Kiran Chitturi, Seungwon Yang.
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
Data Integration Data Integration Data integration: Combines data from multiple sources into a coherent store Challenges: Entity.
Tagging Systems and Their Effect on Resource Popularity Austin Wester.
Windows 7 WampServer 2.1 MySQL PHP 5.3 Script Apache Server User Record or Select Media Upload to Internet Return URL Forward URL Create.
Cal/Val for physics MED-MFC internal meeting CMCC-INGV-SOCIB Lecce E. Clementi, INGV.
Analysis of Complex Systems John Sherwood Period 2.
COGNITIVE NETWORK ACCESS USING FUZZY DECISION MAKING Nicola Baldo and Michele Zorzi Department of Information Engineering – University of Padova, Italy.
 Data integration: ◦ Combines data from multiple sources into a coherent store  Schema integration: e.g., A.cust-id  B.cust-# ◦ Integrate metadata from.
GROUP PresentsPresents. WEB CRAWLER A visualization of links in the World Wide Web Software Engineering C Semester Two Massey University - Palmerston.
Visualising spatial archaeological data The Aboriginal Sites Decision Support Tool (ASDST) Mal Ridges.
Marketing Research Chapter 28. Sec. 28.1—Marketing Information Systems The importance of marketing research The function of a marketing information system.
Interaction and Animation on Geolocalization Based Network Topology by Engin Arslan.
InterestMap - Harvesting Social Network Profiles for Recommendation Hugo Liu (MIT Media lab) Pattie Maes (MIT Media lab) Speaker: Huang, Yi-Ching.
Clustering and Curvature in a Network Our aim in this project is to use the intrinsic geometry of a network, to cluster it's nodes.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Intelligent Exploration for Genetic Algorithms Using Self-Organizing.
Analysis Manager Training Module
COP4710 Database Systems Project Overview.
DATA INTEGRATION FOR LANGUAGE DOCUMENTATION
 Corpus Formation [CFT]  Web Pages Annotation [Web Annotator]  Web sites detection [NEACrawler]  Web pages collection [NEAC]  IE Remote.
Analysis of Complex Systems
Integration of the UC Davis Biological Collections Data via a Web Portal [A Pilot Project] Project Goals To develop a Web Portal allowing better & more.
Summary Presented by : Aishwarya Deep Shukla
Data analysis, interpretation and presentation
Generative Model To Construct Blog and Post Networks In Blogosphere
Analyzing and Securing Social Networks
By (Group 17) Mahesha Yelluru Rao Surabhee Sinha Deep Vakharia
S-GEMS-UQ: An Uncertainty Quantification Toolkit for SGEMS
OCR Level 3 Cambridge Technicals in IT
Chapter 1 Database Systems
ID1050– Quantitative & Qualitative Reasoning
PHP and MySQL.
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
KNOWLEDGE REPRESENTATION
Privacy Protection for Social Network Services
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Chapter 1 Database Systems
CS4433 Database Systems Project.
Marketing Research Chapter 28.
Database Systems: Design, Implementation, and Management Tenth Edition
Games Development 2 Entity / Architecture Review
Organizational Aspects of Data Management
Presentation transcript:

Social Networking sites and Indian caste system Bipin Shetty Santosh Kalyankrishnan

Project Thesis In this project, we will analyze information gathered from social networks to understand the nature of the bias, if any. We aim to look at preference in making friend linkages among various Orkut users to figure out if there is a preference with respect to caste and to what extent. We have a large amount of Orkut data e.g Names,friends links provided to us, which we will use to mine various information and metrics. Based on these metrics, we hope to derive conclusions on the degree of bias existing.

Milestones completed We were able to identify 1136 last names and their caste, religion, language associated. We have stored above information in XML format with respect with tags <casteName><lastname><religion><parentCaste> and generated MysSQL script to insert data in database. Some last names are identified by their suffixes e.g *reddy, in <regex> We are in the processes of generating mysql scripts to parse through last names provided in our data, compare with last name listing of our listing and identify the caste of each individual. We will then insert those data into tables that identify user profile and caste listing.

Accomplishment to be done Maximize identification of user profiles based on castes in orkut user profile list. We hope to identify caste of about 20 to 30% profiles. Calculate bias of user profiles towards caste/religion/language with respect to distribution of population of the caste/religion/language in the region. We have to then run appropriate correlation and dependence algorithms to measure the degree of bias for each user.

Tradeoffs and bottlenecks Many orkut user names were not crawled so we will not be able to properly identify caste. Some orkut users don’t have lastname, also last name for many don’t map to a caste . Above tradeoffs can lead to inappropriate calculation of bias shift. Population distribution of castes within specific region is difficult to gather.

Any Questions?