Download presentation
Presentation is loading. Please wait.
Published byKelly Cummings Modified over 7 years ago
1
Social Networking sites and Indian caste system
Bipin Shetty Santosh Kalyankrishnan
2
Project Thesis In this project, we will analyze information gathered from social networks to understand the nature of the bias, if any. We aim to look at preference in making friend linkages among various Orkut users to figure out if there is a preference with respect to caste and to what extent. We have a large amount of Orkut data e.g Names,friends links provided to us, which we will use to mine various information and metrics. Based on these metrics, we hope to derive conclusions on the degree of bias existing.
3
Milestones completed We were able to identify 1136 last names and their caste, religion, language associated. We have stored above information in XML format with respect with tags <casteName><lastname><religion><parentCaste> and generated MysSQL script to insert data in database. Some last names are identified by their suffixes e.g *reddy, in <regex> We are in the processes of generating mysql scripts to parse through last names provided in our data, compare with last name listing of our listing and identify the caste of each individual. We will then insert those data into tables that identify user profile and caste listing.
4
Accomplishment to be done
Maximize identification of user profiles based on castes in orkut user profile list. We hope to identify caste of about 20 to 30% profiles. Calculate bias of user profiles towards caste/religion/language with respect to distribution of population of the caste/religion/language in the region. We have to then run appropriate correlation and dependence algorithms to measure the degree of bias for each user.
5
Tradeoffs and bottlenecks
Many orkut user names were not crawled so we will not be able to properly identify caste. Some orkut users don’t have lastname, also last name for many don’t map to a caste . Above tradeoffs can lead to inappropriate calculation of bias shift. Population distribution of castes within specific region is difficult to gather.
6
Any Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.