WHAT AND HOW CHILDREN SEARCH ON THE WEB Sergio Duarte Torres, Ingmar Weber.

Slides:



Advertisements
Similar presentations
Struggling or Exploring? Disambiguating Long Search Sessions
Advertisements

Temporal Query Log Profiling to Improve Web Search Ranking Alexander Kotov (UIUC) Pranam Kolari, Yi Chang (Yahoo!) Lei Duan (Microsoft)
Spelling Correction for Search Engine Queries Bruno Martins, Mario J. Silva In Proceedings of EsTAL-04, España for Natural Language Processing Presenter:
Twitter – what is it? The School District of Haverford Township |
Advanced Google Becoming a Power Googler. (c) Thomas T. Kaun 2005 How Google Works PageRank: The number of pages link to any given page. “Importance”
Factors affecting EWT Age. Starter activity Think and make notes in pairs about the following two questions: O How do you think “Age” may affect the accuracy.
The Democratization of Online Social Networks A look at the change in demographics of social network users over time Amanda Lenhart AoIR 10.0 Milwaukee,
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Mining Query Subtopics from Search Log Data Date : 2012/12/06 Resource : SIGIR’12 Advisor : Dr. Jia-Ling Koh Speaker : I-Chih Chiu.
1 Web Search and Web Search Overlap: What the Deal? Amanda Spink Queensland University of Technology.
Tagging Systems Austin Wester. Tags A keywords linked to a resource (image, video, web page, blog, etc) by users without using a controlled vocabulary.
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
Ryen W. White, Microsoft Research Jeff Huang, University of Washington.
Searching the Web II. The Web Why is it important: –“Free” ubiquitous information resource –Broad coverage of topics and perspectives –Becoming dominant.
1 Automatic Identification of User Goals in Web Search Uichin Lee, Zhenyu Liu, Junghoo Cho Computer Science Department, UCLA {uclee, vicliu,
© Tefko Saracevic, Rutgers University 1 Vox populi: the public searching of the Web: A longitudinal study of large samples of Excite queries Dietmar Wofram.
How can Behavioral Targeting Help Online Advertising? Yan et. al October 23, 2014 Sam Hewitt.
Information Re-Retrieval: Repeat Queries in Yahoo’s Logs Jaime Teevan, Eytan Adar, Rosie Jones, Michael A. S. Potts SIGIR 2007.
Consumers on the Web: Identification of usage patterns Consumers on the Web: Identification of usage patterns by Nina Koiso-Kanttila
Figurative Language Development Research and Popular Children’s Literature: Why We Should Know, “Where the Wild Things Are” Kathleen Ahrens.
Section 2: Finding and Refinding Jaime Teevan Microsoft Research 1.
The Gender Gap in Educational Attainment: Variation by Age, Race, Ethnicity, and Nativity in the United States Sarah R. Crissey, U.S. Census Bureau Nicole.
Demography and Aging. What is “demography”? Demography is the study of populations Counting and describing people Age, sex, income, marital status… Demographers.
Information Re-Retrieval Repeat Queries in Yahoo’s Logs Jaime Teevan (MSR), Eytan Adar (UW), Rosie Jones and Mike Potts (Yahoo) Presented by Hugo Zaragoza.
Chap. 9: The Human Population Sect
Modern Retrieval Evaluations Hongning Wang
From Devices to People: Attribution of Search Activity in Multi-User Settings Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz Microsoft Research,
Internet Safety How to keep your children safe as they use the web.
Jure Leskovec, CMU Eric Horwitz, Microsoft Research.
Gradual Adaption Model for Estimation of User Information Access Behavior J. Chen, R.Y. Shtykh and Q. Jin Graduate School of Human Sciences, Waseda University,
The Value of Old Data: Trends in GSA Data Repository Usage Matt Hudson, Geological Society of America, 3300 Penrose Place, Boulder CO INTRODUCTION.
A Comparison of Microblog Search and Web Search.
Ontological Classification of Web Pages Zafer Erenel Many users use search engines to locate and buy goods and services (such as choosing a vacation).
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Hao Wu Nov Outline Introduction Related Work Experiment Methods Results Conclusions & Next Steps.
Understanding and Predicting Personal Navigation Date : 2012/4/16 Source : WSDM 11 Speaker : Chiu, I- Chih Advisor : Dr. Koh Jia-ling 1.
JANE LI, SCOTT B. HUFFMAN, AND AKIHITO TOKUDA JULY 2009 PRESENTED BY : GAURANG JHAWAR Good Abandonment in Mobile and PC Internet Search 1.
Website Usability presentation by Pasha Souvorin for Georgia Pathway in Advanced Web Design evaluating and planning for web design.
SEG3120 User Interfaces Design and Implementation
Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.
Detecting Dominant Locations from Search Queries Lee Wang, Chuang Wang, Xing Xie, Josh Forman, Yansheng Lu, Wei-Ying Ma, Ying Li SIGIR 2005.
Autumn Web Information retrieval (Web IR) Handout #1:Web characteristics Ali Mohammad Zareh Bidoki ECE Department, Yazd University
Explanations of Gender Development: Evaluating Kohlberg
A Statistical Comparison of Tag and Query Logs Mark J. Carman, Robert Gwadera, Fabio Crestani, and Mark Baillie SIGIR 2009 June 4, 2010 Hyunwoo Kim.
Presenter: Lung-Hao Lee Nov. 3, Room 310.  Introduction  Related Work  Methods  Results ◦ General Gaze Distribution on SERPs ◦ Effects of Task.
Finding high-Quality contents in Social media BY : APARNA TODWAL GUIDED BY : PROF. M. WANJARI.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Improving Search Results Quality by Customizing Summary Lengths Michael Kaisser ★, Marti Hearst  and John B. Lowe ★ University of Edinburgh,  UC Berkeley,
Adish Singla, Microsoft Bing Ryen W. White, Microsoft Research Jeff Huang, University of Washington.
Retroactive Answering of Search Queries Beverly Yang Glen Jeh.
Analysing Clickstream Data: From Anomaly Detection to Visitor Profiling Peter I. Hofgesang Wojtek Kowalczyk ECML/PKDD Discovery.
Internet Safety How To Keep Your Children Safe As They Use The Web.
Date: 2015/11/19 Author: Reza Zafarani, Huan Liu Source: CIKM '15
Parental Involvement in Children’s Social Networking Activities.
 Who Uses Web Search for What? And How?. Contribution  Combine behavioral observation and demographic features of users  Provide important insight.
Understanding and Predicting Personal Navigation.
January 15, 2008 ECONOMIC IMPACT VALUE LIGHT UP NIGHT ATTENDEE & HOLIDAY PEDESTRIAN ECONOMIC IMPACT VALUE LIGHT UP NIGHT ATTENDEE & HOLIDAY PEDESTRIAN.
Delicious And other useful websites. What is Delicious Delicious is a social networking website that helps educators organize and share useful websites.
G042 - Lecture 09 Commencing Task A Mr C Johnston ICT Teacher
TO Each His Own: Personalized Content Selection Based on Text Comprehensibility Date: 2013/01/24 Author: Chenhao Tan, Evgeniy Gabrilovich, Bo Pang Source:
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
A Framework for Detection and Measurement of Phishing Attacks Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 2/25/2016 Slide.
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.
Query Prediction by Currently-Browsed Web Pages and Its Applications
Facebook ads as recruitment for online drug surveys: the Holy Grail?
Personalizing Search on Shared Devices
Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz
Date: 2012/11/15 Author: Jin Young Kim, Kevyn Collins-Thompson,
Presentation transcript:

WHAT AND HOW CHILDREN SEARCH ON THE WEB Sergio Duarte Torres, Ingmar Weber

Motivation

Goals of this work Identify and quantify search struggle of young users Retrace stages of child development through their web searches

What data was used? US Yahoo! search logs from May to August of 2010 Cleaning steps: User wise: Logs from users without Yahoo! accounts were removed Query wise: Queries issued by a single user were removed Queries with personally identifiable information Non alpha-numerical single token queries Why the cleaning? What could be advantages/disadvantages?

An aside about the data Users under 13 years old required the consent of an responsible adult to register at Yahoo! (costs $.50) Some people may lie about their age… General trends are expected to be robust to noise People may lie about their age but … usually they tend to make themselves appear older Where do you think millions of children lie about their age?

Data segmentation Users grouped based on their reported birth year Age estimated as: 2010 – Birth year Following age buckets were created: 6-7: early elementary 8-9: readers 10-12: advance readers 13-15: teenagers : mature teenagers >18: grown ups

Data characteristics Data set size Below 10 years oldAbove 10 years old Volume of queries>100K>1M Number of users>10K>100K

Methodology: Micro- vs. Macro-Averages User A: 100x cooking 10x science User B: 1x cooking 5x science User C: 2x cooking 10x science Micro avg.: cooking = ( )/( ) = 0.80 Macro avg.: cooking = (100/ /6 + 2/12) / 3 = 0.41 People search mostly for cooking. True? False?

Methodology: Detecting Navigational Queries facebook, yahoo mail, google,... How would you do it? Editorial judgments Ask human judges to mark queries a navigational Drawbacks? Click entropy Look at the diversity of the results clicked in response Drawbacks? String similarity heuristics Try to find query as substring in clicked domain Drawbacks?

Search Difficulty Outline 1. Query length 2. Natural language usage 3. Click position bias 4. Other signs of click position bias 5. Children expose to adult content 6. Time spent on web results 7. Sessions characteristics

Query length Increasing query length through the age groups Slightly bigger gap for non-navigational queries Greater ambiguity in children queries

Natural language usage (I) Questions instead of queries what is the only immortal animal? Modal queries I don’t want to go to school Factual queries describe the parts of a cell Superlative queries the fastest dog Targeted queries for kids car photos for kids

Natural language usage (II) Greater NL usage at younger ages Teenagers behavior closer to children than adults behavior

Click position bias Other explanations?

Clicks on ads Children aged 6-9 more likely to click on ads! Evidence of disorientation during the search process

How to evaluate search success using click data? How would you do it?

Time spent on web results Click duration as a signal of search success. Hassan et al (2010) WSDM ‘10 Short click (0-10 secs): Unsuccessful click Long click (≥ 100 secs): Successful click

Children exposed to adult content Likelihood of accidental click on adult content: Click on adult content is short and the action is immediately reverted by a click on a non-adult content

Sessions characteristics (I) Shorter sessions in young users Jump to adulthood also occurs in the group of users from 19 to 25

Sessions characteristics (II) Query refinding c q q’ q What do refinding queries indicate?

Sessions characteristics (III) Click refinding q c c’ c

Sessions characteristics (IV) Shorter sessions?

Tracing children development on the web: Outline 1. What do children search for? 2. What entities are children interested in? 3. Does the reading level of the clicks varies across ages and education?

Classifying queries into topics

“sigir 2011”? computers_and_internet/programming_and_development Classifying queries into topics

What do children search for? Children and teenager groups have few dominant topics Adults have more diverse query topics Also due to smaller vocabulary

Gender differences (I) Which topic is most responsible for gender differences?

Gender differences (II)

What entities are children interested in? Queries mapped to Wikipedia entities using site search on wikipedia.org/wiki QueryEntity facebook, facebook loginen.wikipedia.org/wiki/Facebook back to school clothes, london schol uniforms en.wikipedia.org/wiki/School_uniform Hummus recipe, ideal proteinen.wikipedia.org/wiki/Hummus How to map web queries to Wikipedia pages?

What entities are children interested in? (10-12)

What entities are adults interested in? (40+)

What entities are children interested in? Greater used of child oriented entities at young ages

Does the reading level of the clicks varies across ages? Based on Google reading level classification 70% (kids) vs 50% (adults) of clicks classified as basic

Does the reading level of the clicks vary across ages? (II) Reading level also varies according to education level Education level of adults according to US census CIKM Glasgow, 26 of October

Gender: Male Birth year: 1978 ZIP code: cheap holidays Expected income: $ 31k Expected education: 45% BA Race distribution: 38% w, 47% A Label (Q,D) with $31k, 45%BA,... Q D US Census Data factfinder.census.gov Getting demographics from US census

Conclusions Clear behavioral differences between children and adults Although not clean between teenagers and children Sudden jump to adulthood from 19 to 25 years old Stronger position click biased for children, including ads Assistance of question queries Understanding concerns expressed in their queries

THANK YOU FOR YOUR ATTENTION