Location-based web search and mobile applications

Slides:



Advertisements
Similar presentations
Introduction to Web Design Lecture number:. Todays Aim: Introduction to Web-designing and how its done. Modelling websites in HTML.
Advertisements

HTML Basics Customizing your site using the basics of HTML.
Andrei Tabarcea, Matti Mononen  Joint PhD degree candidate for University of Eastern Finland and Technical University of Iasi, Romania  ECSE.
Four aspects of relevance: content, time, location and network Pasi Fränti, Jinhua Chen and Andrei Tabarcea.
Today’s objectives  Element relations – tree structure  Pseudo classes  Pseudo elements.
Recognizing the Benefits of Using CSS 1. The Evolution of CSS CSS was developed to standardize display information CSS was slow to be supported by browsers.
Web Design with Cascading Style Sheet Lan Vu. Overview Introduction to CSS Designing CSS Using Visual Studio to create CSS Using template for web design.
CREATE A WEBPAGE WEB DESIGN. EXAMPLE LAYOUT 2 WEB COMPONENTS Header Banner and logo Footer Copyright information or Address Horizontal Navigation For.
 Popularity of browsers:  Popularity of search.
Retrieving Location-based Data on the Web Andrei Tabarcea,
Mobile collection of location-based multimedia School of Computing University of Eastern Finland Prof. Pasi Fränti Research presentation
TO ENABLE THE CROSS-BORDER EXCHANGE OF CULTURE BY PROVIDING AN INNOVATIVE, MULTILINGUAL IT PLATFORM, BASED ON AVAILABLE OPEN SOURCE SOCIAL PLATFORM SOLUTIONS.
CIT 256 Organizing Large Websites: Divs, DW Templates Dr. Beryl Hoffman.
AD-HOC GEOREFERENCING OF WEB-PAGES USING STREET-NAME PREFIX TREES Andrei Tabarcea, Ville Hautamäki, Pasi FräntiAndrei Tabarcea, Ville Hautamäki, Pasi Fränti.
Chapter 2 Developing a Web Page. Chapter 2 Lessons Introduction 1.Create head content and set page properties 2.Create, import, and format text 3.Add.
DHTML AND JAVASCRIPT Genetic Computer School LESSON 2 HTML TAGS G H E F.
Programming in HTML.  Programming Language  Used to design/create web pages  Hyper Text Markup Language  Markup Language  Series of Markup tags 
Introduction to HTML. What is a HTML File?  HTML stands for Hyper Text Markup Language  An HTML file is a text file containing small markup tags  The.
CHAPTER 4 LINKS Creating links between pages Linking to other sites links.
HTML tips BTM 395: Internet Programming. Components of website development Content Structure Format and design Dynamics and interactivity –Forms –Client-side.
Recommendation system MOPSI project KAROL WAGA
Just Enough HTML How to Create Basic HTML Documents.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
Adding Images. XHTML Element ElementAttributeAttribute Value Closing tag AttributeAttribute Value The src attribute supplies the name and location of.
Mobile Search Engine Based on idea presented in paper Data mining for personal navigation, Hariharan, G., Fränti, P., Mehta S. (2002)
HTML CS 105. Page Structure HTML elements control the details of how a page gets displayed. Every HTML document has the following basic structure: … …
Online Copywriting eMarketing: The Essential Guide to Online Marketing
Extracting Representative Image from Web page Najlaa Gali, Andrei Tabarcea and Pasi Fränti.
Week 5  SEO  CSS Please Visit: to download all the PowerPoint Slides for.
Revision Webpage design HTML.   FACE  Attributes  Marquee  Define the following terms.
HTML LAYOUTS. CONTENTS Layouts Example Layout Using Element Example Using Table Example Output Summary Exercise.
Small Business Website Revisited Smita Roy. Background The Art of Bloom’ has been chosen which is a florist shop located in a small town called Hornchurch.
Unit 15 – Web Authoring Web Authoring Project.
Blended HTML and CSS Fundamentals 3 rd EDITION Tutorial 1 Using HTML to Create Web Pages.
 2001 Prentice Hall, Inc. All rights reserved. 1 Chapter 22 - i-mode Outline 22.1 Introduction 22.2 Japan’s Telecommunications Industry and Infrastructure.
Creating Web Documents CSS examples (do more later) Lab/Homework: Read chapters 15 & 16. Work on Project 3, do postings.
Lab 3 Html basics.
Search Engine Optimization
HTML & CSS Jan Janoušek.
HTML CS 4640 Programming Languages for Web Applications
Introduction to HTML.
HTML5 and CSS3 Illustrated Unit D: Formatting Text with CSS
Formatting Text with CSS
How to create a static website with HTML
Julián ALARTE DAVID INSA JOSEP SILVA
Inserting and Working with Images
CONTENT MANAGEMENT SYSTEM CSIR-NISCAIR, New Delhi
IS 360 Web Promotion.
AN INTRODUCTORY LESSON TO MAKING A SIMPLE WEB PAGE By: RC Emily Solis
HTML Images CS 1150 Spring 2017.
What is Search Engine optimization
Overview The promotion of products or brands via Digital media Digital Media  Search Engine Marketing Search Engine Marketing  Social Media Marketing.
Objective % Explain concepts used to create websites.
Adding Images.
IBM Kenexa BrassRing on Cloud Responsive Apply: Visual Branding Tool
Extracting Representative Image from Web page
Context Is Everything Meaningful Alternative Text
HTML Images CS 1150 Fall 2016.
Adding Images.
Adding Images.
CSc 337 Lecture 15: Review.
Radu Mariescu-Istodor
Objective Explain concepts used to create websites.
5.00 Apply procedures to organize content by using Dreamweaver. (22%)
Web Client Side Technologies Raneem Qaddoura
Adding Images.
Exhibitor Virtual Bag Landing Page
Adding Images.
Exhibitor Virtual Bag Landing Page
HTML CS 4640 Programming Languages for Web Applications
Presentation transcript:

Location-based web search and mobile applications 30.9.2015 Faculty of Science and Forestry School of Computing Location-based web search and mobile applications Supervisor PhD candidate Prof. Pasi Fränti Andrei Tabarcea

Location-based services and applications A location-based service is "an application which integrates the user's geographical location with the general notion of service, its purpose being to provide information about a certain place or geographical location“ (Schiller and Voisard 2004) A location-based application is an application that uses such services. Source: http://news.filehippo.com/2012/10/underutilized-smartphone-features/

Mopsi Project Location-based applications and internet Tools to collect, manage and process location-based data Social network integration Applications for web and for mobile phones cs.uef.fi/mopsi

Publications [P1] P. Fränti, J. Chen, A. Tabarcea, "Four aspects of relevance in location-based media: content, time, location and network", Int. Conf. on Web Information Systems and Technologies (WEBIST'11), Noordwijkerhout, Netherlands, 413–417, May 2011. [P2] P. Fränti, A. Tabarcea, J. Kuittinen, V. Hautamäki, "Location-based search engine for multimedia phones", IEEE Int. Conf. on Multimedia and Expo (ICME'10), Singapore, 558–563, July 2010. [P3] A. Tabarcea, V. Hautamäki, P. Fränti, "Ad-hoc georeferencing of web-pages using street-name prefix trees", Int. Conf. on Web Information Systems and Technologies (WEBIST'10), Valencia, Spain, vol.1, 237–244, April 2010. [P4] A. Tabarcea, N. Gali, P. Fränti, "Location-aware information extraction from the web" (manuscript), 2015. [P5] N. Gali, A. Tabarcea, P. Fränti, "Extracting representative image from web page". Int. Conf. on Web Information Systems and Technologies (WEBIST'15), Lisbon, Portugal, May 2015. [P6] A. Tabarcea, K. Waga, Z. Wan and P. Fränti, "O-Mopsi: Mobile Orienteering Game Using Geotagged Photos", Int. Conf. on Web Information Systems and Technologies (WEBIST'13), Aachen, Germany, 8-10 May 2013.

Location-based web search: workflow and modules

Location-based web search

General workflow User initiates search Distance from user’s location Formatted output Web mining using location and keyword .

Motivation: simple and relevant search results Address Calculating distance Title Image

System architecture

Location-based web search: Address detection

Locations in web pages Geo-tags or address tags: Less than 0.1% of Finnish websites were using geo-tags in 2004 [Vänskä 2004] Less than 1% of the websites related to the Oldenburg , Germany were using explicit localization in 2008 [Ahlers and Boll, 2008] 7% of the service websites from Finland collected in MOPSI until May 2015 [P4] Postal addresses: Most of the service websites have addresses <META name="geo.position" content="62.35;29.44">

Geographical data sources Own gazetteer for Finland OpenStreetMap address data for rest of the world

Address detection using prefix trees We detect street names and city names using prefix trees We are detecting other address elements (street numbers, postal codes, telephone numbers) using regular expressions

Address detection We start with detecting street names numbers City names Telephone We start with detecting street names We search for other address elements close to the street name We aggregate the detected address elements (street names, numbers, postal codes, telephone numbers and municipal names) into an address candidate We validate addresses using our gazetteers

Location-based web search: Title detection

Web page and DOM Tree

Service name detection Identify address nodes Divide the DOM tree so that 1 sub-tree has 1 address Sub-tree with 1 address Addresses

Service name detection Address DIV STRONG Yhteystiedot Niskakatu 11 P A Pizza Master Joensuu H2 Joensuu 80100 Joensuu Puh. 0400 281700 IMG ma-to 10:30-22:00 SPAN pe-la 10:30-04:30 su 12:00-22:00 BR Service name detection Identify address nodes Divide the DOM tree so that 1 sub-tree has 1 address Next step: score all the text nodes Sub-tree with 1 address

Scoring text nodes Score the other text nodes in the sub-tree Select text node with highest score as title node Score: 22/2=11 color: #222222; font-size:18px; font-weight: 900; text-transform: uppercase; DIV 1 2 P A +4 Pizza Master Joensuu Niskakatu 11 font-size:16px; color: #00000; +2 +3 +8 STRONG Yhteystiedot Score: 3/1=3 Joensuu H2 color: #fff1c8; 3 +6 +5 +9 Score: 26/3=8.66 Closest common ancestor node

Score according to appearance color: #222222; font-size:18px; font-weight: 900; text-transform: uppercase; DIV P A +4 Pizza Master Joensuu Niskakatu 11 font-size:16px; color: #00000; +2 +3 +8 STRONG Yhteystiedot 1 Score: 3 Joensuu H2 color: #fff1c8; +6 +5 +9 Score: 26 Score each node according to difference to the address node CSS Attributes Score color, background-color + perceptual color difference (0 to 10) font-size + (node font size - address node font size) font-weight +3 if bold or >500 text-transform +5 if uppercase HTML Tag Score H1 +7 H2 +6 H3 +5 H2, A +4 H5, H6, B, STRONG +3 I, EM +2 Others

Select the node with the highest score as the title Node distance penalty Score: 22/2=11 DIV 1 2 P A Pizza Master Joensuu Niskakatu 11 STRONG Yhteystiedot Score: 3/1=3 Joensuu H2 3 Score: 26/3=8.66 Select the node with the highest score as the title

Location-based web search: Representative image detection

Image categories Banner Formatting Logo Representative Icons Advertisement

Overall extraction process Extract images Web page link Categorize Analyze Rank Representative image Images found Web page

Image features used src http://www.ravintolakreeta.fi///images/banner.jpg alt -- title from css format jpg width 945 height 202 size 190,890 px aspect ratio 4.67 parent tag <div> class header

Summary of rule Category Features Keywords Representative Not in other category Logo logo Banner Ratio > 1.8 Banner, header, Footer, button Advertisement Free, adserver, now, buy, join, click, affiliate, adv, hits, counter Formatting and Icons Width < 100 px Height < 100 px Background, bg, spirit, templates

Scoring images Rule Score Image size ≥ 10.000 px 1 Aspect ratio ≤ 1.8 http://ptiszai.com/imageext/ Rule Score Image size ≥ 10.000 px 1 Aspect ratio ≤ 1.8 Image alt or title set a value Keywords of alt or title appear also in <title> tag Keywords of alt or title appear also in <h1> tag Keywords of image path also in <title> or <h1> tags The image is in the sub-tree of <h1> or <h2> tags Format = jpg Format = svg, png or gif 0.5

Mopsi WebIma dataset Summary of data collected: Websites: 1002 http://cs.uef.fi/mopsi/img/ Summary of data collected: Websites: 1002 Images: 2363 Per page: Min=1, Average=2.36, Max=154 Collection details: Who: 117 volunteers When: September 2014 What: Pages of own choice or Mopsi search How: Select 1-3 most representative images Issues: Some level of subjectivity unavoidable

Results summary Lightweight method suitable for real time applications   Accuracy Extracted Images WebIma 64% 99% Google+ 48% 92% Facebook 39% 90% Lightweight method suitable for real time applications Unsupervised: No training, no user feedback needed Finds correct image 64% of the cases. Outperforms Google+ (48%) and Facebook (39%) In use in MOPSI: Search and Service upgrade

O-Mopsi: Location-Based Mobile Orienteering Game

O-Mopsi location-based game

O-Mopsi vs. Orienteering

O-Mopsi: Web interface Single player movement simulation Multiple Players simulation (Players Competition)

SciFest feedback Feedback Very good Good Needs improvement Bad 3 6 Scifest 2012 7 2 Scifest 2013 21 Scifest 2014 8 19 Scifest 2015 9 1 Total 25 62 10

Conclusions

Main contributions An application that identifies location-based data in web pages by detecting postal address A gazetteer-based method to detect postal addresses using freely available data sources such as OpenStreetMap A location-aware mobile game that promotes physical exercise by applying concepts from the classical game of orienteering and uses geo-tagged photo collection created by users

Thank you for your attention! www.uef.fi