Andrei Tabarcea, Matti Mononen 6.03.2013.  Joint PhD degree candidate for University of Eastern Finland and Technical University of Iasi, Romania  ECSE.

Slides:



Advertisements
Similar presentations
Towards Data Mining Without Information on Knowledge Structure
Advertisements

Doing your Ph.D. in Germany. Title of Presentation | Seite 2 Why? Excellence in Research Close cooperation with leading scientists.
Fatma Y. ELDRESI Fatma Y. ELDRESI ( MPhil ) Systems Analysis / Programming Specialist, AGOCO Part time lecturer in University of Garyounis,
Decision Support and Artificial Intelligence Jack G. Zheng July 11 th 2005 MIS Chapter 4.
- A Powerful Computing Technology Department of Computer Science Wayne State University 1.
Surrey Libraries Computer Learning Centres January 2012 Internet Searching Teaching Script Totally New to Computers Internet Searching.
Introduction Lesson 1 Microsoft Office 2010 and the Internet
The internet. Background Created in 1969, connected computers at UCLA, Stanford Research Institute, U. of Utah, and UC at Santa Barbara With an estimated.
Airmux Presentation for TS2012 Slide 1 Presented by: CBNetworks Technical Support Airmux- Planner.
Search Engine Optimisation (SEO) by Graham Sowerby (28 th November 2013)
June 22, 2007 CMPE588 Term Project Presentation Discovery of Composable Web Services Presented by: Vassilya Abdulova.
Tables Tables provide a means of organising the layout of data
M obile U ser I nterface Survey March 11, 2011 Hyojin Song.
Copyright 2013, Mobile Commons, Inc. reach everyone, everywhere Keep your mobile subscribers in the know-on the go! April 3, 2013 Gloria Fong
American Chemical Society Local Section Track Sunday Morning January 26, 2014 Mitchell Bruce, 2013 Chair, Committee on Local Section Activities (LSAC)
Introduction for University Staff CiCS welcomes you to the University of Sheffield 12/06/2014Allan Wright © The University of Sheffield 1.
Location-Based Social Networks Yu Zheng and Xing Xie Microsoft Research Asia Chapter 8 and 9 of the book Computing with Spatial Trajectories.
Wiki-Reality: Augmenting Reality with Community Driven Websites Speaker: Yi Wu Intel Labs/vision and image processing research Collaborators: Douglas Gray,
A Prototype Implementation of a Framework for Organising Virtual Exhibitions over the Web Ali Elbekai, Nick Rossiter School of Computing, Engineering and.
Macromedia Dreamweaver MX 2004 – Design Professional Dreamweaver GETTING STARTED WITH.
Chapter 13 Web Page Design Studio
Presenter: James Huang Date: Sept. 29,  HTTP and WWW  Bottle Web Framework  Request Routing  Sending Static Files  Handling HTML  HTTP Errors.
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
South Dakota Library Network MetaLib User Interface South Dakota Library Network 1200 University, Unit 9672 Spearfish, SD © South Dakota.
Psychological Advertising: Exploring User Psychology for Click Prediction in Sponsored Search Date: 2014/03/25 Author: Taifeng Wang, Jiang Bian, Shusen.
Four aspects of relevance: content, time, location and network Pasi Fränti, Jinhua Chen and Andrei Tabarcea.
CMo: When Less Is More Yevgen Borodin Jalal Mahmud I.V. Ramakrishnan Context-Directed Browsing for Mobiles.
Location-based search: services, photos, web Andrei Tabarcea Mohammad Rezaei
A reactive location-based service for geo-referenced individual data collection and analysis Xiujun Ma Department of Machine Intelligence, Peking University.
Aki Hecht Seminar in Databases (236826) January 2009
Towards Semantic Web Mining Bettina Berndt Andreas Hotho Gerd Stumme.
Web Data Management Dr. Daniel Deutch. Web Data The web has revolutionized our world Data is everywhere Constitutes a great potential But also a lot of.
MOBIGUIDE MOBIGUIDE CS 8803 – ADVANCED INTERNET APPLICATION DEVELOPMENT Project Presentation By: Ashwin Pallikarana Tirumala Lalanthika Vasudevan Sneha.
Retrieving Location-based Data on the Web Andrei Tabarcea,
Lecture 5 Geocoding. What is geocoding? the process of transforming a description of a location—such as a pair of coordinates, an address, or a name of.
A Social Help Engine for Online Social Network Mobile Users Tam Vu, Akash Baid WINLAB, Rutgers University May 21,
How Search Engines Work. Any ideas? Building an index Dan taylor Flickr Creative Commons.
Over My Shoulder Training Session 4. Over My Shoulder Training Week 3 – Fulfillment – Search Engines and Citations Setting client expectations – More.
Mobile collection of location-based multimedia School of Computing University of Eastern Finland Prof. Pasi Fränti Research presentation
AD-HOC GEOREFERENCING OF WEB-PAGES USING STREET-NAME PREFIX TREES Andrei Tabarcea, Ville Hautamäki, Pasi FräntiAndrei Tabarcea, Ville Hautamäki, Pasi Fränti.
MOBIGUIDE MOBIGUIDE CS 8803 – ADVANCED INTERNET APPLICATION DEVELOPMENT Project Presentation By: Ashwin Pallikarana Tirumala ( ) Lalanthika Vasudevan( )
The Democratic Information Architecture: Government as Nexus Angela Newell Dissertation Defense LBJ School of Public Affairs The University of Texas at.
Detecting Movement Type by Route Segmentation and Classification Karol Waga, Andrei Tabarcea, Minjie Chen and Pasi Fränti.
Analysis of DOM Structures for Site-Level Template Extraction (PSI 2015) Joint work done in colaboration with Julián Alarte, Josep Silva, Salvador Tamarit.
Recommendation system MOPSI project KAROL WAGA
Data Mining for Personal Navigation Gurushyam Hariharan Pasi Fränti Sandeep Mehta DYNAMAP PROJECT University of Joensuu, FINLAND
Search Engines1 Searching the Web Web is vast. Information is scattered around and changing fast. Anyone can publish on the web. Two issues web users have.
Location Aware Information System (LAIS) Neftali Alverio Bryan Halter Jeff Cardillo Brian Reed Advisor: Prof. Tilman Wolf.
Mobile Search Engine Based on idea presented in paper Data mining for personal navigation, Hariharan, G., Fränti, P., Mehta S. (2002)
Semantic Mapping with MediaWiki Jeroen De Dauw. Presentation outline Introduction to MediaWiki Introduction to Semantic MediaWiki – Questions Maps Semantic.
The World Wide Web: Information Resource. How a Search Engine works… How Search Works - YouTube
CompSci Today’s topics Networks & the Internet Basic HTML ä The basis for web pages ä “Almost” programming Upcoming ä Connections ä Algorithms.
CompSci Today’s topics Basic HTML  The basis for web pages  “Almost” programming Upcoming  Programming  Java Reading Great Ideas Chapters 1,
CompSci 1 Lecture 2 HTML Webpages. Today’s Topics Basic HTML The basis for web pages “Almost” programming Upcoming Programming Java Reading Great Ideas.
Can social network be used for location-aware recommendation? Pasi Fränti, Karol Waga and Chaitanya Khurana  P. Fränti, K. Waga, and C. Khurana Can social.
Extracting Representative Image from Web page Najlaa Gali, Andrei Tabarcea and Pasi Fränti.
Protecting your search privacy A lesson plan created & presented by Maria Bernhey (MLS) Adjunct Information Literacy Instructor
Benefits of Online Business Directory Listings Presented to you By Local Search UAE
The Internet Industry Week Two.
Julián ALARTE DAVID INSA JOSEP SILVA
Site-Level Web Template Extraction
Based on Menu Information
Today’s topics Networks & the Internet Basic HTML
IS 360 Web Promotion.
Location-based web search and mobile applications
Web Data Extraction Based on Partial Tree Alignment
Extracting Representative Image from Web page
Radu Mariescu-Istodor
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
Radu Mariescu-Istodor
Presentation transcript:

Andrei Tabarcea, Matti Mononen

 Joint PhD degree candidate for University of Eastern Finland and Technical University of Iasi, Romania  ECSE grant 2012 & 2013  Proposed graduation 2014, supervisor prof. Pasi Fränti  Thesis “Location-based applications”  Research part of Mopsi project

 A. Tabarcea, K. Waga, Z. Wan and P. Fränti, "O-Mopsi: Mobile Orienteering Game Using Geotagged Photos", Int. Conf. on Web Information Systems & Technologies (WEBIST'13), Aachen, Germany, May 2013.Int. Conf. on Web Information Systems & Technologies (WEBIST'13)  K. Waga, A. Tabarcea, R. Mariescu-Istodor and P. Fränti, "Real Time Access to Multiple GPS Tracks", Int. Conf. on Web Information Systems & Technologies (WEBIST'13), Aachen, Germany, 8-10 May 2013.Int. Conf. on Web Information Systems & Technologies (WEBIST'13)  K. Waga, A. Tabarcea, R. Mariescu-Istodor and P. Fränti, "System for real time storage, retrieval and visualization of GPS tracks", Int. Conf. System Theory, Control and Computing (ICSTCC 2012), Sinaia, Romania, Vol. 2, October 2012.Int. Conf. System Theory, Control and Computing (ICSTCC 2012)  K. Waga, A. Tabarcea, M. Chen and P. Fränti, "Detecting movement type by route segmentation and classification", IEEE Int. Conf. on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom'12), Pittsburgh, USA, 2012IEEE Int. Conf. on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom'12)  K. Waga, A. Tabarcea and P. Fränti, "Recommendation of points of interest from user generated data collection", IEEE Int. Conf. on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom'12), Pittsburgh, USA, 2012.IEEE Int. Conf. on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom'12)

How to find location-information in web-pages?

domain: uef.fi descr: ITÄ-SUOMEN YLIOPISTO (UNIV OF EASTERN FINLAND) descr: address: TIETOTEKNIIKKAKESKUS (IT-CENTRE)/Jarno Huuskonen address: PL 1627 address: address: KUOPIO FINLAND phone: status: Granted created: modified: expires: nserver: ns-secondary.funet.fi [Ok] nserver: ns1.uef.fi [Ok] nserver: ns2.uef.fi [Ok] dnssec: no

 geo-tags, address-tags, vcards for Google Maps etc. Pages of Pasi Fränti

Scouts' Youth Hostel Scouts' Youth Hostel (8.3 km from Joensuu Airport ) Show mapShow map Good, 7.4 Latest booking: January 23 Scouts’ Youth Hostel is located at the outfall of River Pielisjoki, 1.5 km from Joensuu city centre. It offers free Wi-Fi and rooms with shared bathroom and kitchen facilities. OlgaSaint-Petersburg, Russia "Great price for the nice room. Friendly stuff, cozy atmosphere. But a bit loud." from € 46 € 46

Input: • user location (lat, lon) • keywords Output: list of services containing: • name/title • website • address (street, number. city) • location (lat, lon) • image • other info (opening hours, telephone etc.) Main idea: • preprocess the search results of an external search engine (Google, Yahoo, Bing etc.) by detecting postal address in order to find the location

1.Convert user location (lat, lon) into user address = Geocoding step 2.Search with the query "keyword+city" using an external search engine API and download the first k results (web pages) = Web page retrieval step 3.Detect addresses and additional informatio from the downloaded web pages = Data mining step 4.Ranking the results (distance, relevance etc.) = Ranking step 5.Display the search results to the user 1. Geocoder 2. Web page retrieval 3. Data mining 4. Result ranking User lat, lon keywords web pages result list 5. ranked result list

Geocoder Web page retrieval Data mining Result ranking User lat, lon keywords web pages result list ranked result list Convert user location (lat, lon) into user address using:

Geocoder Web page retrieval Data mining Result ranking User lat, lon keywords web pages result list ranked result list Download k webpages from the query using API of:

Geocoder Web page retrieval Data mining Result ranking User lat, lon keywords web pages result list ranked result list Main criterion: distance from the user’s location Future idea: relevance to user’s profile and history

Geocoder Web page retrieval Data mining Result ranking User lat, lon keywords web pages result list ranked result list Main idea: Find location information in HTML pages by detecting postal addresses Steps: 1.Parse and segment the HTML page 2.Identify addresses and locations 3.Identify the services the addresses are pointing to (name/title) 4.Retrieve extra information (photos, opening hours, telephone etc.)

Extract text from HTML pages Segmentation of web pages using DOM tree ONLINE TILAUS RAVINTOLAT Ravintola Deli Istanbul Kotiinkuljetus Nouto Pilkkitie 1, Joensuu, Rantakylä Avoinna - Kotiinkuljetus - Nouto La Dolce Vita Kotiinkuljetus Nouto Wahlforssinkatu 6, Joensuu, Ke.. Avoinna - Kotiinkuljetus - Nouto

• Rule-based pattern matching algorithm • Starting point: the detection of street-names • An address-block candidate is constructed by detecting: • street names and number • postal codes • municipal names • We will use OpenStreetMap database for global detection Street names Street numbers City names Telephone numbers

blue: links (the A tag) red: tables (TABLE, TR and TD tags) green: dividers (DIV tag) violet: images (the IMG tag) yellow: forms (FORM, INPUT, TEXTAREA, SELECT and OPTION tags) orange: linebreaks and blockquotes (BR, P, and BLOCKQUOTE tags) black: HTML tag, the root node gray: all other tags

PizzaPojat Niinivaara Niinivaarantie Joensuu PizzaPojat Niinivaara Niinivaarantie Joensuu

Bosbor kebab Fiesta Miami

PizzaPojat Niinivaara Niinivaarantie Joensuu Convert HTML pages to xHTML for using xQuery 2. Detect addresses and postal codes 3. Break the DOM tree into subtrees 4. Use heuristics and regular expressions to detect extra information from the subtree (service name, telephone, opening hours etc.)

Thank you

Yhteystiedot Pizza Master - Joensuu Niskakatu Joensuu Puh Avoinna Pizza Master - Joensuu Ma-To Pe-La SU