Dimitrios Katsaros* † Yannis Manolopoulos* † Aristotle University, Greece *University of Thessaly, Greece Suffix Tree Based Prediction for Pervasive Computing.

Slides:



Advertisements
Similar presentations
Hadi Goudarzi and Massoud Pedram
Advertisements

شهره کاظمی 1 آزمايشکاه سيستم های هوشمند ( گزارش پيشرفت کار پروژه مدل مارکف.
Enhancing Secrecy With Channel Knowledge
The Rate of Concentration of the stationary distribution of a Markov Chain on the Homogenous Populations. Boris Mitavskiy and Jonathan Rowe School of Computer.
 CpG is a pair of nucleotides C and G, appearing successively, in this order, along one DNA strand.  CpG islands are particular short subsequences in.
Mobility and Predictability of Ultra Mobile Users Jeeyoung Kim and Ahmed Helmy.
Hidden Markov Models Fundamentals and applications to bioinformatics.
Computability and Complexity 20-1 Computability and Complexity Andrei Bulatov Random Sources.
1 Prediction-based Strategies for Energy Saving in Object Tracking Sensor Networks Yingqi Xu, Wang-Chien Lee Proceedings of the 2004 IEEE International.
The Cache Location Problem IEEE/ACM Transactions on Networking, Vol. 8, No. 5, October 2000 P. Krishnan, Danny Raz, Member, IEEE, and Yuval Shavitt, Member,
Location and Handoff Management Lecture 10. Location and Handoff Management The current point of attachment or location of a subscriber (mobile unit)
Secure communication in cellular and ad hoc environments Bharat Bhargava Department of Computer Sciences, Purdue University This is supported.
WebKDD 2001 Aristotle University of Thessaloniki 1 Effective Prediction of Web-user Accesses: A Data Mining Approach Nanopoulos Alexandros Katsaros Dimitrios.
Mining Longest Repeating Subsequences to Predict World Wide Web Surfing Jatin Patel Electrical and Computer Engineering Wayne State University, Detroit,
Continuous Data Stream Processing MAKE Lab Date: 2006/03/07 Post-Excellence Project Subproject 6.
University of Athens, Greece Pervasive Computing Research Group Predicting the Location of Mobile Users: A Machine Learning Approach 1 University of Athens,
1 Probabilistic Models for Web Caching David Starobinski, David Tse UC Berkeley Conference and Workshop on Stochastic Networks Madison, Wisconsin, June.
November 22, 2003 BCI 2003 Aristotle University of Thessaloniki 1 Updating Web views distributed over wide area networks Sidiropoulos Antonis Katsaros.
2001 Dimitrios Katsaros Panhellenic Conference on Informatics (ΕΠΥ’8) 1 Efficient Maintenance of Semistructured Schema Katsaros Dimitrios Aristotle University.
1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier.
Predictive Reservation for Allocating Bandwidth to Wireless ATM Networks Venkat Vinay Pampati.
Jianliang XU, Dik L. Lee, and Bo Li Dept. of Computer Science Hong Kong Univ. of Science & Technology April 2002 On Bandwidth Allocation for Data Dissemination.
Using CTW as a language modeler in Dasher Martijn van Veen Signal Processing Group Department of Electrical Engineering Eindhoven University.
CS401 presentation1 Effective Replica Allocation in Ad Hoc Networks for Improving Data Accessibility Takahiro Hara Presented by Mingsheng Peng (Proc. IEEE.
Client-Server Computing in Mobile Environments
By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching.
Chapter 8 Prediction Algorithms for Smart Environments
Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)
Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)
Introduction Random Process. Where do we start from? Undergraduate Graduate Probability course Our main course Review and Additional course If we have.
Network: Location Management Y. Richard Yang 3/21/2011.
SOS: Security Overlay Service Angelos D. Keromytis, Vishal Misra, Daniel Rubenstein- Columbia University ACM SIGCOMM 2002 CONFERENCE, PITTSBURGH PA, AUG.
Multimedia Data Introduction to Lossless Data Compression Dr Sandra I. Woolley Electronic, Electrical.
A High Performance Channel Sorting Scheduling Algorithm Based On Largest Packet P.G.Sarigiannidis, G.I.Papadimitriou, and A.S.Pomportsis Department of.
ECO-DNS: Expected Consistency Optimization for DNS Chen Stephanos Matsumoto Adrian Perrig © 2013 Stephanos Matsumoto1.
Fault-Tolerant Papers Broadband Network & Mobile Communication Lab Course: Computer Fault-Tolerant Speaker: 邱朝螢 Date: 2004/4/20.
Location Management in PCS Networks Report of Dissertation By Manikanta Velaga (Adm. No ) Sanjoy Mondal (Adm. No ) M.Tech (CA)
Load-Balancing Routing in Multichannel Hybrid Wireless Networks With Single Network Interface So, J.; Vaidya, N. H.; Vehicular Technology, IEEE Transactions.
CSE 598/494 – Mobile Computing Systems and Applications Class 13:Location Management Sandeep K. S. Gupta School of Computing and Informatics Arizona State.
1 ACTIVE FAULT TOLERANT SYSTEM for OPEN DISTRIBUTED COMPUTING (Autonomic and Trusted Computing 2006) Giray Kömürcü.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Feb 5, ECET 581/CPET/ECET 499 Mobile Computing Technologies & Apps Data Dissemination and Management 2 of 3 Lecture 7 Paul I-Hai Lin, Professor Electrical.
KAIS T On the problem of placing Mobility Anchor Points in Wireless Mesh Networks Lei Wu & Bjorn Lanfeldt, Wireless Mesh Community Networks Workshop, 2006.
Energy-Efficient Data Caching and Prefetching for Mobile Devices Based on Utility Huaping Shen, Mohan Kumar, Sajal K. Das, and Zhijun Wang P 邱仁傑.
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.
Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.
February 4, Location Based M-Services Soon there will be more on-line personal mobile devices than on-line stationary PCs. Location based mobile-services.
Web Prefetching Lili Qiu Microsoft Research March 27, 2003.
Using Proxy Cache Relocation to Accelerate Web Browsing in Wireless/Mobile Comm. Authors: Stathes Hadjiefthymiades and Lazaros Merakos Dept. of Informatics.
A Bandwidth Scheduling Algorithm Based on Minimum Interference Traffic in Mesh Mode Xu-Yajing, Li-ZhiTao, Zhong-XiuFang and Xu-HuiMin International Conference.
Storage Systems CSE 598d, Spring 2007 OS Support for DB Management DB File System April 3, 2007 Mark Johnson.
Distributed Skip Air Index for Smart Broadcasting in Intelligent Transportation Systems Leandros Maglaras and Dimitrios Katsaros Department of Computer.
Nikos Dimokas1 Dimitrios Katsaros2 (presentation)
Machine Learning Applications in Grid Computing
Effective Prediction of Web-user Accesses: A Data Mining Approach
The Impact of Replacement Granularity on Video Caching
Data Dissemination and Management - Topics
Data Dissemination and Management (2) Lecture 10
Chapter 3: Wireless WANs and MANs
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Analysis and Evaluation of a New MAC Protocol
Hidden Markov Models Part 2: Algorithms
CSE 4340/5349 Mobile Systems Engineering
Effective Replica Allocation
A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS*
Effective Prediction of Web-user Accesses: A Data Mining Approach
Efficient Cache-Supported Path Planning on Roads
Nikos Dimokas1 Dimitrios Katsaros (presentation) Leandros Tassiulas2
Data Dissemination and Management (2) Lecture 10
Presentation transcript:

Dimitrios Katsaros* † Yannis Manolopoulos* † Aristotle University, Greece *University of Thessaly, Greece Suffix Tree Based Prediction for Pervasive Computing Environments

Panhellenic Conference on Informatics, November The architecture of a PCS

Panhellenic Conference on Informatics, November Information dissemination in a PCS Information System (server) Wireless Cell Base Station Downlink Communication Bandwidth Mobile Hosts (MH) #MHosts >> #Servers Uplink bandwidth << Downlink bandwidth

Panhellenic Conference on Informatics, November Roaming: Where is the mobile? The mobile can freely roam inside the coverage area of the cellular system Arises the need for location management –location update –location prediction

Panhellenic Conference on Informatics, November Querying: What data will be requested? The mobile can request any data available in the information system Arises the need for –Proactively pushing them into the broadcast channel –Proactively sending them to the next-to-visit base station

Panhellenic Conference on Informatics, November Predict: Position & Information Needs Why is the location prediction useful? –effective solutions to the mobility tracking/prediction problem can reduce update and paging costs, freeing the network from excessive signaling traffic [bd02]. Why is the request prediction useful? –Accurate data request prediction results in effective prefetching [nkm03], which combined with a caching mechanism [km04], can reduce user-perceived latencies as well as server and network loads [bd02] A. Bhattacharya and S. K. Das, LeZi-Update: An information-theoretic framework for personal mobility tracking in PCS networks, ACM/Kluwer Wireless Networks, 8(2-3), pp. 121 – 135, [nkm03] A. Nanopoulos, D. Katsaros, Y. Manolopoulos, A data mining algorithm for generalized Web prefetching, IEEE Transactions on Knowledge and Data Engineering, 15(5), pp – 1169, [km04] D. Katsaros and Y. Manolopoulos, Web caching in broadcast mobile wireless environments, IEEE Internet Computing, 8 (3), pp. 37 – 45, 2004.

Panhellenic Conference on Informatics, November Where is prediction based? Both of the aforementioned problems are related to the ability of the underlying network to –record, –learn and, subsequently –predict the mobile's “behaviour”, i.e., its movements or its information needs The success of the prediction is presupposed and is boost by the fact that mobile users exhibit some degree of regularity in their movement and/or in their access patterns This regularity may be apparent in the behaviour of each individual client or in client groups.

Panhellenic Conference on Informatics, November Location prediction  Request prediction These issues had been treated in isolation, but pioneering works ([vk96] and [bd02]) are paving the way for treating both problems in an homogeneous fashion Use methods for data compression (thus, characterized as “information-theoretic”), in carrying out prediction. They model the respective state space as finite alphabets comprised of discrete symbols In the mobility tracking scenario, the alphabet consists of all possible sites (cells) where the client has ever visited or might visit (assuming that the number of cells in the coverage area is finite) In the request prediction scenario, the alphabet consists of all the data objects requested by the client plus the objects that might be requested in the future (assuming that the objects come from a database and thus their number is finite) [vk96] J. S. Vitter and P. Krishnan, Optimal prefetching via data compression, Journal of the ACM, 43 (5), pp. 771–793, 1996.

Panhellenic Conference on Informatics, November Families of predictors PPM: Prediction by Partial Match LZ78: Lempel-Ziv 1978 PST: Probabilistic Suffix Tree CTW: Context –Tree Weighting Overheads FamilyTrainingParameterizationStorage LZ78 Onlinemoderate PPM online/offlinemoderate/heavylarge PST offlineheavylow CTW onlinemoderatelarge

Panhellenic Conference on Informatics, November The PPM predictor Running sequence: aabacbbabbacbbc

Panhellenic Conference on Informatics, November The LZ78 predictor Running sequence: aabacbbabbacbbc Enhanced

Panhellenic Conference on Informatics, November The PST predictor Running sequence: aabacbbabbacbbc

Panhellenic Conference on Informatics, November The CTW predictor (1/3) Running bin sequence: 010| Krichevsky-Trofimov estimator:

Panhellenic Conference on Informatics, November The CTW predictor (2/3)

Panhellenic Conference on Informatics, November The CTW predictor (3/3)

Panhellenic Conference on Informatics, November Discrete Sequence Prediction Problem At any given time instance t (meaning that t symbols x t, x t-1,...,x 1 have appeared, in reverse order) calculate the conditional probability where This model introduces stationary Markov chain, since the probabilities are not time-dependent The outcome of the predictor is a ranking of the symbols according to their P. The predictors which use such kind of prediction models are termed Markov predictors

Panhellenic Conference on Informatics, November The STP algorithm [em92] A. Ehrenfeucht and J. Mycielski, A pseudorandom sequence – How random is it?, American Mathematical Monthly, 99 (4), pp. 373–375, 1992.

Panhellenic Conference on Informatics, November An example execution of STP Suppose that the sequence of symbols seen so far is the following: s 1 24 = abcdefgabcdklmabcdexabcd$ The largest suffix of s 1 24 which appears somewhere in s 1 24 is the ss 1 4 = abcd Let α = 0.5 Then sss 1 2 = cd The appearances of cd inside s 1 24 are located at the positions 3, 10, 17, 23 Therefore, the marked positions are the 5, 12, 19, 25 The last one is NULL since it contains the symbol we want to predict Thus, the sequence of candidate predicted symbols is e,k,e. Since the symbol that appears most of the times in this sequence is the e, the output of the STP algorithm, i.e., the predicted symbol at this stage, is e.

Panhellenic Conference on Informatics, November An example execution of STP Suppose that the sequence of symbols seen so far is the following: s 1 24 = abcdefgabcdklmabcdexabcd$ The largest suffix which appear somewhere is the seq is abcd, and s 1 24 = abcdefgabcdklmabcdexabcd$ Let α = 0.5, thus we use a portion of abcd, half of it: cd Appearances of cd in the sequence are: s 1 24 = abcdefgabcdklmabcdexabcd$ Candidate predictions Since e appears most of the times, the final outcome of the prediction is: e

Panhellenic Conference on Informatics, November Proof of concept of STP (1/2) Definition. The ratio of symbols returned by the predictor that indeed match with the next event/symbol in the sequence, divided by the total number of symbols return by the predictor defines the prediction precision

Panhellenic Conference on Informatics, November Proof of concept of STP (2/2) Definition. The total number of symbols return by the predictor divided by the total number of events/symbols of the sequence defines the prediction overhead

Panhellenic Conference on Informatics, November Thank you for your attention! Any questions ?