Discovering the constraint- based association rules in an archive for unique Bulgarian bells Tihomir Trifonov, Tsvetanka Georgieva Department of Mathematics.

Slides:



Advertisements
Similar presentations
Mining Association Rules from Microarray Gene Expression Data.
Advertisements

Data Mining in Clinical Databases by using Association Rules Department of Computing Charles Lo.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
LOGO Association Rule Lecturer: Dr. Bo Yuan
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Selecting Preservation Strategies for Web Archives Stephan Strodl, Andreas Rauber Department of Software.
Association Rule Mining Part 2 (under construction!) Introduction to Data Mining with Case Studies Author: G. K. Gupta Prentice Hall India, 2006.
Databases Chapter Distinguish between the physical and logical view of data Describe how data is organized: characters, fields, records, tables,
Chapter 3 Databases and Data Warehouses Building Business Intelligence
Your Interactive Guide to the Digital World Discovering Computers 2012 Chapter 10 Managing a Database.
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Chapter 14 Organizing and Manipulating the Data in Databases
Living in a Digital World Discovering Computers 2010.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Mining Negative Rules in Large Databases using GRD Dhananjay R Thiruvady Supervisor: Professor Geoffrey Webb.
Association Rule Mining (Some material adapted from: Mining Sequential Patterns by Karuna Pande Joshi)‏
2/8/00CSE 711 data mining: Apriori Algorithm by S. Cha 1 CSE 711 Seminar on Data Mining: Apriori Algorithm By Sung-Hyuk Cha.
Research Project Mining Negative Rules in Large Databases using GRD.
Discovering Computers Fundamentals, 2011 Edition Living in a Digital World.
Mining Association Rules
Data Mining – Intro.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Mining Association Rules between Sets of Items in Large Databases presented by Zhuang Wang.
Basic Data Mining Techniques
Apriori algorithm Seminar of Popular Algorithms in Data Mining and Machine Learning, TKK Presentation Lauri Lahti.
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
ACS1803 Lecture Outline 2 DATA MANAGEMENT CONCEPTS Text, Ch. 3 How do we store data (numeric and character records) in a computer so that we can optimize.
Chapter 6: Integrity and Security Thomas Nikl 19 October, 2004 CS157B.
Discovering Computers Fundamentals, 2012 Edition Your Interactive Guide to the Digital World.
Galina Bogdanova, Konstantin Rangochev, Desislava Paneva-Marinova, Nikolay Noev Institute of Mathematics and Informatics, Bulgarian Academy of Sciences.
Objectives Overview Define the term, database, and explain how a database interacts with data and information Define the term, data integrity, and describe.
Fundamentals of Information Systems, Fifth Edition
1 Apriori Algorithm Review for Finals. SE 157B, Spring Semester 2007 Professor Lee By Gaurang Negandhi.
Methods for Investigation and Security of the Audio and Video Archive for Unique Bulgarian Bells Galina Bogdanova, Institute of Mathematics and Informatics.
Information Technologies for Presentation of Bulgarian Folk Songs with Music, Notes and Text in a Digital Library Lozanka Peycheva, Nikolay Kirov, Maria.
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
Professor Michael J. Losacco CIS 1110 – Using Computers Database Management Chapter 9.
Objectives Overview Define the term, database, and explain how a database interacts with data and information Describe the qualities of valuable information.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
3-1 Management Information Systems for the Information Age Copyright 2004 The McGraw-Hill Companies, Inc. All rights reserved Chapter 3 Databases and Data.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang.
Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Core Concepts of ACCOUNTING INFORMATION SYSTEMS Moscove, Simkin & Bagranoff John Wiley & Sons, Inc. Developed by: S. Bhattacharya, Ph.D. Florida Atlantic.
Association Rule Mining Data Mining and Knowledge Discovery Prof. Carolina Ruiz and Weiyang Lin Department of Computer Science Worcester Polytechnic Institute.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
DISCOVERING SPATIAL CO- LOCATION PATTERNS PRESENTED BY: REYHANEH JEDDI & SHICHAO YU (GROUP 21) CSCI 5707, PRINCIPLES OF DATABASE SYSTEMS, FALL 2013 CSCI.
DATA RESOURCE MANAGEMENT
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Top-K Generation of Integrated Schemas Based on Directed and Weighted Correspondences by Ahmed Radwan, Lucian Popa, Ioana R. Stanoi, Akmal Younis Presented.
Radoslav Pavlov, Galina Bogdanova, Desislava Paneva- Marinova, Todor Todorov, Konstantin Rangochev
1 Top Down FP-Growth for Association Rule Mining By Ke Wang.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Maitrayee Mukerji. INPUT MEMORY PROCESS OUTPUT DATA INFO.
© 2017 by McGraw-Hill Education. This proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Mining Association Rules in Large Database This work is created by Dr. Anamika Bhargava, Ms. Pooja Kaul, Ms. Priti Bali and Ms. Rajnipriya Dhawan and licensed.
Data Mining – Intro.
Chapter Ten Managing a Database.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
CARPENTER Find Closed Patterns in Long Biological Datasets
Introduction of Week 9 Return assignment 5-2
A QUICK START TO OPL IBM ILOG OPL V6.3 > Starting Kit >
Presentation transcript:

Discovering the constraint- based association rules in an archive for unique Bulgarian bells Tihomir Trifonov, Tsvetanka Georgieva Department of Mathematics and Informatics, University of Veliko Tarnovo “St. Cyril and St. Methodius”, Bulgaria The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006 Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009

This paper presents an application that discovers the constraint-based association rules. It allows the association analysis of the different characteristics of the bells. Detailed information about the examined bells is maintained in an audio and video archive of unique Bulgarian bells. The application is realized with Java and SQL and provides the possibility for finding the association rules of the data obtained after applying the methods of digital processing of signals for analysis of bell sounds. Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009 The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006 2

Discovering the association rules is a data mining task [3, 7] in which the goal is to find interesting relationships between the attributes of the analyzed data. Once found, the association rules can be used for supporting decision making in different areas. In numerous cases the algorithms generate a large number of association rules, often thousands or even millions. It is almost impossible for the end users to encompass or validate such a large number of association rules, limiting the results of the data mining is therefore helpful. Tenth International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009 The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

Let I = {I 1, I 2, …, I n } be a set of n different values of attributes. Let R be a relation, where each tuple t has a unique identifier and contains a set of items, such that t  I. An association rule is an implication of the form X → Y, where X, Y  I are sets of items with X  Y = . The set X is called an antecedent, and Y –a consequent. There are two parameters associated with a rule:  the support of the association rule X → Y is the proportion (in percentages) of the number of the tuples in R, which contain X  Y to the total number of the tuples in the relation;  the confidence of the association rule X → Y is the proportion (in percentages) of the number of the tuples in R, which contain X  Y to the number of the tuples, which contain X. 4 Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009 The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006 Association Rules

5 Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009 The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006 The task of association rules mining is to generate all association rules which have values of the parameters support and confidence, exceeding the previously given respectively minimal support min_supp and minimal confidence min_conf. Therefore the discovery of the association rule requires finding the sets of items, which have a support, more than the previously defined minimal threshold min_supp. These sets are called frequent itemsets. Association Rules

To increase the efficiency of existing algorithms for data mining, during the mining process constraints are applied with the goal for these association rules, of which only those interesting to the user are generated, instead of all association rules. The constraint-based association rule mining aims to find all rules from given dataset, which satisfy the constraints required from the users. For discovering only the rules corresponding to the specific patterns, in [4] the meta-rules are applied. The format of the interesting rules is defined by using a template, the algorithm generates only these rules, which correspond to this template. Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006 Constraint-based Association Rules

A meta-rule [4] is a rule template from the following type P 1  P 2  …  P m → Q 1  Q 2  …  Q l where P i (i = 1, …, m) and Q j (j = 1, …, l) are instantiated predicates or predicate variables, p = m + l is the number of the predicates in the rule. This paper presents an application, which allows the user to set constraints for searched rules and finds constraint- based association rules. The application is used for performing the association analysis on the different characteristics of the bells, the information for which is kept in an archive produced for the goal. Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009 Constraint-based Association Rules 7 The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

Detailed data about the analysis of bells is stored in an audio and video archive of the unique Bulgarian bells [10]. The data of the archive is accessible from For each bell, information is maintained for its unique identifier, location, type, geometrical dimensions, weight, material, condition, creator, year or period of creation, description, estimation of its historical value, digital photos, sound and video files, spectrograms. Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006 Discovering the constraint-based association rules in an archive for unique Bulgarian bells

A program is realized with MatLab [11] for analysis of the sounds of the bells by using various methods for digital signal processing (DSP) – spectral analysis by means of the Discrete Fourier Transform (DFT), digital filter, wavelet analysis. The partials of the sounds of the bells are found by applying the digital filter. The information about the previously calculated partials of the sounds of the different bells is stored in the archive. The represented application discovers the constraint- based association rules in an archive for unique Bulgarian bells. It is realized by using the languages Java and SQL. Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009 Discovering the constraint-based association rules in an archive for unique Bulgarian bells 9 The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

To the user that starts the application, the following possibilities, are provided: Setting the attributes, being subject to analysis; Setting the minimal value of the support min_supp and the minimal value of the confidence min_conf; Setting the conditions (Boolean expression) for the values of the attributes, which can participate or not in the antecedent and the consequence of the searched rules. Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009 Discovering the constraint-based association rules in an archive for unique Bulgarian bells 10 The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

Usually the user is interested in a specified subset of attributes and wants to express interesting common connections between the selected attributes. Therefore a facility with a friendly interface should be provided to specify the set of attributes to be mined and exclude the set of irrelevant attributes from the examination. The selection of the attributes controlled by the user gives the opportunity to decide the attributes in the antecedent and the consequences of the searched rules. Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009 Discovering the constraint-based association rules in an archive for unique Bulgarian bells 11 The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

The support reflects the utility on given rule.  The minimal support min_supp, which an association rule has to satisfy, means that each value, included in the study, has to appear a significant number of times in corresponding attribute of the initial relation.  The user can define various values of the minimal support, when the items are mined. The confidence reflects the certainty of a discovered rule. Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009 Discovering the constraint-based association rules in an archive for unique Bulgarian bells 12 The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009 Choosing the attributes; defining the minimal support, the minimal confidence, the conditions for the values of the attributes; outputting the found rules 13 The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

The figure shows an example result from the execution of the realized program with given values of the minimal support, minimal confidence and conditions for the values of the attributes. For instance, let the following rule be generated from the archive for unique Bulgarian bells: SecondParial(“600 Hz”) → FirstPartial(“320 Hz”) with values of the support s = and the confidence c = 0.5. This rule means, that for monastery bells with second partial 600 Hz one of the most frequent values of the first partial is 320 Hz (with 50.00% confidence) and the monastery bells with value 600 Hz of their second partial represent 2.5% from all bells, included in the study. Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009 Discovering the constraint-based association rules in an archive for unique Bulgarian bells 14 The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

When representing the association rules in a tabular view all found rules are exposed in a table, where each row matches to a rule and provides information for the support and the confidence of this rule. All rules can be showed in different orders – according to the values of the attributes, participating in the antecedents and the consequents of the discovered rules; according to the values of the parameters support and confidence in ascending or descending order. In this manner the user has a more clear and complete view of the rules and can more easily locate a special rule. The tabular view facilitates adopting a large number of rules. Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009 Discovering the constraint-based association rules in an archive for unique Bulgarian bells 15 The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

Monastery “St. Transfiguration” near the town of Veliko Tarnovo Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

Church “St. Nikolay” in Veliko Tarnovo, View close to the bell Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

The connection between BellDB and MatLab provides possibility for analyzing the sounds of the bells and to search a concrete bell by a sample of its sound. Real sound of the given bell and its 3D spectrogram, computed in MatLab Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

Data acquisition of the experimental data (PULSE 11, B&K) Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

References 1. R. Agrawal, T. Imielinski, A. Swami, Mining Association Rules between Sets of Items in Large Databases, In Proc. of the ACM SIGMOD International Conference on Management of Data, R. Agrawal, R. Srikant, Fast Algorithms for Mining Association Rules, Proc. of the Int. Conf. on Very Large Databases, A. A. Barsegyan, M. S. Kupriyanov, V. V. Stepanenko, I. I. Holod, Technologies for data analysis: Data Mining, Visual Mining, Text Mining, OLAP, BHV-Peterburg, 2008 (in Russian). 4. Y. Fu and J. Han, Meta-rule-guided mining of association rules in relational databases, In Proc. of the Int. Workshop on Integration of Knowledge Discovery with Deductive and Object-Oriented Databases, T. Georgieva, Discovering Branching and Fractional Dependencies in Databases, Data and Knowledge Engineering, Elsevier, vol. 66, № 2, M. Kamber, J. Han, J. Chiang, Using Data Cubes for Metarule-Guided Mining of Multi-Dimensional Association Rules, Technical Report, CMPT–TR–97–10, School of Computing Sciences, Simon Fraser University, M. Kantardzic, Data Mining: Concepts, Models, Methods, and Algorithms, John Wiley & Sons, S. Kotsiantis and D. Kanellopoulos, Association Rules Mining: A Recent Overview, GESTS International Transactions on Computer Science and Engineering, Vol.32 (1), R. Ng, L. Lakshmanan, J. Han, A. Pang, Exploratory Mining and Pruning Optimizations of Constrained Association Rules, In Proceedings of the ACM SIGMOD Conference on Management of Data, T. Trifonov,T. Georgieva, Web based approach to managing audio and video archive for unique Bulgarian bells, In Proc. of the Tenth Int. Conf. on science and technology "System Analysis and Information Technologies", Kiev, T. Trifonov,T. Georgieva, The bell chime – an acoustical, mathematical and technological challenge, In Proceedings of the National Scientific Conference on Acoustics, Varna, The Bell Project “Research and Identification of Valuable Bells of the Historic and Culture Heritage of Bulgaria and Development of Audio and Video Archive with Advanced Technologies” Website, Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/2006

Благодаря за вниманието! Thank you for your attention! Спасибо за внимание! The work was supported partially by the Bulgarian National Science Fund under Grant KIN-1009/ Eleventh International Conference on System Analysis and Information Technologies, Kyiv, Ukraine, 2009