STEGANOGRAPHY: Data Mining:

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Supporting End-User Access
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Data warehouse example
Chapter 9 Business Intelligence Systems
The Decision-Making Process IT Brainpower
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Data Mining.
Data Mining By Archana Ketkar.
Data Mining Ketaki Borkar CS157A November 29, 2007.
Knowledge Discovery Centre: CityU-SAS Partnership 1 Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr.
Data Mining – Intro.
1 Data and Knowledge Management. 2 Data Management: A Critical Success Factor The difficulties and the process Data sources and collection Data quality.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Data mining By Aung Oo.
DataMining By Guan Hang Su CS157A section 2 fall 2005.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Chapter 5 Data mining : A Closer Look.
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Mining Techniques
Shilpa Seth.  What is Data Mining What is Data Mining  Applications of Data Mining Applications of Data Mining  KDD Process KDD Process  Architecture.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Techniques As Tools for Analysis of Customer Behavior
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.
Robert Krenn January 21, 2004 Steganography Implementation & Detection.
Chapter 1 Introduction to Data Mining
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
Introduction to SQL Server Data Mining Nick Ward SQL Server & BI Product Specialist Microsoft Australia Nick Ward SQL Server & BI Product Specialist Microsoft.
Fox MIS Spring 2011 Data Mining Week 9 Introduction to Data Mining.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
BARCODE IDENTIFICATION BY USING WAVELET BASED ENERGY Soundararajan Ezekiel, Gary Greenwood, David Pazzaglia Computer Science Department Indiana University.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Organizing Data and Information
1 Introduction to Data Mining C hapter 1. 2 Chapter 1 Outline Chapter 1 Outline – Background –Information is Power –Knowledge is Power –Data Mining.
MIS2502: Data Analytics Advanced Analytics - Introduction.
DATA MINING PREPARED BY RAJNIKANT MODI REFERENCE:DOUG ALEXANDER.
Academic Year 2014 Spring Academic Year 2014 Spring.
Data Mining. Overview the extraction of hidden predictive information from large databases Data mining tools predict future trends and behaviors, allowing.
Data Mining Copyright KEYSOFT Solutions.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Waqas Haider Bangyal. 2 Source Materials “ Data Mining: Concepts and Techniques” by Jiawei Han & Micheline Kamber, Second Edition, Morgan Kaufmann, 2006.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Why Intelligent Data Analysis? Joost N. Kok Leiden Institute of Advanced Computer Science Universiteit Leiden.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Data Mining – Intro.
MIS2502: Data Analytics Advanced Analytics - Introduction
School of Computer Science & Engineering
Data Mining 101 with Scikit-Learn
Introduction C.Eng 714 Spring 2010.
MIS5101: Data Analytics Advanced Analytics - Introduction
Data Warehousing and Data Mining
Supporting End-User Access
Data Mining: Introduction
Welcome! Knowledge Discovery and Data Mining
Presentation transcript:

STEGANOGRAPHY: Data Mining: SOUNDARARAJAN EZEKIEL Department of Computer Science Indiana University of Pennsylvania Indiana, PA 15705

Steganography Cryptography Data Mining Art of hiding information in ways that prevent the detection of hidden message Existence is not know Science of writing in secret code It encodes a message so it cannot be understood Discovering hidden Values in your data Warehouse That is The extraction of hidden predictive information from large database Knowledge discovery method– extraction of implicit and interesting pattern from large data collection

Data Mining-- Introduction It started when we started to store data in computer( businesses) Continued improvements– technology that navigate through data in real time Examples:- Single case: Web server collect data for every single cleick Logs are too big and contain gibberish Lots of data and statistics What we collected is not really useful Multiple Case:- Collection of web servers with large bandwidth Think about the size of the data we collect

Data Mining --- Continue It helps to design better and more intelligent business( e-learning environments) because it supported by Massive data collection Powerful multiprocessor computers Good data mining algorithms It existed at least 10 years, but it is getting popular recently Example:- Winter Corporation Report Data warehouses with as much as 100 to 200 terabytes of raw data will be operational by next year, performing nearly 2,000 concurrent queries and occupying nearly 1 petabyte (1,000 terabytes) of disk space. In the same time period, transaction-processing databases will handle workloads of nearly 66,000 transactions per second

Evolution of Data mining Evolutionary step Question Tech Product providers characteristics Data collection 60’s What was my total revenue last few years Computer, tapes, disks IBM , CDC Retrospective static data delivery Data Access 80’s What were unit sales in India last year January RDBMS(Relational DataBases) SQL( Structured Query Languages) ODBC Oracle Sybase Informix IBM Microsoft Dynamic data delivery Data warehouse and decision support 90’s What were unit sales price in India last March? On-line analytic processing (OLAP) Multidimensional data base, data warehouses Pilot Comshare Arbor Cognos Microstrategy Dynamic data delivery in multiple level Data mining Now What will be unit price in India next month? Why? Advanced algorithms, multiprocessor computers, massive database Lockheed IBM,SGI Many more… Prospective, proactive information delivery

The scope of Data mining It is similar to sifting gold from immense amount of dirt--- searching valuable information in a gigabytes data Automated prediction of trends and behaviors: Data mining automates the process of finding predictive information in a large database. Example: Question related to target marketing Data mining can use mailing list data– other previous data to identify the solution Another example- Forecasting bankruptcy by identifying segments of a population likely to respond similarly to given events

Data base can be larger in both depth and breadth Automated discovery of previously unknown patterns: It sweep through the database and identify previously hidden patterns in one step Example: Unrelated items purchased together in a store. Detecting fraudulent credit card transactions etc Data base can be larger in both depth and breadth High performance data mining need to analyze full depth of a database without pre-selecting subsets Larger samples yield lower estimation errors and variances

Research Rank 2001 – According to MIT’s Technology Review – Data mining is a top 10 research area Recently – According to Gartner Group Advanced Technology Research Note– data mining and AI is top 5 key research area.

Multi-disciplinary field with a broad applicability Has several applications Market based analysis Customer relationship management Fraud detection Network intrusion detection Non-destructive eavaluation Astronomy (look up dataa) Remote sensing data ( look down data) Text and mulitmedia mining Medical imaging Automated target recognition Combined ideas from several diffferent fields Steganography-- Cryptography My point of view of Data mining Borrowing the idea from Machine Learning Artificial Intelligence Statistics High performance computing Signal and Image Processing Mathematical Optimization Pattern Recognition Natural Language processing Steganography Cryptography

General view of Data mining Preprocessed data Raw Data Target Data Knowledge Transformed Data Pattern Data processing pattern recog. Interpreting results Dimension Reduction Data Fusion Sampling MRA De-noising Object Identification Feature Extraction Normalization Classification Clustering Regression Visualization Validation An Iterative and Interactive Process

Our Research Based On Data Preprocessing Pattern Recognition Multiresolution Analysis De-noising ( wavelet based methods) Object Classifications Feature Extraction Pattern Recognition Classification Clustering Visualization and Validation Steganography Cryptography

Where we are going from here More robust , accurate, scalable algorthim For pre-processing and pattern recognition Wavelets– and fractals Newer data types Video and multimedia Multi-sensor data More complex problems Dynamic tracking in video Mining text, audio, video, images Investigating Steganography in images, analysis of data hiding methods, attacks against hidden information, and counter measures to attacks against digital watermarking ( detection and distortion)

How data mining works? How exactly the data mining able to tell you important things that you did not know or what is going to happen next? The method/ techniques that is used to perform these feats in data mining is called modeling Modeling is simply the act of building a model in one situation where you know the answer and then applying it to another situation that you don’t Example: Sunken treasure ship– Bermuda shore, other ships– path-- keep all these information– build the model– if the model is good– you find the treasure in the ocean Example 2: Identify telephone customer– for example you have the information that is the model that 98% customer who makes $60K per year spend more than $80 per month on long distance with this model new customer can be selectively targeted

Most commonly used techniques Artificial Neural Networks: Non linear predictive models that learn through training and resemble biological neural networks in structure Decision Trees: Tree- shaped structures that represents set of decisions . These decisions generated rules for the classification of a dataset. Specific decision tree include classification and Regression Test(CART)and Chi Square Automated Interaction Detection (CAID) Genetic Algorithms: optimization techniques that uses processes genetic combination, mutation, and selection in a design based on the concept of evolution Nearest Neighbor Method: Rule Induction: OUR METHODS WILL BE BASED ON WAVELETS, FRACTALS, STEG, AND CRYPT

Steganography Methods Lets us discuss few methods and its advantage and disadvantage 1. Least Significant Method Idea:- Hide the hidden message in LSB of the pixels Example:- Advantage:- quick and easy– works well in gray image Disadvantage:- insert in 8 bit– changes color– noticeable change– vulnerable to image processing– cropping and compression

STEGANALYSIS Detection Distortion Redundant method Store more than one time--- withstand cropping Spread Spectrum Store the hidden message everywhere STEGANALYSIS Detection Distortion Analyst manipulate the stego-media To render the embedded information Useless or remove it altogether Analyst observe various Various relationship between Cover, message, stego-media Steganography tool Seeing the Unseen

DCT - Discrete Cosine Transformation Encode Take image Divide into 8x8 blocks Apply 2-D DCT--- DCT coefficients Apply threshold value Store the hidden message in that place Take inverse– store as image Decode Start with modified image Apply DCT Find coefficient less than T Extract bits Combine bits and make message 219 215 214 216 218 218 217 216 219 216 216 216 215 215 215 215 217 217 218 216 212 212 213 215 215 215 215 215 211 212 214 216 217 216 214 216 215 215 217 218 216 216 215 214 215 215 215 216 215 214 210 210 211 215 215 216 218 215 211 211 213 214 216 216 1720 1.524 7.683 1.234 1.625 0.9234 -0.07047 -1.055 5.667 3.475 -4.181 -1.524 1.152 1.637 1.016 0.3802 0.3711 -1.442 1.067 5.944 0.3943 -0.4591 0.1313 0.7812 3.888 -3.356 -1.97 3.265 0.5632 -0.939 -0.2434 0.2354 1.625 -2.279 0.4735 1.392 1.375 0.6552 -1.143 0.03459 -4.049 -1.223 0.5466 -0.5425 -1.013 -0.2651 0.5696 -0.9296 1.876 1.924 -1.369 -1.132 -0.02802 -0.4646 0.1831 0.9729 0.8995 -0.7233 0.667 0.436 0.1325 -0.03665 -0.3141 -0.4749

Wavelets Transformation Wavelets are basis function in continuous time. a basis is a set of linearly independent functions that can be used to produce all admissible functions f(t) The special feature of wavelet basis is that all functions are constructed from a single mother wavelet w(t). This wavelet is is a small wave ( a pulse). Normally it starts at time t=0 and end at time t=N Shifted k time = Compressed = Combine both we have Haar Wavelet :- 1909 Haar, 1984– theory, 88– daubechies 89- Mallat 2-d, mra, -- 92- bi-orthogonal Haar=

Inverse Transformation Extract the Hidden Message figure Carrier Stego image Wavelet Transformation Thresholding Compression Message to be Hidden Error Image Inverse Transformation Extract the Hidden Message

Information security and data mining Goal of intrusion detection – discover intrusion into a computer or network With internet and available tool for attacking networks– security becomes a critical component of network Misuse detection: finds intrusion by looking for activity corresponding to known techniques for intrusion Anomaly detection: the system defines the expected behavior of the network in advance

What we want The tools to filter and classify information Tools to find and retrieve the relevant information when you need it Tools that adapt to your pace and needs Tools to predict information needs Tools to recommend tasks and information sources Tools than can be personalized, manually or automatically

The tools should be… Non- intrusive Secure Integrated Adaptable Controllable Automatic or semi-automatic Useful For learners For educators Integrate operational data with customer, suppliers and market --

Profitable application A wide range of companies have deployed successful application of data mining Some applications area include A pharmaceutical company can analyze its recent sales force activity and their results to improve target of high-value physician and determine which marketing activities will have the greatest impact in the next few months A credit card companies can leverage its vast warehouse of customers transactions data to identify customers most likely to be interested in a new credit product A diversified transportation company with a large direct sales forces can apply data mining to identify the best prospect for its services A large consumer package goods company can apply data mining to improve its sales process to retailers

Conclusion In this talk, we have discussed data mining related topics Our goals Research Software and algorithms Application Our main focus is Science Data, though applicable to other data sets as well More information – check out website http://www.cosc.iup.eud/sezekiel Contact: sezekiel@iup.edu