Download presentation
1
STEGANOGRAPHY: Data Mining:
SOUNDARARAJAN EZEKIEL Department of Computer Science Indiana University of Pennsylvania Indiana, PA 15705
2
Steganography Cryptography Data Mining
Art of hiding information in ways that prevent the detection of hidden message Existence is not know Science of writing in secret code It encodes a message so it cannot be understood Discovering hidden Values in your data Warehouse That is The extraction of hidden predictive information from large database Knowledge discovery method– extraction of implicit and interesting pattern from large data collection
3
Data Mining-- Introduction
It started when we started to store data in computer( businesses) Continued improvements– technology that navigate through data in real time Examples:- Single case: Web server collect data for every single cleick Logs are too big and contain gibberish Lots of data and statistics What we collected is not really useful Multiple Case:- Collection of web servers with large bandwidth Think about the size of the data we collect
4
Data Mining --- Continue
It helps to design better and more intelligent business( e-learning environments) because it supported by Massive data collection Powerful multiprocessor computers Good data mining algorithms It existed at least 10 years, but it is getting popular recently Example:- Winter Corporation Report Data warehouses with as much as 100 to 200 terabytes of raw data will be operational by next year, performing nearly 2,000 concurrent queries and occupying nearly 1 petabyte (1,000 terabytes) of disk space. In the same time period, transaction-processing databases will handle workloads of nearly 66,000 transactions per second
5
Evolution of Data mining
Evolutionary step Question Tech Product providers characteristics Data collection 60’s What was my total revenue last few years Computer, tapes, disks IBM , CDC Retrospective static data delivery Data Access 80’s What were unit sales in India last year January RDBMS(Relational DataBases) SQL( Structured Query Languages) ODBC Oracle Sybase Informix IBM Microsoft Dynamic data delivery Data warehouse and decision support 90’s What were unit sales price in India last March? On-line analytic processing (OLAP) Multidimensional data base, data warehouses Pilot Comshare Arbor Cognos Microstrategy Dynamic data delivery in multiple level Data mining Now What will be unit price in India next month? Why? Advanced algorithms, multiprocessor computers, massive database Lockheed IBM,SGI Many more… Prospective, proactive information delivery
6
The scope of Data mining
It is similar to sifting gold from immense amount of dirt--- searching valuable information in a gigabytes data Automated prediction of trends and behaviors: Data mining automates the process of finding predictive information in a large database. Example: Question related to target marketing Data mining can use mailing list data– other previous data to identify the solution Another example- Forecasting bankruptcy by identifying segments of a population likely to respond similarly to given events
7
Data base can be larger in both depth and breadth
Automated discovery of previously unknown patterns: It sweep through the database and identify previously hidden patterns in one step Example: Unrelated items purchased together in a store. Detecting fraudulent credit card transactions etc Data base can be larger in both depth and breadth High performance data mining need to analyze full depth of a database without pre-selecting subsets Larger samples yield lower estimation errors and variances
8
Research Rank 2001 – According to MIT’s Technology Review – Data mining is a top 10 research area Recently – According to Gartner Group Advanced Technology Research Note– data mining and AI is top 5 key research area.
9
Multi-disciplinary field with a broad applicability
Has several applications Market based analysis Customer relationship management Fraud detection Network intrusion detection Non-destructive eavaluation Astronomy (look up dataa) Remote sensing data ( look down data) Text and mulitmedia mining Medical imaging Automated target recognition Combined ideas from several diffferent fields Steganography-- Cryptography My point of view of Data mining Borrowing the idea from Machine Learning Artificial Intelligence Statistics High performance computing Signal and Image Processing Mathematical Optimization Pattern Recognition Natural Language processing Steganography Cryptography
10
General view of Data mining
Preprocessed data Raw Data Target Data Knowledge Transformed Data Pattern Data processing pattern recog. Interpreting results Dimension Reduction Data Fusion Sampling MRA De-noising Object Identification Feature Extraction Normalization Classification Clustering Regression Visualization Validation An Iterative and Interactive Process
11
Our Research Based On Data Preprocessing Pattern Recognition
Multiresolution Analysis De-noising ( wavelet based methods) Object Classifications Feature Extraction Pattern Recognition Classification Clustering Visualization and Validation Steganography Cryptography
12
Where we are going from here
More robust , accurate, scalable algorthim For pre-processing and pattern recognition Wavelets– and fractals Newer data types Video and multimedia Multi-sensor data More complex problems Dynamic tracking in video Mining text, audio, video, images Investigating Steganography in images, analysis of data hiding methods, attacks against hidden information, and counter measures to attacks against digital watermarking ( detection and distortion)
13
How data mining works? How exactly the data mining able to tell you important things that you did not know or what is going to happen next? The method/ techniques that is used to perform these feats in data mining is called modeling Modeling is simply the act of building a model in one situation where you know the answer and then applying it to another situation that you don’t Example: Sunken treasure ship– Bermuda shore, other ships– path-- keep all these information– build the model– if the model is good– you find the treasure in the ocean Example 2: Identify telephone customer– for example you have the information that is the model that 98% customer who makes $60K per year spend more than $80 per month on long distance with this model new customer can be selectively targeted
14
Most commonly used techniques
Artificial Neural Networks: Non linear predictive models that learn through training and resemble biological neural networks in structure Decision Trees: Tree- shaped structures that represents set of decisions . These decisions generated rules for the classification of a dataset. Specific decision tree include classification and Regression Test(CART)and Chi Square Automated Interaction Detection (CAID) Genetic Algorithms: optimization techniques that uses processes genetic combination, mutation, and selection in a design based on the concept of evolution Nearest Neighbor Method: Rule Induction: OUR METHODS WILL BE BASED ON WAVELETS, FRACTALS, STEG, AND CRYPT
15
Steganography Methods
Lets us discuss few methods and its advantage and disadvantage 1. Least Significant Method Idea:- Hide the hidden message in LSB of the pixels Example:- Advantage:- quick and easy– works well in gray image Disadvantage:- insert in 8 bit– changes color– noticeable change– vulnerable to image processing– cropping and compression
16
STEGANALYSIS Detection Distortion
Redundant method Store more than one time--- withstand cropping Spread Spectrum Store the hidden message everywhere STEGANALYSIS Detection Distortion Analyst manipulate the stego-media To render the embedded information Useless or remove it altogether Analyst observe various Various relationship between Cover, message, stego-media Steganography tool Seeing the Unseen
17
DCT - Discrete Cosine Transformation
Encode Take image Divide into 8x8 blocks Apply 2-D DCT--- DCT coefficients Apply threshold value Store the hidden message in that place Take inverse– store as image Decode Start with modified image Apply DCT Find coefficient less than T Extract bits Combine bits and make message
18
Wavelets Transformation
Wavelets are basis function in continuous time. a basis is a set of linearly independent functions that can be used to produce all admissible functions f(t) The special feature of wavelet basis is that all functions are constructed from a single mother wavelet w(t). This wavelet is is a small wave ( a pulse). Normally it starts at time t=0 and end at time t=N Shifted k time = Compressed = Combine both we have Haar Wavelet : Haar, – theory, 88– daubechies Mallat 2-d, mra, bi-orthogonal Haar=
19
Inverse Transformation Extract the Hidden Message
figure Carrier Stego image Wavelet Transformation Thresholding Compression Message to be Hidden Error Image Inverse Transformation Extract the Hidden Message
20
Information security and data mining
Goal of intrusion detection – discover intrusion into a computer or network With internet and available tool for attacking networks– security becomes a critical component of network Misuse detection: finds intrusion by looking for activity corresponding to known techniques for intrusion Anomaly detection: the system defines the expected behavior of the network in advance
21
What we want The tools to filter and classify information
Tools to find and retrieve the relevant information when you need it Tools that adapt to your pace and needs Tools to predict information needs Tools to recommend tasks and information sources Tools than can be personalized, manually or automatically
22
The tools should be… Non- intrusive Secure Integrated Adaptable
Controllable Automatic or semi-automatic Useful For learners For educators Integrate operational data with customer, suppliers and market --
23
Profitable application
A wide range of companies have deployed successful application of data mining Some applications area include A pharmaceutical company can analyze its recent sales force activity and their results to improve target of high-value physician and determine which marketing activities will have the greatest impact in the next few months A credit card companies can leverage its vast warehouse of customers transactions data to identify customers most likely to be interested in a new credit product A diversified transportation company with a large direct sales forces can apply data mining to identify the best prospect for its services A large consumer package goods company can apply data mining to improve its sales process to retailers
24
Conclusion In this talk, we have discussed data mining related topics
Our goals Research Software and algorithms Application Our main focus is Science Data, though applicable to other data sets as well More information – check out website Contact:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.