Download presentation
Presentation is loading. Please wait.
Published byJoseph Lawson Modified over 9 years ago
1
Artificial Immune based Approach to Association Rule Mining By: B. Hoda Helmi Supervisor: Adel T. Rahmani January 2008 A Thesis Submitted in Partial Fulfillment of the Requirement for the Degree of Master of Science in Artificial Intelligence-Computer Engineering 1
2
Outline The Immune System Natural and Artificial Association Rules Web Usage Mining Proposed Algorithm AISWUM Results and Conclusion 2
3
Natural Immune System A system that protects the body from foreign substances and pathogenic organisms. Immune System The immune system creates antibodies which match the antigens and cause the pathogens to be destroyed Antibody Substances capable of starting a specific immune response are referred to as antigens (viruses, bacteria, fungi). Antigen 3
4
A High Level Overview 4
5
Natural Immune System Immunity Innate Danger Theory Adaptive Clonal Selection Network Theory Affinity Maturation Hyper mutation 5
6
Innate versus Adaptive IS Innate immediately available for combat Adaptive antibody production specific to a determined infectious agent 6
7
Adaptive Immunity epitope Low affinity receptor structurally similar – high affinity 7
8
Clonal Selection & Affinity Maturation 8
9
Network Theory 1 2 3 Ag Stimulation (Positive Response) Suppression (Negative Response) Idiotypic network (Jerne, 1974): B cells stimulate each other. Creates an immunological memory 9
10
Danger Theory 10
11
Artificial Immune System Algorithms Affinity Representation AIS A Framework for AIS 11
12
Association Rules Set of items: I={I 1,I 2,…,I m } Transactions: D={t 1,t 2, …, t n }, t j I Itemset: {I i1,I i2, …, I ik } I Large (Frequent) itemset: Itemset whose number of occurrences is above a threshold. Support of an itemset: Percentage of transactions which contain that itemset. 12
13
Given: a set of items I={I 1,I 2,…,I m }, a database of transactions D={t 1,t 2, …, t n } where t i ={I i1,I i2, …, I ik } and I ij I, The Association Rule Problem is to identify all association rules X Y with a minimum support and confidence. 13
14
Association Rule Mining Steps Find Frequent Itemsets. Generate rules from frequent itemsets. Challenging Step In Association Rule Mining 14
15
Goal In this project our goal is to find all the in using frequent itemsets Web usage data artificial immune system 15
16
Web Usage Mining Web usage mining also known as Web log mining Mining techniques to discover interesting usage patterns from the secondary data derived from the interactions of the users while surfing the web. 16
17
Web Usage Mining Applications Target potential customers for electronic commerce Enhance the quality and delivery of Internet information services to the end user Improve Web server system performance Identify potential prime advertisement locations Facilitates personalization/adaptive sites Improve site design Fraud/intrusion detection Predict user’s actions (allows prefetching) 17
18
Motivations (of choosing this application) 18
19
WUM-Definitions Set of all accessed to URLs of a Web site that is stored in Web server Web Logs A sequence of URLs that are accessed by a user in one visit of Web site. (Itemset) Session crowded paths that frequently are traversed by users. (Frequent Itemsets) Strong trend 19
20
Web Log O:0000002560 || T:1997/09/12-22:43:00 ||U:/ || R:http://www.hyperreal.org/ O:0000002560 || T:1997/09/12-22:50:27 || U:/categories/software/ || R:http://www.hyperreal.org/music/machines/ O:0000002560 || T:1997/09/12-22:50:38 || U:`/categories/software/Windows/ || R:http://www.hyperreal.org/music/machines/categories/software/ O:0000002560 || T:1997/09/12-22:50:47 || U:/categories/software/Windows/V909V03.TXT || R:http://www.hyperreal.org/music/machines/categories/software/Wind ows/ O:0000002560 || T:1997/09/12-22:51:06 || U:/categories/software/Windows/ || R:http://www.hyperreal.org/music/machines/categories/software/ 20
21
Session Construction URLS ID X /0 /categories/soft ware/ 1 /categories/soft ware/Windows/ 2 /categories/soft ware/Windows/ V909V03.TXT 3 /categories 4 /manufacturers 5 /samples.html/ 6 /gearlists/ 7 /features/ 8 /ecards/ 9 1111000000 07: 27 00: 11 02: 10 00: 19 02: 01 00: 00 1121000000 Duration Frequency 21
22
Representation Antibody: (strong trends) Antigen: (incoming sessions) URL1 (0/1) URL2 (0/1) URLm (0/1) URL1 (0/1) URL2 (0/1) URLm (0/1) 22
23
Scenario Antigen enters the body Determine if the first signal is produced? (2 signals are needed for an antigen to trigger AIS, first signal is produced if antigen is harmful to body) If first signal is produced, present antigen to antibodies and compute distance, weight and influence zone. Determine antibody with maximum weight. If maximum weight > threshold compute SL and IZ for antibody else create by duplication a new antibody. Clone and Mutate. 23
24
Danger Signal Danger Theory (two signal approach) If antigen is harmful trigger an IS response else discard the antigen. In data mining context : harmful interesting (valid) What is Danger signal in our system? ◦We should find a measure to determine the validity of sessions. 24
25
Validity Measure 25
26
Validity Measure 26
27
Affinity Measure What affinity measure is used in our proposed algorithm? 27
28
Affinity Measure Weight function decreases with distance from the antigen/data location. is a scale parameter that controls the decay rate of the weights along the spatial dimensions 28
29
Stimulation Level 29
30
Weighted Stimulation 30
31
Network Stimulation & Suppression 31
32
Cloning Antibodies are cloned in proportion to their stimulation levels relative to the average network stimulation. To avoid preliminary proliferation of antibodies and to encourage a diverse repertoire new antibodies do not clone before they are mature (their age exceeds a threshold) 32
33
Hypermutation Somatic hyper mutation is a powerful natural exploration mechanism in IS, that allows it to learn how to respond to new antigens that have never been seen before. very costly and inefficient operation since its complexity is exponential in the number of features. we model this operation in AIS by an instant antigen duplication whenever an antigen is encountered that fails to activate the entire immune network. 33
34
Directed Mutation Antibodies which are added to population via mutation are always superior individuals. In this mutation mechanism whenever the system realize there are not enough good antibodies to confront with antigens, new antibodies add to population. It is a new from of DANGER THEORY. Directed mutation mechanism is as follow: 34
35
Directed Mutation 011000010 010011100 011101100 110011100 100001111 010001110 Web log In to the system 35
36
Directed Mutation 01100000 0100+2110 01+2 0110 100011100 +2000111+2 010001110 36
37
Directed Mutation 110100010 011100010 110001010 100101000 010100010 110101010 01100000 0100+2110 01+2 0110 100011100 +2000111+2 37
38
Directed Mutation 11010010 011100010 110 01010 100101000 010100010 110101010 0110000 0 0100+2110 01+2 0110 100011100 +2000111+2 38
39
Directed Mutation 11010010 011100010 110 01010 100101000 010100010 110101010 0110000 0 0100+2110 01+2 0110 100011100 +2000111+2 39
40
Directed Mutation 011000010 0100+2110 01+310110 100011100 +2000111+2 010101100 11010010 011100010 110 01010 100101000 010100010 40
41
Directed Mutation 11010010 011100010 110 01010 100101000 010100010 110100100 011000010 0100+2110 01+310110 100011100 +2000111+2 41
42
Directed Mutation 110100+20 011100010 11001010 100101000 010100010 110100100 011000010 0100+2110 01+310110 100011100 +2000111+2 42
43
Decide to Mutate After some times 11-9100+80 011100010 110-100-9010 100101000 010100010 011000010 0100+911-70 01+91011-80 100011100 +2000111+2 43
44
Mutation Occur After some times 11-9100+80 011100010 110-100-9010 100101000 010100010 011000010 0100+911-70 01+91011-80 100011100 +2000111+2 010001110 010101110 11110000 110100010 44
45
Directed Mutation Directed mutation is not computationaly complex. It doesn't cause antibodies to destroy before they have to leave population. It make system intelligent -> system can decide when to create new individuals. After each T antigens enter the system, directed mutation happens. 45
46
Compression Compression: cluster antibody population into k clusters. external interactions: those occurring between an antigen (external agent) and the antibody in the immune network. internal interactions: those occurring between one antibody and all other antibodies in the immune network. The most expensive computation and storage overhead stems from calculating and storing all the internal network interactions (quadratic complexity with respect to the network size). After compression: ◦internal interactions: ◦external interactions: k choosing an appropriate number of clusters 46
47
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1 1 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 1010 8 8 7 7 6 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 1 2 2 3 3 4 4 5 5 47
48
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 1010 8 8 7 7 6 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 1 2 2 3 3 4 4 5 5 48
49
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 1010 8 8 7 7 6 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 1 2 2 3 3 4 4 5 5 49
50
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 1010 8 8 7 7 6 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 2 34 5 50
51
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 1010 8 8 7 7 6 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 2 34 5 51
52
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 1010 8 8 7 7 6 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 2 34 5 52
53
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 8 8 7 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 2 34 5 53
54
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 8 8 7 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 2 34 5 54
55
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 8 8 7 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 2 34 5 55
56
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 8 8 7 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 2 34 5 56
57
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 8 8 7 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 2 34 5 57
58
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 8 8 7 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 2 34 5 58
59
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 8 8 7 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 2 34 5 5 5 1 1 59
60
Algorithm Visualization 1616 1616 2929 2929 2828 2828 3030 3030 4343 434344 4242 4242 4545 4545 4141 4141 4848 4848 4646 4646 4949 4949 5050 5050 4747 4747 2626 2626 3636 3636 4040 4040 3737 3737 3939 3939 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1515 1515 11 1313 1313 1414 1414 2727 2727 1818 1818 2020 2020 1919 1919 1717 1717 9 9 1010 8 8 7 6 2525 2525 2424 2424 2323 2323 22 2121 2121 1 2 34 5 5 5 1 1 60
61
Algorithm Visualization 1717 1717 2828 2828 2929 2929 3030 3030 4545 454544 4242 4242 4343 4343 4141 4141 4848 4848 4646 4646 4747 4747 2626 2626 3636 3636 3939 3939 3737 3737 4040 4040 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1313 1313 11 1515 1515 1414 1414 2727 2727 1818 1818 1919 1919 2020 2020 1616 1616 8 7 6 9 2424 2424 2323 2323 2525 2525 22 2121 2121 1 4 35 2 1 1 2 2 1010 1010 4949 4949 5050 5050 61
62
Algorithm Visualization 1717 1717 2828 2828 2929 2929 3030 3030 4545 454544 4242 4242 4343 4343 4141 4141 4848 4848 4646 4646 4747 4747 2626 2626 3636 3636 3939 3939 3737 3737 4040 4040 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1313 1313 11 1515 1515 1414 1414 2727 2727 1818 1818 1919 1919 2020 2020 1616 1616 8 7 7 7 6 9 2424 2424 2323 2323 2525 2525 22 2121 2121 1 4 35 2 1 1 2 2 1010 1010 6 6 4949 4949 5050 5050 62
63
Algorithm Visualization 1717 1717 2828 2828 2929 2929 3030 3030 4545 454544 4242 4242 4343 4343 4141 4141 4848 4848 4646 4646 4747 4747 2626 2626 3636 3636 3939 3939 3737 3737 4040 4040 3535 3535 3434 3434 33 3838 3838 3232 3232 1212 1212 3131 3131 1313 1313 11 1515 1515 1414 1414 2727 2727 1818 1818 1919 1919 2020 2020 1616 1616 8 7 7 7 6 9 2424 2424 2323 2323 2525 2525 22 2121 2121 1 4 35 2 1 1 2 2 1010 1010 6 6 4949 4949 5050 5050 63
64
Algorithm Visualization 1717 1717 2828 2828 2929 2929 3030 3030 4545 454544 4242 4242 4343 4343 4141 4141 4848 4848 4646 4646 4747 4747 2626 2626 3636 3636 3939 3939 3737 3737 4040 4040 3535 3535 3434 3434 3 3838 3838 3232 3232 1212 1212 3131 1313 1313 11 1515 1515 1414 1414 2727 2727 1818 1818 1919 1919 2020 2020 1616 1616 8 7 7 7 6 9 2424 2424 2323 2323 2525 2525 22 2121 2121 1 4 35 2 1 1 2 2 1010 1010 6 6 4949 4949 5050 5050 64
65
Algorithm Visualization 1717 1717 2828 2828 2929 2929 3030 3030 4545 454544 4242 4242 4343 4343 4141 4141 4848 4848 4646 4646 4747 4747 2626 2626 3636 3636 3939 3939 3737 3737 4040 4040 3535 3535 3434 3434 3 3838 3838 3232 3232 1212 1212 3131 1313 1313 11 1515 1515 1414 1414 2727 2727 1818 1818 1919 1919 2020 2020 1616 1616 8 7 7 7 6 9 2424 2424 2323 2323 2525 2525 22 2121 2121 1 4 35 2 1 1 2 2 1010 1010 6 6 4949 4949 5050 5050 65
66
Algorithm Visualization 1717 1717 2828 2828 2929 2929 3030 3030 4545 454544 4242 4242 4343 4343 4141 4141 4848 4848 4646 4646 4747 4747 2626 2626 3636 3636 3939 3939 3737 3737 4040 4040 3535 3535 3434 3434 3 3838 3838 3232 3232 1212 1212 3131 1313 1313 11 1515 1515 1414 1414 2727 2727 1818 1818 1919 1919 2020 2020 1616 1616 8 7 7 7 6 6 9 2424 2424 2323 2323 2525 2525 22 2121 2121 1 4 35 2 1 1 2 2 1010 1010 6 4949 4949 5050 5050 66
67
Algorithm Visualization X 1717 1717 2828 2828 2929 2929 3030 3030 4545 454544 4242 4242 4343 4343 4141 4141 4848 4848 4646 4646 4747 4747 2626 2626 3636 3636 3939 3939 3737 3737 4040 4040 3535 3535 3434 3434 3 3838 3838 3232 3232 1212 1212 3131 1313 1313 11 1515 1515 1414 1414 2727 2727 1818 1818 1919 1919 2020 2020 1616 1616 8 7 7 7 6 6 9 2424 2424 2323 2323 2525 2525 22 2121 2121 1 4 35 2 1 1 2 2 1010 1010 6 4949 4949 5050 5050 6 6 6 6 6 6 8 8 67
68
Algorithm Visualization X 1717 1717 2828 2828 2929 2929 3030 3030 4545 454544 4242 4242 4343 4343 4141 4141 4848 4848 4646 4646 4747 4747 2626 2626 3636 3636 3939 3939 3737 3737 4040 4040 3535 3535 3434 3434 3 3838 3838 3232 3232 1212 1212 3131 1313 1313 11 1515 1515 1414 1414 2727 2727 1818 1818 1919 1919 2020 2020 1616 1616 8 7 7 7 6 6 9 2424 2424 2323 2323 2525 2525 22 2121 2121 1 4 35 2 1 1 2 2 1010 1010 6 4949 4949 5050 5050 6 6 6 6 6 6 8 8 68
69
Algorithm Visualization 1717 1717 2828 2828 2929 2929 3030 3030 4545 454544 4242 4242 4343 4343 4141 4141 2626 2626 3636 3636 3939 3939 3737 3737 4040 4040 3535 3535 3434 3434 3 3838 3838 3232 3232 1212 1212 3131 1313 1313 11 1515 1515 1414 1414 2727 2727 1818 1818 1919 1919 2020 2020 1616 1616 8 7 7 7 6 6 9 2424 2424 2323 2323 2525 2525 22 2121 2121 1 4 35 2 1 1 2 2 1010 1010 6 6 6 6 6 6 6 8 8 69
70
Algorithm Visualization 1717 1717 2828 2828 2929 2929 3030 3030 4545 454544 4242 4242 4343 4343 4141 4141 2626 2626 3636 3636 3939 3939 3737 3737 4040 4040 3535 3535 3434 3434 3 3838 3838 3232 3232 1212 1212 3131 1313 1313 11 1515 1515 1414 1414 2727 2727 1818 1818 1919 1919 2020 2020 1616 1616 8 7 7 7 6 6 9 2424 2424 2323 2323 2525 2525 22 2121 2121 1 4 35 2 1 1 2 2 1010 1010 6 6 6 6 6 6 6 8 8 4646 4646 70
71
Pseudocode 71
72
Data Data set 1 One week of HTTP requests to Music Machine Web site. www.hyperreal.org www.hyperreal.org 220146 Requests. 19542 Sessions. 4756 URLs. Data set 2 One week of HTTP requests to the University of Saskatchewan’s WWW server. 44298 Requests. 9188 Sessions. 1519 URLs. 72
73
Ground Profiles For evaluating learned profiles, it should be shown that the learned profiles are good representatives of the input data: Summarization ability of AISWUM In order to show this ability, a comparison between distribution of the learned profiles and input data should be done, so: we need some ground profiles Ground profiles are extracted using: Scalable K-Means 73
74
Evaluation Metrics 74
75
Results (Music Machine) Distribution of the learned antibodies that are simultaneously precise and complete per input category at time t. 75
76
Precision Distribution of precise antibodies per input category at time t. 76
77
Coverage Distribution of complete antibodies per input category at time t. 77
78
Results (Saskatchewan University) Distribution of the learned antibodies that are simultaneously precise and complete per input category at time t. 78
79
Precision Distribution of precise antibodies per input category at time t. 79
80
Coverage Distribution of complete antibodies per input category at time t. 80
81
Evaluation Metrics Overall level of learned antibodies precision with respect to input data Ratio of learned antibodies that accurately represent the past input data to all of learned antibodies 81
82
Evaluation Metrics Overall coverage of learned antibodies with respect to input data Ratio of past input data that are summarized accurately with antibodies to the all input data. 82
83
Results (Music Machines) Ratio of learned antibodies that accurately represent past input data to the all of learned antibodies. Ratio of past input data that are summarized accurately with antibodies to the all input data. 83
84
Results (Saskatchewan) 84 Ratio of learned antibodies that accurately represent past input data to the all of learned antibodies. Ratio of past input data that are summarized accurately with antibodies to the all input data.
85
Results Maximum Contentment Minimum Contentment Average Contentment of 50 users 41%15%28% State 1 60%40%51% State 2 67%45%56%State 3 Danger Theory Weighted Items Weighted Sessions State 1No State 2YesNo State 3Yes 85
86
Run Time The rune time with one scan of data with non-optimal C++ code on Pentium 4 PC tooks: ◦For the first dataset: less than 6 min. ◦For the second dataset: less than 3 min. 86
87
Comparison with other methods Method AIS-WUMSKMDBSCANBIRCHaiNet Fuzzy AIS SOSDM Reliability/Insensitivit y to initial condition YesNoYesNoYes Noise toleranceYesNoYesNo Yes Moderately Need to scan before learning NoYes No Time complexity O(N) O(Nlog(N))O(N)O(N²) O(N) Buffer dataNoYes Number of clusters specified NoYesNoYesNo Yes Handle evolving clusters YesNo Yes Automatic scale estimation YesNo Yes No Clustering ModelNetworkCentroidsMedoidsCentroidsNetwork Handle different similarity measures YesNoYesNoYes Density/Partition based DensityPartition/ Distance DensityPartitionPartition/ Distance DensityPartition/ Distance 87
88
Novelties of the proposed algorithm Low Computational Complexity. Danger Theory in Two FormsDirected MutationWeighted Stimulation Learning the Data in a Single PassNatural MechanismApplicable to Stream Data Bi-functionality: Frequent Itemsets Mining + Finding Centroids of Clusters in Large Datasets Clear and fast identification of outliers. 88
89
Conclusion A robust and scalable algorithm for frequent itemsets mining is designed which is well fitted for noisy sparse data like Web usage data. 89
90
Conclusion The main factor behind the ability of proposed algorithm to learn in a single pass lies in the richness of the immune network structure that form a dynamic synopsis of the data and danger theory which decide which antigen is dangerous and when new antibodies are needed for combating antigens. 90
91
Publications B.Hoda Helmi, Adel T. Rahmani, Nona Helmi, “An Evolutionary Control Model for a Generic Multiagent System Using Artificial Immune Systems”, in proceeding of First Joint Congress on Fuzzy and Intelligent Systems,2007, Ferdowsi University. B. Hoda Helmi, Adel T. Rahmani, “Image Segmentation with a New Texture Feature Based on AIS ”, In proceeding of the first conference on Data Mining, AmirKabir University, 2007, Tehran, Iran.(farsi) B.Hoda Helmi, Adel T. Rahmani, “An AIS Algorithm for Web Usage Mining with Directed Mutation”, accepted in IEEE World Congress on Computational Intelligence, CEC division, 2008, Hong Kong. B. Hoda Helmi, Adel T. Rahmani, “An Enhanced AIS for WUM, inspired by Danger Theory”, submitted to ICEE 2008, Tarbiat Modarres University, 2008, Tehran, Iran. (farsi) 91
92
Publications Adel T. Rahmani, B.Hoda Helmi, “EIN-WUM an AIS-based Algorithm for Web Usage Mining”, submitted to Genetic and Evolutionary Computation Conference, 2008, Atlanta, Georgia. B. Hoda Helmi, Adel T. Rahmani, “A New Web Usage Mining Method based on An Artificial Immune System Solution with Enhanced Network and Danger Theory ”, submitted to International Journal of Control, Automation, and Systems. B.Hoda Helmi, Adel T. Rahmani, “Evolutionary based Combining of Evolved Neural Network Classifiers”, accepted in IASTAD International Conference on Signal Processing, Pattern Recognition and applications, 2006, Austria. (unrelated) 92
93
پایان Thanks 93
94
Somatic Hypermutation 94
95
Cross Reaction 95
96
Artificial Immune System Algorithms Affinity Representation AIS A Framework for AIS Shape-Space Binary Integer Real-valued Symbolic 96
97
Artificial Immune System Algorithms Affinity Representation AIS A Framework for AIS Euclidean Manhattan Hamming … 97
98
Artificial Immune System Algorithms Affinity Representation AIS A Framework for AIS Bone Marrow Clonal Selection Negative Selection Positive Selection Immune Network 98
99
Sessions 1111000000 07.2700.1102.1000.1902:0100.0000:00 1121000000 1001001010 04.2100.00 00.2500:0000.0001:0000:0000:3500:00 1002000010 0100000100 00.0000.5100.00 00:0000.0000:0000:5400:00 0100000100 0010001001 00.00 03.0000.0000:0000.0001:0700:00 00:32 0010001001 0100100100 00.0000.2100.00 03:0100.0000:0002:0000:00 0100100200 Sesison 1 Sesison 2 Sesison 3 Sesison 4 Sesison 5 99
100
Affinity Measure 100
101
Precision versus Coverage 101
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.