Download presentation
Presentation is loading. Please wait.
Published byRoderick Terry Modified over 9 years ago
1
Artificial Intelligence Project 1 Neural Networks Biointelligence Lab School of Computer Sci. & Eng. Seoul National University
2
(C) 2000-2002 SNU CSE BioIntelligence Lab 2 Outline Classification Problems Task 1 Estimate several statistics on Diabetes data set Task 2 Given unknown data set, find the performance as good as you can get The test data is hidden.
3
(C) 2000-2002 SNU CSE BioIntelligence Lab 3 Network Structure (1) … positive negative f pos (x) > f neg (x),→ x is postive
4
(C) 2000-2002 SNU CSE BioIntelligence Lab 4 Network Structure (2) … f (x) > thres,→ x is postive
5
Medical Diagnosis: Diabetes
6
(C) 2000-2002 SNU CSE BioIntelligence Lab 6 Pima Indian Diabetes Data (768) 8 Attributes Number of times pregnant Plasma glucose concentration in an oral glucose tolerance test Diastolic blood pressure (mm/Hg) Triceps skin fold thickness (mm) 2-hour serum insulin (mu U/ml) Body mass index (kg/m 2 ) Diabetes pedigree function Age (year) Positive: 500, negative: 268
7
(C) 2000-2002 SNU CSE BioIntelligence Lab 7 Report (1/4) Number of Epochs
8
(C) 2000-2002 SNU CSE BioIntelligence Lab 8 Report (2/4) Number of Hidden Units At least, 10 runs for each setting # Hidden Units TrainTest Average SD BestWorst Average SD BestWorst Setting 1 Setting 2 Setting 3
9
(C) 2000-2002 SNU CSE BioIntelligence Lab 9 Report (3/4)
10
(C) 2000-2002 SNU CSE BioIntelligence Lab 10 Report (4/4) Normalization method you applied. Other parameters setting Learning rates Threshold value with which you predict an example as positive. If f(x) > thres, you can say it is postive, otherwise negative.
11
(C) 2000-2002 SNU CSE BioIntelligence Lab 11 Challenge (1) Unknown Data Data for you: 2000 examples Pos: 1000, Neg: 1000 Test data 600 examples Pos: 300, Neg: 300 Labels are HIDDEN!
12
(C) 2000-2002 SNU CSE BioIntelligence Lab 12 Challenge (2) Data Train.data : 2000 x 500 (2000 examples with 500dim) Train.labels: positive 1, negative 0 Test.data: 600 x 500 (600 examples with 500 dim) Test.labels: not given to you. Verify your NN at http://knight.snu.ac.kr/aiproj1/ai_nn_do.asp http://knight.snu.ac.kr/aiproj1/ai_nn_do.asp
13
(C) 2000-2002 SNU CSE BioIntelligence Lab 13 Challenge (3) K-fold Cross Validation The data set is randomly divided into k subsets. One of the k subsets is used as the test set and the other k-1 subsets are put together to form a training set. 200 D1D1 D2D2 D3D3 D8D8 D9D9 D 10 200 D1D1 D2D2 D3D3 D8D8 D9D9 D 10 200 D2D2 D3D3 D4D4 D8D8 D9D9 D 10 … … …
14
(C) 2000-2002 SNU CSE BioIntelligence Lab 14 Challenge (4) Include followings at your report The best performance you achieved. The spec of your NN when achieving the performance. Structure of NN Learning epochs Your techniques Other remarks… True Predict PositiveNegative Positive Negative Confusion matrix
15
(C) 2000-2002 SNU CSE BioIntelligence Lab 15 References Source Codes Free softwares NN libraries (C, C++, JAVA, …) MATLAB Tool box Weka Web sites http://www.cs.waikato.ac.nz/~ml/weka/
16
(C) 2000-2002 SNU CSE BioIntelligence Lab 16 Pay Attention! Due (October 14, 2004): until pm 11:59 Submission Results obtained from your experiments Compress the data Via e-mail Report: Hardcopy!! Used software and running environments Results for many experiments with various parameter settings Analysis and explanation about the results in your own way
17
(C) 2000-2002 SNU CSE BioIntelligence Lab 17 Optional Experiments Various learning rate Number of hidden layers Different k values Output encoding
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.