Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parallel Apriori Algorithm Using MPI Congressional Voting Records Çankaya University Computer Engineering Department Ahmet Artu YILDIRIM January 2010.

Similar presentations


Presentation on theme: "Parallel Apriori Algorithm Using MPI Congressional Voting Records Çankaya University Computer Engineering Department Ahmet Artu YILDIRIM January 2010."— Presentation transcript:

1 Parallel Apriori Algorithm Using MPI Congressional Voting Records Çankaya University Computer Engineering Department Ahmet Artu YILDIRIM January 2010

2 Efficient Association Rules Mining Using MPI Overview Apriori algorithm used for discovery of association rules Computation time is the major issue if dataset is pretty large The aim is to increase efficiency of mining process in running time manner utilizing computers for parallel computation

3 Efficient Association Rules Mining Using MPI Apriori Algorithm (Example) Confidence({5}→{2,3})=Prob({2,3,5}/{5})=2/3=0.66 Min support=50% Min support count=0.5x4 = 2 Min confidence = 0.50

4 Efficient Association Rules Mining Using MPI Technology and Methodology Platform: GNU/Linux 2.6.20.7 i386 Programming language: ISO C99 language Cross platform APIs: MPICH API for MPI implementation and Glib API utility library Compiler suite: GNU toolchain Division Methodology: 1.Dataset division 2.Large frequent itemset division Dataset division methodology used

5 Efficient Association Rules Mining Using MPI Data Division (Merging Local Support)

6 Efficient Association Rules Mining Using MPI Parallel Apriori AlgorithmFlowchart

7 Efficient Association Rules Mining Using MPI Dataset 1984 United States congressional voting records Attribute Information: Democrat, republican, handicapped infants yes-no, water project cost sharing yes-no, adoption of the budget resolution yes-no, physician fee freeze yes-no, el salvador aid yes-no, religious groups in schools yes-no, aid to nicaraguan contras yes-no, mx-missile yes-no, immigration yes-no, synfuels corporation cutback yes-no, education spending yes-no, superfund right to sue yes-no, crime yes-no, duty free exports yes- no, export admin act south africa yes-no

8 Efficient Association Rules Mining Using MPI Preprocessing of Dataset Data transformation applied before processing Attributes numbered such as democrat = 1, republican = 2, handicapped infants yes = 3, handicapped infants no = 4, water project cost sharing yes = 5 …

9 Efficient Association Rules Mining Using MPI Config File and Run Command Config File: attributecount=34 transactioncount=435 minsupportpercent=50 minconfidencepercent=80 Command: mpirun -np x -machinefile machines./aprioriparallel

10 Efficient Association Rules Mining Using MPI Program Output

11 Efficient Association Rules Mining Using MPI Rules Rules according to confidence threshold level 80%: Democrats support Adoption of the budget resolution Aid to Nicaraguan contras Democrats do NOT support Physician fee freeze

12 Efficient Association Rules Mining Using MPI Rules (cont.) Rules according to confidence threshold level 80%: Those who do not support physician fee freeze, support adoption of the budget resolution Those who support adoption of the budget resolution also do not support physician fee freeze

13 Efficient Association Rules Mining Using MPI Parallel Computation Speed Up Run on Çankaya University wee cluster Processor Specs: 600 MHz CPU, 250 Mb Ram Speed up = t s / t p

14 Efficient Association Rules Mining Using MPI Conclusion Parallel version of Apriori algorithm is efficient in running time manner with large datasets Scalability gained via adding additional nodes (computers) or memory without modification of code High price-performance ratio by utilizing less powerful computers

15 Thank You Questions?


Download ppt "Parallel Apriori Algorithm Using MPI Congressional Voting Records Çankaya University Computer Engineering Department Ahmet Artu YILDIRIM January 2010."

Similar presentations


Ads by Google