Download presentation
Presentation is loading. Please wait.
Published bySophie Papineau Modified over 5 years ago
1
PROJECTS SUMMARY PRESNETED BY HARISH KUMAR JANUARY 10,2018
2
OUTLINE PROJECT I: Identification of Geo Objects In Lidar Data
PROJECT II: Implementation of C4.5 Algorithm In MatLab PROJECT III: s Retrieval From account In UI Path PROJECT IV: Information Retrieval System for UNT Domain PROJECT V: Periodic Update of Account Balance To Excel Doc PROJECT VI: Implementation of Brill’s Tagger In Python PROJECT VII: Market Basket Analysis On Large Data Sets
3
PROJECT I: Identification Of Geo Objects In Lidar Data(In progress)
Objectives Proposed Strategies Challenges Tasks Performed To identify Objects in a LIDAR data collected from a sample geographical location Creation of learning model which is efficient in classification and takes less time in execution Comparing the classifier with previous classifiers for accuracy more than 95 percent Process: Extraction of attributes that helps in classification of objects in LIDAR data Use this Attributes to train the model and then test the model on testing data Tools Used: MatLab Less information in LIDAR data compared to high spectral images Large complex data which increase execution time exponentially Inconsistency and missing values in LIDAR data can cause incorrect predictions Preprocessing of LIDAR data for missing values using interpolation Identifying set of attributes for training the classifier using SVM and Neural Network Testing classifier on testing data for predictions and labels Calculating accuracy of model
4
PROJECT II: Implementation Of C4.5 Algorithm In MatLab
Objectives Proposed Strategies Challenges Tasks Performed Implement C4.5 algorithm for fraud credit card detection. Creating training and testing samples Create a learning model which can classify input into fraud and non fraud classifications. Creating a classifier with minimum false positive and false negatives Process: Create a C4.5 learning algorithm for the training data set Tools Used: MatLab Accuracy increase by 97 percent from 92 percent compared to ID3 algorithm Getting good training data set Building classifier on large data set which tends to increase in false negative and false positives No references for C4.5 algorithms in MatLab library Identifying good tributes for decision making Calculating information gain for all the attributes at each level Identification of good training sample data which enough attributes for classification Generation of decision tree based on attributes selected at each level Calculation of accuracy for testing data.
5
PROJECT III: Email Retrieval From Account In UIPath
Objectives Proposed Strategies Challenges Tasks Performed Automate the process of retrieval from an account in UIPath Creating a automation model which works independent on type of browser The goal was to automate the stuff without any functionality errors and in less execution time Transfer of recent s to an excel file. Process: Error handling techniques for different type of browsers and accounts Tools Used: UI Path editor Increase in speed of process by 60 percent compared to human doing the same process Having different logistics for different type of browsers and different systems Rewriting of old s with the new s. Request deny by some of the providers. Identified objects in different browser and different systems Constructed a flow chart for the automation process. Developed a model from above flow chart for the automation process.
6
PROJECT IV: Information Retrieval System For UNT Domain
Objectives Proposed Strategies Challenges Tasks Performed To create a Information Retrieval System For University data Creating a GUI which allows user to retrieve top information related to the search words Specifying good inbound conditions for search techniques which increases efficiency of search result and decrease in search time Process: We construct a dictionary of web links as keys and corresponding words, frequency in that web link and as Items in dictionary Then use this dictionary and a good search algorithm to retrieve the result of search by user Tools Used: Python IDLE Increase in accuracy by 97 percent compared to 92 percent using traditional mapping Identification of all the possible web links under University domain. Handling dynamic and redundant websites Denial in request of XML object for some of secure websites. Documented Project description and requirements Identified all the web links under UNT domain using web scrapping and JSON Created a dictionary for all the search key words by eliminating adjectives and pro nouns. Assigning ranks for each of he word in corresponding web link GUI for user to enter search words
7
PROJECT V: Periodic Update Of Account Balance to Excel document
Objectives Proposed Strategies Challenges Tasks Performed Automate the process of going through account balance every day. Creating a automation model which retrieves account balance periodically and stores it in a excel document The goal was to automate the stuff without any functionality errors and in less execution time Process: Ensuring security of the account details is required. Tools Used: UI Path editor Increase in speed of process by 60 percent compared to human doing the same process Flexibility for user to check his Account balance without Internet connectivity Choosing browser security options before login in to an account to view for account balance. Deciding up on the period time. Request deny for some of the connection settings. Identified objects in different browser and different Bank Account websites. Constructed a flow chart for the automation process. Developed a model from above flow chart for the automation process.
8
PROJECT VI: Implementation Of Brill’s Tagger In Python
Objectives Proposed Strategies Challenges Tasks Performed Implementation of Brill’s Tagger algorithm in Python Effectively assigning parts of speech tags to given corpus. Training good classifier model using hand written or Gold corpus Minimization of miss classifications Process: In this we train the model based on given corpus and their parts of speech. Then used this model to tag unknown corpus Tools Used: Python interpreter Achieved accuracy of 97 percent in correct prediction of parts of speech for a given unknown corpus Having a good known tagged corpus Large training corpus can increase execution time. Miss classification in training data can lead to decrease in the efficiency of model Preprocessing of training corpus. Creating model for training data Testing test data using model created Calculated the accuracy for testing corpus Submitted a detailed report and working model
9
PROJECT VII: Market Basket Analysis
Objectives Proposed Strategies Challenges Tasks Performed Main objective is to identify affinity between products that are purchased by the customer Deliverable analysis report A clear and concise report for financial assistance and marketing Process: In this, we analysis induvial customer purchases and try to globalize the common purchases and rank them induvial. This rank of subset of items helps us to identify items that are sold frequently together and to identify store layout Tools Used: JAVA Compiler JAVA Editor Collection of Large Transaction from customer and merging them as a single data unit Large set of data and identification of subsets in each transaction can lead to enormous computation power Initially we transform induvial transactions into a single data unit Removing attributes or items of less interest which decreased our computation time by 40 percent Identifying subsets of items from total list of items ranging from (2,3,4…..). We carried till 15th subset. Use this information to generate a detail report.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.