Text Analytics and Machine Learning Workshop

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Machine Learning and Data Mining Course Summary. 2 Outline  Data Mining and Society  Discrimination, Privacy, and Security  Hype Curve  Future Directions.
Frank Yu Australian Bureau of Statistics Unstructured Data 1.
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Integration and Insight Aren’t Simple Enough Laura Haas IBM Distinguished Engineer Director, Computer Science Almaden Research Center.
Knowledge is Power Marketing Information System (MIS) determines what information managers need and then gathers, sorts, analyzes, stores, and distributes.
Data Mining – Intro.
Introduction to Data Science Kamal Al Nasr, Matthew Hayes and Jean-Claude Pedjeu Computer Science and Mathematical Sciences College of Engineering Tennessee.
Business Intelligence
WHT/ HPCC Systems Flavio Villanustre VP, Products and Infrastructure HPCC Systems Risk Solutions.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Machine Learning Queens College Lecture 1: Introduction.
Chapter 11 Databases.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Machine Learning CUNY Graduate Center Lecture 1: Introduction.
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
CONFIDENTIAL1 Hidden Decision Trees to Design Predictive Scores – Application to Fraud Detection Vincent Granville, Ph.D. AnalyticBridge October 27, 2009.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Machine Learning. Definition Machine learning is a subfield of computer science that evolved from the study of pattern recognition and computational.
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
Big Data Javad Azimi May First of All… Sorry about the language  Feel free to ask any question Please share similar experiences.
IS6146 Databases for Management Information Systems Lecture 12: Exam Revision Rob Gleasure robgleasure.com.
Oscar E. Cariceo, MSW, NSWM Chile Chapter
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Research in Social Work Practice Salem State University
SNS COLLEGE OF TECHNOLOGY
Machine Learning for Computer Security
Linear Regression CSC 600: Data Mining Class 12.
Taking a Tour of Text Analytics
Contextual Intelligence as a Driver of Services Innovation
IB Assessments CRITERION!!!.
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Insights driven Customer Experience
Natural Language Processing (NLP)
Developing a Methodology
Current Issues or Challenges in Visual Analytics
How to Learn Your Client
Vincent Granville, Ph.D. Co-Founder, DSC
The views expressed are the personal views of the presenter and do not reflect those of the PCAOB, members of the Board, or the PCAOB staff.
Qualitative Research.
THE ENTERPRISE ANALYTICAL JOURNEY
Machine Learning & Data Science
What is Pattern Recognition?
Implementing AI solutions using the cognitive services in Azure
Data Warehousing and Data Mining
Speech Capture, Transcription and Analysis App
TDM=Text Mining “automated processing of large amounts of structured digital textual content for purposes of information retrieval, extraction, interpretation.
Data Science with Python
Text Categorization Rong Jin.
Overview of Machine Learning
Big Data Young Lee BUS 550.
PROJECTS SUMMARY PRESNETED BY HARISH KUMAR JANUARY 10,2018.
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Text Analytics and Machine Learning Workshop Machine Learning Session
Overfitting and Underfitting
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
PURE Learning Plan Richard Lee, James Chen,.
Natural Language Processing (NLP)
Big DATA.
Data Analysis and R : Technology & Opportunity
Chapter 10 Content Analysis
Indegene’s AI/NLP Powered Pharmacovigilance/Safety Solution
Natural Language Processing (NLP)
Presentation transcript:

Text Analytics and Machine Learning Workshop October 26, 2018 CPA Canada - Toronto, Canada

Text analytics use is skyrocketing More machine readable text … Computing & data storage that is cheaper, more mobile, powerful & connected Machine learning More efficient, less error prone text processing More data for business intelligence = better predictions IntellyticsHub

Text of interest to accountants includes: News – e.g. earnings announcements, trade publications Business documents: invoices, contracts Meeting minutes Emails Social media Annual reports (MD&A, audit report, internal control report, etc.) Proxy statements (Committee reports, director biographies) Contracts Conference call transcripts Interview transcripts

Text Analytics + Machine Learning: a powerful duo Text analytics: Extracts relevant information from text by turning unstructured text into data for analysis. Techniques range from simple searches to using deep learning (a subset of machine learning) for natural language processing (NLP). Machine learning: Builds systems that learn from data, identify patterns, and make predictions (and decisions) with minimal human intervention. Structured data SAS Institute: Hall, Dean, Kabul, and Silva: An Overview of Machine Learning

Text Analytics can… Visualize and describe text Crawl and tag social media, web resources, & other documents (e.g. contracts) Search words/phrases/n-grams Recognize entities Classify documents Identify topics in collections of text Cluster similar documents Extract text or concepts Analyze sentiment and tone Tag parts of speech - NLP Structure text for analysis Screen captures: Provalis Research - Wordstat

Machine Learning

Types of machine learning

So many different models to choose from! Simple methods may underfit -> Poor prediction Complex methods may overfit -> not generalizable Experiment to find the simplest one that predicts well enough

Choice considerations Complex models may handle data better Outliers Non-linearity Correlated variables Unknown interactions between variables More features than observations data-driven, not theory-driven Machine Learning is not a magic bullet Relies on good data Results biased by bias in data “black box” interpretation challenge Requires data science skills & infrastructure that are in short supply

Deployment challenges Data security & privacy concerns Need to localize models Models degrade quickly Python, R other machine learning tools used to create the models may not be easy to put into production systems “Making Netflix Machine Learning Algorithms Reliable” International Conference on Machine Learning 2017

Machine Learning Gartner Hype Cycle Text Analytics

What machine learning can’t do … YET! Think … infer … feel … create