The experiments based on word-embedding and SVM

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Imbalanced data David Kauchak CS 451 – Fall 2013.
Large-Scale Entity-Based Online Social Network Profile Linkage.
Team : Priya Iyer Vaidy Venkat Sonali Sharma Mentor: Andy Schlaikjer Twist : User Timeline Tweets Classifier.
Distant Supervision for Emotion Classification in Twitter posts 1/17.
LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014.
TÍTULO GENÉRICO Concept Indexing for Automated Text Categorization Enrique Puertas Sanz Universidad Europea de Madrid.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Optimizing F-Measure with Support Vector Machines David R. Musicant Vipin Kumar Aysel Ozgur FLAIRS 2003 Tuesday, May 13, 2003 Carleton College.
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
SVMLight SVMLight is an implementation of Support Vector Machine (SVM) in C. Download source from :
Process Maps “101” An introduction to QAD.NET UI’s Process Maps Stacy Green, BravePoint MWUG Mar 2013.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
Use of web scraping and text mining techniques in the Istat survey on “Information and Communication Technology in enterprises” Giulio Barcaroli(*), Alessandra.
Math Information Retrieval Zhao Jin. Zhao Jin. Math Information Retrieval Examples: –Looking for formulas –Collect teaching resources –Keeping updated.
Collaborative Information Retrieval - Collaborative Filtering systems - Recommender systems - Information Filtering Why do we need CIR? - IR system augmentation.
Automatic Detection of Social Tag Spams Using a Text Mining Approach Hsin-Chang Yang Associate Professor Department of Information Management National.
SPAM DETECTION AND FILTERING By Prasanna Kunchavaram.
Sentosa Technology Consultants | | KDDI R&D Laboratories Inc. Automatic Content Filtering KDDI R&D Laboratories Inc.
Exploiting Wikipedia Categorization for Predicting Age and Gender of Blog Authors K Santosh Aditya Joshi Manish Gupta Vasudeva Varma
Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization Shubhanshu Mishra 1, Jana Diesner 1, Jason Byrne 2, Elizabeth.
Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia- Molina Stanford University SIGIR 2008.
Linked Data Profiling Andrejs Abele National University of Ireland, Galway Supervisor: Paul Buitelaar.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Linked Data Profiling Andrejs Abele UNLP PhD Day Supervisor: Paul Buitelaar.
Experience Report: System Log Analysis for Anomaly Detection
Topic Modeling for Short Texts with Auxiliary Word Embeddings
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
CNN-RNN: A Unified Framework for Multi-label Image Classification
Outsourcing Golf Course Website - -Marketing - E-commerce
Sentiment Analysis of Twitter Messages Using Word2Vec
Name: Sushmita Laila Khan Affiliation: Georgia Southern University
COMP1942 Classification: More Concept Prepared by Raymond Wong
TeamMember1 TeamMember2 Machine Learning Project 2016/2017-2
Presenter: Chu-Song Chen
DEFECT PREDICTION : USING MACHINE LEARNING
Data File Import / Export
Detecting Online Commercial Intention (OCI)
Features & Decision regions
Machine Learning Week 1.
Intro to Machine Learning
An Inteligent System to Diabetes Prediction
Face Components detection
Word embeddings based mapping
Word embeddings based mapping
Image Classification Painting and handwriting identification
<< Advanced Software Agents in Web Mining >>
The experiments based on CNN
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Intro to Machine Learning
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
NLC Training Data (.csv)
Parallel Session: BR maintenance Quality in maintenance of a BR:
Deep Learning for the Soft Cutoff Problem
Hierarchical, Perceptron-like Learning for OBIE
Physics-guided machine learning for milling stability:
Introduction to Sentiment Analysis
The Updated experiment based on LSTM
Build a Text Dataset from AMAZON
Practice Project Overview
Topic: Semantic Text Mining
The experiments based on Recurrent Neural Networks
THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU
LHC beam mode classification
The experiment based on hier-attention
An introduction to Machine Learning (ML)
Presentation transcript:

The experiments based on word-embedding and SVM 2018-09-10 Raymond ZHAO WENLONG

Content Background Updates on experiments Metrics Next steps Mean of word embedding + SVM classifier Metrics Next steps

large Screen Size laptop Background develop a new product configuration approach in e-commerce industry to elicit customer needs collect Amazon user reviews (laptop), and suppose these reviews as user inputs ( Sentiment Analysis?) query-to-attributes mapping: map user inputs (the functional requirements in unstructured query) into product parameters or features (structured attributes) classification problem in ML large Screen Size laptop

The experiments Mean of word embeddings in each user review use pre-trained Glove word embeddings represent words using low-fixed-dim vector capture word relations via inner products SVM multi-classifier in ML scikit-learn lib

Metrics in ML Accuracy = (A+D) / Total Precision = A / (A + B) correct predictions out of all total examples Precision = A / (A + B) what proportion of positive identifications was actually correct? “這個預測多少是對的” Recall = A / (A + C) what proportion of actual positives was identified correctly? “正例里這個預測覆蓋了多少” F1 seek a balance between precision and recall reference from 知乎

parameter: screen size (for example) Metrics parameter: screen size (for example)

The current experiments Metrics The current experiments

TODO Large-scale dataset DL model - LSTM model? Scrape data from HP/DELL/ebay website Chinese Text Dataset? DL model - LSTM model? reference “Predicting Latent Structured Intents from Shopping Queries” in 2017, World Wide Web

Thanks