The experiments based on word-embedding and SVM

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Imbalanced data David Kauchak CS 451 – Fall 2013.

Large-Scale Entity-Based Online Social Network Profile Linkage.

Team : Priya Iyer Vaidy Venkat Sonali Sharma Mentor: Andy Schlaikjer Twist : User Timeline Tweets Classifier.

Distant Supervision for Emotion Classification in Twitter posts 1/17.

LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014.

TÍTULO GENÉRICO Concept Indexing for Automated Text Categorization Enrique Puertas Sanz Universidad Europea de Madrid.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Optimizing F-Measure with Support Vector Machines David R. Musicant Vipin Kumar Aysel Ozgur FLAIRS 2003 Tuesday, May 13, 2003 Carleton College.

CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.

Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.

SVMLight SVMLight is an implementation of Support Vector Machine (SVM) in C. Download source from :

Process Maps “101” An introduction to QAD.NET UI’s Process Maps Stacy Green, BravePoint MWUG Mar 2013.

Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)

Use of web scraping and text mining techniques in the Istat survey on “Information and Communication Technology in enterprises” Giulio Barcaroli(*), Alessandra.

Math Information Retrieval Zhao Jin. Zhao Jin. Math Information Retrieval Examples: –Looking for formulas –Collect teaching resources –Keeping updated.

Collaborative Information Retrieval - Collaborative Filtering systems - Recommender systems - Information Filtering Why do we need CIR? - IR system augmentation.

Automatic Detection of Social Tag Spams Using a Text Mining Approach Hsin-Chang Yang Associate Professor Department of Information Management National.

SPAM DETECTION AND FILTERING By Prasanna Kunchavaram.

Sentosa Technology Consultants | | KDDI R&D Laboratories Inc. Automatic Content Filtering KDDI R&D Laboratories Inc.

Exploiting Wikipedia Categorization for Predicting Age and Gender of Blog Authors K Santosh Aditya Joshi Manish Gupta Vasudeva Varma

Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization Shubhanshu Mishra 1, Jana Diesner 1, Jason Byrne 2, Elizabeth.

Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia- Molina Stanford University SIGIR 2008.

Linked Data Profiling Andrejs Abele National University of Ireland, Galway Supervisor: Paul Buitelaar.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:

Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA

Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.

Linked Data Profiling Andrejs Abele UNLP PhD Day Supervisor: Paul Buitelaar.

Experience Report: System Log Analysis for Anomaly Detection

Topic Modeling for Short Texts with Auxiliary Word Embeddings

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

CNN-RNN: A Uniﬁed Framework for Multi-label Image Classiﬁcation

Outsourcing Golf Course Website - -Marketing - E-commerce

Sentiment Analysis of Twitter Messages Using Word2Vec

Name: Sushmita Laila Khan Affiliation: Georgia Southern University

COMP1942 Classification: More Concept Prepared by Raymond Wong

TeamMember1 TeamMember2 Machine Learning Project 2016/2017-2

Presenter: Chu-Song Chen

DEFECT PREDICTION : USING MACHINE LEARNING

Data File Import / Export

Detecting Online Commercial Intention (OCI)

Features & Decision regions

Machine Learning Week 1.

Intro to Machine Learning

An Inteligent System to Diabetes Prediction

Face Components detection

Word embeddings based mapping

Word embeddings based mapping

Image Classification Painting and handwriting identification

<< Advanced Software Agents in Web Mining >>

The experiments based on CNN

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Intro to Machine Learning

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

NLC Training Data (.csv)

Parallel Session: BR maintenance Quality in maintenance of a BR:

Deep Learning for the Soft Cutoff Problem

Hierarchical, Perceptron-like Learning for OBIE

Physics-guided machine learning for milling stability:

Introduction to Sentiment Analysis

The Updated experiment based on LSTM

Build a Text Dataset from AMAZON

Practice Project Overview

Topic: Semantic Text Mining

The experiments based on Recurrent Neural Networks

THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU

LHC beam mode classification

The experiment based on hier-attention

An introduction to Machine Learning (ML)

Presentation transcript:

The experiments based on word-embedding and SVM 2018-09-10 Raymond ZHAO WENLONG

Content Background Updates on experiments Metrics Next steps Mean of word embedding + SVM classifier Metrics Next steps

large Screen Size laptop Background develop a new product configuration approach in e-commerce industry to elicit customer needs collect Amazon user reviews (laptop), and suppose these reviews as user inputs ( Sentiment Analysis?) query-to-attributes mapping: map user inputs (the functional requirements in unstructured query) into product parameters or features (structured attributes) classification problem in ML large Screen Size laptop

The experiments Mean of word embeddings in each user review use pre-trained Glove word embeddings represent words using low-fixed-dim vector capture word relations via inner products SVM multi-classifier in ML scikit-learn lib

Metrics in ML Accuracy = (A+D) / Total Precision = A / (A + B) correct predictions out of all total examples Precision = A / (A + B) what proportion of positive identifications was actually correct? “這個預測多少是對的” Recall = A / (A + C) what proportion of actual positives was identified correctly? “正例里這個預測覆蓋了多少” F1 seek a balance between precision and recall reference from 知乎

parameter: screen size (for example) Metrics parameter: screen size (for example)

The current experiments Metrics The current experiments

TODO Large-scale dataset DL model - LSTM model? Scrape data from HP/DELL/ebay website Chinese Text Dataset? DL model - LSTM model? reference “Predicting Latent Structured Intents from Shopping Queries” in 2017, World Wide Web

Thanks