The experiments based on CNN 2018-10-09 Raymond ZHAO WENLONG
Content Summary about Dataset The experiments based on CNN (on Amazon dataset)
Dataset from HP ~1k reviews for 58 laptops, and ~78k words totally An average of ~16 reviews per laptop An average of ~82 words per review See the data: hp_data.md
Dataset from flipkart (in India) ~32k reviews for 408 laptops, and ~1.17M words totally Average ~79 reviews per laptop Average ~36 words per review See the data: flikkart_data.md
Dataset from Amazon ~7.2k reviews for 116 laptops, and ~521k words totally ~60 words in each review See the data: amazon_laptop.json
Summary about Dataset From Amazon, ~32k reviews for 408 laptops, and ~1.17M words totally From HP, ~1k reviews for 58 laptops, and ~78k words totally From flipkart, ~7.2k reviews for 116 laptops, and ~521k words totally ~40.2k reviews for ~ (408+58+116=) 582 laptops (duplication) ~1.67M words totally
Based on CNN See the source code: text_cnn.py (some borrows from cnn.py)
The experiments based on CNN (on Amazon Dataset) A bit better than SWEM-con Alg
TODO The experiments on all datasets RNN + LSTM ?
Thanks