Download presentation
Presentation is loading. Please wait.
1
Text Analysis and Search Analytics
2
Outline Text Analysis Search Analytics
Text analysis in the online environment Steps in text analysis Analysis of Otto’s reviews Search Analytics
3
Text Analysis a lot of unstructured data are available in the online world: online communities (Twitter, Facebook, etc.) review sites (TripAdvisor, Yelp, etc.) retail sites (Amazon, Alibaba, etc.) marketers try to extract useful information contained in these online conversations by their customers;
4
Steps in text analysis Social data collection and database setup
Feature generation and pruning Opinion word extraction Sentiment analysis Summary generation (visualization)
5
Otto’s reviews on TripAdvisor
8
Analyzing TripAdvisor reviews of Otto’s
Look at the spreadsheet Ottos.xlsx showing all the Otto’s reviews on TripAdvisor from 11/07 to 3/18: What numerical analyses would you perform to analyze these data? How well is Otto’s performing? What textual analyses would you perform to analyze these data? Be specific!
9
Word cloud of quotes
10
Word cloud of reviews
11
Creating a word cloud create a text file containing the text to be analyzed; clean the text (remove punctuation marks and numbers, eliminate common stopwords, use text stemming to reduce words to their root form, etc.); determine the frequency of occurrence of all the words and eliminate unwanted words; choose the minimum frequency of words to be used in the word cloud and the maximum number of words to be included; generate and plot the word cloud;
12
Sentiment analysis if numeric ratings are available, they will (hopefully) summarize the overall emotional tone of the text; sometimes, it is necessary to extract the emotional tone of the text from the text itself; usually, an attempt is made to classify words in terms of their valence (or more specific emotions): positive words (great, good, delicious, loved, enjoyed, etc.) negative words (disappointed, etc.) ambiguous words (busy, crowded, etc.) a sentiment lexicon can be used to classify individual words as positive or negative, and then the overall sentiment of the text can be determined;
13
Sentiment analysis for Otto’s reviews
for the 702 reviews, a sentiment analysis based on the bing lexicon shows the following: number of positive and negative words and balance of positive and negative words: negative 390 positive 2177 difference 1787 most common positive and negative words: 1 good 352 2 great 3 excellent 4 nice 5 friendly 6 well 7 like 8 delicious 9 best 10 enjoyed 1 crowded 24 2 disappointed 23 3 loud 23 4 fried 15 5 noisy 14 6 slow 14 7 cold 12 8 bad 11 9 pale 8 10 disappoint 7
14
Search analytics when somebody conducts an online search, the search will usually yield both organic listings (sorted by relevance to the query) and paid-for content (product listing ads and text ads); paid links are based on how much advertisers have bid on specific keywords associated with particular search terms and other factors (CTR, quality of the landing page, etc.); if somebody clicks on an organic listing, a display ad (e.g., banner ad) may appear on the landing page; by inserting tracking codes on the pages of a website, user visits to webpages and related information can be tracked and summary reports can be generated;
15
Results of a search for ‘kitchen knives’:
Product listing ads Text ads Organic listings
17
Google Analytics reports
Audience reports show you characteristics about your users like age and gender, where they’re from, their interests, how engaged they were, whether they’re new or returning users, and what technology they’re using. Acquisition reports show you which channels (such as advertising or marketing campaigns) brought users to your site. Behavior reports show how people engaged on your site including which pages they viewed, and their landing and exit pages. Conversion reports allow you to track website goals based on your business objectives.
18
Audience Overview Report
“Sessions” are the total number of sessions for the given date range. “Users” are the total number of users that visited for the given date range, “Pageviews” are the total number of times pages that included your Analytics tracking code were displayed to users “Pages per session” is the average number of pages viewed during each session. “Average session duration” is the average length of a session based on users that visited your site in the selected date range. “Bounce rate” is the percentage of users who left after viewing a single page on your site and taking no additional action. “Percent of new sessions” is the percentage of sessions in your date range who are new users to your site.
19
Audience Overview Report (cont’d)
e.g., this provides data on languages:
20
Additional Audience Reports
21
Acquisition Reports information about the traffic medium, specific source, and name of the marketing campaign; different types of mediums: “Organic” is used to identify traffic that arrived on your site through unpaid search like a non-paid Google Search result. “CPC” indicates traffic that arrived through a paid search campaign like Google AdWords text ads. “Referral” is used for traffic that arrived on your site after the user clicked on a website other than a search engine. “ ” represents traffic that came from an marketing campaign. “(none)” is applied for users that come directly to your site by typing your URL directly into a browser.
22
Behavior and Conversion Reports
information about Pageviews, Average Time on Page, and Bounce Rate; additional information includes metrics for particular pages of the website, landing page, exit page, etc.; if website goals were defined, conversion rates can be assessed;
23
Tracking marketing campaigns
by using campaign tagging, the performance of online marketing and advertising campaigns can be assessed (e.g., a monthly newsletter with a link offering a special promotion); AdWords can be used to generate text and display ads; advertisers have to bid on keywords, and they can assess how well keywords and individual ads are performing; bids can also be adjusted based on certain criteria (e.g., time of day or distance to the store), and the effectiveness of these adjustments can be analyzed;
24
Keyword analysis
25
Metrics for assessing online advertising effectiveness
CPM (cost per mille): cost per thousand impressions; CTR (click-through-rate): percentage of people who see an ad and click on it to go to the website advertised (i.e., clicks on an ad over total impressions); CPC (cost-per-click): price paid for each click on an ad; CPA (cost per conversion or action): cost of an ad per action of interest (e.g., purchase, subscription to a newsletter);
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.