Presentation is loading. Please wait.

Presentation is loading. Please wait.

Retrieving Relevant Reports from a Customer Engagement Repository Dharmesh Thakkar Zhen Ming Jiang Ahmed E. Hassan School of Computing, Queen’s University,

Similar presentations


Presentation on theme: "Retrieving Relevant Reports from a Customer Engagement Repository Dharmesh Thakkar Zhen Ming Jiang Ahmed E. Hassan School of Computing, Queen’s University,"— Presentation transcript:

1 Retrieving Relevant Reports from a Customer Engagement Repository Dharmesh Thakkar Zhen Ming Jiang Ahmed E. Hassan School of Computing, Queen’s University, Canada Gilbert Hamann Parminder Flora Research In Motion (RIM), Canada

2 Software Maintenance: Customer Support

3 Retrieving Relevant Reports ■ State of Practice: –No systematic techniques to retrieve and use information for future engagements –Keyword searching is limited: depends on the search skills and experience of the analyst and peculiarity of the problem

4 Customer Support Problem Statement ■ We want to find customers with similar operational and problem profiles ■ We can reuse prior solutions and knowledge  Heavy Email, Light Web, Light MDS  Light Email, Light Web, Light MDS  Light Email, Heavy Web, Light MDS  Heavy Email, Heavy Web, No MDS  Light Email, Light Web, Heavy MDS Other Customers Compare New Customer Engagement

5 Using Logs for Customer Support ■ Execution logs are readily available and contain –Operational Profile: usage patterns (heavy users of email from device, or to device, or light users of calendar, etc.) –Signature Profile: specific error line patterns (connection timeout, database limits, messages queued up, etc.) ■ Find the most similar profile

6 Execution Logs ■ Contain time-stamped sequence of events at runtime ■ Readily available representatives of both feature executions and problems Queuing new mail msgid=ABC threadid=XYZ Instant message. Sending packet to client msgid=ABC threadid=XYZ New meeting request msgid=ABC threadid=XYZ Client established IMAP session emailid=ABC threadid=XYZ Client disconnected. Cannot deliver msgid=ABC threadid=XYZ New contact in address book emailid=ABC threadid=XYZ User initiated appointment deletion emailid=ABC threadid=XYZ

7 Example Other Customers Compare

8 Our Technique

9 Log Lines to Event Distribution ■ Remove dynamic information –Example: Given the two log lines “Open inbox user=A” and “Open inbox user=B”, map both lines to the event “Open inbox user=?” ■ Use event percentages to compare event logs for different running lengths without bias

10 Compare Event Distributions ■ Kullback-Leibler Divergence ■ Cosine Distance

11 Identify Signature Events ■ Signature Events have a different frequency when compared to events in other log files –Example signature events: dropped connections, thread dumps, and full queues ■ Chi-square test identifies such events

12 Measuring Performance ■ Precision = 2/4 = 50% 100% precise if all the retrieved log files are relevant ■ Recall = 2/3 = 67% 100% recall if all the relevant log files are retrieved

13 The Big Picture

14 Case Studies ■ Case Study I –Dell DVD Store open source application –Code instrumentation done for event logging –Built the execution log repository by applying synthetic workloads, changing the workload parameters each time ■ Case Study II –Globally deployed commercial application –More than 500 unique execution events

15 Case Study Results ■ Dell DVD Store –100% precision and recall on both operational profile based and signature profile based retrieval ■ Commercial Application –100% precision and recall for signature profile based retrieval –Results for operational profile based retrieval: Experiment Count of Log Files K-L DistanceCosine Distance PrecisionRecallPrecisionRecall Single Feature Group2867.71%90.28%67.71%90.28% Multiple Feature Groups2860.71%80.95%75.00%100.00% All Feature Groups1272.92%97.22%62.50%83.33% Real World Log Files1254.17%72.22%68.75%91.67% All the Log Files8059.93%79.90%56.72%75.62%

16 Sources of Errors ■ Events that do not correspond directly to a particular operational feature, such as idle time events, server health check events, startup and shutdown events ■ Imbalance in the event logging

17 Imbalance in Event Logging

18 Related Work ■ Data mining techniques on textual information [Hui and Jha, 2000] –Cons: Limited results, depending on analyst’s search skills and peculiarity of the problem ■ Using customer usage data [Elbaum and Narla, 2004] –Cons: Customer usage data rarely exists ■ Clustering HTTP execution logs [Menascé, 1999] –Cons: Complex process, works only for HTTP logs ■ Software Agent Deployment to build operational profile [Ramanujam et. al., 2006] –Cons: Intrusive, complex, costly

19 Conclusion


Download ppt "Retrieving Relevant Reports from a Customer Engagement Repository Dharmesh Thakkar Zhen Ming Jiang Ahmed E. Hassan School of Computing, Queen’s University,"

Similar presentations


Ads by Google