Retrieving Relevant Reports from a Customer Engagement Repository Dharmesh Thakkar Zhen Ming Jiang Ahmed E. Hassan School of Computing, Queen’s University, Canada Gilbert Hamann Parminder Flora Research In Motion (RIM), Canada
Software Maintenance: Customer Support
Retrieving Relevant Reports ■ State of Practice: –No systematic techniques to retrieve and use information for future engagements –Keyword searching is limited: depends on the search skills and experience of the analyst and peculiarity of the problem
Customer Support Problem Statement ■ We want to find customers with similar operational and problem profiles ■ We can reuse prior solutions and knowledge Heavy , Light Web, Light MDS Light , Light Web, Light MDS Light , Heavy Web, Light MDS Heavy , Heavy Web, No MDS Light , Light Web, Heavy MDS Other Customers Compare New Customer Engagement
Using Logs for Customer Support ■ Execution logs are readily available and contain –Operational Profile: usage patterns (heavy users of from device, or to device, or light users of calendar, etc.) –Signature Profile: specific error line patterns (connection timeout, database limits, messages queued up, etc.) ■ Find the most similar profile
Execution Logs ■ Contain time-stamped sequence of events at runtime ■ Readily available representatives of both feature executions and problems Queuing new mail msgid=ABC threadid=XYZ Instant message. Sending packet to client msgid=ABC threadid=XYZ New meeting request msgid=ABC threadid=XYZ Client established IMAP session id=ABC threadid=XYZ Client disconnected. Cannot deliver msgid=ABC threadid=XYZ New contact in address book id=ABC threadid=XYZ User initiated appointment deletion id=ABC threadid=XYZ
Example Other Customers Compare
Our Technique
Log Lines to Event Distribution ■ Remove dynamic information –Example: Given the two log lines “Open inbox user=A” and “Open inbox user=B”, map both lines to the event “Open inbox user=?” ■ Use event percentages to compare event logs for different running lengths without bias
Compare Event Distributions ■ Kullback-Leibler Divergence ■ Cosine Distance
Identify Signature Events ■ Signature Events have a different frequency when compared to events in other log files –Example signature events: dropped connections, thread dumps, and full queues ■ Chi-square test identifies such events
Measuring Performance ■ Precision = 2/4 = 50% 100% precise if all the retrieved log files are relevant ■ Recall = 2/3 = 67% 100% recall if all the relevant log files are retrieved
The Big Picture
Case Studies ■ Case Study I –Dell DVD Store open source application –Code instrumentation done for event logging –Built the execution log repository by applying synthetic workloads, changing the workload parameters each time ■ Case Study II –Globally deployed commercial application –More than 500 unique execution events
Case Study Results ■ Dell DVD Store –100% precision and recall on both operational profile based and signature profile based retrieval ■ Commercial Application –100% precision and recall for signature profile based retrieval –Results for operational profile based retrieval: Experiment Count of Log Files K-L DistanceCosine Distance PrecisionRecallPrecisionRecall Single Feature Group %90.28%67.71%90.28% Multiple Feature Groups %80.95%75.00%100.00% All Feature Groups %97.22%62.50%83.33% Real World Log Files %72.22%68.75%91.67% All the Log Files %79.90%56.72%75.62%
Sources of Errors ■ Events that do not correspond directly to a particular operational feature, such as idle time events, server health check events, startup and shutdown events ■ Imbalance in the event logging
Imbalance in Event Logging
Related Work ■ Data mining techniques on textual information [Hui and Jha, 2000] –Cons: Limited results, depending on analyst’s search skills and peculiarity of the problem ■ Using customer usage data [Elbaum and Narla, 2004] –Cons: Customer usage data rarely exists ■ Clustering HTTP execution logs [Menascé, 1999] –Cons: Complex process, works only for HTTP logs ■ Software Agent Deployment to build operational profile [Ramanujam et. al., 2006] –Cons: Intrusive, complex, costly
Conclusion