1 Visualization and Evaluation at Microsoft Research George Robertson, Mary Czerwinski and VIBE team
2 Visualization Benefits
3 Visualization in Microsoft Products Data Visualization Excel chartingExcel charting Information Visualization Basic hierarchy visualization – TreeViewBasic hierarchy visualization – TreeView Microsoft Business SolutionsMicrosoft Business Solutions BizTalk ServerBizTalk Server
4 Visualization Research Categories Information Management Data Mountain (UIST’98)Data Mountain (UIST’98) Photo Mountain (WinHEC 2001)Photo Mountain (WinHEC 2001) DateLens (CHI 2003)DateLens (CHI 2003) FacetMap (AVI 2006)FacetMap (AVI 2006) FaThumb (CHI 2006)FaThumb (CHI 2006) Principles: leverage spatial memory, animation, space-filling for scaling, provide tools for personalization
5 Document Management: Data Mountain (UIST’98) Subject Layout of 100 Pages Size is strongest cue 26% faster than IE4 After 6 months, no performance change Images help, but not required Faster retrieval when similar pages are highlighted Size is strongest cue 26% faster than IE4 After 6 months, no performance change Images help, but not required Faster retrieval when similar pages are highlighted
Evaluation 6 Wanted to test the spatial memory hypothesis Also wanted to know what the influence of other factors were Thumbnail imageThumbnail image Audio cuesAudio cues Text summariesText summaries
Method Gave subjects a “cue” to look for after they arranged their Data Mountain Cue either had text summary, a thumbnail, an audio cue or all 3 Time to retrieve the right page, number of “misses” were dependent measures After 6 months, had them do it again This time, 50% of the trails had the thumbnail images turned off!This time, 50% of the trails had the thumbnail images turned off! 7
8 Calendar Visualization: Datelens (CHI 2003) With Ben U. Maryland Fisheye representation of dates Compact overviews User control over the view Integrated search (keyword) Enables overviews, fluid navigation to discover patterns and outliers Integrated with outlook
Evaluation First, prototyped on desktop to perform formative evaluation but also tested against existing UI Next built onto Pocket PC Gave to PPC owners for 3 days use Performed benchmark tasks with them on 4 th day, satisfaction ratings over all 4 days 9
Benchmark Study DateLens v. Microsoft’s Pocket PC 2002™ Goals 1 st iteration of UI with potential users 1 st iteration of UI with potential users to compare its overall usability against an existing productto compare its overall usability against an existing product Mary’s calendar, seeded with artificial calendar events, utilized
Figure 7: Screen shots of the Microsoft Pocket PC Calendar program that was used in the study showing day, week, month, and year views.
Methods 11 knowledge workers (5 F) All experienced PC, not PDA usersAll experienced PC, not PDA users 11 isomorphic browsing tasks on each calendar All conditions counterbalancedAll conditions counterbalanced All tasks had deadline of 2 minutesAll tasks had deadline of 2 minutes Find the dates of specific calendar events (e.g., birthdays)Find the dates of specific calendar events (e.g., birthdays) Determine how many Mondays a month containedDetermine how many Mondays a month contained View all bdays for the next 3 monthsView all bdays for the next 3 months Task times, success rate, verbal protocols, user satisfaction and preference
Results—Task Times Tasks were performed faster using DateLens, F(1,8)=3.5, p=.08 Avg=49 v sec’s for the Pocket PCAvg=49 v sec’s for the Pocket PC Complex tasks significantly harder, p<.01, but handled reliably better by DateLens (task x calendar interaction), p=.04Complex tasks significantly harder, p<.01, but handled reliably better by DateLens (task x calendar interaction), p=.04
Results—Task Times
Task Success Tasks were completed successfully significantly more often using DateLens (on average, 88.2% v. 76.3% for the PPC, p<.001. In addition, there was a significant main effect of task, p<.001. For the most difficult task (#11), no participant using the Pocket PC completed the task successfully.
Task Success
Usability Issues Many users disliked the view when more than 6 months were shown Concerns about the readability of text, needs to be customizable Wanted more control about how weeks were viewed (e.g., start with Sunday or Monday?) Needed better visual indicators of conflicts for both calendars, e.g., red highlights and/or a “conflicts” filter
18 FacetMap—Faceted Search Results of Digital Bits Meant to use metadata of your digital stuff to aid in browsing Abstract, scalable, space-filling Visual more than textual Study showed favored over existing techniques for browsing tasks
19 Small Size
20 Large Size (Wall Display)
Evaluation 21 Wanted to test against textual search UI (existing system) Needed to use both text search and browse at various levels of depth Targeted: Find the earliest piece of Gordon received from Jim Gemmell (text search for “Gemmell”).Targeted: Find the earliest piece of Gordon received from Jim Gemmell (text search for “Gemmell”). Browse: Name a document that Gordon modified in the 3 rd week of May, 2000.Browse: Name a document that Gordon modified in the 3 rd week of May, Also, needed to test search for different kinds of dimension (file type, date, people, etc.)
The Text Baseline 22
Results 23 QuestionFacetMapMemex Mental Demand4.0 (1.8)4.3 (1.6) Physical Demand3.6 (2.1)3.6 (1.6) System Response Time4.8 (1.4)5.7 (1.1) Satisfaction5.6 (1.4)5.4 (0.8) Preference over Existing Techniques 4.9 (1.2)5.2 (1.4) Browsing Support5.9 (0.9) Text Search Support5.9 (1.4)5.3 (0.8) Aesthetic Appeal5.3 (1.3)4.1 (1.5)
24 Visualization Research Categories Task Management Scalable Fabric (WinHEC 2003)Scalable Fabric (WinHEC 2003) Clipping Lists (summer 2005)Clipping Lists (summer 2005) Change Borders (summer 2005)Change Borders (summer 2005) Principles: leverage spatial memory and periphery to reduce clutter and improve glancability Users stay in the flow of their tasks longer, switch more optimallyUsers stay in the flow of their tasks longer, switch more optimally
25 Task Management: Scalable Fabric (WinHEC 2003) Beyond Minimization Manage Windows tasks using natural human skills Central focus area Periphery: windows scaled Cluster of windows = task Works on variety of displays Download available Aug – 5000 downloads in 1 st 2 months
Evaluation Similar to TG, users lay out tasks Simulate task switching Compare to TaskBar Also, 3 weeks real usage + satisfaction 26
27 Visualization Research Categories Improved Productivity & Readability Clipping Lists and Change BordersClipping Lists and Change Borders Principles: remove content of less importance to get more info on the screen, reduce occlusion for readabilityPrinciples: remove content of less importance to get more info on the screen, reduce occlusion for readability
Study: compare abstraction techniques Change detection signals when a change has occurredsignals when a change has occurred Semantic content extraction pulling out and showing the most relevant contentpulling out and showing the most relevant content Scaling shrunken version of all the contentshrunken version of all the content Which will most improve multitasking efficiency?
Our Designs
Study Design no semantic content extraction semantic content extraction no change detection change detection
Comparing Tradeoffs no semantic content extraction semantic content extraction no change detection + spatial layout – no legible content + most relevant task info – detailed visuals / text change detection + spatial layout + simple visual cue for change – limited info + most relevant task info + simple visual cue for change
User Study: Participants 26 users from the Seattle area (10 female) moderate to high experience using computers and Microsoft Office-style applicationsmoderate to high experience using computers and Microsoft Office-style applications
User Study: Tasks Four tasks designed to mimic real world tasks Quiz - wait for modules to loadQuiz - wait for modules to load Uploads - wait for documents to uploadUploads - wait for documents to upload - wait for quiz answers and upload task documents to arrive - wait for quiz answers and upload task documents to arrive Puzzle - high-attention task done while waitingPuzzle - high-attention task done while waiting
Quiz
User Study: Tasks Four tasks designed to mimic real world tasks Quiz - wait for modules to loadQuiz - wait for modules to load Uploads - wait for documents to uploadUploads - wait for documents to upload - wait for quiz answers and upload task documents to arrive - wait for quiz answers and upload task documents to arrive Puzzle - high-attention task done while waitingPuzzle - high-attention task done while waiting
Uploads
User Study: Tasks Four tasks designed to mimic real world tasks Quiz - wait for modules to loadQuiz - wait for modules to load Uploads - wait for documents to uploadUploads - wait for documents to upload - wait for quiz answers and upload task documents to arrive - wait for quiz answers and upload task documents to arrive Puzzle - high-attention task done while waitingPuzzle - high-attention task done while waiting
User Study: Tasks Four tasks designed to mimic real world tasks Quiz - wait for modules to loadQuiz - wait for modules to load Uploads - wait for documents to uploadUploads - wait for documents to upload - wait for quiz answers and upload task documents to arrive - wait for quiz answers and upload task documents to arrive Puzzle - high-attention task done while waitingPuzzle - high-attention task done while waiting
Puzzle
User Study: Tasks Four tasks designed to mimic real world tasks Quiz - wait for modules to loadQuiz - wait for modules to load Uploads - wait for documents to uploadUploads - wait for documents to upload - wait for quiz answers and upload task documents to arrive - wait for quiz answers and upload task documents to arrive Puzzle - high-attention task done while waitingPuzzle - high-attention task done while waiting
User Study Setup left monitor right monitor
Measures Overall performance task duration Accuracy of task resumption timing time to resume task (e.g., time between upload finishing & user clicking on upload tool) Task flow number of task switches Recognition of windows and reacquisition of task number of window switches within a task User satisfaction survey after each trial & the lab session
Results: overall performance Clipping Lists faster task times Change Borders no significant improvement
Results: task resumption timing Clipping Lists trend toward more accurate task resumption timing
Results: task flow Clipping Lists reduced switches Change Borders increased switches for SF
Results: recognition & reacquisition Clipping Lists reduced window switches Change Borders no significant improvement
Results: user satisfaction Clipping List UIs rated better than those without Change Border UIs rated better than those without Preferred UI 17– Clipping Lists + Change Borders17– Clipping Lists + Change Borders 4– Scalable Fabric + Change Borders4– Scalable Fabric + Change Borders 2– Clipping Lists2– Clipping Lists 2– Scalable Fabric2– Scalable Fabric
Results Summary Clipping Lists were most effective for all metrics Overall performance speedOverall performance speed Accuracy of task resumption timing (not significant)Accuracy of task resumption timing (not significant) Task flowTask flow Recognition of windows & reacquisition of taskRecognition of windows & reacquisition of task User satisfactionUser satisfaction Improvements are cumulative, adding up to a sizeable impact on daily multitasking productivity Clipping ListsClipping Lists 29 seconds faster on average Clipping Lists + Change BordersClipping Lists + Change Borders 44 seconds faster on average
Results: semantic content extraction… + benefits task flow, resumption timing, and reacquisition + improves multitasking performance more than either change detection or scaling Implication for design of peripheral interfaces that support multitasking: providing enough relevant task info is more important than very simplistic designsproviding enough relevant task info is more important than very simplistic designs
50 Visualization Research Categories Software Visualization FastDASHFastDASH Principles: leverage usage data to expose most important, relevant content to improve discoverability
FastDASH 51 Peripheral display for showing a dev team who has what checked out of a code base Shows individual team members, what they’ve checked out, what method they’re in, what they’ve changed, where they may be blocked and need help Display devotes more screen real estate to bigger files in code base
Evaluation Developed coding scheme to quickly document communication and display usage behaviors of team Code 2 days w/o FastDASH Insert FastDASH display on 3 rd day Code 2 days w/FastDASH display Pre- and post- satisfaction and situation awareness surveys 52
53 Reduction in Use of Shared Artifacts
Increase in Certain Communications 54
55 Visualization Research Categories Large Information Spaces Polyarchy (CHI 2002)Polyarchy (CHI 2002) PaperLens (InfoVis 2004, CHI 2005)PaperLens (InfoVis 2004, CHI 2005) Schema Mapper (CHI 2005)Schema Mapper (CHI 2005) Treemap Vis of Newsgroup CommunitiesTreemap Vis of Newsgroup Communities Principles: support interactive data exploration through highlighting, transparency, animation and focus + context techniques
56 Polyarchy Visualization (CHI 2002) Multiple Intersecting Hierarchies Show multiple hierarchiesShow multiple hierarchies Show other relationshipsShow other relationships Search results in contextSearch results in context
Evaluation Systematically explored each potential animation speed and transition style Also, keystroke evaluation 57
58 Topic Trends Visualization: PaperLens (InfoVis 2004, CHI 2005) Understanding a conference InfoVis (8 years) CHI (23 years) Helps understand: Topic evolution Frequently published authors Frequently cited papers/authors Relationship between authors Understanding a conference InfoVis (8 years) CHI (23 years) Helps understand: Topic evolution Frequently published authors Frequently cited papers/authors Relationship between authors
Evaluation Formative evaluation with target end users Used the information visualization contest questions to make sure the prototype satisfied the requirements Noted usability issues and redesigned Scaled up for CHI, required massive changes 59
60 Schema Mapper (CHI 2005) Current techniques fail for large schemas/maps
61 Schema Mapper Emphasize relevant relationships De-emphasize other relationshipsDe-emphasize other relationships
Evaluation Systematically explored each new feature addition against shipping product doing mapping tasks Used real schema map designers 62
63 Goals for Future Visual representations that: Exploit human perception, pattern matching and spatial memoryExploit human perception, pattern matching and spatial memory Summarize and scale to very large datasetsSummarize and scale to very large datasets Use animated transitions to help retain contextUse animated transitions to help retain context Scale to a variety of display form factorsScale to a variety of display form factors Move into collaborative/sharing task domainsMove into collaborative/sharing task domains Challenges: user-centered design, creative breakthroughs, need machine learning expertise
64 Thanks for your attention!