Visualization and Cluster
Visual Analysis “the science if analytical reasoning facilitated by interactive visual interface” (Thomas and Cook, 2005) Interact with data Test hypotheses Formulate knowledge Human intuitive is reliable Few user are well-versed in algorithms
Clustering without Human User have domain knowledge for feature selection Which feature is more important than which Sometimes feature have different weight in different use scenarios Priority Distribution User know outliers in dataset User can give initial state
Show Cluster with Visualization
BaVA: Approach Bayesian Visual Analytics framework Display posterior result Dimension Reduction for display Expert give feedback through adjust layout Feedback as observation level rather than dimension level What kind of interaction should be captured Human input change the underlying probabilistic model and updates the display
Semantic Layout ForceSPIRE https://www.youtube.com/watch?v=I3cKKSFnePo&feature=youtu.be
Interface design Move Document link document: Pin: semantic location Update position based on current weight link document: need update the weight Pin: semantic location Exp: move document close to a pin is expressive movement Text highlighting Increase the weight for highlighted keywords and update the layout
Interface design Search Document Coloring Visual level of detail Document contain the keywords will be highlighted Document Coloring Mark document of same group Visual level of detail More detail = easier to reference Annotation Give semantic information to clusters = easier to reference
Interaction Feedback Spatial information on interaction is ambiguity Will cover it later Operations are on observation level, not on dimension level Let user adjust parameters directly is just guessing game Also, operation is in 2D space rather than high dimension space
Interaction Feedback Two examples: PPCA and MDS
Probabilistic PCA PCA: minimize the variance e Problem of PCA: Important structures in data may not correlate with variance, like cluster Probabilistic PCA
Probabilistic PCA Let , the marginal variance of d Sd the empirical variance of d MAP(∑d) = Sd Let The coordinate The relationship between variance and coordinate
User guided PPCA Display show point layout in 2D space Drag away or drag close two observations if user thinks they are close
User guided PPCA Dragged points have different and similar features A hypothetical variance matrix f(p) An addition weight v to show how important this adjust is Parameter feedback
User guided PPCA https://www.youtube.com/watch?v=k5NYZbP4yKQ
Weighted MDS Minimize the difference if sum of distance in the real space and in the embedded space
User guided MDS
User guided MDS User adjust the relative position of points Solve w so that r: adjust position, d: original position
V2PI Visual to parametric Interaction Spatial Interaction is intuitive but ambiguity Move a point can means: Move toward to a unmoved points Move around and happen to be moved closer to unmoved points Need better explicit interaction Use tag to distinguish
Interaction Design For each interaction, data are involved in two ways Explicit Implicit Example: move A to B, A is Explicitly involved and B is implicitly involved For unmoved points: It is implicitly involved Not involved at all (ignored by user) Get meaning od unmoved point with minimal addition effort
Interaction Design Moved set Highlighted set (user highlight some points that he found interesting) Untouched set (ignored by user)
Pair-wise weighting C_ij show user’s preference on each data C_ij: Combine value is 1
Pair-wise weighting n_h and n_m are the number of object of highlighted and moved in the dataset
User guided DR Key notes: We can rely on user’s domain knowledge to improve Dimension Reduction Data manipulating must be intuitive and efficient