Mining Turbulence Data Ivan Marusic Department of Aerospace Engineering and Mechanics University of Minnesota Collaborators: Victoria Interrante, George Karypis, Vipin Kumar Graham Candler, Ellen Longmire, Sean Garrick Acknowledgement: National Science Foundation Mathematical Challenges in Scientific Data Mining IPAM January, 2002
Flow direction Solid surface Turbulent Boundary Layer (Flow visualization using Al flakes in water channel)
Outline Turbulent boundary layers: introduction and background Need for both simulation and experimental datasets Visualization and feature extraction What are the important features? What is to be “data mined”? Difficulties with present analysis approach New analysis strategy to investigate causal relationships Data mining issues and challenges
Flow direction Solid surface Turbulent Boundary Layer Responsible for heat transfer, skin friction (drag), mixing of scalars
Issues in wall turbulence Described by Navier-Stokes equations (non-linear PDEs) Direct numerical simulation is restricted to low Re (Reynolds number) Re = ratio of inertia to viscous forces ( U ) No. of simulation grid points ~ (Re) 9/4, Cost ~ (Re) 3 Present simulation: Re = O(10 3 ), Require Re = O(10 6 ) Also need experimental datasets to investigate high Re flows Better understanding of physics/causal relationships would lead to more accurate modeled simulation tools (CFD) and analytical scaling laws
What features do we extract? Flow field information involves in (x,y,z,t) : Velocity u, Pressure p, Temperature , etc Good candidate = Coherent vortex structures
Vortex identification using velocity gradient tensor
Flow topology classification
Isosurfaces of:
Decreasing threshold levels Enstrophy Discriminant Volume rendered visualizations ( DNS data Re = 700)
Discriminant
Cross-section of “blue” vortex
EXPERIMENTAL WIND TUNNEL FACILITY
PIV SETUP Kodak Megaplus Cameras 1024 x 1024 pixels Pulsed Lasers Nd:YAG = 15
In-plane Vorticity
In-plane Swirl
Difficulties with present analysis approach
Typical Turbulent Boundary Layer Simulation O(10 8 ) grid points Generates >10 Terabytes per day (every day) Write to disk every 1/1000 time steps (99.9% discarded) Final database ~1 Terabyte All analysis is done after final database is obtained
Present approach
New analysis approach
Some important trigger events associated with drag “Bursting” High values of Reynolds shear stress (-uw) (associated with momentum transport)
Example of bursting events
N.B. High –uw region
Swirl (| ci |) Reynolds shear stress VorticityWall-normal velocity 20Apr_06 zone1
Consistent with “packets of vortices” (together with other evidence):
SIMPLE SEARCH ALGORITHM Dual threshold search routine Define connected region only if 8 neighboring points To search for ‘Packets of hairpin vortices’, define a region if Positive Vorticity in the bottom and Negative Vorticity in the top.. Additional search for (a) Low streamwise velocity (Low momentum) (b) High Reynolds shear stress in the adjoining region of patches of vorticity
z + = 92 All quantities non- dimensionalized using U and VORTICITY MOMENTUM SWIRL STRENGTH
VORTICITY u’w’ z + = 92 All quantities non- dimensionalized using U and
VORTICITY u’w’ MOMENTUM
Adrian, Meinhart & Tomkins (2000)
Modeling Data With Graphs Beyond Transactions Graphs are suitable for capturing arbitrary relations between the various objects. Vertex Object Object’s Attributes Relation Between Two Objects Type Of Relation Vertex Label Edge Label Edge Data InstanceGraph Instance Discovery Frequent Subgraph Discovery (FSG – Karypis & Kuramochi 2001)
Interesting Patterns Frequent Subgraphs Discovering interesting patterns Finding frequent, recurrent subgraphs Efficient algorithms must be developed that operate and take advantage of the new representation.
Finding Frequent Subgraphs: Input and Output Problem setting: similar to finding frequent itemsets for association rule discovery Input Database of graph transactions Undirected simple graph (no loops, no multiples edges) Each graph transaction has labeled edges/vertices. Transactions may not be connected Minimum support threshold σ Output Frequent subgraphs that satisfy the support threshold Each frequent subgraph is connected.
Finding Frequent Subgraphs: Input and Output
Example
Example of datasets (Database type-B) for investigation using a Frequent Subgraph Discovery scheme: - PIV data : In-plane swirl S(x,y) for multiple timesteps (with and without trigger signal) - Full 3D data from simulation
Further Challenges Temporally and Spatially evolving structures (objects change) Interactions of vortex structures
C B A D