Implementing Unified Access to Scientific Data from.NET Platform Sergey B. Berezin Dmitriy V. Voitsekhovskiy Vilen M. Paskonov Moscow State University Department of Computational Mathematics and Cybernetics Supported by Student Laboratory of Microsoft Technologies and RFBR grants
Different languages, common tools Viscous fluid flow visualization via vector fields and color maps ( Seismic data visualization via isosurfaces ( Tensor field visualization for diffusion through biological tissue (
Scientific data access requirements Viscous fluid flow visualization via vector fields and color maps ( Seismic data visualization via isosurfaces ( Tensor field visualization for diffusion through biological tissue ( We need to: Retrieve typed data object without regard where it is stored and how it is stored. Physical data independence Retrieve partial data when needed Filtering & Caching Retrieve data description Metadata support We don’t want to: Rewrite existing computational software Use existing formats Install new system software Use existing protocols
Scientific data access today
What’s so special in scientific data? Scientific data … Have a complex structure; Parameterized by … Time Sampling point coordinates More complex parameters; Stored in many files of various formats Have very large size of individual data items Don’t fit well to relational model!
What is DataSet?
Example: Accessing data in C# // Retrieve the DataSet object from a server by GUID DataSet dataset = DataSet.Open(" "767c57b1-801e-4784-bbd fd0ec2"); // Fetching DataItem by name. // DataItem may be either simple or composite, it doesn’t matter DataItem xVelocity = dataset.DataItems["u-values"]; // Creating parameter corresponding to time moment = 0.0 CompositeParameter param = new CompositeParameter( new ParameterValue("time", 0.0d) ); // Fetching DataItemSlice for the parameter. // It is an instance of DataItem for specified parameter value. DataItemSlice dataVelocity = xVelocity[param]; // Getting required data: velocity array for time = 0.0 ScalarArray3d data = dataVelocity.GetData() as ScalarArray3d;
DataRequest: communicating with server The following DataRequest is sent to a server as the result of the previous example: <dataRequest dataSource="…" dataSet="767c57b1-801e-4784-bbd fd0ec2" … > <dataSource sourceName="u0000.cdf" sourceType="netCDF" sourceParameters="u" /> The following DataRequest is received from the server: <dataRequest dataSource="…" dataSet="767c57b1-801e-4784-bbd fd0ec2" … >
Complex structures in DataSet file with scalar array file with scalar array file with scalar array file with spatial grid scalar array data item scalar array data item scalar array data item vector array constructor vector array data item spatial grid data item data field constructor vector field data item x X,Y, Z
Example: Accessing composite data in C# // Retrieve the DataSet object from a server by GUID DataSet dataset = DataSet.Open(" "767c57b1-801e-4784-bbd fd0ec2"); // Fetching DataItem by name. // DataItem may be either simple or composite, it doesn’t matter DataItem velocity = dataset.DataItems["uvw-values"]; // Creating parameter corresponding to time moment = 0.0 CompositeParameter param = new CompositeParameter( new ParameterValue("time", 0.0d) ); // Fetching DataItemSlice for the parameter. // It is an instance of DataItem for specified parameter value. DataItemSlice dataVelocity = velocity[param]; // Getting required data: velocity array for time = 0.0 Vector3dArray3d data = dataVelocity.GetData() as Vector3dArray3d;
DataRequest: composite data items The following DataRequest is sent to a server as the result of the previous example execution: … The following DataRequest is received from server:
Filtering Filtering allows transfer of only required data from server to client Filtering may be performed both by a client-side and a server- side of the system. Examples of the filtering are cropping and thinning of large vector fields x d vectors 108 МB x d vectors 12 MB x 100 2d vectors 120KB 0.4 cropping filter [0.4,0.76] x [0.4,0.76] thinning filter (0.1,0.1)
Example: Filtering data in C# // Initializing the DataSet object from a server by its GUID DataSet dataset = DataSet.Open(" "767c57b1- 801e-4784-bbd fd0ec2"); // Fetching DataItem by its name. It may be either simple or composite DataItem velocity = dataset.DataItems["uvw-values"]; // Creating parameter corresponding to time moment = 0.0 CompositeParameter param = new CompositeParameter( new ParameterValue("time", 0.0d) ); // Fetching DataItemSlice for the parameter. DataItemSlice dataVelocity = velocity[param]; // Creating filter "Thinner" for a type of the velocity data item // and setting up its parameters IThinner3dFilter filter = FilterFactory.GetFilter("Thinner", dataVelocity.TypeDescriptor) as IThinner3dFilter; filter.PercentageX = 0.05; filter.PercentageY = 0.05; filter.PercentageZ = 0.05; // Getting required data: thinned out velocity array for time = 0.0 Vector3dArray3d data = dataVelocity.GetData(filter) as Vector3dArray3d;
DataRequest: communicating with server The following DataRequest is sent to a server as the result of the previous example execution: … The returned DataRequest is similar in this case to the returned DataRequest from the previous example.
Caching Both server-side and client-side of the system cache the results of a successful DataRequest execution. Server-side cache filtering results Client-side cache retrieved data items and results of DataRequest
How DataRequest is performed?
Deployment Scenario The simplest scenario is as follows: More sophisticated scenario includes development of distributed data sources that provide scientific data. Dedicated servers will act as data processor performing data filtering and transformations Dedicated servers will act as data registries allowing DataSet enumeration and querying in entire global network. This will make possible to create dynamic data libraries of researches and enables easy data publishing
Why.NET? Object-oriented data access requires an object-oriented platform to be built on. High extensibility is based on CLR dynamic nature New data types New filters New parsers No built in data types, filters, parsers.NET opens new horizons with LINQ, WPF,…
Future work Transferring files from remote server is just an example of DataProvider Extend architecture for new types of data providers LINQ technology will make data access from C# much more elegant. Development of easy-to-use data management applications for the proposed approach. Development of an innovative visualization system, highly extensible and customizable Or integrate our approach with existing one
Future visualization system Control A Control B Control C A BABA C View 1 View 1.1 Control D D View 1.2 View Step 1. Choose object of interest Step 2. Choose data transform Step 3. Choose visualization algorithm Example
Questions? Visit: Mail to: