Presentation is loading. Please wait.

Presentation is loading. Please wait.

SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10.

Similar presentations


Presentation on theme: "SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10."— Presentation transcript:

1 SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10

2 SimDB and SimDAL Protocols to support describing simulations –Simulation Data Model: Model for N-body 3+1D any simulations http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/specification/uml/SimDB_DM.png http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/specification/uml/SimDB_DM.png publishing simulations –Simulation Database (SimDB): protocol for accessing a database built according to SimDM. finding simulations –SimDB/TAP –queryData in SimDAL –SimTAP retrieving simulation data, whole, in parts, manipulated –SimDAL getData services (not in this talk) Btw: simulation can be –simulation run –simulation result –simulation data –post-processing of simulation results

3 SimDB/REST simple access to SimDB Uses XML representation of model –XML schema http://code.google.com/p/volute/source/browse/#svn/trunk/projects/theory/snapdm/specification/xsd http://code.google.com/p/volute/source/browse/#svn/trunk/projects/theory/snapdm/specification/xsd Examples http://code.google.com/p/volute/source/browse/#svn/trunk/projects/theory/snapdm/specification/examples http://code.google.com/p/volute/source/browse/#svn/trunk/projects/theory/snapdm/specification/examples –PDR http://code.google.com/p/volute/source/browse/#svn/trunk/projects/theory/snapdm/specification/examples/external/PDR http://code.google.com/p/volute/source/browse/#svn/trunk/projects/theory/snapdm/specification/examples/external/PDR –Gadget2 http://volute.googlecode.com/svn-history/r1382/trunk/projects/theory/snapdm/specification/examples/external/Gadget2/Gadget2.xml http://volute.googlecode.com/svn-history/r1382/trunk/projects/theory/snapdm/specification/examples/external/Gadget2/Gadget2.xml –TODO more (SVO) VO-URP –validator http://www.g-vo.org/SimDB-browser/Validate.do http://www.g-vo.org/SimDB-browser/Validate.do –upload http://www.g-vo.org/SimDB-browser http://www.g-vo.org/SimDB-browser –download http://www.g-vo.org/SimDB-browser http://www.g-vo.org/SimDB-browser

4 SimDB/TAP Model complex –Too(?) complex for trivial (parameter based) query language –Need special navigation tools (vo-urp@gavo)vo-urp@gavo –Need powerful query language Impement TAP on database built according to SimDM Map UML to RDB model –TAP_SCHEMA for SimDM (vo-urp@gavo old) http://code.google.com/p/volute/source/browse/#svn/trunk/projects/theory/snapdm/specification/tapvo-urp@gavo http://code.google.com/p/volute/source/browse/#svn/trunk/projects/theory/snapdm/specification/tap –create table + inserts –VODataService VO-URP SQL query http://www.g-vo.org/SimDB-browser/Query.do http://www.g-vo.org/SimDB-browser/Query.do Not always easy!

5 Model complex Normalised (see image) General Abstract –e.g. parameters must be fully defined, no assumptions Hard to deal with quantities with a priori unknown units –ParameterSetting table has value AND unit attributes (Quantity datatype)

6 Example queries Find synthetic spectra of white dwarf stars Find cosmological simulations with Ω=0.9, Ω Λ = 0.7 and Ω b =0.02 Find all SPH simulations containing a galaxy cluster with mass around10 14 M sun

7 select e.* from experiment e, targetObject t, result r, product p where t.label=white_dwarf and t.containerid=e.id and r.containerid=e.id and r.targetId=t.id and p.containerid=r.id and p.productType=spectrum

8 Example queries Find synthetic spectra of white dwarf stars Find (cosmological) simulations with Ω=0.9, Ω Λ = 0.7 and Ω b =0.02 Find all SPH simulations containing a galaxy cluster with mass around10 14 M sun

9 select e.* from Experiment e, InputParameter ip1, ParameterSetting ps1, InputParameter ip2, ParameterSetting ps2, InputParameter ip3, ParameterSetting ps3 where ps1.containerId = e.id and ps1.parameterId = ip1.id and ip1.label = omega_lambda and ps1.numericalValue_value=0.7 and ps2.containerId = e.id and ip2.label = omega_baryon and ps2.parameterId = ip1.id and ps2.numericalValue_value=0.02 and ps3.containerId = e.id and ip3.label = omega and ps3.numericalValue_value=0.9

10 Example queries Find synthetic spectra of white dwarf stars Find (cosmological) simulations with Ω=0.9, Ω Λ = 0.7 and Ω b =0.02 Find all SPH simulations containing a galaxy cluster with mass around10 14 M sun

11 select e.* from Experiment e, ExperimentRepresentationObject ero, RepresentationObjectType rot, TargetObject to, Property p, StatisticalSummary s where ero.containerId = e.id and ero.typeId= rot.id and rot.label=sph.particle and to.containerId = e.id and to.label = galaxy.cluster and p.containerId = to.id and p.label=mass and s.propertyId = p.id and s.statistic = value and s.numericalValue_value=1e14 and s.numericalValue_unit=M_sun

12 SELECT r.id as id, r.publisherdid as publisherdid, s0.numericValue_value as mass, s1.numericValue_value as x, s2.numericValue_value as y, s3.numericValue_value as z FROM result r, product o, statisticalsummary s0, property p0, statisticalsummary s1, property p1, statisticalsummary s2, property p2, statisticalsummary s3, property p3 WHERE r.containerid = 6 AND o.containerid = r.id and s0.containerid = o.id and s1.containerid = o.id and s2.containerid = o.id and s3.containerid = o.id and p0.publisherdid = 'mass' and s0.proprtyid=s3.id and s0.statistic = nominal and p1.publisherdid = 'x' and s1.proprtyid=s3.id and s1.statistic = nominal and p2.publisherdid = 'y' and s2.proprtyid=s3.id and s2.statistic = nominal and p3.publisherdid = 'z' and s3.proprtyid=s3.id and s3.statistic = nominal An example from Paris. Find typical values of mass,x,y,z properties in a given simulation result

13 SELECT r.id as id, r.publisherdid, max(case when p.publisherdid = mass and s.statistic=nominal then s.numericValue_value else null end) as mass, max(case when p.publisherdid = x and s.statistic=nominal then s.numericValue_value else null end) as x, max(case when p.publisherdid = y and s.statistic=nominal then s.numericValue_value else null end) as y, max(case when p.publisherdid = z and s.statistic=nominal then s.numericValue_value else null end) as z FROM result r, product o, statisticalsummary s, property p WHERE r.containerid = 6 AND o.containerid = r.id and s.containerid = o.id and p.id = s.propertyid group by r.id,r.publisherid,o.id

14 Conclusions Some queries can be phrased nicely Others using standard SQL, but due to level of normalisation and abstraction MANY joins required Can we simplify this a bit?

15 zoom

16 containerIdvalueunitparameterId... 1230.02456 1230.7457 1230.9458 345.04456 345.7457 3451458... idnamelabeldatatypedescription 456omega_bomega.baryonreal... 457omega_lomega.lambdareal... 458omega real... ParameterSetting InputParameter idomega_bomega_lomega... 1230.020.70.9 3450.040.71 + simtap.Experiment

17 SimTAP When Protocol is fixed, tap schema can be simplified –parameters columns in simtap.Experiment table –property characterisation columns in product specific characterisation table(s) –...

18 select e.* from Experiment e, InputParameter ip1, ParameterSetting ps1, InputParameter ip2, ParameterSetting ps2, InputParameter ip3, ParameterSetting ps3 where ps1.containerId = e.id and ps1.parameterId = ip1.id and ip1.label = omega_lambda and ps1.numericalValue_value=0.7 and ps2.containerId = e.id and ip2.label = omega_baryon and ps2.parameterId = ip1.id and ps2.numericalValue_value=0.02 and ps3.containerId = e.id and ip3.label = omega and ps3.numericalValue_value=0.9 Instead of this

19 this select e.* from simtap.Experiment where omegaLambda=0.7 and omegaBaryon=0.02 and omega=0.9

20 Table definitions can be derived From a Protocol definition –input parameters –for each Representation object type a table with statistical summaries of properties –target object type ala SimDM (units in ADQL required) pivoted per project? –input data sets (urls) Pivoting queries can be generated

21 Proposal SimDAL services MAY include a SimTAP service 1 SimTAP schema per Protocol Each such schema contains –1 Experiment table with columns for parameters –>=1 Product tables with characterisation of properties –Possibly other tables from SimDB/TAP


Download ppt "SimDB and SimTAP Dealing with a complex data model Gerard Lemson, Nara, 2010-12-10."

Similar presentations


Ads by Google