Victor Eijkhout and Erika Fuentes, ICL, University of Tennessee SuperComputing 2003 A Proposed Standard for Numerical Metadata
2003/11Eijkhout / Metadata / SC20032 Introduction Many numerical routines have parameters with settings that depend on the application context of the routine. Computing the parameter settings is now part of the numerical software, or is done by human intervention. We argue that this should be done by a separate software analysis component, and automatically. This, however, requires a higher level description of the application data. We formalize this by introducing our Numerical Metadata. Having analysis modules, and a formal almost-semantic description of numerical data, makes Component-based Programming Frameworks possible. We also show the feasibility of the automatic analysis approach.
2003/11Eijkhout / Metadata / SC20033 Traditional flow of control Physics application produces data Numerical app analysis data to find relevant characteristic, uses characteristic to decide on algorithm and set its parameters => only `data’ interface needed
2003/11Eijkhout / Metadata / SC20034 Improved scenario Physics as before Analysis module finds characteristics Numerical algorithm choice and setting of parameters => also interface needed for characteristics => metadata
2003/11Eijkhout / Metadata / SC20035 Usage scenario 1 Example: GMRES restart length as function of indefinitenes s
2003/11Eijkhout / Metadata / SC20036 Usage scenario 2 Example: estimate fill- in, use iterative method if data wouldn’t fit
2003/11Eijkhout / Metadata / SC20037 Usage scenario 3
2003/11Eijkhout / Metadata / SC20038 Numerical experimentation is held back by lack of available characteristics Separately available analysis modules should remedy that Numerical experimentation Many relevant matrix quantities are hard to compute and hard to implement: enclosing ellipse of the spectrum, departure from normality, &c. Availability of independent analysis modules should encourage further experimentation on the part of numerical analysists.
2003/11Eijkhout / Metadata / SC20039 Component-based Programming Frameworks Applications: large, complex scientific applications (Composite Applications) that couple a variety of single-focus, scientific algorithms (Element Applications) along with other software support (e.g. visualization) Using behavioural metadata to assist in integrating single-focus algorithms into complex applications Metadata as semantic part of interface spec of numerical components (with Thomas Eidson)
2003/11Eijkhout / Metadata / SC200310
2003/11Eijkhout / Metadata / SC Practical access to metadata Store in XML format; use Schema for validation; XSL for display API for conversion XML internal data structure API for retrieval / insertion of metadata We need two-fold access to the metadata: inside a code and in more permanent form. Conversion between the two forms
2003/11Eijkhout / Metadata / SC API: creation routines
2003/11Eijkhout / Metadata / SC API: Access routines
2003/11Eijkhout / Metadata / SC API: Conversion routines
2003/11Eijkhout / Metadata / SC Proposed metadata category 1
2003/11Eijkhout / Metadata / SC Proposed metadata category 2
2003/11Eijkhout / Metadata / SC Proposed metadata category 3
2003/11Eijkhout / Metadata / SC Proposed metadata category 4
2003/11Eijkhout / Metadata / SC Further categories Custom categories Application properties: discretisation, mesh Even though we propose a core set of categories, our storage format, and the libraries implementing it, are general and open-ended. Thus we hope that people will propose categories that are inspired by other views of the same kind of data, or by different problem areas altogether. In particular, categories that describe the application-derived properties of numerical data would be very useful in the analysis modules we proposed.
2003/11Eijkhout / Metadata / SC Matrix metadata, issues Duplication of elements (e.g., simple->nnz == matrix_market->nnz) Relations between elements (e.g., if M-matrix then definite) Inheritance / derivation (e.g., dummy rows from bc, fictitious domain) It is clear that certain pieces of information will appear in more than one category, especially if third-parties will start proposing their own categories. We want to introduce mechanisms for resolving or enforcing such implied relations. Also, if one matrix is derived from another, there should be a linkage mechanism so that categories of metadata can be inherited where this is mathematically justified
2003/11Eijkhout / Metadata / SC Matrix metadata, more issues Extensions beyond matrices and linear systems Language interoperability The current proposal was clearly inspired by linear system solving, and the proposed categories are applicable to matrices, mostly in that context. However, the storage format is general enough to cover other numerical application areas and other types of data. The library we have written uses and targets C. This obviously needs to be extended to Fortran and Java. We will use Babel for this.
2003/11Eijkhout / Metadata / SC Proof of concept Predicting partitioning/distribution of linear solve Analysis modules for structural, scalar, spectral categories of metadata
2003/11Eijkhout / Metadata / SC Proof of concept Heuristic: choice of permutation & partitioning before preconditioning Statistical analysis (parametric model, Bayesian decision rule) Analysis modules for features: bandwidth, sparsity, field-of-values We ran exhaustive tests of a number of iterative methods on a collection of matrices. The results are used in a parametric model to classify the matrices. Dividing the test collection into a training and test set allows us to assess the predictive value of the model. Three different methods can be predicted with accuracies 30,90,30%. Average gain is approx factor of 5 (correct prediction over worst case). Misprediction penalty is only 60%, but still factor of 2 gain over worst method.
2003/11Eijkhout / Metadata / SC Software Metadata library based on libxml Library, XML schema, XSL style sheet Currently only C support See