SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar 28.10.2005 Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki.

Slides:



Advertisements
Similar presentations
Tahir Nawaz Introduction to.NET Framework. .NET – What Is It? Software platform Language neutral In other words:.NET is not a language (Runtime and a.
Advertisements

SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.
Tutorial 12: Enhancing Excel with Visual Basic for Applications
The Functions and Purposes of Translators Code Generation (Intermediate Code, Optimisation, Final Code), Linkers & Loaders.
Snejina Lazarova Senior QA Engineer, Team Lead CRMTeam Dimo Mitev Senior QA Engineer, Team Lead SystemIntegrationTeam Telerik QA Academy SOAP-based Web.
Programming Types of Testing.
XML Parsing Using Java APIs AIP Independence project Fall 2010.
1 Introducing Collaboration to Single User Applications A Survey and Analysis of Recent Work by Brian Cornell For Collaborative Systems Fall 2006.
TREMA Tree Management and Mapping software Raintop Computing - Oxford.
Systems Analysis and Design in a Changing World, 6th Edition
XML October 24, Unit 6. What is XML? Stands for eXtensible Markup Language It is a markup language, like HTML But, –XML is designed to markup data –HTML.
Copyright © 2001 by Wiley. All rights reserved. Chapter 1: Introduction to Programming and Visual Basic Computer Operations What is Programming? OOED Programming.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
1 Software Testing Techniques CIS 375 Bruce R. Maxim UM-Dearborn.
The Project AH Computing. Functional Requirements  What the product must do!  Examples attractive welcome screen all options available as clickable.
Introduction to BIM BIM Curriculum 01.
Your Interactive Guide to the Digital World Discovering Computers 2012.
Chapter 12 Creating and Using XML Documents HTML5 AND CSS Seventh Edition.
PHASE 3: SYSTEMS DESIGN Chapter 7 Data Design.
© 2004 The MathWorks, Inc. 1 MATLAB for C/C++ Programmers Support your C/C++ development using MATLAB’s prebuilt graphics functions and trusted numerics.
9 Feb 2004Mikko Mäkinen & Saija Ylönen Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Geneva, 9-11 February 2004, Topic (ii): Metadata.
Chapter 8: Problem Solving
1 Shawlands Academy Higher Computing Software Development Unit.
Learning Objectives Data and Information Six Basic Operations Computer Operations Programs and Programming What is Programming? Types of Languages Levels.
Lesley Bross, August 29, 2010 ArcGIS 10 add-in glossary.
XML The Overview. Three Key Questions What is XML? What Problems does it solve? Where and how is it used?
Java Programming, 3e Concepts and Techniques Chapter 3 Section 62 – Manipulating Data Using Methods – Day 1.
COMP 410 & Sky.NET May 2 nd, What is COMP 410? Forming an independent company The customer The planning Learning teamwork.
ICT Technologies Session 2 4 June 2007 Mark Viney.
Web Services An introduction for eWiSACWIS May 2008.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
‘Tirgul’ # 7 Enterprise Development Using Visual Basic 6.0 Autumn 2002 Tirgul #7.
Systems Development Lifecycle Testing and Documentation.
1 The Software Development Process  Systems analysis  Systems design  Implementation  Testing  Documentation  Evaluation  Maintenance.
DELMIA DPM Assembly This is the Master “Presentation title” page. Type the title of your presentation in the "Presentation title” field. Cette page est.
Chapter 14 Part II: Architectural Adaptation BY: AARON MCKAY.
SE: CHAPTER 7 Writing The Program
FlexElink Winter presentation 26 February 2002 Flexible linking (and formatting) management software Hector Sanchez Universitat Jaume I Ing. Informatica.
The european ITM Task Force data structure F. Imbeaux.
SIMO SIMulation and Optimization ”New generation forest planning system” Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki.
G.Corti, P.Robbe LHCb Software Week - 19 June 2009 FSR in Gauss: Generator’s statistics - What type of object is going in the FSR ? - How are the objects.
K.Furukawa, Nov Database and Simulation Codes 1 Simple thoughts Around Information Repository and Around Simulation Codes K. Furukawa, KEK Nov.
Modeling and simulation of systems Simulation languages Slovak University of Technology Faculty of Material Science and Technology in Trnava.
LHCb Software Week November 2003 Gennady Kuznetsov Production Manager Tools (New Architecture)
SIMO SIMulation and Optimization ”New generation forest planning system” Antti Mäkinen & Jussi Rasinmäki Dept. of Forest Resource Management.
MIS 105 LECTURE 1 INTRODUCTION TO COMPUTER HARDWARE CHAPTER REFERENCE- CHP. 1.
Monte Carlo Process Risk Analysis for Water Resources Planning and Management Institute for Water Resources 2008.
ECE450 - Software Engineering II1 ECE450 – Software Engineering II Today: Introduction to Software Architecture.
The Software Development Process
Parallelization of likelihood functions for data analysis Alfio Lazzaro CERN openlab Forum on Concurrent Programming Models and Frameworks.
Chapter One An Introduction to Programming and Visual Basic.
Software Development Problem Analysis and Specification Design Implementation (Coding) Testing, Execution and Debugging Maintenance.
SOAP-based Web Services Telerik Software Academy Software Quality Assurance.
Robust Design Optimization (RDO) easy and flexible to use Introduction Dynardo Services.
1 The Software Development Process ► Systems analysis ► Systems design ► Implementation ► Testing ► Documentation ► Evaluation ► Maintenance.
Systems Development Lifecycle
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Anders Nielsen Technical University of Denmark, DTU-Aqua Mark Maunder Inter-American Tropical Tuna Commission An Introduction.
Ganga/Dirac Data Management meeting October 2003 Gennady Kuznetsov Production Manager Tools and Ganga (New Architecture)
Slide 1 Chapter 8 Architectural Design. Slide 2 Topics covered l System structuring l Control models l Modular decomposition l Domain-specific architectures.
Part 1 The Basics of Information Systems. Purpose of Information Systems Information systems ◦ Collects, stores and organizes information ◦ Retrieves.
Advanced Higher Computing Science
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Tools Of Structured Analysis
Software Testing.
Database System Concepts and Architecture
1-1 Logic and Syntax A computer program is a solution to a problem.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
ACTIVE DIRECTORY An Overview.. By Karan Oberoi.
Presentation transcript:

SIMO Python/XML Simulator Current situation 28/10/2005 SIMO Seminar Antti Mäkinen Dept. of Forest Resource Management / University of Helsinki

What can be calculated at the moment? Development of different variables at stand level...

What can be calculated at the moment? Development of different variables at stand level...

What can be calculated at the moment? Development of different variables at stand level...

What can be calculated at the moment? Development of different variables at stand level...

What can be calculated at the moment? Development of different variables at stand level...

What can be calculated at the moment? Development of different variables at stand level...

What can be calculated at the moment? Diameter distributions and tree level attributes

Just for comparison with J simulator...

What can be calculated at the moment? Estimating forest variable development at both stand level & tree level is possible at the moment (300+ models implemented), but  Forestry operations not yet implemented in the simulator → ”real world” simulations not yet possible  Bucking models still not ready  Optimizing module still missing

How the simulation process works in SIMO? XML Files SIMULATOR MODEL LIBRARY Reporter Module IN: data, simulation control, modelchains, model definitions OUT: results IN: modelname, input variables OUT: model result, warnings & errors IN: XML data OUT: transformed XML, graphs SIMULATION PROCESS

What is missing? XML Files SIMULATOR MODEL LIBRARY Reporter Module Optimizer Module MODEL LIBRARY Validator Module

XML Files  Data XML Data XML  Simulation control XML Simulation control XML  Model Chain XML Model Chain XML  Model XML Model XML  Result XML Result XML

Model Library  Includes all models used in the simulator  Programmed with C language as a Dynamic Link Library (DLL)  Models are C functions that are called from the simulator (model definitions also in the Model.xml)  Users can add new models to the library or create additional model libraries  Reports warnings and errors to the simulator  Risk level models not yet implemented

SIMULATOR  1. version of simulator programmed with C/C++  Later the programming language was changed to Python, because of:  Simple and concise syntax → easier readability of code and possibility of developing the simulator faster  Good combatibility with C language  Number of useful readymade open source tools for variety of purposes  Code documentation is underway

SIMULATOR  Intakes simulation control instructions, model chains, model definitions and data in XML format  Processes the user defined model chains for each computing unit in the data  Calls the model library whenever some value needs to be calculated (Python/C interface ctypes)  Prints the resulting values into a result XML file  Transforms the XML data from different files to simulators own data structure (more efficient than ElementTree data structure)

Reporting Module  Used for visualizing data & transforming the results from XML format to other formats  Intakes data and processing instructions in XML format  At the moment can plot different kinds of graphs of given variables (matplotlib) toolset  XML transformations to be implemented later...

Missing modules  Optimizer module Finds the best alternative from the alternatives generated by the simulator Possibly many alternative optimizing methods?  Validator module Validates the XML files with XSD (Schema) files and by external rules Makes sure that the XML files are well-formed and contain all necessary elements

Strengths of SIMO XML Simulator  Virtually any kind of model can be used in the simulations and added to the model library  User can define the model chains freely for different kinds of simulations  User can define correction/rectification factors for the models, (eg. different factors for geographical areas)  Extensive warning and error reporting system (risk control coming later...)  Data levels are not confined to strict predifined standard

Model risk management –individual variables Minimum and maximum limits of individual variables have been defined Documented in ModelXMLModelXML Limits have been coded into ModelLibrary -> throws warnings if the Individual parameter values are out of bounds How the minimum and maximum limits are defined? Limits defined by author (caused by data, model shape, …) Limits of modeling data Model is tested with those limits using NFI-data as test data. Does the model function properly if the Individual parameter values are out of bounds? For example: Basal area growth model (Vuokila & Väliaho) for Scots pine on mineral soilsBasal area growth model

Model risk management –interaction Interaction between variables age ba Accepted combinations of varibles (120, 5) not accepted (20, 32) not accepted Solution alternatives: Logit-model: propability that the estimate is in acceptable area (at least linear regression was not flexible enough) Grid: area of combinations of variables is divided into cells. Every cell has information is the estimate acceptable or not

Model risk management Two levels 1. Individual parameter values out of bounds 2. All individual parameter values acceptable, but is the specific combination of them acceptable? Case 1: already in the simulator Case 2: Suggestion 1. get the k nearest neighbours from the VMI data, 2. evaluate the model for the data point and the k nearest neighbours. 3. If the difference for the model estimate between the data point and the neighbours is too big, generate an event of ”unacceptable” model estimate

Isn’t that procedure too heavy computationally? Probably, not yet evaluated But what about if we store the risk evaluation results and use those primarily: 1. Is it safe to call ModelA with parameters (5, 6, 10) when we accept risk level X? 2. Has the risk been evaluated with parameter values (5,6,10) and risk level X before. If yes, get the answer from a table of risk evaluations 3. If not, get k nearest neighbours for data point (5,6,10), evaluate the model with (5,6,10) and k neighbours 4. Store the risk evaluation result and the mean model result for k neighbours for the data point (5,6,10) and risk level X

Open questions: When evaluating model result shall we compare it to: values derived directly from the nearest VMI permanent sample plots OR model estimates for the nearest VMI sample plots?

Software license for SIMO Types of Open Source licenses MIT & Co: “Do whatever you want” LGPL: “Everything you do to the original code must be open source, anything on top of that can be closed” GPL & Co: “Everything you do is open source, …well almost” GPL under the hood: "derivative work" or "mere aggregation“? Derivative work must be open source, but aggregation can be closed source

The case of MySQL Double licensing: open source GPL, commercial development with a commercial license that allows closed source

General software architecture Individual components that communicate over the network Validator Simulator – this is well underway Optimiser Reporter – simulation results to figures and other data formats than XML, or different XML format etc. Implications to licensing? What about if one of the components uses a sub component that is published under GPL?

Architecture continued TCP/IP based communication Security issues? secured traffic (SSL, SSH) inside firewall Scalable