From the NEAR EARTH SPACE / SPACE WEATHER Window the BIG DATA ERA Yurdanur Tulunay, METU Dept. of Aerospace Engineering, Ankara Ersin Tulunay, METU Dept.

Slides:



Advertisements
Similar presentations
The Robert Gordon University School of Engineering Dr. Mohamed Amish
Advertisements

Lect.3 Modeling in The Time Domain Basil Hamed
Fuzzy immune PID neural network control method based on boiler steam pressure system Third pacific-asia conference on circuits,communications and system,
CPSC 695 Future of GIS Marina L. Gavrilova. The future of GIS.
Introduction to Communication Research
Applying Multi-Criteria Optimisation to Develop Cognitive Models Peter Lane University of Hertfordshire Fernand Gobet Brunel University.
Medical Informatics Basics
RESEARCH DESIGN.
Data Mining Techniques
DR. AHMAD SHAHRUL NIZAM ISHA
1 Chapter No 3 ICT IN Science,Maths,Modeling, Simulation.
MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way
8th MCM and WG's Meetings, COST Action 724, March 2006, Antalya, Turkey COST 724 Turkish Initiative: Activities Relevant to WG1; WG2; WG3; WG4 Yurdanur.
Abstraction, Big Data and Context Dependency, THEfoDA, Son Bernardino, Mallorca, May 2013 Abstraction, Big Data and Context- Dependency.
15 May 2006ICTP, Space Weather - METU-NN1 ICTP-COST-USNSWP-CAWSES-INAF-INFN International Advanced School on Space Weather 2-19 May 2006, Trieste, Italy.
Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.
MIS – 3030 Business Technologies Social Media & Conversation Big Data.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
DATABASES Southern Region CEO Wednesday 13 th October 2010.
1 Enviromatics Environmental sampling Environmental sampling Вонр. проф. д-р Александар Маркоски Технички факултет – Битола 2008 год.
Copyright © 2012, SAS Institute Inc. All rights reserved. ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY,
Taguchi. Abstraction Optimisation of manufacturing processes is typically performed utilising mathematical process models or designed experiments. However,
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
International Conference on cybernetics and intelligent system, p.p , Sept Modeling Large-Scale Manpower Dynamics: An Expert Systems Approach.
Neural Modeling - Fall NEURAL TRANSFORMATION Strategy to discover the Brain Functionality Biomedical engineering Group School of Electrical Engineering.
Extracting value from grey literature Processes and technologies for aggregating and analysing the hidden Big Data treasure of the organisations.
Downscaling Global Climate Model Forecasts by Using Neural Networks Mark Bailey, Becca Latto, Dr. Nabin Malakar, Dr. Barry Gross, Pedro Placido The City.
IEEE International Conference on Fuzzy Systems p.p , June 2011, Taipei, Taiwan Short-Term Load Forecasting Via Fuzzy Neural Network With Varied.
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
LOAD FORECASTING. - ELECTRICAL LOAD FORECASTING IS THE ESTIMATION FOR FUTURE LOAD BY AN INDUSTRY OR UTILITY COMPANY - IT HAS MANY APPLICATIONS INCLUDING.
Scientific Literature and Communication Unit 3- Investigative Biology b) Scientific literature and communication.
From the NEAR EARTH SPACE / SPACE WEATHER Window the BIG DATA ERA Yurdanur Tulunay, METU Dept. of Aerospace Engineering, Ankara Ersin Tulunay, METU Dept.
Data Analytics (CS40003) Introduction to Data Lecture #1
CNIT131 Internet Basics & Beginning HTML
IOT – Firefighting Example
Building Public Health System Capacity:
A Signal Processing Approach to Vibration Control and Analysis with Applications in Financial Modeling By Danny Kovach.
Amplify Science.
Chapter 1 The Science of Biology.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
Locating The Problem Dr. Anshul Singh Thapa.
Signals and systems By Dr. Amin Danial Asham.
Chapter 1- Introduction
Library Web Portals: Reinventing Libraries for the Future
Prediction of Coal Free-Swelling Index using Mathematical Modelling
Introduction to Physical Science
TESTING OF BIG DATA & PREDICTIVE ANALYTICS
Climate , Climate Change, and climate modeling
Frequently asked questions about software engineering
Dynamical Models - Purposes and Limits
School of Information Management Nanjing University China
Meteorological applications and numerical models becoming increasingly accurate Actual observing systems provide high resolution data in space and time.
Objective of This Course
11-12 June Third International Symposium on Climate and Earth System Modeling, NUIST, 南京 (Nanjing) On the added value generated by dynamical models.
of the Artificial Neural Networks.
8th Grade Matter and Energy in Organisms and Ecosystems
What is Concurrent Programming?
Statistical Data Analysis
doc.: n Jeff Gilbert Atheros Communications
doc.: n Jeff Gilbert Atheros Communications
Stepping on Earth: A Roadmap for Data-driven Agent-Based Modelling
Business Intelligence
Inquiry Dr. Charles Ophardt EDU 370.
The New Internet2 Network: Expected Uses and Application Communities
Biological Science Applications in Agriculture
NextGen STEM Teacher Preparation in WA State
Modeling IDS using hybrid intelligent systems
Planning a cross- curricular topic
2019/9/14 The Deep Learning Vision for Heterogeneous Network Traffic Control Proposal, Challenges, and Future Perspective Author: Nei Kato, Zubair Md.
Akram Bitar and Larry Manevitz Department of Computer Science
Presentation transcript:

From the NEAR EARTH SPACE / SPACE WEATHER Window the BIG DATA ERA Yurdanur Tulunay, METU Dept. of Aerospace Engineering, Ankara Ersin Tulunay, METU Dept. of Electrical Engineering, Ankara ABSTRACT Near Earth Space processes are mostly nonlinear and time varying. Therefore, data driven models have proven to be more attractive in modelling such systems to be employed in parallel to the mathematical models. At present, one of the urgent issues is the development of new signal processing techniques to extract manageable representative data out of the relevant “big data” to be used in “training”, “testing” and “validation” phases of modeling. In this poster our intention is to stress that it is time to get ready! 1. INTRODUCTION Near Earth Space processes are mostly nonlinear and time varying. Therefore, data driven models, as scientists call “evidence-based decision making” , have proven to be more attractive in modelling such systems to be employed in parallel to the mathematical models based on the first physical principles. We have been dealing with the data driven models since around 2000. Even in those times, which may be considered as relatively recent, it was difficult to access independent representative data to be employed in the “training”, “testing”, and “validation” phases. As we understand, from the concept of “metrics” it is the term covering the well set-up criteria to compare the performances of various data driven models of the same process. The existence and availability of reliable data sources are vital in both scientific and technological developments. During the recent years, mainly, due to the developments in digital electronics and space technologies, huge amount of data have been obtained as the results of space and Earth bound measurement and monitoring campaigns. Thus the term of “big data “have become an important issue. Although, “big data size” is a constantly moving target it is the time to develop a set of new techniques and technologies with new models and integration to extract representative characteristics of data sets that are diverse, complicated, and of a big scale.

Connection (sensor and networks) Cloud (computing and data on demand) The basic characteristics of big data involve “high volume- amount of data”, “velocity-speed of data in and out”, and “variety-range of data types and sources”. At present, from the model making point of view, one of the urgent issues is the development of new signal processing techniques to extract manageable representative data out of the relevant big data ranging from a few dozen terra bytes, to for example, many peta bytes (as of 2012). For example, one will need to improve, “pattern recognition” techniques and identify the “outliers” depending on the objective of the action. In this poster our intention is to stress that it is time to get ready! 2. METADATA AND BIG DATA First, we would like to note on what we understand concerning Meta Data (MD) and Big Data (BD). Simply MD is the data of data. Whereas BD is the details included in the items mentioned in MD. Big Data can be described by 5V (Volume; Variety; Velocity; Variability; Veracity). Cyber - physical systems may have 6C systems : Connection (sensor and networks) Cloud (computing and data on demand) Cyber (model and memory) Content/ context (meaning and correlation) Community (sharing and collaboration) Customization (personalization and value)

3. SPACE WEATHER Based on our experience on the COST; SPECIAL; IHY; FP6;FP7 and some other EU/EC bound Actions between 1990 and 2015 , we had observed that data management is important, in particular, in contemporary subjects such as NES and SpW . The Sun is the main driver for SpW. There are various types solar variabilities with time scales ,say , from miliseconds to hundred of years. Therefore, taxonomy is extremely important in data management in terms of MD and BD. Quoting from Siscoe (2000), ”To be able to talk about the network of space vulnerable , technological entities upon which humankind is becoming increasingly dependent , I suggest referring to it as the cyber - electrosphere The cyber - electrosphere is defined as the global totality of all space – vulnerable - electrically enabled technological systems. If we could see the network comprised of this totality by itself - see the satellite links, the cable links, the navigation and positioning links, the electric power grids, and the radio links – the image would resemble a picture of Grey’s Anatomy showing the central nervous system or the circulation system. Here the sensory , information and energy transfer network does not belong to the human body to an abstract,but are interconnected global entity – cyber - electrosphere.” 4. DATA DRIVEN MODELS The “Middle East Technical University (METU) Data driven Models” had proven to be powerful in forecasting the parameters of the non-linear processes including the ionospheric , magnetospheric processes. These models consist of Neural Network (NN), Neuro Fuzzy Networks (NFN), Genetic Programming (GP) and Cascade Models. The decisions based on the analyses of the big - retrospective NES/ SpW- data, models, algorithms can forecast / predict future development if future is similar to the past. If the system dynamics of future change, i.e. the process is not stationary, the past will not be able to model future. In the case of a changing environment, making forecast / predictions make it necessary the thorough understanding of the systems dynamics which requires advanced theory. However, to develop such a sophisticated theory is prohibitingly difficult. In such cases, as a parallel means to analitical approach, the data driven techniques such as NNs, NFNs etc. based attempts are proposed for process modelling.

The existence and availability of reliable data sources are vitai in both scientific and technological developments. In recent years mainly due to the developments in electronics and space technologies, huge amount of data are obtained as the result of space and Earth bound measurement and monitoring campaigns campaigns. In the past, scarcity of data was one of the main handicaps in the scientific and technological developments. However, today one of the main problems is the lack of full use of available data due to the difficulties in organization of collection and cataloguing of MD and making them available via BD environment. 5. TRAINING, TESTING, VALIDATION AND COMPARISON OF MODELS An important aspect in data driven modelling is the availability of independent data which represents the process. BD opens an era which helps in finding useful data. However the question “will large-scale search data help us to create better tools?” is also pending. On the other hand, although BD contains huge amount of data, still we may come across the following situations: (i) no data, (ii) only output data and (iii) both input and output data. Special statistical procedures must be developed to cope with such situations. Big data bring new possibilities to training, testing and validation. BD also makes it possible to discover the fine structures of the processes so that they can be included in the model. Discovering the fine structures is also very important for validation and to create metrics for comparison of the models. However, massive sample size and high dimensionality of BD bring the necessity for developing new computational and statistical paradigm. In developing and using the computational techniques in data driven modelling, the storage, speed and cost issues must also be taken into account.

it is time to get ready! 8. REFERENCES 6. CONCLUSIONS At present, from the training, test and validation viewpoints in model making and in developing metrics for model comparison, one of the urgent issues is the development of new statistical processing techniques to extract manageable representative data out of the relevant BD. Eventhough, BD provides an opportunity to observe the process from a wider point of view. However, extracting meaningfull results by using classical signal processing methods is not possible. Therefore, new suitable data processing aproaches and methods are needed in particular concerning contemporary data driven based approaches. Summarising: Sampling methods are needed which are suitable to the nature of collected huge amount of data. Thus, representative sparse data are obtained from the dense data. Then, the obtained small packages are analysed via mini batch processing. For example, one will need to improve, “pattern recognition” techniques and identify the “outliers” depending on the objective of the action. In the cases where recording and offline processing are not suitable, online processing techniques are needed to analyse the continuously flowing data. Accordingly, metrics and validation methods also need to change. it is time to get ready! 7. ACKNOWLEDGEMENTS The valuable mutual discussions on the concept of Big Data with Dr. Selda Konukcu and Dr. Berk Ozer are acknowledged. Last but not the least acknowledgements are due METU Research Asst. Oktay Sipahigil and Dr. Mehmet Karaca for organizing the material in the form of an e-poster. 8. REFERENCES [1] Hilbert, M., & Lépez, P. (2011). The World’s Technological Capacity to Store, Communicate, and Compute Information. Science, 332(6025), 60 - 65. http://xwm/.martinhilbert.net/Worldl nfoCapacity.html [2] Siscoe, G. (2000). The space- weather enterprise: Past, present, and future. Journal of Atmospheric and Solar-Terrestrial Physics 62(14):1223- 1232. [3] Boyd, D., & Crawford, K. (2012). Critical Questions for Big Data. Information, Communication and Society. Volume 15, 2012, Issue 5. [4] Kleijen, J.P.C. (1999). Validation of Models: Statistical Techniques and Data Availability. Simulation Conference Proceedings. 5-8 Dec. 1999. [5] Fan, J., Han, F., Liu, H. (2013) Challenges of Big Data Analysis. National Science Review. (2):293-314.