Presentation is loading. Please wait.

Presentation is loading. Please wait.

From the NEAR EARTH SPACE / SPACE WEATHER Window the BIG DATA ERA Yurdanur Tulunay, METU Dept. of Aerospace Engineering, Ankara Ersin Tulunay, METU Dept.

Similar presentations


Presentation on theme: "From the NEAR EARTH SPACE / SPACE WEATHER Window the BIG DATA ERA Yurdanur Tulunay, METU Dept. of Aerospace Engineering, Ankara Ersin Tulunay, METU Dept."— Presentation transcript:

1 From the NEAR EARTH SPACE / SPACE WEATHER Window the BIG DATA ERA Yurdanur Tulunay, METU Dept. of Aerospace Engineering, Ankara Ersin Tulunay, METU Dept. of Electrical Engineering, Ankara ABSTRACT Near Earth Space processes are mostly nonlinear and time varying. Therefore, data driven models have proven to be more attractive in modelling such systems to be employed in parallel to the mathematical models. At present, one of the urgent issues is the development of new signal processing techniques to extract manageable representative data out of the relevant “big data” to be used in “training”, “testing” and “validation” phases of modeling. In this poster our intention is to stress that it is time to get ready! 1. INTRODUCTION Near Earth Space processes are mostly nonlinear and time varying. Therefore, data driven models, as scientists call “evidence-based decision making” , have proven to be more attractive in modelling such systems to be employed in parallel to the mathematical models based on the first physical principles. We have been dealing with the data driven models since around Even in those times, which may be considered as relatively recent, it was difficult to access independent representative data to be employed in the “training”, “testing”, and “validation” phases. As we understand, from the concept of “metrics” it is the term covering the well set-up criteria to compare the performances of various data driven models of the same process. The existence and availability of reliable data sources are vital in both scientific and technological developments. During the recent years, mainly, due to the developments in digital electronics and space technologies, huge amount of data have been obtained as the results of space and Earth bound measurement and monitoring campaigns. Thus the term of “big data “have become an important issue. Although, “big data size” is a constantly moving target it is the time to develop a set of new techniques and technologies with new models and integration to extract representative characteristics of data sets that are diverse, complicated, and of a big scale.

2 Connection (sensor and networks) Cloud (computing and data on demand)
The basic characteristics of big data involve “high volume- amount of data”, “velocity-speed of data in and out”, and “variety-range of data types and sources”. At present, from the model making point of view, one of the urgent issues is the development of new signal processing techniques to extract manageable representative data out of the relevant big data ranging from a few dozen terra bytes, to for example, many peta bytes (as of 2012). For example, one will need to improve, “pattern recognition” techniques and identify the “outliers” depending on the objective of the action. In this poster our intention is to stress that it is time to get ready! 2. METADATA AND BIG DATA First, we would like to note on what we understand concerning Meta Data (MD) and Big Data (BD). Simply MD is the data of data. Whereas BD is the details included in the items mentioned in MD. Big Data can be described by 5V (Volume; Variety; Velocity; Variability; Veracity). Cyber - physical systems may have 6C systems : Connection (sensor and networks) Cloud (computing and data on demand) Cyber (model and memory) Content/ context (meaning and correlation) Community (sharing and collaboration) Customization (personalization and value)

3 3. SPACE WEATHER Based on our experience on the COST; SPECIAL; IHY; FP6;FP7 and some other EU/EC bound Actions between and , we had observed that data management is important, in particular, in contemporary subjects such as NES and SpW . The Sun is the main driver for SpW. There are various types solar variabilities with time scales ,say , from miliseconds to hundred of years. Therefore, taxonomy is extremely important in data management in terms of MD and BD. Quoting from Siscoe (2000), ”To be able to talk about the network of space vulnerable , technological entities upon which humankind is becoming increasingly dependent , I suggest referring to it as the cyber - electrosphere The cyber - electrosphere is defined as the global totality of all space – vulnerable - electrically enabled technological systems. If we could see the network comprised of this totality by itself - see the satellite links, the cable links, the navigation and positioning links, the electric power grids, and the radio links – the image would resemble a picture of Grey’s Anatomy showing the central nervous system or the circulation system. Here the sensory , information and energy transfer network does not belong to the human body to an abstract,but are interconnected global entity – cyber - electrosphere.” 4. DATA DRIVEN MODELS The “Middle East Technical University (METU) Data driven Models” had proven to be powerful in forecasting the parameters of the non-linear processes including the ionospheric , magnetospheric processes. These models consist of Neural Network (NN), Neuro Fuzzy Networks (NFN), Genetic Programming (GP) and Cascade Models. The decisions based on the analyses of the big - retrospective NES/ SpW- data, models, algorithms can forecast / predict future development if future is similar to the past. If the system dynamics of future change, i.e. the process is not stationary, the past will not be able to model future. In the case of a changing environment, making forecast / predictions make it necessary the thorough understanding of the systems dynamics which requires advanced theory. However, to develop such a sophisticated theory is prohibitingly difficult. In such cases, as a parallel means to analitical approach, the data driven techniques such as NNs, NFNs etc. based attempts are proposed for process modelling.

4 The existence and availability of reliable data sources are vitai in both scientific and technological developments. In recent years mainly due to the developments in electronics and space technologies, huge amount of data are obtained as the result of space and Earth bound measurement and monitoring campaigns campaigns. In the past, scarcity of data was one of the main handicaps in the scientific and technological developments. However, today one of the main problems is the lack of full use of available data due to the difficulties in organization of collection and cataloguing of MD and making them available via BD environment. 5. TRAINING, TESTING, VALIDATION AND COMPARISON OF MODELS An important aspect in data driven modelling is the availability of independent data which represents the process. BD opens an era which helps in finding useful data. However the question “will large-scale search data help us to create better tools?” is also pending. On the other hand, although BD contains huge amount of data, still we may come across the following situations: (i) no data, (ii) only output data and (iii) both input and output data. Special statistical procedures must be developed to cope with such situations. Big data bring new possibilities to training, testing and validation. BD also makes it possible to discover the fine structures of the processes so that they can be included in the model. Discovering the fine structures is also very important for validation and to create metrics for comparison of the models. However, massive sample size and high dimensionality of BD bring the necessity for developing new computational and statistical paradigm. In developing and using the computational techniques in data driven modelling, the storage, speed and cost issues must also be taken into account.

5 it is time to get ready! 8. REFERENCES
6. CONCLUSIONS At present, from the training, test and validation viewpoints in model making and in developing metrics for model comparison, one of the urgent issues is the development of new statistical processing techniques to extract manageable representative data out of the relevant BD. Eventhough, BD provides an opportunity to observe the process from a wider point of view. However, extracting meaningfull results by using classical signal processing methods is not possible. Therefore, new suitable data processing aproaches and methods are needed in particular concerning contemporary data driven based approaches. Summarising: Sampling methods are needed which are suitable to the nature of collected huge amount of data. Thus, representative sparse data are obtained from the dense data. Then, the obtained small packages are analysed via mini batch processing. For example, one will need to improve, “pattern recognition” techniques and identify the “outliers” depending on the objective of the action. In the cases where recording and offline processing are not suitable, online processing techniques are needed to analyse the continuously flowing data. Accordingly, metrics and validation methods also need to change. it is time to get ready! 7. ACKNOWLEDGEMENTS The valuable mutual discussions on the concept of Big Data with Dr. Selda Konukcu and Dr. Berk Ozer are acknowledged. Last but not the least acknowledgements are due METU Research Asst. Oktay Sipahigil and Dr. Mehmet Karaca for organizing the material in the form of an e-poster. 8. REFERENCES [1] Hilbert, M., & Lépez, P. (2011). The World’s Technological Capacity to Store, Communicate, and Compute Information. Science, 332(6025), nfoCapacity.html [2] Siscoe, G. (2000). The space- weather enterprise: Past, present, and future. Journal of Atmospheric and Solar-Terrestrial Physics 62(14): [3] Boyd, D., & Crawford, K. (2012). Critical Questions for Big Data. Information, Communication and Society. Volume 15, 2012, Issue 5. [4] Kleijen, J.P.C. (1999). Validation of Models: Statistical Techniques and Data Availability. Simulation Conference Proceedings. 5-8 Dec [5] Fan, J., Han, F., Liu, H. (2013) Challenges of Big Data Analysis. National Science Review. (2):


Download ppt "From the NEAR EARTH SPACE / SPACE WEATHER Window the BIG DATA ERA Yurdanur Tulunay, METU Dept. of Aerospace Engineering, Ankara Ersin Tulunay, METU Dept."

Similar presentations


Ads by Google