Weather Prediction Expert System Approaches Bulent KISKAC Harun YARDIMCI
Outline Weather Prediction Introduction Case-Based Weather Prediction Neural-Net Weather Prediction Hybrid Weather Prediction 11/22/2018
Weather Prediction Introduction Would you like to know the weather in advance? Automated prediction with distilled information allows non-meteorologist customers to reduce the threat of weather,confidently make decisions and plan activities. 11/22/2018
Weather Prediction Motivation Operational decisions in many organizations are strongly affected by meteorological phenomena. Day-to-day business processes require detailed weather information in real time and/or short-term predictions formatted to suit user’s needs. This information has to be reliable, easily understood and thoroughly customized. 11/22/2018
Weather Prediction Applications They can pinpoint areas where weather hazards impact an organization’s operations. They can detects,predicts and forecasts weather phenomena and hazards, utilizing state-of-the-science technologies, developed and licensed from leading weather research institutions and companies. 11/22/2018
Weather Prediction Applications Provides a complete solution for generating predictions and warnings. This is made possible by a powerful suite of detection and prediction algorithms that process data from multiple weather data streams. use a combination of image processing, expert systems, fuzzy logic, and sophisticated statistical techniques. 11/22/2018
Weather Prediction Applications Meteorological data sets assimilated in real time, run a series of algorithms, produced analysis and data that is stored to a database for display and warning. Provide highly accurate prediction capability for all weather phenomena.(%80-95) 11/22/2018
Weather Predictions Identification and prediction for the following extreme weather phenomena: Storms Hail swath Lightning Precipitation Tornado Flash flood Damaging Wind 11/22/2018
Seamless Suite of Forecasts Transportation Forecast Lead Time Warnings & Alert Coordination Watches Forecasts Threats Assessments Guidance Outlook Protection of Life & Property Space Operation Recreation Ecosystem State/Local Planning Environment Flood Mitigation & Navigation Agriculture Reservoir Control Energy Commerce Benefits Hydropower Fire Weather Health Forecast Uncertainty Minutes Hours Days 1 Week 2 Week Months Seasons Years Boundary Conditions Initial Conditions 11/22/2018
Forecasts types TAF Public Forecasts Marine Forecasts (Hourly,100feet ceiling ,400 meter ground visibility) Public Forecasts (Variable cloudiness this morning) Marine Forecasts (Fog patches forming this afternoon) TAF: Forecasts of the height of low cloud ceiling height are expected to be accurate to within 100 feet. horizontal visibility on the ground, when there is dense obstruction to visibility, such as fog or snow, are expected to be accurate to within 400 metres Forecasts of the time of change from one flying category to another are expected to be accurate to within one hour. 11/22/2018
Application Areas Example A correctly forecast timing of a ceiling and visibility event could be expected to result in a savings of approximately $480,000 per event at LaGuardia Airport Based on a related study, the U.S. National Weather Service estimated that a 30 minute lead-time for identifying cloud ceiling or visibility events could reduce the number of weather-related delays by 20 to 35 percent and that this could save between $500 million to $875 million An examination of the causes and effects of flight delays at the three main airports serving NewYork City concluded that a correctly forecast timing of a ceiling and visibility event (i.e., a significant change) could be expected to result in a savings of approximately $480,000 per event at La Guardia Airport (Allan et al. 2001). Based on a related study, the U.S. National Weather Service estimated that a 30 minute lead-time for identifying cloud ceiling or visibility events could reduce the number of weather-related delays by 20 to 35 percent and that this could save between $500 million to $875 million annually (Valdez 2000). 11/22/2018
Application Areas Example When ceiling and visibility at a busy airport are low, in order to maximize safety, the rate of planes landing is reduced. When ceiling and visibility at a destination airport are forecast to below at a flight's scheduled arrival time, its departure may be delayed in order to minimize traffic congestion and related costs. 11/22/2018
Case-Based Reasoning Meteorological view: CBR = analog forecasting AI view: CBR = retrieval + analogy + adaptation + learning CBR is a way to avoid the “knowledge acquisition problem.” CBR is very effective in situations “where the acquisition of the case-base and the determination of the features is straightforward compared with the task of developing the reasoning mechanism.” CBR and analog forecasting recommended when models are inadequate, e.g., ceiling and visibility, which are strongly determined by local effects below scale of current computer models. 11/22/2018
potential endless loop Classic CBR Flowchart CBR needs methods for acquiring domain knowledge for retrieval and adaptation. difficult problem potential endless loop 11/22/2018
k-Nearest Neighbor(s) Technique For a particular point in question, in a population of points, the k nearest points.” The closer the neighbors, the more useful they are for prediction. “It is reasonable to assume that observations which are close together (according to some appropriate metric) will have the same classification. It may be reasonable to weight the evidence of a neighbor close to an unclassified observation more heavily than the weight of another neighbor which is at a greater distance from the unclassified observation.” 11/22/2018
Fuzzy k-Nearest Neighbor(s) Technique basic measurement technique is fuzzy. avoidance of unrealistic absolute classification. “Increase the interpretability of results of retrieval because the overall degree of membership of a case in a class that provides a level of assurance to accompany the classification.” 11/22/2018
Weather Prediction Data Past airport weather observations, Consists of three parts: Data – weather observations and model-based guidance. Fuzzy similarity-measuring algorithm. Prediction composition – fairly trivial, predictions are based on selected percentiles of cumulative summaries of k nearest neighbors. Data Past airport weather observations, Recent and current observations. Numeric Weather Prediction based guidance. 11/22/2018
Algorithm: Collect Most Similar Analogs, Make Prediction Archive search is like contracting hyperellipsoid centered on present case. Axes measure differences weather elements between compared cases. “Distances” determined by fuzzy similarity-measuring functions, expertly tuned, all applied together simultaneously. Ceiling&Visibility evolution Forecast ceiling and visibility based on outcomes of most similar analogs. Spread in analogs helps to inform about appropriate forecast confidence. Climate archive Analog ensemble . . . . 11/22/2018
Knowledge Representation Category temporal cloud ceiling and visibility wind precipitation spread and temperature pressure Attribute date hour cloud amount(s) cloud ceiling height visibility wind direction wind speed precipitation type precipitation intensity dew point temperature dry bulb temperature pressure trend Units Julian date of year (wraps around) hours offset from sunrise/sunset tenths of cloud cover (for each layer) height in metres of ³ 6/10ths cloud cover horizontal visibility in metres degrees from true north knots nil, rain, snow, etc. nil, light, moderate, heavy degrees Celsius degrees Celsius kiloPascal × hour -1 11/22/2018
Prediction System – Data Structure and Case Retrieval Compose present case: recent obs + NWP Collect most similar past cases Present Case Recent past Time zero Future a(t0-p) ... a(t0) ... guidance Traversing Case Base Similarity measurement ... ... b(t0-p) ... b(t0) ... b(t0+p) ... ... Past Cases
Solution Features Fuzzy Set 11/22/2018
11/22/2018
11/22/2018
11/22/2018
11/22/2018
Ceiling and Visibility Forecast Forecast: ceiling and visibility based on 30%ile of analogs 11/22/2018
Ceiling and Visibility Forecast Probabilistic forecast: 10 %ile to 50%ile cig. and vis. from analogs. 11/22/2018
Verification Method Forecasts verified using standard performance measurement method, according to the average accuracy of forecasts in the 0-to 6 hour and the 0-to-24 hour projection period of significant flying categories. 11/22/2018
Artificial Neural Networks Motivation -Real-Time Operation: Neural network computations can be carried out in parallel. -Fault Tolerance by Redundant Information Coding: Destruction of parts of a network leads to the degradation of performance. However, some network capabilities in neural networks can be retained even with major network damage. 11/22/2018
Biological Neural Networks This figure displays the essential structure of a neuron: Effective connections activation function input output 11/22/2018
Biological Neural Networks Neurons constantly receive signals from these inputs and then perform their function. The neurons evaluate the voltages inputted to it and then, if the evaluated value is greater than some threshold value (meaning the excitatory influences are more dominant than the inhibitory influences acting on the neuron), the neurons fire. When firing, a voltage signal is generated and outputted along a structure called an axon. 11/22/2018
Basic Neuron Model Inputs xi arrive through pre-synaptic connections Synaptic efficacy is modeled using real weights wi The response of the neuron is a nonlinear function f of its weighted inputs 11/22/2018
Network Topology Feedforward Inputs Outputs Inputs Feedback Outputs 11/22/2018
Differences In Networks Feedforward Networks Solutions are known Weights are learned Evolves in the weight space Used for: Prediction Classification Function approximation Feedback Networks Solutions are unknown Weights are prescribed Evolves in the state space Used for: Constraint satisfaction Optimization Feature matching 11/22/2018
Inputs To Neurons Arise from other neurons or from outside the network Nodes whose inputs arise outside the network are called input nodes and simply copy values An input may excite or inhibit the response of the neuron to which it is applied, depending upon the weight of the connection 11/22/2018
Weights Represent synaptic efficacy and may be excitatory or inhibitory Normally, positive weights are considered as excitatory while negative weights are thought of as inhibitory Learning is the process of modifying the weights in order to produce a network that performs some function 11/22/2018
Output The response function is normally nonlinear Samples include Sigmoid Piecewise linear 11/22/2018
The Backpropagation Network The backpropagation network (BPN) is for classification or function approximation applications. Three (sometimes more) layers of neurons, Only feedforward processing: input layer hidden layer output layer, Sigmoid activation functions 11/22/2018
The Backpropagation Network BPN units and activation functions: 11/22/2018
Backpropagation Preparation Training Set A collection of input-output patterns that are used to train the network Testing Set A collection of input-output patterns that are used to assess network performance Learning Rate-η A scalar parameter, analogous to step size in numerical integration, used to set the rate of adjustments 11/22/2018
Network Error Total-Sum-Squared-Error (TSSE) Root-Mean-Squared-Error (RMSE) 11/22/2018
Learning in the BPN Before the learning process starts, all weights (synapses) in the network are initialized with pseudorandom numbers. We also have to provide a set of training patterns. They can be described as a set of ordered vector pairs {(x1, y1), (x2, y2), …, (xP, yP)}. Then we can start the backpropagation learning algorithm. This algorithm iteratively minimizes the network’s error by finding the gradient of the error surface in weight-space and adjusting the weights in the opposite direction (gradient-descent technique). 11/22/2018
Learning in the BPN Gradient-descent example: Finding the absolute minimum of a one-dimensional error function f(x): Repeat this iteratively until for some xi, f’(xi) is sufficiently close to 0. 11/22/2018
A Pseudo-Code Algorithm Randomly choose the initial weights While error is too large For each training pattern (presented in random order) Apply the inputs to the network Calculate the output for every neuron from the input layer, through the hidden layer(s), to the output layer Calculate the error at the outputs Use the output error to compute error signals for pre-output layers Use the error signals to compute weight adjustments Apply the weight adjustments Periodically evaluate the network performance 11/22/2018
Learning in the BPN Gradients of two-dimensional functions: the gradient is always pointing in the direction of the steepest increase of the function. In order to find the function’s minimum, we should always move against the gradient. 11/22/2018
Possible Data Structures Two-dimensional arrays Weights (at least for input-to-hidden layer and hidden-to-output layer connections) Weight changes (Dij) One-dimensional arrays Neuron layers Cumulative current input Current output Error signal for each neuron Bias weights 11/22/2018
Learning in the BPN If we choose the type and number of neurons in our network appropriately, after training the network should show the following behavior: If we input any of the training vectors, the network should yield the expected output vector (with some margin of error). If we input a vector that the network has never “seen” before, it should be able to generalize and yield a plausible output vector based on its knowledge about similar input vectors. 11/22/2018
Apply Inputs From A Pattern Apply the value of each input parameter to each input node Input nodes computer only the identity function Feedforward Inputs Outputs 11/22/2018
Calculate Outputs For Each Neuron Based On The Pattern The output from neuron j for pattern p is Opj where and k ranges over the input indices and Wjk is the weight on the connection from input k to neuron j Feedforward Inputs Outputs 11/22/2018
Calculate The Error Signal For Each Output Neuron The output neuron error signal dpj is given by dpj=(Tpj-Opj) Opj (1-Opj) Tpj is the target value of output neuron j for pattern p Opj is the actual output value of output neuron j for pattern p 11/22/2018
Calculate The Error Signal For Each Hidden Neuron The hidden neuron error signal dpj is given by where dpk is the error signal of a post-synaptic neuron k and Wkj is the weight of the connection from hidden neuron j to the post-synaptic neuron k 11/22/2018
Calculate And Apply Weight Adjustments Compute weight adjustments DWji at time t by DWji(t)= η dpj Opi Apply weight adjustments according to Wji(t+1) = Wji(t) + DWji(t) Some add a momentum term a*DWji(t-1) 11/22/2018
An Example: Exclusive “OR” Training set ((0.1, 0.1), 0.1) ((0.1, 0.9), 0.9) ((0.9, 0.1), 0.9) ((0.9, 0.9), 0.1) Testing set 11/22/2018
An Example (continued): Network Architecture inputs output(s) 11/22/2018
An Example (continued): Network Architecture Target output 0.9 0.1 Sample input 1 0.9 1 1 11/22/2018
Feedforward Network Training by Backpropagation: Process Summary Select an architecture Randomly initialize weights While error is too large Select training pattern and feedforward to find actual network output Calculate errors and backpropagate error signals Adjust weights Evaluate performance using the test set 11/22/2018
An Example (continued): Network Architecture ?? Actual output ??? Target output 0.9 0.1 ?? ?? Sample input 1 ?? 0.9 ?? 1 ?? 1 11/22/2018
11/22/2018
iteration 11/22/2018
iteration 11/22/2018
Supervised Learning in ANNs If an ANN has too few neurons, it may not have enough degrees of freedom to precisely approximate the desired function. If an ANN has too many neurons, it will learn the exemplars perfectly, but its additional degrees of freedom may cause it to show implausible behavior for untrained inputs; it then presents poor ability of generalization. Unfortunately, there are no known equations that could tell you the optimal size of your network for a given application; you always have to experiment. 11/22/2018
Rainfall Forecasting Test 11/22/2018
RainFall 11/22/2018
Rainfall Forecasting Test 11/22/2018
Hybrid Weather Prediction Two basic methods to predict weather: Dynamical - based upon equations of the atmosphere, uses finite element techniques, and is commonly referred to as computer modeling. Empirical - based upon the occurrence of analogs, or similar weather situations. In practice, hybrid methods used: Models + Observations Statistical methods infer estimated expected distributions under specified conditions. Theoretical distributions are fit to scanty data, e.g. normal distributions. 11/22/2018
Hybrid Weather Prediction Hybrid methods = Models + Observations Statistical methods infer estimated expected distributions under specified conditions. Theoretical distributions are fit to scanty data, e.g. normal distributions. 11/22/2018
Hybrid Forecast Decision Support Systems Hybrid forecast system development is a current direction of the Aviation Weather Research Program AWRP Terminal Ceiling and Visibility Product Development Team (PDT) project, Consensus Forecast System, a combination of: a physical column model Statistical forecast models, local and regional Satellite statistical forecast model 11/22/2018
Hybrid Forecast Decision Support Systems AWRP National Ceiling and Visibility PDT research initiatives Data fusion: intelligent integration of output of various models, observational data, and forecaster input using fuzzy logic Data mining, C5.0 pattern recognition software for generating decision trees based on data mining Analog forecasting using Euclidean distance development of daily climatology. Incorporate AutoNowcast of weather radar. Incorporate satellite image cloud-type classification algorithms. 11/22/2018
11/22/2018
Decision Support Systems Design Generic: no-name, conceptual design that could link and integrate the most useful elements of WIND, AVISA, MultiAlert, SCRIBE, FPA, URP, and so on in evolving WSP application Modular: shows where distinct sub-tools / agents can be developed. Working in this way, individual developers could work on isolated sub-problems and anticipate how to plug their results into a larger shared system. As technology inevitably improves, improved modules can be easily installed and quickly implemented. User-centered: forecast decision support systems from forecaster's point of view, designed to increase situational awareness. Hybrid: combines complementary sources of knowledge, forecasters and AI, to increase the quality of input data and output information. Intelligent integration of data, information, and model output, and use of adaptive forecasting strategies are intrinsic in this design. 11/22/2018
Weather Radar Nowcasts Graphic User Interface Intelligent Weather Systems Weather Radar Nowcasts RAP, Thunderstorm Auto-Nowcasting, www.rap.ucar.edu/projects/nowcast Human Input (> 15 min) Graphic User Interface AI works here Real-Time Data Algorithms Real-Time Data Preprocessing Fuzzy Logic Integration Algorithm Sensor Systems Quality Control Product Generator User Model Output Algorithms Data Assimilation Mesoscale Model Selective Climatological Input 11/22/2018
Conclusion Many decision makers who are responsible for outdoor activities, transportation and travel planning. When weather can seriously influence the operations of an organization, precisely prediction weather information on which to base management decisions that avoid injury, mitigate weather-related risk and leverage knowledge to competitive advantage. 11/22/2018
End of Presentation QUESTIONS? 11/22/2018