Mathematics SL Internal Assessment IA Portfolio Type II Task Mathematical Modeling Mr. Wai 2012
Population Trends in China Aim: In this task, you will investigate different functions that best model the population of China from 1950 to 1995. The following table shows the population of China from 1950 to 1995. Year 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 Population (millions) 554.8 609.0 657.5 729.2 830.7 927.8 998.9 1070.0 1155.3 1220.5
Parameters will be provided when a mathematical model is proposed. Define all relevant variables and parameters clearly. Use technology to plot the data points from the above table on a graph. You should define elapsed time t, in years, since 1950 as the independent variable, and population P, in millions of habitants, as the dependent variable. Things to note here: You should identify the dependent and independent variables with a variable and the unit that it is measured in clearly and correctly. “The independent variable is elapsed time t, in years, since 1950” is better than “the independent variable is time” “The independent variable is population P, in millions of habitants” is better than “the independent variable is population” ALL variables representing a quantity, such as t and P, are italicized. Parameters will be provided when a mathematical model is proposed.
Comment on any apparent trends shown in the graph Comment on any apparent trends shown in the graph. What type of functions could model the behaviour of the graph? Explain your choices. Create a scatter plot based on your selected dependent and independent variable. Graph 1. Population in China from 1950 to 1995
You should explain the behaviour and graphical pattern observed. Comment on any apparent trends shown in the graph. What type of functions could model the behaviour of the graph? Explain your choices. You should explain the behaviour and graphical pattern observed. It is increasing rate of change is increasing for the first half, then it seems to decrease A linear function, a cubic function, or an exponential function seem reasonable based on the apparent trends shown. Things to note in this section: There is no correct answer here, as long as your choice is accompanied with a reasonable justification. The choice of mathematical model made here is only preliminary, there will be room for improvement later. Make sure the graph takes up at least half a page. Make sure all Graphs and Tables are titled and numbered. (Graph 1. Population in China from 1950 to 1995) This makes it easier to refer to them.
Analytically develop one model function that fits the data points on your graph. At this stage of the course, you have learned about many different types of functions. Suppose I have chosen cubic function as my model. However, you have not learned exactly HOW and WHY regression works, thus the “baby way” to analytically develop a cubic function to model population as a function of time, P(t) = at3 + bt2 + ct + d, where a, b, c, d, are the parameters of the cubic model, and P and t are the independent and dependent variables respectively, is to work out a system of equations where four of the data points are the solutions to my cubic model.
Analytically develop one model function that fits the data points on your graph. I have chosen P5(5, 609.0), P15(15, 729.2), P30(30, 998.9), P40(40, 1155.3) to be the four points. Why do you think I chose those four points?
Using the four chosen data points, we can establish a system of equations, using our cubic model P(t) = at3 + bt2 + ct + d P(5) = (5)3a + (5)2b + (5)3c + d = 609.0 P(15) = (15)3a + (15)2b + (15)3c + d = 729.2 P(30) = (30)3a + (30)2b + (30)3c + d = 998.9 P(40) = (40)3a + (40)2b + (40)3c + d = 1155.3 Thus we have the matrix system AX = B, where where A is the coefficient matrix, B column vector on the right side, X is the column vector (a b c d)T.
The system is solved by applying matrix algebra: X = A–1B. Thus the cubic mathematical model obtained analytically is P(t) = –0.009486t3 + 0.7127t2 + 0.8491t + 588.1 I chose 4 significant digits for my parameters because the data is given to 4 significant digits.
Analytically develop one model function that fits the data points on your graph. Suppose I chose an exponential function instead, because I think population growth exponentially. P(t) = abt, where a, and b are the parameters of the model. Using the two chosen data points, we can establish a system of equations, with our exponential model P(t) = abt P(5) = ab5 = 609.0 P(40) = ab40 = 1155.3 Solving the system of equation below will give the solution a ≈ 555.8 and b ≈ 1.018 Thus the exponential model obtained analytically using this “baby way” is P(t) = 555.8(1.018)t
On a new set of axes, plot your model and the original data On a new set of axes, plot your model and the original data. Comment on how well your model fits the original data. Revise your model if necessary. Graph 2 shows the original data, the cubic model (red curve) and the exponential model (green curve) both of which are created analytically. Graph 2. Mathematical models for the population of China from 1950 to 1995
We can observe that both model fits the data fairly well. The cubic functions pass through more data points. (at least 4, because that is how we analytically arrived at the four parameters of the cubic function). However it is reasonable to say that the cubic model will underestimate the population of China based on the behaviour of the data. The exponential function passes through less data points, and they appear to be further away, especially in 1950, 1970 and 1975. Also it is reasonable to say that the exponential model will overestimating the population of China based on the behaviour of the data.
Use technology to find another function that models the data Use technology to find another function that models the data. On a new set of axes, draw both your model functions. Comment on any differences. In the task of Population Trends in China, students are instructed to use logistic regression on a Graphic Display Calculator to model the data. Doing so on the calculator you will arrive at a logistic function as follow:
On a new set of axes, plot the logistic model and the original data On a new set of axes, plot the logistic model and the original data. Comment on how well this model fits the original data. Graph 3. Logistic Regression model created on a GDC
Keys things to note: The logistic function eliminates the main short coming in both the cubic function and exponential function, which is that both cubic function and exponential function does not appear to be able to predict the population beyond 1995 satisfyingly since they contradict each other, thus both of them cannot be correct at the same time. We should still discuss the implication of each of the three models we examined in terms of population growth for China in the future. In the exponential model, one implication is that the population in China will continue to grow indiscriminately, which is not probable, since resources are limited. In the cubic model, one implication is that the population will start to decline beyond 1955, which is not an expected behaviour of any population. In the logistic model, without future data it is hard to conclude quantitatively that it is the superior model. However, it does eliminate the immediate short coming of both the cubic and logistic model.
Here are additional data on population trends in China from 2008 World Economic Outlook published by the International Monetary Fund (IMF) Comment on how well each of the models above fit the IMF data for the years 1983-2008 Year 1983 1992 1997 2000 2003 2005 2008 Population (millions) 1030.1 1171.7 1236.3 1267.4 1292.3 1307.6 1327.7 Create a graph using the original data and the additional data on population in China along with the three models that we have came up with. Then comment on how well each models fit the IMF data and the original data based on the graph.
Graph 4. Comparing all three models for population in China from 1950 to 2008
Discuss the quality of fit for each of the functions with the new data points specifically, then in general (total data pool), and lastly if any of them addresses the concerns we had before. For an example, you can discuss how the two data points that are within the original time range how they still fit very well with all three functions. However beyond 1995, the additional data did show that our original suspicion that the exponential function will overestimate the population where the cubic function will underestimate the population. Even though the logistic overestimates the population beyond year 2000, it is still the best model out of the three that was being considered. Thus…
Thus… We should use a logistic model to find the curve that best model our data set. To improve our current model, we must include the additional data from IMF to make adjustment to the parameters in the logistic model. Entering the new data, the final logistic model will be as follows: However, you must note that there is still limitations to our refined model with the modified parameters. Since this model is created using a finite number of data (58 years), it is not reasonable to use the model to predict the population of China in 30 years. (maybe quickly explain why 30 years)
To Sum Up Remember that IB is looking for the process, on how you came to conclude that which ever model you selected in the end is reasonable and justifiable. Emphasize at each stage how your current model fit the data, and how it can be improved. As an activity, you should go through the slides and locate where I have included items that satisfies all the requirements for the following: Criterion C: correctly defined variables, parameters, and constraints, and analyze them to enable the formulation of mathematical model so that the model can be applied to additional data while taking into consideration on how well it fits the data. Criterion D: correctly and critically interpreted the reasonableness of the results of the model in the context of the task, to include possible limitations and modifications of the results, to the appropriate degree of accuracy.
Special Mentions Professor Carlos Abel Eslava Carrillo from Tecnológico de Monterrey – Campus Eugenio Garza Lagüera