Download presentation
Presentation is loading. Please wait.
Published byMatilda Booth Modified over 9 years ago
2
Air pollution is the introduction of chemicals and biological materials into the atmosphere that causes damage to the natural environment. We focused on Sulfur Dioxide as a major contributor to air pollution. Sulfur is: Highly reactive gas Cause of acid rain Precursor to respiratory and cardiovascular problems Air pollution is an ongoing problem worldwide, now more than ever. We conduct a cross-sectional study of the air pollution levels in terms of Sulfur and related factors for 41 US cities using the means over the years 1969-1971. By running several regressions we attempt to determine the likely causes of air pollution.
3
CitySO 2 TemperatureManPopulationWindRainRainDays Phoenix1070.321358267.0536 Little Rock1361911328.248.52100 San Francisco1256.74537168.720.6667 Denver1751.9454515912.9586 Hartford5649.1412158943.37127 Wilmington365480 940.25114 Washington2957.34347579.338.89111 Jacksonville1468.41365298.854.47116 …….…. The data are means over the years 1969-1971.
4
1. City: City 2. SO 2 : Sulfur dioxide content of air in micrograms per cubic meter 3. Temp: Average annual temperature in degrees Fahrenheit 4. Man: Number of manufacturing enterprises employing 20 or more workers 5. Pop: Population size in thousands from the 1970 census 6. Wind: Average annual wind speed in miles per hour 7. Rain: Average annual precipitation in inches 8. RainDays: Average number of days with precipitation per year
5
Histogram of sulfur levels: Since the data has a high Jarque-Bera test and are positively skewed, sulfur levels are not normally distributed.
6
We ran a number of bi-variate regressions to find out which independent variables significantly explain SO 2 levels, both including and excluding dummy variables. Next we ran a multi-variate regression to see if the variables that we found to be significant are significant in explaining SO 2 levels when combined. We then tested for multicollinearity and lastly investigated an interesting problem.
9
Dependent Variable: SO2 Method: Least Squares Date: 11/22/09 Time: 14:03 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. TEMPERATURE-1.4081330.468595-3.0050120.0046 C108.571126.343714.1213280.0002 R-squared0.188009 Mean dependent var30.04878 Adjusted R-squared0.167189 S.D. dependent var23.47227 S.E. of regression21.42044 Akaike info criterion9.014119 Sum squared resid17894.58 Schwarz criterion9.097708 Log likelihood-182.7894 F-statistic9.030097 Durbin-Watson stat1.848386 Prob(F-statistic)0.004624 Temperature significantly explains SO 2 levels due to the high t-statistic and low p-values. The coefficient of temperature is negative meaning SO 2 levels decrease as temperature increases.
10
Dependent Variable: SO2 Method: Least Squares Date: 11/22/09 Time: 14:22 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. MAN0.0268590.0050995.2677880.0000 C17.610573.6915874.7704620.0000 R-squared0.415727 Mean dependent var30.04878 Adjusted R-squared0.400745 S.D. dependent var23.47227 S.E. of regression18.17025 Akaike info criterion8.684999 Sum squared resid12876.16 Schwarz criterion8.768588 Log likelihood-176.0425 F-statistic27.74959 Durbin-Watson stat1.721399 Prob(F-statistic)0.000005 Manufacturing Enterprises significantly explains SO 2 levels due to the high t-statistic and low p-values. The positive coefficient of man means that as number of manufacturing enterprises increases so do SO 2 levels.
11
Dependent Variable: SO2 Method: Least Squares Date: 11/22/09 Time: 14:16 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. POPULATION0.0200140.0056443.5461110.0010 C17.868324.7138443.7906040.0005 R-squared0.243818 Mean dependent var30.04878 Adjusted R-squared0.224429 S.D. dependent var23.47227 S.E. of regression20.67121 Akaike info criterion8.942912 Sum squared resid16664.66 Schwarz criterion9.026500 Log likelihood-181.3297 F-statistic12.57490 Durbin-Watson stat1.791243 Prob(F-statistic)0.001035 Population significantly explains SO 2 levels due to the high t-statistic and low p-values. The coefficient of population is positive meaning as population increases, so does SO 2.
12
Dependent Variable: SO2 Method: Least Squares Date: 11/22/09 Time: 14:07 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. WIND1.5557412.6190450.5940110.5559 C15.3565225.008590.6140500.5427 R-squared0.008966 Mean dependent var30.04878 Adjusted R-squared-0.016445 S.D. dependent var23.47227 S.E. of regression23.66448 Akaike info criterion9.213378 Sum squared resid21840.30 Schwarz criterion9.296967 Log likelihood-186.8743 F-statistic0.352849 Durbin-Watson stat1.818109 Prob(F-statistic)0.555935 Wind does not significantly explain SO 2 levels as can be seen by the low t-statistic and low R-square. It thus makes sense to take the wind variable out of our regression model.
13
Dependent Variable: SO2 Method: Least Squares Date: 11/22/09 Time: 14:14 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. RAIN0.1082620.3188220.3395690.7360 C26.0680912.294922.1202330.0404 R-squared0.002948 Mean dependent var30.04878 Adjusted R-squared-0.022618 S.D. dependent var23.47227 S.E. of regression23.73623 Akaike info criterion9.219433 Sum squared resid21972.94 Schwarz criterion9.303022 Log likelihood-186.9984 F-statistic0.115307 Durbin-Watson stat1.820565 Prob(F-statistic)0.736003 Rain does not significantly explain SO 2 levels due to the low t-statistic and low R-squared. We thus remove the wind variable from our regression model.
14
Dependent Variable: SO2 Method: Least Squares Date: 11/22/09 Time: 14:09 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. RAINYDAYS0.3272600.1317602.4837610.0174 C-7.22696315.39914-0.4693100.6415 R-squared0.136577 Mean dependent var30.04878 Adjusted R-squared0.114438 S.D. dependent var23.47227 S.E. of regression22.08842 Akaike info criterion9.075534 Sum squared resid19028.03 Schwarz criterion9.159123 Log likelihood-184.0485 F-statistic6.169068 Durbin-Watson stat1.970233 Prob(F-statistic)0.017404 RainyDays does significantly explain the SO 2 levels due to the high t- statistic and low p-value. The coefficient of rainydays is positive meaning the SO 2 levels will increase as the number of rainy days increases.
15
Dependent Variable: SO2 Method: Least Squares Date: 12/03/09 Time: 00:56 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. TEMPERATURE-0.4172430.391666-1.0653040.2938 RAINYDAYS0.1276340.1007131.2673080.2132 POPULATION-0.0439290.015398-2.8529370.0071 MAN0.0681790.0161114.2319090.0002 C33.9399127.916321.2157730.2320 R-squared0.629094 Mean dependent var30.04878 Adjusted R-squared0.587882 S.D. dependent var23.47227 S.E. of regression15.06835 Akaike info criterion8.376920 Sum squared resid8173.989 Schwarz criterion8.585892 Log likelihood-166.7269 F-statistic15.26491 Durbin-Watson stat1.543633 Prob(F-statistic)0.000000
17
Dependent Variable: RAINYDAYS Method: Least Squares Date: 11/22/09 Time: 14:35 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. TEMPERATURE-1.5778400.530112-2.9764270.0050 C201.888229.802126.7742890.0000 R-squared0.185108 Mean dependent var113.9024 Adjusted R-squared0.164214 S.D. dependent var26.50642 S.E. of regression24.23253 Akaike info criterion9.260819 Sum squared resid22901.40 Schwarz criterion9.344408 Log likelihood-187.8468 F-statistic8.859119 Durbin-Watson stat1.233606 Prob(F-statistic)0.004989 Multicollinearity does exist because the two variables are significantly correlated; they have a high t-statistic and high R-square. RainyDays and Temperature are negatively correlated, as temperature goes up, rainy days goes down.
18
Dependent Variable: SO2 Method: Least Squares Date: 11/22/09 Time: 14:44 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. TEMPERATURE-1.0943400.512399-2.1357170.0392 RAINYDAYS0.1988750.1397201.4233830.1628 C68.4205438.365111.7834050.0825 R-squared0.229110 Mean dependent var30.04878 Adjusted R-squared0.188537 S.D. dependent var23.47227 S.E. of regression21.14411 Akaike info criterion9.010956 Sum squared resid16988.80 Schwarz criterion9.136339 Log likelihood-181.7246 F-statistic5.646841 Durbin-Watson stat1.934916 Prob(F-statistic)0.007126 Since multicollinearity exists, we cannot look at the t-statistic for a regression using these two variables as the independent variables. We can however, continue to use the F- statistic to determine if these two variables collectively significantly impact SO 2 levels. As it turns out we cannot tell which variable significantly impacts the SO 2 level.
19
Box plot indicating the two outliers : Providence (94) and Chicago (110) Smallest = 8 (Wichita) Q1 = 12.5 Median = 26 (Richmond) Q3 = 35.5 Largest = 110 (Chicago) IQR = 23 94110
21
Dependent Variable: SO2 Method: Least Squares Date: 12/02/09 Time: 14:26 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. TEMPERATURE-1.0464090.342685-3.0535620.0042 C261.3169615.760313.8905930.0004 C177.9448115.734614.9537160.0000 C85.0034719.363504.3898810.0001 R-squared0.600423 Mean dependent var30.04878 Adjusted R-squared0.568025 S.D. dependent var23.47227 S.E. of regression15.42711 Akaike info criterion8.402598 Sum squared resid8805.845 Schwarz criterion8.569776 Log likelihood-168.2533 F-statistic18.53262 Durbin-Watson stat1.893874 Prob(F-statistic)0.000000 Temperature still significantly explains SO 2 levels due to the high t-statistic and low p-values.
22
Dependent Variable: SO2 Method: Least Squares Date: 12/02/09 Time: 14:30 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. MAN0.0258360.0072843.5467640.0011 C268.9149515.106304.5620010.0001 C17.38047226.275150.2808920.7804 C16.223233.7240414.3563510.0001 R-squared0.626658 Mean dependent var30.04878 Adjusted R-squared0.596387 S.D. dependent var23.47227 S.E. of regression14.91206 Akaike info criterion8.334685 Sum squared resid8227.670 Schwarz criterion8.501863 Log likelihood-166.8610 F-statistic20.70163 Durbin-Watson stat1.877703 Prob(F-statistic)0.000000 Manufacturing Enterprises still significantly explains SO 2 levels due to the high t-statistic and low p-values.
23
Dependent Variable: SO2 Method: Least Squares Date: 12/02/09 Time: 14:31 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. POPULATION0.0121830.0071031.7152590.0947 C272.1469117.029454.2365970.0001 C149.2826926.159931.8839000.0675 C19.672304.7196004.1682140.0002 R-squared0.536577 Mean dependent var30.04878 Adjusted R-squared0.499002 S.D. dependent var23.47227 S.E. of regression16.61396 Akaike info criterion8.550832 Sum squared resid10212.88 Schwarz criterion8.718010 Log likelihood-171.2921 F-statistic14.28020 Durbin-Watson stat1.799458 Prob(F-statistic)0.000002 Population no longer significantly explains SO 2 levels due to the low t- statistic and high p-values.
25
Dependent Variable: SO2 Method: Least Squares Date: 12/02/09 Time: 14:28 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. WIND-0.3930121.937653-0.2028290.8404 C268.1166717.628743.8639560.0004 C184.0380717.581384.7799460.0000 C30.0492618.402611.6328810.1110 R-squared0.500282 Mean dependent var30.04878 Adjusted R-squared0.459765 S.D. dependent var23.47227 S.E. of regression17.25229 Akaike info criterion8.626234 Sum squared resid11012.73 Schwarz criterion8.793412 Log likelihood-172.8378 F-statistic12.34727 Durbin-Watson stat1.712120 Prob(F-statistic)0.000010 Wind still does not significantly explain SO 2 levels as can be seen by the low t-statistic and low R-square.
26
Dependent Variable: SO2 Method: Least Squares Date: 12/02/09 Time: 14:30 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. RAIN0.0709500.2324400.3052410.7619 C267.2100317.516813.8368870.0005 C183.7996317.467544.7974490.0000 C23.756848.9606942.6512280.0117 R-squared0.500983 Mean dependent var30.04878 Adjusted R-squared0.460522 S.D. dependent var23.47227 S.E. of regression17.24018 Akaike info criterion8.624830 Sum squared resid10997.28 Schwarz criterion8.792008 Log likelihood-172.8090 F-statistic12.38194 Durbin-Watson stat1.716316 Prob(F-statistic)0.000009 Rain still does not significantly explain SO 2 levels due to the low t- statistic and low R-squared.
27
Dependent Variable: SO2 Method: Least Squares Date: 12/02/09 Time: 14:29 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. RAINYDAYS0.2784140.0926443.0051930.0047 C264.4142815.710034.1002000.0002 C181.2495215.693495.1772760.0000 C-5.21599810.79510-0.4831820.6318 R-squared0.597879 Mean dependent var30.04878 Adjusted R-squared0.565274 S.D. dependent var23.47227 S.E. of regression15.47614 Akaike info criterion8.408944 Sum squared resid8861.907 Schwarz criterion8.576122 Log likelihood-168.3834 F-statistic18.33736 Durbin-Watson stat1.897244 Prob(F-statistic)0.000000 Rainy Days still significantly explains the SO 2 levels due to the high t- statistic and low p-value.
28
Dependent Variable: TEMPERATURE Method: Least Squares Date: 12/02/09 Time: 15:38 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. RAINYDAYS-0.1141680.040132-2.8447970.0072 C2-4.7204166.805350-0.6936330.4922 C1-4.4629196.798183-0.6564870.5156 C68.991374.67627614.753490.0000 R-squared0.204186 Mean dependent var55.76341 Adjusted R-squared0.139660 S.D. dependent var7.227716 S.E. of regression6.704032 Akaike info criterion6.735763 Sum squared resid1662.930 Schwarz criterion6.902941 Log likelihood-134.0831 F-statistic3.164421 Durbin-Watson stat1.108636 Prob(F-statistic)0.035732 Multicollinearity still exists because the two variables are significantly correlated.
29
Dependent Variable: SO2 Method: Least Squares Date: 12/02/09 Time: 15:30 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. TEMPERATURE-0.7418910.364337-2.0362770.0491 RAINYDAYS0.1937140.0981861.9729300.0562 C260.9122515.179594.0127730.0003 C177.9385215.153455.1432850.0000 C45.9680927.188691.6907060.0995 R-squared0.639411 Mean dependent var30.04878 Adjusted R-squared0.599346 S.D. dependent var23.47227 S.E. of regression14.85731 Akaike info criterion8.348710 Sum squared resid7946.626 Schwarz criterion8.557682 Log likelihood-166.1486 F-statistic15.95916 Durbin-Watson stat1.971757 Prob(F-statistic)0.000000 According to the F-statistic temperature and rainy days are significantly related to SO 2 levels. However since multicollinearity exists we cannot refer to the t-statistic and therefore do not know how significant each variable is.
30
Our final model includes the two dummy variables. This regression model has a significant F-statistic and a small p-value. Dependent Variable: SO 2 Method: Least Squares Date: 12/02/09 Time: 14:37 Sample: 1 41 Included observations: 41 VariableCoefficientStd. Errort-StatisticProb. TEMPERATURE-0.6180180.326611-1.8922180.0668 RAINYDAYS0.1655920.0878421.8851080.0677 MAN0.0212870.0065933.2285860.0027 C263.0364213.529564.6591610.0000 C116.0216923.447210.6833090.4989 C33.8640224.493221.3825870.1756 R-squared0.722158 Mean dependent var30.04878 Adjusted R-squared0.682467 S.D. dependent var23.47227 S.E. of regression13.22665 Akaike info criterion8.136803 Sum squared resid6123.048 Schwarz criterion8.387570 Log likelihood-160.8045 F-statistic18.19420 Durbin-Watson stat1.962021 Prob(F-statistic)0.000000
31
Histogram of sulfur levels with dummy variables: The data has a low Jarque-Bera test, a high probability and is slightly positively skewed, so sulfur levels are normally distributed.
32
According to the figure above, there is an indication of heteroskedasticity. However since this is a cross sectional analysis, it does not have a significant impact on our final regression.
33
From our regression model, we find that temperature, rainy days and manufacturing all have a significant effect on SO 2 levels, explaining 72% of the sulfur levels. Out of the three variables however, manufacturing enterprises is the most significant explanatory variable. Economic Impact: Given that SO 2 is a threat to human wellbeing and the environment, lowering the SO 2 levels can reduce future costs. SO 2 pollution is preventable as it stems from human activity. Lower SO 2 levels could be achieved by future restrictions on the number of manufacturing enterprises or on the emission levels of SO 2 they release.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.