Download presentation
Presentation is loading. Please wait.
Published byRudolph Joseph Modified over 9 years ago
2
In this chapter we focus on how data are constructed and where they may be found. Collecting and manipulating data is the key part of an empirical research project. A research project is nothing without adequate data and an original testable hypothesis. We, as researchers, have to be sure that there is enough data to adequately test our hypothesis. Otherwise, we might have experienced the dullness(!) of investing a great deal of time and effort just to see that the data are not available to test our painstaking hypothesis.
3
The vast majority of data are constructed rather than collected. For this reason, statistics is made up not only of facts but also of knowledge which is created.
4
Best (2001) identifies 3 steps in the construction of a data series: Defining the concept Deciding how the concept will be measured, and Determining how to define the sample on which the data will be based.
5
Every data series is constructed for a specific purpose. However, a given data series may not be defined or measured in a way that best matches your needs. So, sometimes [or probably most of the time(!)] you may need to construct your own data.
6
Most social science statistics are based on sample data rather than populations. For example, average family income is not the average income of ALL families; rather, it is the average income of families in the sample.
7
It is important to distinguish between those organizations that collect or produce data and those that publish it.
8
Data comes in 3 forms: time-series, cross- section, and longitudinal (panel) data. Time- series data gives different observations or data points on the same variable at different points accross time (ex.: Turkish GDP per capita over the time period 1923-2009). Cross-section data, by contrast, gives different observations of a comparable variable at the same point in time (ex.: average disposable personal income across different cities of Turkey for 2009). Longitudinal (panel) data take a cross-section sample and follow it over time (ex.: a sample of family income for the same 10 families over 5 years).
9
Longitudinal data is an example of a micro data set, since the data points or observations are of individual economic agents such as individuals, households, or firms. Macro data are compiled at national level. Besides, the frequency of data changes as well. You may find daily, weekly, monthly, quarterly, or annual data.
10
A number of US governmental, international, and private organizations gather economic and social statistics. There are; Census bureau (www.census.gov) Bureau of economic analysis (www.bea.doc.gov) Bureau of labor statistics (www.bls.gov) The federal reserve (www.federalreserve.gov) International agencies; International Monetary Fund Worlg Bank OECD Eurostat Asian Development Bank Inter-American Development Bank
11
For Turkish data sources; Central Bank Of the Republic of Turkey (www.tcmb.gov.tr) Turkish Statistical Institute (www.tuik.gov.tr) State Planning Organization of Turkey (www.dpt.gov.tr)
12
US national income and product accounts (official national accounts of the US). US flow of funds accounts (data on financial flows across the US economy) US balance of payments accounts and international investment position of the US. US census of population and integrated public use microdata series. Current population survey. Current employment statistics. The economic census. Annual survey of manufacturers. Current industrial reports. American housing survey. Consumer expenditure survey. National longitudinal surveys. Panel study of income dynamics. Surveys of consumers. Survey of consumer finances.
13
These sources are usually more user friendly compared to the primary sources. Economic report of the president. Economagic. FRED II (federal reserve economic data) (an excellent source for US macro and financial data). Stat-USA/State of the nation. Inter-university Consortium for political and social research. International financial statistics (principal data set of the IMF). World economic outlook database. Penn world tables. Joint BIS-IMF-OECD-WB statistics on external debt. Eurostat OECD main economic indicators and national accounts
15
Empirical research can be divided into 2 types: experimental and survey (nonexperimental). In the first one, the data come from the experiment. Collecting the data is the major part of the study. For the latter, we use preexisting data. Researchers generally donot put the same care and effort into it. This is undoubtedly a huge mistake!
16
It is a good idea to start with a search strategy. We have 2 steps:
17
You need to have a large sample size (large enough to obtain statistically valid empirical test results). The second issue is that of a random or representative sample which will be discussed in details in the 10th chapter. The third one is obtaining data that correctly measure the concepts that your theory deems important. Once you have determined your list of desired variables, the next step is to think about where those data are likely to be found. To summarize step 1 by raising questions; What are the desired variables? How should each variable be defined? What data frequency and sample period or what level of analysis? What are potential sources for data on each variable?
18
As you begin to investigate each data source, you need to ask several questions What data are in fact available? If the data are not the ideal, are they good enough? If the data are not acceptable, is there an available proxy? (a proxy is a variable that should behave roughly the same as your theoretical variable). If there is no adequate proxy, how can the hypothesis be reformulated to make it testable, given the data available?
19
Data for any variable may be found in various forms some of which are listed below: Levels Per capita (per person) Changes Rates of change (growth rates) Annualized growth rates Proportions Nominal Real Index numbers
20
This is the most basic form. It is the actual value or size of the variable being measured (ex.: level of Turkish GDP per capita in 2009 is TLX). Researchers often use per capita form of the variable which is found by dividing the level of the variable by the appropriate population.
21
Sometimes it is more useful to examine the change in a variable than the level. ex.: Say that GDP of Turkey in 2008 and 2009 are X and Y respectively (in TL). Then, the change between 2008 and 2009 would be (Y-X). A more meaningful evaluation would be made by calculating the rate of change (percentage change or the growth rate). If we turn to the example, the rate of change between 2008 and 2009 would be calculated as follows: G= [(Y - X) / X]*100
22
For periods of time shorter than a year, annualized growth rate is used. Let us give an example: Assume that the sales of a company grows by 10% each quarter and that the beginning value for sales is TL100. Then, 1.10*100= TL110 for the 1st Q. 1.10*110= TL121 for the 2nd Q. 1.10*121= TL133.1 for the 3rd Q. And finally, 1.10*133.1= TL146.4 for the final Q. So, the annualized growth would be 46.4% which is 6.4% more than the rough approximation (10%*4= 40%).
23
To formulate this rate: Gq = [(X1/X0)^4 – 1]*100 (X0: initial value; X1: next period’s value) In our example: [(110/100)^4 – 1]*100= %46.4. If the data is monthly then we should raise the ratio of monthly values to the 12th power.
24
A form of data similar to growth rate is proportion. It is also called a share or a percentage or a fraction. Let us give a numerical example: Suppose that; GDP= TL 10446.2 = (C=7303.7)+(I=1593.2)+(G=1972.9)+(X-M=- 423.6) The proportion of consumption expenditures in GDP would be calculated as follows: C/GDP = 7303.7/10446.2 = 0.699 = 69.9%.
25
Similarly, the share of other components in GDP would be calculated as: I/GDP = 1593.2/10446.2 = 0.153 = 15.3%. G/GDP = 1972.9/10446.2 = 0.189 = 18.9%. (X-M)/GDP = -423.6/10446.2 = 0.041 = -4.1%.
26
Let’s remember the simple identity below: V=PxQ where V: nominal (or value); P: price, and Q: real (or volume). Nominal data are data measured by using the actual market prices that existed during the time period in question. Real data, at the micro level refer to the actual quantities employed by a firm (labor hours), produced by a firm (number of widgets), or sold by a firm (sales volume).
27
Index numbers, unlike most other statistics, have no units. They are designed for comparison purposes. For ex., one could use an index number to compare the level of whatever the index is measuring to an earlier time period known as the base period.
28
The formula is as follows; XT = (Xt/X0)*100 Xt is the value of the raw variable in a given time period t in the series, X0 is the value of the raw variable in the base period, which is the period to be compared against, XT is the resulting index number. Note that in the base period Xt=X0.
29
Because prices tend to increase over time, it would be misleading to compare the nominal measurements. By using base year prices and actual year quantities, real GDP excludes the effects of changing prices over time.
30
Base year is generally changed to keep it “recent” because we do care more about the recent economic changes than historic ones. Let us give a numerical example: Suppose that we have the following annual CPI data;
31
Year199192939495969798 Base 19970,850,951,001,15 Base 19920,951,001,251,301,40 Linked series 0,580,610,760,790,850,951,001,15
32
Now, suppose that we want to complete the series with a 1997 base year. We need to transform the values for the observations only available with a base year of 1992, so they correctly show the change in CPI between both parts of the data. Year 1995 is the year of overlap, that is, we have 2 values for this year. The data with the earlier base year need to be reduced to link to the data with the later base year. The amount of the reduction is given by the ratio of the two values for 1995. So, the reduction factor would be: 85/140= 0.607.
33
To obtain the linked series with a 1997 base year, each value with base year 1992 would be multiplied by the reduction factor (0.607) (shown in the third row).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.