Download presentation
Presentation is loading. Please wait.
Published byFrederica Walters Modified over 6 years ago
1
Merging data using Excel & Stata Mark Bruyneel & Matthijs de Zwaan
Research Data Services Merging data using Excel & Stata Mark Bruyneel & Matthijs de Zwaan - Welcome to the course/ this web lecture on Data retrieval skills - My name is . . .
2
Program: Background (30 min.) - Getting your data - Working with data
Exercise 1: Excel (30 min.) Exercise 2: Stata (40 min.)
3
think now, download later
Getting your data Rule 1: think now, download later
4
Getting your data Formulate a research question What was the influence of using governance reporting standards on earnings management of companies? What data do I need? Variables? Sample: Geography? Time period? Is the data available?
5
What data do I need? Variables Sample Research question:
What was the influence of using governance reporting standards on earnings management of companies? Variables What reporting standards? How to measure earnings management? Control variables: firm size, board members, … Sample Which countries: USA, Europe, the Netherlands Time period: recent, historical? Company type: public, SMEs? Model ? Relationships ?
6
Remarks: Is the data for each database in a single currency ?
What control variables do I need? Do I need to download components to calculate variables I could not download? (Ratios etc.) Is the data for each database comparable in time? If you need more than 1 database: Do you need company identifiers? (see: Blackboard)
7
Which databases are relevant?
Do I need several databases ? Do I need to combine datasets ? Do I need just one database ?
8
Which databases are available?
9
Research Data Services
10
Data Center on Blackboard
11
Research Data Services on Blackboard
12
Research Data Services on Blackboard
13
Research Data Services on Blackboard
Help on software: manuals/websites
14
Research Data Services on Blackboard
15
Using several databases
Company identifiers Search 1 Data 1 Data 2 Search 2 Company identifiers: codes that uniquely identify a company in 1 or more databases
16
Combining data(sets) Company identifiers Data 1 Data 2
Find out which (Company) identification codes are available in all relevant databases ! Examples: ISIN, Sedol, CUSIP, Tickers.
17
Research Data Services on Blackboard
Additional information or tools
18
Research Data Services on Blackboard
19
Blackboard file:
20
Working with data Many different ways to organize data For analysis:
One line (row) = one observation One column = one variable “Tidy” data Different ways to organize data. Best way depends on what you want to do. For example: clearest way to present data in thesis is not best way to organize data for analysis. Software requires certain structure in the data. Can be different for different software packages or even versions. Try to keep actual data/observations separate from comments etc Stata expects data as one observation in each line, and variables in columns: ‘tidy data’.
21
“Untidy” data: example 1
Name y2001 y2002 Alphabet - 2 Johnson & Johnson 16 11 Pfizer 3 1 Name Y(ear) Result Alphabet 2001 - Johnson & Johnson 16 Pfizer 3 2002 2 11 1 Headers have names and data in one cell: ‘y’ = the variable year 2001 and 2002 are values
22
“Untidy” data: example 2
Company Result T-2001 - MSFT-2001 16 GOOG-2001 3 T-2002 2 MSFT-2002 11 GOOG-2002 1 Company Year 2001 Year 2002 T - 2 MSFT 16 11 GOOG 3 1 Company Year Result T 2001 - MSFT 16 GOOG 3 2002 2 11 1 One column, two separate variable values: Name and Year
23
Working with data Basics of data merges: Merging data ≠ Appending data
Merging = Adding variables Appending = Adding observations Merge data on key variables (ID / Codes) Must be available in all data files / datasets Uniquely identify observations (can be a combination of items)
24
A tidy dataset: example 1
‘Name’ and ‘Year’ together uniquely identify a single observation The ‘Result’ column gives variable values Name Year Result Alphabet 2001 - Johnson & Johnson 16 Pfizer 3 2002 2 11 1 N.B.: Unique Company ID codes are often better than the name !
25
Working with data Warning: make sure to keep key ID codes in tact ! ID
ID 1324 21234
26
Restoring the ID Restore original length with Excel: REPT & LEN
REPT() = Repeat LEN() = Length number of characters in a cell
27
Merging data Auditor Year GRI Score John 2001 - Jane 16 Mary 3 2002 2
11 1 Auditor Year Big4 ? John 2001 “Yes” Jane Mary “No” 2002
28
Working with data: merging
Auditor Year GRI Score John 2001 - Jane 16 Mary 3 2002 2 11 1 Auditor Year Big4 John 2001 “Yes” Jane Mary “No” 2002 “1-to-1” Auditor Year GRI Score Big4 John 2001 - “Yes” Jane 16 Mary 3 “No” 2002 2 11 1
29
Working with data: merging
Auditor Year GRI Score John 2001 - Jane 16 Mary 3 2002 2 11 1 Name Gender John “M” Jane “F” Mary
30
Working with data: merging
Auditor Year GRI Score John 2001 - Jane 16 Mary 3 2002 2 11 1 Name Gender John “M” Jane “F” Mary “many-to-1” Auditor Year GRI Score Gender John 2001 - “M” Jane 16 “F” Mary 3 2002 2 11 1
31
Working with data: merging
Auditor Year GRI Score John 2001 - Jane 16 Mary 3 2002 2 11 1 Name Gender John “M” Jane “F” Mary “1-to-m” Auditor Year GRI Score Gender John 2001 - “M” Jane 16 “F” Mary 3 2002 2 11 1
32
Exercise 1: Combining data using Excel
Compustat Global data Preparing the Datastream data Combining both datasets
33
Exercise 2: Combining data using Stata
Introduction Exercise: Compustat Global & Datastream
34
Exercise 2: Combining data using Stata
36
Exercise 2: Stata – Command line
37
Exercise 2: Stata – Scripts / Do files
Basics about .do files: Text files with the .do file extension Commands are handled as if they were typed in on the Command line interface Typing “doedit” calls up the do-file editor. Advantages of scripting: Documents what you have done It makes finding mistakes and repairing them easier Add comments to your script(s) (your future self & your supervisor will be grateful)
38
Exercise 2: Stata – Combining the data
Let’s get to work! - Go to the Data Center Blackboard course - Download the data files - Start up the program Stata
39
Need help? The library is there for you !
Website: Blackboard:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.