Database Clearinghouse Update William Johnson Miller J. Whitney Bunting College of Business, Georgia College & State University Department of Management Milledgeville, GA 31061 (william.miller@gcsu.edu) Presentation to the 48th Annual Meeting of the Southeast Decision Science Institute-Wilmington, NC (2018)
Publicly Available (Free) Databases Business Analytics Teaching Database Clearinghouse Interest Viability Benefits Toward Teaching Benefits Toward Research Decision at November 2017 dsi meeting (Washington, DC) Links will be made available on DSI website Password protected for members only Metrics, Availability, and Maintenance Metrics Mechanism for Updating
DASI Database Clearinghouse Byproduct of Course Project Fall 2016, Spring 2017, Fall 2017 Spring 2018 (Eight Sections, tw) 1st Stage of Project Teams Choose 1 or 2 Industries (Individuals Find 3 to 5 Databases) Mgmt 3175 Business Analytics Students Collected 700+ Database Links Each Team Chooses 3 Preferred Potential Databases I Meet with Each Team and Help Them Pick Their Project Database 2nd Stage of Project Teams Use Business Analytic Process to Analyze Their Project Database Descriptive Analytics (Descriptive Statistics, Graphs) Predictive Analytics (Regression, Discriminant Analysis) Prescriptive Analytics (Linear Programming, Decision Analysis, Heuristics, Software) Presentation and Formal Written Report Continuing Future Classes Not sure
Ideal Data Bases Publicly Available, Free, Downloadable Flat Files Rows and Columns Suitable for Excel Query-Driven Software Exported to a Flat File http://ahrf.hrsa.gov/ https://www.medicare.gov/hospitalcompare/search.html? Columns (Variables) Several Continuous Variables Several Categorical Variables Consider How Variables Interact Consider Adding Variables From Other Databases Rows (Cases) At Least Several Hundred (Could Justify Fewer) Can’t Have Too Many (Allows Working with a Subset) Raw Data (NEISS or Complaints or UFOs) Report Data (Gun Ownership, Arrests)
Database Choices Based On Industry File Template From Database Choices Assignment At Least Three Data Bases Per Team Member (Five Ideally) Teams Chose Either One or Two Industries-8/31 (8/29) Elements Industry (added Spring 2017) DB Name Description URL Data Dictionary #Rows #Continuous Variables #Categorical Variables Report (1) to Raw Data (10) Submit to OneDrive Shared Folder (As Developed) Final List Reviewed By Team Top 3 Final Choices Identified (9/12) Each Team Meet With Me In My Office By 9/14, Decide on Data Base
Databases Analyzed Fall 2016 Accidents Arrests Faculty Salaries Global Graduation Rates Halloween Injuries HIV Studies Gun Ownership IRS Info By Zip Code Loan Complaints Movies NFL Teams NBA Players Poverty UFOs Spring 2017 Aviation Safety Baseball Injuries Beer Building Code Violations Drug Deaths MLB Players Olympic Medals Soccer Injuries Traffic Stops Youth Tobacco Use Fall 2017 Sports Betting Global Films PGA Driving Statistics Storms Hunting College Athletics Revenue Airline Delays Fleet Data PGA Player Statistics National Food Expenditures Campus Crimes Spring 2018 Searching New industries 9 Teams
Decisions Name of Clearinghouse List of 40+ Individuals interested in either receiving list, contributing data, helping organize project Several Bad Email Addresses DSI DASI Website (Bob McQuaid) What to Post on Website When to post it Limited access testing phase Others Contributing to List Vetting Links Archiving Databases Organizing List of Database links Database Elements Tracked Cleaning Up, Certifying, and Maintaining List Method and Frequency Legal Issues Managing Links: Flat Files vs. Directories vs. Querying Software
Managing Links: Flat Files vs. Directories vs. Querying Software Gas Prices https://catalog.data.gov/dataset/gasoline-retail-prices-weekly-average-by- region-beginning-2007 UFOs http://www.nuforc.org/webreports.html Gambling http://www.oddsshark.com/sports-betting/parlay-betting Injuries https://www.cpsc.gov/cgibin/NEISSQuery/home.aspx Hospital Compare https://www.medicare.gov/hospitalcompare/search.html Coffee http://datatopics.worldbank.org/consumption/ http://databank.worldbank.org/data/home.aspx Food https://ndb.nal.usda.gov/ndb/search/list