Uexplore/Dexter Tutorial

Slides:



Advertisements
Similar presentations
Accessing Large Table Files With Dexter Census Summary Files and ACS Base Tables John Blodgett, Missouri Census Data Center.
Advertisements

Rankster A web utility app to make it fast and easy to create extracts and reports with ranked data John Blodgett Nov, 2013.
® Microsoft Office 2010 Word Tutorial 3 Creating a Multiple-Page Report.
Managing Grades with Excel Viewing Help To view Help 1.Open Excel on your computer. 2.In the top right hand corner of the Excel Screen type in the.
Dexter The Missouri Census Data Center’s Data Extraction Utility Data Extraction Utility John Blodgett: OSEDA, University of MissouriOSEDA Rev.14May2007,
Inventory Throughout this slide show there will be hyperlinks (highlighted in blue) follow the hyperlinks to navigate to the specified Topic or Figure.
PowerPoint: Tables Computer Information Technology Section 5-11 Some text and examples used with permission from: Note: We are.
Developing Effective Reports
EMetric Presents A reporting application designed to fit the needs of ACCESS for ELLs users.
Ten Things To Like About the Missouri Census Data Center’s ACS Profiles ACS Profiles As of Nov
Lesson No:9 MS-Word Tools, Mail Merge and working with Tables CHBT-01 Basic Micro process & Computer Operation.
Moodle (Course Management Systems). Assignments 1 Assignments are a refreshingly simple method for collecting student work. They are a simple and flexible.
4/22/2017 5:36 PM EViews Training Creating Workfiles.
1 Data List Spreadsheets or simple databases - a different use of Spreadsheets Bent Thomsen.
CREATING TEMPLATES CREATING CUSTOM CHARACTERS IMPORTING BATCH DATA SAVING DATA & TEMPLATES CREATING SERIES DATA PRINTING THE DATA.
Forms and Server Side Includes. What are Forms? Forms are used to get user input We’ve all used them before. For example, ever had to sign up for courses.
IRS Migration Data & Profiles From the Missouri Census Data Center.
McGraw-Hill/Irwin The Interactive Computing Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Excel 2002 Working with Data Lists.
Acsmcdcprofiles_extract A tool to make it much simpler to access the latest 5-year period estimates from the American Community Survey John Blodgett May,
Using Microsoft Office Word Assignment Layout. Target Create a Cover Page (Front Page) Create a Table of Contents Page Create a Table of Figures Page.
For Datatel and other applications Presented by Cheryl Sullivan.
Mail Merge Introduction to Word Processing ITSW 1401 Instructor: Glenda H. Easter Introduction to Word Processing ITSW 1401 Instructor: Glenda H. Easter.
Emdeon Office Batch Management Services This document provides detailed information on Batch Import Services and other Batch features.
Advanced HTML Tags:.
Assignments, Assessments and Grade Book
Accessing ACS Data Using Missouri Census Data Center Web Tools
Compatible with the latest browsers; Chrome, Safari, Firefox, Opera and Internet Explorer 9 and above.
Finding Magazine & Newspaper Articles in a Library Database
IUIE Reporting Basics Workshop
AP CSP: Cleaning Data & Creating Summary Tables
Project Management: Messages
Module 4: Building Reports
Setting Defaults in Microsoft Word for Accessibility
Finding Scholarly Articles in a Library Database
Homework 1 Hints.
OneSource Account Intelligence
Customizing the Toolbar
The Smarter Balanced Assessment Consortium
Building a User Interface with Forms
Single Sample Registration
Lesson 2 Tables and Charts
Managing Worksheets And Workbooks
The Smarter Balanced Assessment Consortium
Adding Assignments and Learning Units to Your TSS Course
Reports: Pivot Table ©2015 SchoolCity, Inc. All rights reserved.
How to Use Members Area of The Ninety-Nines Website
Intro to PHP & Variables
ECONOMETRICS ii – spring 2018
Database Applications – Microsoft Access
Tutorial Tutorial Read all the directions before proceeding
Exploring Microsoft® Access® 2016 Series Editor Mary Anne Poatsy
The Smarter Balanced Assessment Consortium
Creating and Modifying Queries
Module 5: Data Cleaning and Building Reports
The Smarter Balanced Assessment Consortium
Benchmark Series Microsoft Word 2016 Level 2
Word offers a number of features to help you streamline the formatting of documents. In this chapter, you will learn how to use predesigned building blocks.
Navya Thum January 30, 2013 Day 5: MICROSOFT EXCEL Navya Thum January 30, 2013.
Comparative Reporting & Analysis (CR&A)
Inside a PMI Online Course
USER MANUAL - WORLDSCINET
REACH Computer Resource Center
The Smarter Balanced Assessment Consortium
Bent Thomsen Institut for Datalogi Aalborg Universitet
Introduction to Excel 2007 Part 3: Bar Graphs and Histograms
The Smarter Balanced Assessment Consortium
Training Document Accessing Reports in VinCENT.
USER MANUAL - WORLDSCINET
EViews Training Creating Workfiles. EViews Workfiles EViews main operating principles: Any work in EViews is created in workfiles – which are place-holders.
Presentation transcript:

Uexplore/Dexter Tutorial Part 2 More Advanced Topics And Exercises Rev. 5-16-07, jgb By John Blodgett, OSEDA, U of Missouri Columbia; under contract with the Missouri Census Data Center.

Generate a Report Showing … Population change (estimated) for Missouri cities (places) from 2000 to 2004. Show change and pct change sorted by change in population, descending. Only show cities that had growth of at least 100 people and 5% over the period. Use variable labels (rather than names) as column headings in the report.

Navigate to popests Filetype Pop Estimates is a major category with (currently) 4 filetypes. By far the most important of these is popests. Choose the Current version of popests (as opposed to the older estimates in popests2).

Finding the Relevant Dataset The Census Bureau does estimates at various levels, including nation, state, county and “subcounty”. The latter includes cities (“places”) and other sub-county governmental units. Find the relevant dataset by scanning the Datasets.html page in the popests dir. By now you should know that whenever you encounter a Datasets.html file in a data directory you should take advantage of it. It makes finding what you are looking for a lot easier.

The Winner is dataset mosc04 We got here by clicking in the Details column of the mosc04 row of the Datasets.html table.

Determine SumLev Value for Filter Turns out that this set has some non subcounty data as well (i.e. state & county summaries). We want complete places – level 162. Level 157 would give us place-within-county.

Choose Output Format(s) We will want to switch to PDF output at the end but while creating a complex query it is best to use some other report format that does not take so long to generate.

Create the Filter First row selects complete-place summaries. 2nd row says only places with growth of 100 or more persons. 3rd row says the change must be at least 5%.

Choose Columns I would normally keep geocode but the report was tight for horizontal space and most users will ignore it so we left it off.

Section IV Important This Time Section IV may not be essential but it is easy to use and comes in real handy when labeling of reports is required.

Use An Option in New Sec V Section V is brand new (summer, 2005) and as of the date this slide was created we still had not created the help page for it. Macho users only. Mostly of use to MCDC personnel.

HTML Output We had to change the % character in the title to the Pct because Dexter strips out special characters.

Summary Log for this Query Note the link to a “saved query file”. This is where we store all the specs resulting from your page clicks.

The Dexter Saved Query File Do not concern yourself with the details here. Just know that this is way to codify your query. We hope at some point to allow for public queries, but for now such entities can only be re-used by authorized MCDC personnel.

Dexter Query Files Are simple text files used to encapsulate a query (i.e. save all the specs so that the query can be rerun.) Written to a temporary file which goes away within 48 hrs. For now, most users cannot replay a saved query. But authorized MCDC personnel can use these to create public queries. Veteran Dexter users may recall our earlier attempts at capturing queries using very long URLs with all the parm specs as passed to Dexter. There were technical problems with that approach which forced us to seek better alternatives, and this led us to develop these query files.

Invoking a Saved Query Can be done via a URL with a parm spec as we see in the current example. The name of a stored query can also be entered on the Dexter query form near the bottom of the page. For techies who care: Dexter looks for the file named &query.txt in the Queries subdirectory of the &path data directory.

Saved Queries Are relatively new to Dexter and not yet fully implemented. Have good potential for creating “virtual” data products. You save the query file that generates the report (and/or csv file) rather than the files. We are experimenting with adding “run-time parms” and generating query front-ends to allow customizing the query. For example, think of turning the growing cities query into one where you could specify the state rather than having it always be Missouri. We have removed several slides here that referenced a saved query that we no longer support. We are still working on a better way to utilize this tool.

Saved Queries and xsamples We are experimenting with a new kind of documentation for using Dexter. Sample dexter queries are documented and stored in shtml files in an xsamples dir. These sample pages include links to let you view and/or invoke the saved query file. See these at http://mcdc.missouri.edu/xsamples/ As of this writing (5-16-07) we do not have an index page for the xsample directory. But we obviously will need to create one. We are still experimenting with format and content for these pages and welcome user feedback. One of the things we would like to be able to do at some point is to be able to get user-contributed xsamples to be shared with others and included in this directory. We shall also need to provide linkage to these pages from the Queries subdirectories of the felevant filetype directories.

V. Advanced Options A new section on the Dexter input form, targeted at more sophisticated user who wants more control over output. As with all Dexter sections, click on section header to see online documentation. Easily ignored. Pretend it’s not there if you want. Many of the new features are things we want to do when we build public queries.

Advanced Features Example We will use several advanced features here including data aggregation. The dataset we shall access is the latest (thru 2004) county level estimates with components of change since 2000 for the entire country. We have added cbsa (metropolitan and micropolitan area) codes to this dataset. The latest such estimates as of 2006 are thru 2005 and are stored in datasets with names just like the ones we use in the examples except substitute “com05” wherever you see “com04”.

Use Datasets.html Page in popests We choose uscom04 because it has the latest estimates (2004), for the universe wanted (US) and at the level we need (county). When you read this it is likely that there will be a more recent dataset such as uscom05 or uscom06 with the latest estimates. We tend to do everything the same with these data each year so selecting the latest should be OK.

Query Specifications We have a dataset that has county-level data but also has CBSA (core based statistical area) codes identifying the metro area. We want to aggregate (sum up) the pop data to get cbsa-level summaries. To further complicate matters, we have a variable, cbsatype, that tells us whether it is a Metro or Micro (-politan) area. We not only want summaries for each cbsa within state, but we also want a summary for all the cbsa’s of a type (metro/micro) within each state. The states of interest are Illinois, Kansas and Mo. We want HTML output in a custom style.

Define Filter County level summaries only. Code of 99999 indicates not in a CBSA; exclude. Choose states using the postal abbreviations.

Select Columns Not interested in any variables that ID the county (eg. County, areaname). Only want the variables by which we intend to aggregate and the variables which are to be summed.

Titles and Footnotes

The Really Hard Part Don’t feel bad if this makes no sense to you. It may or may not be easier once we actually have a help page.

Aggregation Specs Aggby: stab cbsatype cbsa indicates that you want Dexter to combine all rows that have the same value for these 3 category variables, summing all the numeric variables from these rows. Agglvl: 2 indicates that you want summaries for the rightmost 2 aggby variables. A summary will be generated for all stab/cbsatype combinations, regardless of the value of cbsa. A value of 3 would have indicate we also want a summary at the state level. A value greater than 3 would be an error. The default level is 1.

More Aggregation Specs Grand Totals? – No means you do not want Dexter to add a summary row at the end of the file with totals for all rows in the entire dataset. Means or Percents – you specify here cols. that cannot be just summed up. They have to be specially processed using something called a weighted average.

Aggregation Specs 3 Weights for Means/Pcts – this is a list of columns to correspond with the list specified just above for Means or Percents. We are saying here that we want the program to weight the value of pctchang using the value of pop00c. Each pctchang value is multiplied by pop00c (“weighted”) prior to aggregation. During the agg step the weighted values are summed. In a post-agg step the sum of the weighted values is divided by the sum of the weights.

Just Know That … Whenever you are aggregating and you have a column/variable that is a percentage, you need to specify it in your Means or Percents list, and the corresponding col/var to use in the “Weights for ..” list is a variable containing the value of which it is the percentage. (We call this the “universe variable”). E.g. if the variable is PctAsian (Asians as a pct of total persons) then the weight variable is TotPop – the total persons.

If you misspell a Variable Name When entering variable names in any of the boxes in Section V be extra careful to spell the name exactly. Also be certain that you select the variable in Section III – you cannot aggregated by State if you have not selected State as 1 or the variables to keep. You will not get a specific error message about this but instead it will just say that ther were no observations selected. You can sometimes get help in locating such errors by typing the code “131” in the box near the bottom of the form label “Internal use only (_debug opt)”. This results in a stream of technical gobbledygook but in that stream you can look for SAS-generated error messages generated in red. This will usually tell you where the query failed. E.g. The message might say something such as “The variable pctchange in the DROP, KEEP, or RENAME list has never been referenced.” This is your clue that you typed in a variable name incorrectly (the name of the variable is “pctchang”, without the “e” at the end.)

Variables to Drop Not important (99% of the time). The program generates 2 extra variables, named _lvl_ and _nag_ , that occasionally may be useful. _lvl_ indicates the summary level (in our example it would have value 1 or 2). _nag_ keeps a count of how many rows/observations were used to form the output summary row/observation.

Advanced Report Formatting We check the option to use variable labels as column headers in the report. Not very advanced, but it did not fit elsewhere. By variables for report allows specifying one of more variables that are listed on a separate “by line” instead of as a column. ID variables for report (not specified here) are variables listed at the far left of each row to identify the observation (instead of having “Obs”, the observation #, used.)

Style to Use for html/pdf Output You get to pick from a menu of 14 or so. Those followed by ** are recommended. The default (sasweb) is minimal blue & white. In this example we chose brick, one of our favorites. Names are not very mnemonic; you just have to try them to see what they look like.

This is an example of what our output looks like using the brick style. What is the data on Obs 12?

Exercise 1 Access 2000 census long-form (sample) data for census tracts in state of Nebraska. Create a csv file where each record corresponds to a census tract and the variables/columns tell us what metro area and county the tract is in and reports the total population and the number and percent of persons who were poor.

Exercise 2 Access filetype stf903x2 (under 1990 census data). Create a file (sas, dbf, or Excel – whichever you prefer) that has the number of hispanics and pct hispanics for all census tracts in Greene and Christian counties (MO). Use 2000 tract geography. Do similar query using filetype sf32000x to get comparable data for 2000. (Not part of exercise, but we hope you will have tools to merge these 2 results).

Exercises 3 Access the beareis filetype. Pull data for all counties in the new Jefferson City metro area (CBSA). The id variables for your output should be county, LineCd and LineCdMeaning. The numeric variables here are a time series; select data for 1993 and 2003. In the Advanced Report Formatting section specify county as a by variable and linecd as an ID variable. Select electronics as the style.

Exercise 4 In the beareis filetype we have datasets that report on total transfer payments. Locate the dataset with this data for Mo. Print a report showing total transfer payments for the state of Missouri for each of the most recently-available 10 years. Extra credit option: specify that you want all the trf variables displayed using a dollar12. format.

Exercise 5 Navigate to the filetype georef under the Geography/GIS major category. Access the mocogeos dataset. (Missouri county geocodes). Do a plain text report showing all counties in Mo, their FIPS codes & names along with the DED-Region, RPC and dot (MoDOT) region in which they are contained. Extra credit opt: the format code $rpcname. Can be used to display names for rpc codes. Specify that you want to see rpc names instead of codes in your report.

Exercise 6 Find the 25 wealthiest counties in the United States per the 2000 census, using Median Household Income as the measure of wealth. Print them out in descending order (highest income first) – just the top 25. Hint: run Dexter twice, the 2nd time applying a filter based on medhhinc. If you’ve been paying attention you should be able to find the appropriate dataset on your own.

Exercise 7 Filetype sf12000x contains the standard extract of data from Summary File 1, 2000 census. As a bonus, we added the 1990 pop for the corresponding geography to this otherwise all-2k dataset. Access the block level dataset for Mo and generate a tab-delimited file showing the 1990 & 2000 pops along with change & % change for all blocks in Adair county.

Exercise 8 In filetype sf32000x, access dataset moschlcos. The MCDC has done many custom geographic aggregations of 2k census data -- including data for school districts and school districts/counties. In filetype sf32000x, access dataset moschlcos. Print a report showing total pop, rural pop and % rural pop for all the districts within St. Charles county.

Want More Exercises? Take the MCDC Trivia challenge at http://mcdc.missouri.edu/trivia/popests1.shtml Has ten rather challenging exercises, all involving data in the popests filetype (current population estimates). Answers included (sort of).

As usual, questions and comments are encouraged. E-mail preferred: Thank You As usual, questions and comments are encouraged. E-mail preferred: blodgettj@missouri.edu