Presentation is loading. Please wait.

Presentation is loading. Please wait.

Getting to the PID – pitfalls along the way

Similar presentations


Presentation on theme: "Getting to the PID – pitfalls along the way"— Presentation transcript:

1 Getting to the PID – pitfalls along the way
Denise Hills Director, Energy Investigations Geological Survey of Alabama

2 Information Mandate: Data at the GSA/OGB
Legally charged to be a repository for data relating to energy and mineral resources. As a state government agency, GSA is mandated to explore for, characterize, and report Alabama’s mineral, energy, water, and biological resources in support of economic development, conservation, management, and public policy for the betterment of Alabama’s citizens, communities, and businesses. We have a LOT of data, including: Geophysical well logs (e.g., left) Cores, cuttings, and other physical samples, sometimes with descriptions (e.g., warehouse, top center; slabbed core (SECU D-9-8 Citronelle 11143_A), top right; hand samples, lower left; core samples, center center) Fluid production and injection information from oil and gas wells (e.g., bottom right) Geologic maps As with many agencies, GSA/OGB has challenges: Data discoverability often difficult Much of the available information was analog Even digital data was not always “machine-readable” Lack of standardization and documentation of data and metadata Provenance and quality often poor or unknown Goal: Improved Data Access

3 State of the Collection.
Essentially chaos (although we are in the process of organizing it). Not only do we have challenges with the physical state of the collection, we are trying to get the metadata into a state that we can actually REGISTER it.

4 Things to notice here: -Non-unique ID -Missing information (specific sample location) Other things to note – this is one of the better spreadsheets we had, actually. We can backtrack and get locations for most of these (by using the OGB database) What’s even more challenging is when we had NO KNOWN electronic records.

5 Metadata Archaeology "Metadata archaeology" is often what you have to do during data rescue, when what metadata you have isn't enough or when what metadata you have isn't enough isn't trusted. The archaeology part comes with figuring out that missing part. Example: we have geo cores here, that have inconsistent/missing labels and little/no digital info associated. We've had to decipher codes on boxes, and dig through other records to try to discover the metadata. We also discovered old paper records are associated with these samples that no current employees knew we had and, of course, we've talked with former employees and dug through publications (both formal and in-house pubs). Digging thru all this, layer by layer, to discover most accurate/trusted metadata is sifting thru the past. This sifting/discovery/eval process is similar to archaeology, thus "metadata archaeology" Hope that explains it :)

6 The workflow is a framework rather than it MUST BE DONE this way (Individuals and interactions over processes and tools) Even though it’s not perfect, it generally works (Working software over comprehensive documentation) Semi-automated workflow to capture information from current researchers with minimal effort For example, information was maintained in several different spreadsheets. These documents contained distinct AND overlapping information. Determination of canonical files/records is a time consuming process. By making the legacy data rescue and preservation process as simple as possible through the development of template workflows, such as that presented here, personnel are more likely to adopt and adhere to standards. Template workflows also simplify training of additional personnel to assist in the registration process. Ultimately this increases data and metadata exposure and interoperability. (Hills, 2015) Suggestions for data rescue plans: Prioritize critical data – eg., person is about to retire, sample about to loss critical information, used by lots of people Develop a workflow – iterative process to refine; be aware of internal and other existing standards Involve current data holder as well as those not as familiar with the information – current archivist checks for accuracy, others check for usabilityConduct exit surveys specific to data Register your samples and data – thus, people can FIND the information and use it => becomes VALUABLE --- Reference: Hills, D.J Let’s make it easy: A workflow for physical sample metadata rescue. GeoResJ, vol. 6, p doi: /j.grj Hills, 2015

7

8 Exit interviews SPECIFIC TO DATA would be extremely helpful down the line. It can be an “interview” or just filling out forms. Retirement of individuals who know how to access these data or know the metadata associated with them to make the data useable by others In fact, it might be good practice to have someone fill out this form as part of the DMP – when a project is completed, this form could be done and periodically reviewed. For guidance, look at USGS’s page on Exit Survey: Form: Key Points from USGS Do not assume these questions are being asked by other groups. Even if they have policies in place for departing employees, you may find that some of the questions in the form have not been asked and the answers to them are necessary to have once the employee leaves. If someone other than the departing employee is filling out the Exit Survey by interviewing the employee, you may want to supply the form before you meet with them so they know what questions will be asked. The information captured on the form can be shared with a variety of people: co-workers, supervisors, Science Center Directors, IT personnel, data managers, Records Management personnel, library staff, publication service centers, and lab managers. The form is broken into questions about electronic and physical data; you only need fill out the sections that are relevant to your data.


Download ppt "Getting to the PID – pitfalls along the way"

Similar presentations


Ads by Google