Presentation is loading. Please wait.

Presentation is loading. Please wait.

Strategies for solving scientific problems using computers.

Similar presentations


Presentation on theme: "Strategies for solving scientific problems using computers."— Presentation transcript:

1 Strategies for solving scientific problems using computers

2 Outline Motivation A standard framework Which tool to use? Critical considerations Aftermath

3 Motivation Most (all?) problems in modern geoscience benefit strongly from computer methods A good hypothesis warrants a clear analytical approach Make large problems more tractable Avoid a posteriori rationalizations as much as possible Encourage predictions rather than diagnoses Our scientific thought process must be defensible, and so should our methodology

4 Overarching framework for a scientific problem Formulate a hypothesis Collect new data Process this data Interpretation Review existing research Evaluate and present your hypothesis Where will most of your time/energy be spent?

5 A sub-framework for computer-based problems Format this data Process this data Visualize results Scientific interpretation Load data within chosen work environment

6 What relevant tools exist? Many options are free or free-ish Overlapping functionality Many are user-extendable

7 Finding the best tool for the job Are you already familiar with it? Can it already do what you need it to do, or is it conceivable that it could do so after some effort? Is it easy (enough) to learn? Is it intuitive? Is it fast enough? Does it support the command line, a GUI or both? Can you understand what it’s doing, or is it a black box? Does it have sufficient mathematical functionality? Does it have sufficient mapping functionality? Can it easily generate reproducible output? Is it popular within your field? Can its output be shared easily? Is it affordable and accessible?

8 Which tool to use for geoscience? MATLAB, Python, GMT and ArcGIS are the best current options

9 Intuitive/explicit processing and data visualization

10 Every platform is vulnerable but some more so

11 A sub-framework for computer-based problems Format this data Process this data Visualize results Scientific interpretation Load data within chosen work environment

12 A directory structure for computer-based problems current_project data (raw) mat (formatted) fig (useful not pretty) research/code/ old (no need to delete) your code

13 Incidentally, a similar manuscript structure current_paper draft (versioned) revised (basically inevitable) fig (pretty) research/manuscript/ final (proofs, published) master document

14 Loading data Load all necessary data first This step can be (but is rarely actually) a deal-breaker If someone or something generated it, you can almost certainly read it A question that will keep coming up: How often will you need to do this? The answer is almost always: Much more often than you think A valuable habit: Spend the time to record data loading (i.e., not just ad hoc in the command line) and sourcing Save the MATLAB/etc.-formatted data before processing

15 How often will you need to do this? Only once, I swear: –command line and save –import data using GUI and save Every time I want to do this analysis: –Write it down and comment Often and with lots of data: –Time to consider how to make it faster So often that other people will have to do it for me: –Consider writing a GUI, which enforces standardization

16 Format data Data structures to use in descending order of preference: –scalar –vector –matrix –structure/object –cell

17 Numeric vs. logical vs. string Several different data types to consider –numeric (MATLAB defaults to double precision signed) –string –logical (true/false)

18 Most (all?) data are imperfect NaN: Not a Number

19 Poor variable names data index constant var test temp i, j any name identical to or confusingly similar to an existing function name do not abuse case sensitivity names that are not descriptive: you will forget what “A” means

20 Processing data Document what not how even you think it’s just for you, because you are your own worst enemy Re-use shamelessly, but avoid copy/paste Is this a function or a script? Will you re-use it often? The other kind of MATLAB cell

21 Visualizing data Physically separate visualization code Visualize as you’re writing, but not as you’re running Again, use cells

22 MATLAB is trying to help you (similar to Word) Code could be better Code is wrong

23 Whitespace and indentation Choose a style and stick with it NoBetter

24 Order of operations Forget it exists and use parentheses instead NeverBetter

25 Functions warrant error checking

26 Aftermath In the long term, do not keep failed/commented code in a working function/script Getting complicated and/or popular? Consider a versioning scheme or repository, e.g., Github or RunMyCode


Download ppt "Strategies for solving scientific problems using computers."

Similar presentations


Ads by Google