Presentation is loading. Please wait.

Presentation is loading. Please wait.

Collaboration and data sharing in the Distributed Graduate Seminars Role of NCEAS in the DGS Coordination Collaborative webspace Grad Research Associate.

Similar presentations


Presentation on theme: "Collaboration and data sharing in the Distributed Graduate Seminars Role of NCEAS in the DGS Coordination Collaborative webspace Grad Research Associate."— Presentation transcript:

1 Collaboration and data sharing in the Distributed Graduate Seminars Role of NCEAS in the DGS Coordination Collaborative webspace Grad Research Associate Informatics support Outreach support & publicity Time & space for synthesis

2 Collaboration and data sharing in the Distributed Graduate Seminars Organization of this presentation: Basic problems and solutions in data sharing & collaboration Introduction to NCEAS collaborative sites Overview of NCEAS computing research and support

3 Collaboration & data stewardship Personal data management problems are magnified in collaboration Data organization Data documentation Data analysis Data & analysis preservation Collaboration and data sharing

4 “Data Entropy” Michener et al. 1997, Ecological Monographs Information content Time Time of publication Specific details General details Retirement or career change Death Accident

5 A personal example... Collaboration and data sharing

6 2 tables Collaboration and data sharing

7 Random notes What is this? Collaboration and data sharing

8 Wash Cres Lake Dec 15 Dont_Use.xls Collaboration and data sharing

9 What if we want to merge files?

10 What is this? Collaboration and data sharing

11 Personal data management problems are magnified in collaboration Data organization – standardize Data documentation – standardize descriptions of data (metadata) Data analysis - document Data & analysis preservation - protect Collaboration and data sharing

12

13 Example – using R for data exploration, analysis and presentation ### Simple Linear Regression - 0+ age Trout in Hoh River, WA against Temp Celsius ### Load data HohTrout<-read.csv("Hoh_Trout0_Temp.csv") ### See full metadata in Rosenberger, E.E., S.L. Katz, J. McMillan, G. Pess., and S.E. Hampton. In prep. Hoh River trout habitat associations. ### http://knb.ecoinformatics.org/knb/style/skins/nceas/ ### Look at the data HohTrout plot(TROUT ~ TEMPC, data=HohTrout) ### Log Transform the independent variable (x+1) - this method for transform creates a new column in the data frame HohTrout$LNtrout<-log(HohTrout$TROUT+1) ### Plot the log-transformed y against x ### First I'll ask R to open new windows for subsequent graphs with the windows command windows() plot(LNtrout ~ TEMPC, data=HohTrout) ### Regression of log trout abundance on log temperature mod.r <- lm(LNtrout ~ TEMPC, data=HohTrout) ### add a regression line to the plot. abline(mod.r) ### Check out the residuals in a new plot layout(matrix(1:4, nr=2)) windows() plot(mod.r, which=1) ### Check out statistics for the regression summary.lm(mod.r)

14 Example – using R for data exploration, analysis and presentation ### Simple Linear Regression - 0+ age Trout in Hoh River, WA against Temp Celsius ### Load data HohTrout<-read.csv("Hoh_Trout0_Temp.csv") ### See full metadata in Rosenberger, E.E., S.L. Katz, J. McMillan, G. Pess., and S.E. Hampton. In prep. Hoh River trout habitat associations. ### http://knb.ecoinformatics.org/knb/style/skins/nceas/ ### Look at the data HohTrout plot(TROUT ~ TEMPC, data=HohTrout) ### Log Transform the independent variable (x+1) - this method for transform creates a new column in the data frame HohTrout$LNtrout<-log(HohTrout$TROUT+1) ### Plot the log-transformed y against x ### First I'll ask R to open new windows for subsequent graphs with the windows command windows() plot(LNtrout ~ TEMPC, data=HohTrout) ### Regression of log trout abundance on log temperature mod.r <- lm(LNtrout ~ TEMPC, data=HohTrout) ### add a regression line to the plot. abline(mod.r) ### Check out the residuals in a new plot layout(matrix(1:4, nr=2)) windows() plot(mod.r, which=1) ### Check out statistics for the regression summary.lm(mod.r)

15 Example – using R for data exploration, analysis and presentation ### Simple Linear Regression - 0+ age Trout in Hoh River, WA against Temp Celsius ### Load data HohTrout<-read.csv("Hoh_Trout0_Temp.csv") ### See full metadata in Rosenberger, E.E., S.L. Katz, J. McMillan, G. Pess., and S.E. Hampton. In prep. Hoh River trout habitat associations. ### http://knb.ecoinformatics.org/knb/style/skins/nceas/ ### Look at the data HohTrout plot(TROUT ~ TEMPC, data=HohTrout) ### Log Transform the independent variable (x+1) - this method for transform creates a new column in the data frame HohTrout$LNtrout<-log(HohTrout$TROUT+1) ### Plot the log-transformed y against x ### First I'll ask R to open new windows for subsequent graphs with the windows command windows() plot(LNtrout ~ TEMPC, data=HohTrout) ### Regression of log trout abundance on log temperature mod.r <- lm(LNtrout ~ TEMPC, data=HohTrout) ### add a regression line to the plot. abline(mod.r) ### Check out the residuals in a new plot layout(matrix(1:4, nr=2)) windows() plot(mod.r, which=1) ### Check out statistics for the regression summary.lm(mod.r)

16 Example – using R for data exploration, analysis and presentation ### Simple Linear Regression - 0+ age Trout in Hoh River, WA against Temp Celsius ### Load data HohTrout<-read.csv("Hoh_Trout0_Temp.csv") ### See full metadata in Rosenberger, E.E., S.L. Katz, J. McMillan, G. Pess., and S.E. Hampton. In prep. Hoh River trout habitat associations. ### http://knb.ecoinformatics.org/knb/style/skins/nceas/ ### Look at the data HohTrout plot(TROUT ~ TEMPC, data=HohTrout) ### Log Transform the independent variable (x+1) - this method for transform creates a new column in the data frame HohTrout$LNtrout<-log(HohTrout$TROUT+1) ### Plot the log-transformed y against x ### First I'll ask R to open new windows for subsequent graphs with the windows command windows() plot(LNtrout ~ TEMPC, data=HohTrout) ### Regression of log trout abundance on log temperature mod.r <- lm(LNtrout ~ TEMPC, data=HohTrout) ### add a regression line to the plot. abline(mod.r) ### Check out the residuals in a new plot layout(matrix(1:4, nr=2)) windows() plot(mod.r, which=1) ### Check out statistics for the regression summary.lm(mod.r) Call: lm(formula = LNtrout ~ TEMPC, data = HohTrout) Residuals: Min 1Q Median 3Q Max -1.7534 -1.1924 -0.3294 0.9304 4.2231 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.07545 0.18220 -0.414 0.679 TEMPC 0.11220 0.01448 7.746 1.74e-14 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.365 on 1501 degrees of freedom Multiple R-Squared: 0.03844, Adjusted R-squared: 0.0378 F-statistic: 60 on 1 and 1501 DF, p-value: 1.735e-14

17 ### Simple Linear Regression - 0+ age Trout in Hoh River, WA against Temp Celsius ### Load data HohTrout<-read.csv("Hoh_Trout0_Temp.csv") ### See full metadata in Rosenberger, E.E., S.L. Katz, J. McMillan, G. Pess., and S.E. Hampton. In prep. Hoh River trout habitat associations. ### http://knb.ecoinformatics.org/knb/style/skins/nceas/ ### Look at the data HohTrout plot(TROUT ~ TEMPC, data=HohTrout) ### Log Transform the independent variable (x+1) - this method for transform creates a new column in the data frame HohTrout$LNtrout<-log(HohTrout$TROUT+1) ### Plot the log-transformed y against x ### First I'll ask R to open new windows for subsequent graphs with the windows command windows() plot(LNtrout ~ TEMPC, data=HohTrout) ### Regression of log trout abundance on log temperature mod.r <- lm(LNtrout ~ TEMPC, data=HohTrout) ### add a regression line to the plot. abline(mod.r) ### Check out the residuals in a new plot layout(matrix(1:4, nr=2)) windows() plot(mod.r, which=1) ### Check out statistics for the regression summary.lm(mod.r) Compare this method to: Copy and paste from Excel Log-transform in Excel Copy and paste new file Graph in SigmaPlot Analyze in Systat Graph in SigmaPlot

18 Collaboration & data stewardship Personal data management problems are magnified in collaboration Data organization – standardize Data documentation – standardize metadata Data analysis - document Data & analysis preservation - protect

19 Personal data management problems are magnified in collaboration Data organization – standardize Data documentation – standardize metadata Data analysis - document Data & analysis preservation - protect Collaboration and data sharing

20 “Data Entropy” Michener et al. 1997, Ecological Monographs Information content Time Time of publication Specific details General details Retirement or career change Death Accident

21 Databases that protect original data structure (e.g., not “sorting” in Excel) Statistical software that documents transformations, sorts, analysis, etc. (e.g., R, Matlab, SAS, JMP, PRIMER) Free software for data management & analysis Morpho, free software that standardizes descriptions of data (metadata) Central collaborative space organizing communications Technological solutions

22 Communication for collaboration Emphasis is on accessibility Open to collaborators Archived for the future Highly organized Searchable

23 Communication for collaboration Emphasis is on accessibility Open to collaborators √ √ √ Archived for the future √ √ √ √ Highly organized √ Searchable √ √ √ √ Email Discussion Forum Blogs Chat

24 NCEAS Collaborative Websites What we have in place:  User-editable website  Discussion Forums  Realtime chat with archived transcripts What others have used:  Skype  Google Docs  Zoho  GoToMeeting

25 Easy file uploading PDF library Discussion boards Wiki-style collaborative page editing Real-time chat NCEAS Collaborative Websites

26 Easy file uploading PDF library Discussion boards Wiki-style collaborative page editing Real-time chat All is searchable Click to go to the DGS Website NCEAS Collaborative Websites

27 Making useful tools for the students and faculty is critical Technology helps, but many of the problems are more social than technical Find a mixture that helps you get your work done NCEAS Collaborative Websites

28

29 traits-dgs.nceas.ucsb.edu Please create an account: http://traits-dgs.nceas.ucsb.edu/ Explore a previous site: http://dgs.nceas.ucsb.edu http://dgs.nceas.ucsb.edu name: student password: stu00

30 General NCEAS Tech Support

31 NCEAS recycles!

32 NCEAS Tech Support

33 NCEAS collaboration support http://help.nceas.ucsb.edu

34 NCEAS Tech Support

35 NCEAS Analytical Support

36

37

38 NCEAS Tech Support

39 Ecoinformatics at NCEAS

40

41 Metadata - Describing your data Metadata – description of your data – makes data useful Needed to Assess & interpret data Provide context for data

42 Metadata - Describing your data Metadata – description of your data – makes data useful Needed to Assess & interpret data Provide context for data But how to handle different styles of describing data? e.g., abundance vs. biomass

43 Metadata - Describing your data Metadata – description of your data – makes data useful Needed to Assess & interpret data Provide context for data But how to handle different styles of describing data? e.g., abundance vs. biomass Ecological Metadata Language (EML) Metadata standard Human-readable, yet machine-interpretable

44 Metadata - Describing your data What EML metadata looks like...

45 Metadata - Describing your data 1. Use our basic web interface 2. Install Morpho desktop software (free) Create & save metadata using wizards a. Manage data on your own computer b. Share metadata/data with the broader community c. Share with specific colleagues: set access privileges Search and locate data and metadata on the network

46 Metadata - Describing your data

47 Search knb.ecoinformatics.org NCEAS – LTER Network – OBFS – PISCO – ESA – UC Natural Reserve System – and more

48 NCEAS recycles!


Download ppt "Collaboration and data sharing in the Distributed Graduate Seminars Role of NCEAS in the DGS Coordination Collaborative webspace Grad Research Associate."

Similar presentations


Ads by Google