Download presentation
Presentation is loading. Please wait.
1
TeraGrid Area Director for Science Gateways
Science Gateways and their tremendous potential for science and engineering Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways
2
Outline History of internet NSF involvement Gateways Impact on science
Changes in web development Impact on public perception of science NSF involvement CDI PACI, ITR, build up to gateways Gateways Just beginning to imagine the promise of gateways Gateways and TeraGrid TG overview (disciplines, usage, users) Current gateway development Gateway successes Nancy Wilkins-Diehr
3
Many similarities between Banff and Gateways
Thank You for the Invitation to Speak To such a distinguished audience in such a beautiful location Many similarities between Banff and Gateways Both are about connections National park created due to sea to sea railway connection Trail guides lead the way “Peyto assumes a wild and picturesque, though somewhat tattered attire” Describes Banff trail guides and gateway developers! Nancy Wilkins-Diehr
4
Only 15 years since the release of Mosaic!
Phenomenal Impact of the Internet on Worldwide Communication and Information Retrieval Only 15 years since the release of Mosaic! Implications on the conduct of science are still evolving 1980’s, Early gateways, National Center for Biotechnology Information BLAST server, search results sent by , still a working portal today 1989, First ftp archive (archie) created at McGill 1992 Mosaic web browser developed 1995 “International Protein Data Bank Enhanced by Computer Browser” 2004 TeraGrid project director Rick Stevens recognized growth in scientific portal development and proposed the Science Gateway Program Simultaneous explosion of digital information Analysis needs in a variety of scientific areas Sensors, telescopes, satellites, digital images and video #1 machine on Top500 today is 300x more powerful than all combined entries on the first list in 1993 Nancy Wilkins-Diehr
5
1998 Workshop Highlights Early Impact of Internet on Science
Shared access to geographically disperse resources Assembling the best minds to tackle the toughest problems regardless of location Tackling the same problems differently, but also tackling different problems Not only the scope, but the process of scientific investigation is changed “As the chemical applications and capabilities provided by collaboratories become more familiar, researchers will move significantly beyond current practice to exciting new paradigms for scientific work” Requirements for future success include: - Development of interdisciplinary partnerships of chemists and computer scientists - Flexible and extensible frameworks for collaboratories - Means to deploy, support, and evaluate collaboratories in the field Nancy Wilkins-Diehr
6
Finholt: Internet challenges status quo in chemistry research
Since the birth of modern chemistry in the early 19th century there has been tremendous growth in the knowledge and the practical application of chemical principles. However, in many important ways, the practice of chemistry research and teaching has remained unchanged. The advent of the Internet as a worldwide mechanism for conducting scientific communication challenges this status quo. Specifically, innovations like collaboratories, or network-based virtual laboratories, remove constraints of distance and time on scientific collaboration. Collaboratories increase access to scarce instruments, accelerate the flow of information, and place new demands on senior scientists to mentor students. Nancy Wilkins-Diehr
7
Bair: Internet revolutionizes not only scope, but process of scientific investigation
High-speed computation provides the means to examine and simulate systems at unprecedented levels of detail and accuracy Large-scale databases enable analysis of the prodigious volumes of data Coupling technologies with communications revolutionizes not only the scope but also the process of scientific investigation Distributed computing and communications technologies enable researchers to access data, instruments, and expertise independent of their location Abilities of geographically distributed research teams include organization, close-knit interaction, and rapid response, needed to address increasingly challenging research problems. Reduction in travel and equipment costs, increased access to large facilities also a plus Nancy Wilkins-Diehr
8
Potential and Criteria for Success Even More Pronounced Today
As the chemical applications and capabilities provided by collaboratories become more familiar, researchers will move significantly beyond current practice to exciting new paradigms for scientific work Requirements for future success include: Development of interdisciplinary partnerships of chemists and computer scientists Flexible and extensible frameworks for collaboratories Means to deploy, support, and evaluate collaboratories in the field Ray Bair, Argonne National Lab Nancy Wilkins-Diehr
9
Rapid Advances in Web Usability
First generation Static Web pages Second generation Dynamic, database interfaces, cgi Lacked the ease of use of desktop applications Third generation True networked and internetworked applications that enable dynamic two-way, even multi-way, communication and collaboration on the Web. Remarkable new uses of the Web in the organizational workplace and on the Internet Source: Screen Porch White Paper, The University of Western Ontario (1996) Nancy Wilkins-Diehr
10
What’s Next. “Prediction is hard. Especially about the future
What’s Next? “Prediction is hard. Especially about the future.” Yogi Berra Scientists of tomorrow are familiar with media we don’t even know about Not using full power of the internet by any means today Data and knowledge are handled differently Linking publications and data referenced in those publications Annotation, data provenance Inability to create discourse around a piece of data Ability to keep up with knowledge generation 16,000 papers a week into PubMed 50,000 papers a week in biology Right now have choice between reading abstract or paper, might add 10 minute author clip How can science motivate in the way YouTube can? Streaming video to view simulations, using visual and sound media Ipods everywhere, but not exploited for science Web 2.0 Science was earlier internet adopter, now overtaken by business Now a big difference between commercial and scientific sites Noticeable efforts to keep users on commercial sites Source: 5/14/07 interview with Dr. Philip Bourne, Protein Data Bank Nancy Wilkins-Diehr
11
Summary of Findings at a Glance John B. Horrigan, Associate Director
The Internet as a Resource for News and Information about Science: Summary of Findings at a Glance 40 million Americans rely on the internet as their primary source for news and information about science. For home broadband users, the internet and television are equally popular as sources for science news – and the internet leads the way for young broadband users. The internet is the source to which people would turn first if they need information on a specific scientific topic. The internet is a research tool for 87% of online users. That translates to 128 million adults. Consumers of online science information are fact-checkers of scientific claims. Sometimes they use the internet for this, other times they use offline sources. Convenience plays a large role in drawing people to the internet for science information. Happenstance also plays a role in users’ experience with online science resources. Two-thirds of internet users say they have come upon news and information about science when they went online for another reason. Those who seek out science news or information on the internet are more likely than others to believe that scientific pursuits have a positive impact on society. Internet users who have sought science information online are more likely to report that they have higher levels of understanding of science. Between 40% and 50% of internet users say they get information about a specific topic using the internet or through . Search engines are far and away the most popular source for beginning science research among users who say they would turn first to the internet to get more information about a specific topic. Half of all internet users have been to a website which specializes in scientific content. Fully 59% of Americans have been to a science museum in the past year. Science websites and science museums may serve effectively as portals to one another. The convenience of getting scientific material on the web opens doors to better attitudes and understanding of science. November 20, 2006 John B. Horrigan, Associate Director 87% of internet users use it for research Half of all internet users have been to a website that specializes in scientific content Those who seek out science information on the internet are more likely to believe that scientific pursuits have a positive impact on society (EOT components of gateways are important)
12
NSF (my sponsor) has long recognized the importance of science and technology interactions
Interdisciplinary programs did much to facilitate application-technology integration and develop standard tools 1997 PACI Program Marriage of technologists and application scientists A few groups served as path finders and benefited tremendously NPACI neuroscience thrust in 1997 leads to Telescience portal and BIRN in 2001 Information Technology Research (ITR) NSF Middleware Initiative (NMI) Plug and play tools so more groups can benefit Nancy Wilkins-Diehr
13
NSF Continues Its Leadership Today What Will Lead to Transformative Science?
“Virtual environments have the potential to enhance collaboration, education, and experimentation in ways that we are just beginning to explore.” “In every discipline, we need new techniques that can help scientists and engineers uncover fresh knowledge from vast amounts of data generated by sensors, telescopes, satellites, or even the media and the Internet.” Gateways are a terrific example of interfaces that can support transformative science Nancy Wilkins-Diehr
14
Flagship US$52M CDI Program Launched in 2008
Cyber-enabled Discovery and Innovation (CDI) is “NSF’s bold five-year initiative to create revolutionary science and engineering research outcomes made possible by innovations and advances in computational thinking.” Program announced October 1 Bold multidisciplinary activities that, through computational thinking, promise radical, paradigm-changing research findings Far-reaching, high-risk science and engineering research and education agendas that capitalize on innovations in, and/or innovative use of, computational thinking Partnerships to involve investigators from academe, industry and may include international entities Growth to US$250M recommended by 2012 Funded across NSF directorates Birds-of-a-feather session at SC07 in Reno, NV The power of new information and communications allows us to investigate phenomena of increasing complexity, scale and scope. But researchers are finding it increasingly difficult to cope with the flood of data from improved observational tools, to assimilate different data formats and ontologies--atomic to the cosmic--and to find ways to store and archive petabyte-sized databases. In 2008, NSF will invest $52 million in a new initiative we call Cyber-enabled Discovery and Innovation, or CDI. CDI will explore a new generation of computationally-based discovery concepts and tools at the intersection of the computational world and the physical and biological worlds. In every discipline, we need new techniques that can help scientists and engineers uncover fresh knowledge from vast amounts of data generated by sensors, telescopes, satellites, or even the media and the Internet. Understanding complex interactions in systems ranging from living cells to binary star systems, or from computer networks to societies, also present challenges. We need improved simulation and other dynamic modeling techniques to support experiments with complex systems--from earthquakes to brains--that are not feasible to perform in the physical world. Finally, virtual environments have the potential to enhance collaboration, education, and experimentation in ways that we are just beginning to explore. CDI educational research efforts will center on a combination of virtual environments and advanced cyberinfrastructure. CDI will tackle all of these challenging research problems. Nancy Wilkins-Diehr
15
Don’t use Knowledge extraction Complex interactions
Data mining, visualization, petascale computational power, etc. to assist scientists and engineers extract most important information from the almost infinite amounts of data from sensors, telescopes, satellites, the media, the Internet, surveys, etc. Complex interactions Scaling from the quantum- to the nano- to the macro-scales, large number of interacting elements, non-linearity of interactions Computational experimentation Simulation allows experimentation with complex systems like tornado development and brain surgery unavailable in the real world Virtual environments Collaboration among diverse populations spread across geographic distances and time zones Educating researchers and students Integration of computational discovery techniques into the basic education of all scientists and engineers stated as an explicit goal to realize the full potential of this program Nancy Wilkins-Diehr
16
Three Thematic Areas Offer Diversity
From Data to Knowledge Enhancing human cognition and generating new knowledge from a wealth of heterogeneous digital data Data mining, visualization, petascale computational power, etc. to assist scientists and engineers extract most important information from the almost infinite amounts of data from sensors, telescopes, satellites, the media, the Internet, surveys, etc. Understanding Complexity in Natural, Built, and Social Systems Deriving fundamental insights on systems comprising multiple interacting elements Simulate and predict complex stochastic or chaotic systems Explore and model nature’s interactions, connections, complex relations, and interdependencies, scaling from sub-particles to galactic, from subcellular to biosphere, and from the individual to the societal Building Virtual Organizations Facilitate creative, cyber-enabled boundary-crossing collaborations, including those with industry and international dimensions Advance the frontiers of science and engineering and broaden participation in science, technology, engineering and math fields Nancy Wilkins-Diehr
17
Exciting Canadian Activities
September 13, 2007 announcement of $30M CANARIE program Network-Enabled Platforms (NEP) Collaborative projects that accelerate the development of, and participation in, national and international cyberinfrastructure and e-Research platforms. Participants in the Program can be from both the public and private sectors. Infrastructure Extension Program (IEP) Extensions to Canada's research and education network that will enhance and accelerate research, enable national and international collaboration, improve access to knowledge, and contribute to the development of cyberinfrastructure and e-research in Canada. Nancy Wilkins-Diehr
18
Science Gateways are a Natural Extension of Internet Developments
3 common types of gateway Web portal with users in front and services in back Client server model where application programs running on users' machines (i.e. workstations and desktops) and accesses services Bridges across multiple grids, allowing communities to utilize both community developed grids and shared grids Continued rapid changes ahead, must be adaptable, gateways can provide some nimbleness Scientific gateways can have varying goals and implementations. Some expose specific sets of community codes so that anonymous scientists can run them. Others may serve as a community portal that brings a broad range of new services and applications to the community. Some may provide access to data collections or the ability to create data products by analyzing data in a collection. Some provide remote visualization. A common trait of all gateways is their interaction with the TeraGrid through the various service interfaces that TeraGrid provides. Nancy Wilkins-Diehr
19
Gateway Idea Resonates with Scientists
Capabilities provided by the Web are easy to envision because we use them in every day life Researchers can imagine scientific capabilities provided through a familiar interface Groups resonate with the fact that gateways are designed by communities and provide interfaces understood by those communities But also provide access to greater capabilities on the back end without the user needing to understand the details of those capabilities Scientists know they can undertake more complex analyses and that’s all they want to focus on But this seamless access doesn’t come for free. It all hinges on very capable developers Nancy Wilkins-Diehr
20
Trust and Reliability are Fundamental to Success
Fundamental in business applications Fundamental for science too The public gains confidence in internet sites that provide accurate information reliably Pub Med National Cancer Institute Google Paypal For scientists it takes far longer to build this confidence Scientists will not rely on gateway tools to conduct their analysis and store their research results unless they have ultimate confidence in the interfaces Proven track record Run by reputable organization Have been in existence “a long time” Provide accurate results Work repeatedly Confidence in PDB developed over 30 years, started with community mandate that proteins must be deposited before publications would be accepted Nancy Wilkins-Diehr
21
How can we build interfaces that scientists will trust?
Expertise Simple web pages are easy to design Complex capabilities, particularly those involving grid access, take knowledgeable developers to create a production product LEAD, nanoHUB show what investment can do Sustained funding Most science groups have money for research, not portal building or ongoing support for portals Knowledge transfer Must take advantage of industry advancements Investments must result in building blocks that other applications can use Many gateways have similar issues Data access Analysis capabilities User work environments Workflow capabilities Nancy Wilkins-Diehr
22
Tremendous Opportunities Using the Largest Shared Resources - Challenges too!
What’s different when the resource doesn’t belong just to me? Resource discovery Accounting Security Proposal-based requests for resources (peer-reviewed access) Code scaling and performance numbers Justification of resources Gateway citations Tremendous benefits at the high end, but even more work for the developers Potential impact on science is huge Small number of developers can impact thousands of scientists But need a way to train and fund those developers and provide them with appropriate tools Nancy Wilkins-Diehr
23
What is the TeraGrid? A unique combination of fundamental CI components
Dedicated high-speed, cross—country network Staff & Advanced Support 20 Petabytes Storage 2 PetaFLOPS Computation Visualization
24
300+ Teraflops Computation Dedicated cross-country network
What is the TeraGrid? NSF-funded facility to offer high end compute, data and visualization resources to the nation’s academic researchers 300+ Teraflops Computation Visualization 20+ Petabytes Storage Dedicated cross-country network Nancy Wilkins-Diehr
25
Opportunities and Challenges as a Virtual Organization
Full vision of cyberinfrastructure Data, compute, visualization, workflows But need to do a better job of representing the capabilities to researchers Creating prototypes for others to follow Never underestimate the value in keeping things SIMPLE Work with top notch people regardless of location Better for end users Single request process for all types of resources Single place for documentation But must work harder To sustain momentum in projects Set a few high-level goals Clear management structure Individual responsibility Project accountability To provide clarity for users Nancy Wilkins-Diehr
26
TeraGrid Resources Available for all Domain Scientists At no cost to them!
Integrated, persistent, pioneering resources Significantly improve the ability and capacity to gain new insights into the most challenging research questions and societal problems Peer-reviewed, proposal-based access Targeted support available as well Dedicated staff investment to really make a difference on complex problems Transformational science Must have PI commitment Make lessons learned available for all Nancy Wilkins-Diehr
27
TeraGrid Usage 200 100 ~50% Annual Growth Compute Cycles Delivered
June 2006 Specific Allocations Roaming Allocations Compute Cycles Delivered Normalized Units (millions) ~50% Annual Growth 200 100 TeraGrid users are awarded “allocations” of time based on peer review that takes place on a quarterly basis. New users can apply for development allocations, allowing them to begin computing within 2-3 weeks of initial request. This slide shows overall TeraGrid usage from Jan 2004 through June 2007, including the addition of NCSA and SDSC resources in April ROAMING usage refers to allocates that allow the user to use the allocation on any TeraGrid resource, where other usage shown is from allocations on specific resources. TeraGrid currently delivers an average of 420,000 cpu-hours per day -> ~21,000 DC every hour Source: Dave Hart Nancy Wilkins-Diehr Charlie Catlett
28
TeraGrid User Community
June 2006 Gateways Growth Target Source: Dave Hart Nancy Wilkins-Diehr Charlie Catlett
29
Easy TeraGrid Gateway True and False Test Answers Provided
Any PI can request an allocation and use it to develop a gateway (T) Gateway design is community-developed and that is the core strength of the program (T) TeraGrid staff are alerted to gateway work when a proposal is reviewed or when a community account is requested (T) Limited TeraGrid support can be provided for targeted assistance to integrate an existing gateway with TeraGrid (T) TeraGrid selects all gateways (F) TeraGrid designs all gateways (F) TeraGrid limits the number of gateways (F) All gateways need TeraGrid funding to exist (F) Nancy Wilkins-Diehr
30
TeraGrid RATs (Requirements Analysis Teams)
Spring, Science Gateway Requirements Analysis Team (RAT) Identification of common needs across the gateways Goal is production use of TG resources in the gateway as well as development of process and policy within TG for scalable gateway program and services Tremendous sharing of experiences amongst talented developers Nancy Wilkins-Diehr
31
2006 – Implementing Common Gateway Requirements
Web Services GT4 deployment, identification of remaining capabilities Information services, WebMDS Auditing Need to retrieve job usage info on production resources GRAM audit deployed in test mode in September, inclusion in CTSSv4 Community Accounts Policy finalized, security approaches being tested by RPs Attribute-based authentication testing Allocations Changes in allocation procedures, the mechanisms used to evaluate science impact, and models for identity management, authentication and authorization that are more tuned to virtual organizations. Scheduling Metascheduling RAT On-demand via SPRUCE framework Outreach Talks, Schools/workshops (NVO, GISolve), major project demonstrations (LEAD) SURA, HASTAC, GEON, CI-Channel, SC, Grace Hopper, MSI-CI2, Lariat, Science Workflows and On Demand Computing for Geosciences Workshop Primer Living document in wiki, provides up-to-date overview and instructions for new gateway developers (“how to make your portal a TeraGrid science gateway”) Nancy Wilkins-Diehr
32
Current Activities – Moving Forward!
Extend development of general gateway services React to and anticipate community needs Streamlined TeraGrid integration means more interest and more science Building Blocks for Science Gateways ( Continue targeted work with selected projects SidGrid, CReSIS Stay ahead of technology changes Well, at least not get too far behind… Build on burgeoning interest in gateways for education Navajo Technical College TeraGrid EOT supplemental funding Nancy Wilkins-Diehr
33
Planning for the Future of TeraGrid
Activity lead by U Michigan School of Information Gateway (June) and user (August) workshops held Report due February, 2008 Recommendations from gateway workshop include: Support interaction and cross-fertilization among Science Gateway development communities Sharing code and successful solutions Financial and professional support for developing gateways Develop gateway framework templates built upon toolkits which may already exist Training, education, workshops, generalized & standardized basic services, documentation End-to-end support for Virtual Organizations Operating more effectively as a community in order to better support the education and development needs of gateway developers. Nancy Wilkins-Diehr
34
Selected Gateway Highlights
nanoHUB Linked Environments for Atmospheric Discovery (LEAD) GridChem Biomedical Informatics Research Network (BIRN) Center for Remote Sensing of Polar Icesheets (CReSIS) Nancy Wilkins-Diehr
35
Highlights: NanoHub Explosive User Growth
In past 12 months 26,000 users 50% of usage from U.S. 10 courses viewed by over 6,000 users 165 podcasts downloaded by over 4,000 users 1400 online meetings Short clip from Gerhard Klimeck Nancy Wilkins-Diehr
36
Highlights: LEAD Inspires Students Advanced capabilities regardless of location
A student gets excited about what he was able to do with LEAD “Dr. Sikora:Attached is a display of 2-m T and wind depicting the WRF's interpretation of the coastal front on 14 February It's interesting that I found an example using IDV that parallels our discussion of mesoscale boundaries in class. It illustrates very nicely the transition to a coastal low and the strong baroclinic zone with a location very similar to Markowski's depiction. I created this image in IDV after running a 5-km WRF run (initialized with NAM output) via the LEAD Portal. This simple 1-level plot is just a precursor of the many capabilities IDV will eventually offer to visualize high-res WRF output. Enjoy! Eric” ( , March 2007) Nancy Wilkins-Diehr
37
Source: Sudhakar Pamidighantam, NCSA
Highlights: GridChem’s Client-Server Approach Provides Power and a Rich Feature Set 200 users, 500,000 CPU hours delivered National Center for Supercomputing Applications Source: Sudhakar Pamidighantam, NCSA
38
Source: Anthony Kolasny, Johns Hopkins
Biomedical Informatics Research Network (BIRN) BIRN is a National Center for Research Resources (NCRR) initiative aimed at creating a testbed to address biomedical researchers Source: Anthony Kolasny, Johns Hopkins
39
Source: Anthony Kolasny, Johns Hopkins
Shape Analysis - A Morphometry BIRN Project 4 JHU CIS-KKI Shape Analysis of Segmented Structures 3 MGH Segmentation 5 BWH Visualization TeraGrid Supercomputing Data Donor Sites Storage Goal: comparison and quantification of structures’ shape and volumetric differences across patient populations 1 De-identification And upload 2 Source: Anthony Kolasny, Johns Hopkins
40
BIRN uses SSHFS to mount TeraGrid filesystems locally
August 2005 CIS has 87TB of local storage. /cis/net lists network drives. 220TB through CIS portal using autofs, samba, smbwebclient. Nancy Wilkins-Diehr Source: Anthony Kolasny, Johns Hopkins University Charlie Catlett
41
CReSIS (Center for Remote Sensing of Ice Sheets)
Awarded CI-TEAM funding to build a Polar Gateway International Polar Year Led by Geoffrey Fox, IU and Linda Hayden, Elizabeth City State CReSISGrid Build a TeraGrid Science Gateway Provide broad-based educational and training activity in Cyberinfrastructure for remote sensing and ice sheet dynamics Lessons learned in remote data gathering can be applied to fields Nancy Wilkins-Diehr
42
When is a gateway appropriate?
Researchers using defined sets of tools in different ways Same executables, different input GridChem, CHARMM Creating multi-scale workflows Datasets Common data formats National Virtual Observatory Earth System Grid Some groups have invested significant efforts here caBIG, extensive discussions to develop common terminology and formats BIRN, extensive data sharing agreements Difficult to access data/advanced workflows Sensor/radar input LEAD, GEON Nancy Wilkins-Diehr
43
Tremendous Potential for Gateways
In only 15 years, the Web has fundamentally changed human communication Science Gateways can leverage this amazingly powerful tool to: Transform the way scientists collaborate Streamline conduct of science Influence the public’s perception of science Like e-commerce, Science Gateways need to build trust in the infrastructure, tools, and methods that they use Unlike the public or commercial arena, scientists will be vested in these gateways Science Gateways will need to build trust in the organization behind them. Gateways need to have continuity High end resources can have a profound impact The future is very exciting! Nancy Wilkins-Diehr
44
Thank you for your attention
Enjoy the Summit! Thank you for your attention Please contact me for further information Nancy Wilkins-Diehr
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.