Distributed Geospatial Information Processing (DGIP) Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Outline 1. Centralization & Distributed 2. Distributed system 3. Distributed process 4. Distributed Geospatial Information Processing (DGIP) 5. Computing platforms 6. Summary
Centralization & Distributed Why distributed? Digital Earth (DE) Internet techniques Global Earth Observation Service (GEOS) Online data analysis Collaboration Distributed data sources Collaborative analysis Parallel computing Cloud computing
Distributed System Definition A distributed system is a software system that all independent computers play as a single coherent system for users. Independent computers Middleware *
Distributed System Middleware Middleware is a software to connect different applications, which stays in the middle of system. Middleware is developed by Off-the-shelf Middleware enables communication and input/output, Middleware is the software layer that lies between the operating system and the applications Middleware supports complex, and distributed applications Middleware is built based on Extensible Markup Language (XML), Simple Object Access Protocol (SOAP), Web services, SOA, Web 2.0 infrastructure, and Light weight directory access protocol(LDAP)
Distributed System Distributed computing system Cluster computing Grid computing Distributed information system For the network based applications with poor interoperability Distributed pervasive system Nodes are fixed and kindly permanent connection to a network
Distributed Processing Definition Distributed processing: refers to use multiple computers (or computing resources) to conduct an application. Parallel processing: refers to use multiple CPUs in one computer to conduct an application.
Distributed Geospatial Information Processing (DGIP) Advantages of DGIP An infrastructure and platform built on the principles of DGIP architecture and algorithms Interoperable approaches to share the widely distributed and heterogeneous geospatial resources Computing infrastructure to share the widely distributed computing resources Intelligent approaches to share knowledge using ontology and distributed resources Best-practices of the GEOSS societal benefit areas
DGIP Characteristics of DGIP Transparency Openness Concurrency Scalability Flexibility Sharing & communication
DGIP Types of DGIP Multiprocessor architectures Client-server architectures Distributed object architectures Inter-organisational computing
DGIP Multiprocessor architecture* Simple model System composed of multiple processes Architectural model of many large real-time systems Distribution of process to processor may be pre-ordered or may be under the control of a dispatcher * Sommerville, I. (2001). Software Engineering,
DGIP Multiprocessor architecture
DGIP Client-server architectures The model consists of services providers and services clients Clients know of servers but servers need not know of clients. Clients and servers are logical processes The mapping of processors to processes is not necessarily 1 : 1. * Sommerville, I. (2001). Software Engineering,
DGIP Client-server architectures
DGIP Architecture layers: Presentation layer Application processing layer Data management layer Thin and Fat client: Thin-client model: server carries out application processing and data management, client is only responsible for presentation software. Fat-client model: server carries out data management, client is only responsible for application processing and presentation software.
DGIP Client-server architectures 2-tier architecture 3-tier architecture
DGIP Distributed object architectures* There is no distinction in a distributed object architectures between clients and servers. Each distributable entity is an object that provides services to other objects and receives services from other objects. Object communication is through a middleware system called an object request broker. However, distributed object architectures are more complex to design than C/S systems. * Sommerville, I. (2001). Software Engineering,
DGIP Distributed object architectures
DGIP Advantages*: It allows the system designer to delay decisions onwhere and how services should be provided. Openness and resources sharing Flexible and scalable. It is possible to reconfigure the system dynamically * Sommerville, I. (2001). Software Engineering,
DGIP Inter-organizational computing For security and inter-operability reasons, most distributed computing has been implemented at the enterprise level. Local standards, management and operational processes apply. Newer models of distributed computing have been designed to support inter- organizational computing where different nodes are located in different organizations. * Sommerville, I. (2001). Software Engineering,
DGIP Inter-organisational computing Decentralised peer to peer (P2P) architecture Semi-centralised peer to peer (P2P) architecture Service-oriented architecture (SOA)
Computing platforms High performance computing (HPC) High performance computing: clusters the computing powers to support complex and large-scale computing
Computing platforms High performance computing (HPC) Parallel computing: simultaneously utilizes multiple compute resources to conduct a specific application A coordinate mechanism is employed to schedule the computing task A complex task is broken into separate sub-tasks that can be solved concurrently The sub-task is further broken into different instructions Each instructions are operated simultaneously on different processors
Computing platforms High performance computing (HPC) Parallel computing
Computing platforms Grid computing Grid computing: combines the different computing resources derived from distributed locations to support a specific application Grid: generates the connection between different computing resources
Computing platforms Grid computing Advantages: Increases the efficiency of utilization of distributed computing resources Supports parallel computing Satisfies complex computing demands Enables utilization of different types of data Supports collaborative research and analysis
Computing platforms Grid computing Architecture: Fabric layer: offers the computing resources Connectivity layer: defines the communication and authorization for grid network Resource layer: defines protocols, API, and software development kits (SDK) Collective layer: manages the interaction between different computing resources Application layer: enables user’s operations
Computing platforms Grid computing Key components: Grid portal: the interface between local computer and computing resources Security: grid security infrastructure (GSI) Broker: monitoring and discovery services (MDS) Scheduler *
Computing platforms Cloud computing Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. * *
Computing platforms Cloud computing Essential Characteristics*: On-demand self-service: A consumer can unilaterally provision computing capabilities as needed automatically without requiring human interaction with each service provider. Broad network access: Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous client platforms Resource pooling: The provider’s computing resources are pooled to serve multiple consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to consumer demand. *
Computing platforms Cloud computing Essential Characteristics*: Rapid elasticity: Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. Measured service: Cloud systems automatically control and optimize resource use by leveraging a metering capability at some level of abstraction appropriate to the type of service. Resource usage can be monitored, controlled, and reported, providing transparency for both the provider and consumer of the utilized service. *
Computing platforms Cloud computing Service models* Software as a Service (SaaS): The capability provided to the consumer is to use the provider’s applications running on a cloud infrastructure. The applications are accessible from various client devices through either a thin client interface, such as a web browser, or a program interface. Platform as a Service (PaaS): The capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created or acquired applications created using programming languages, libraries, services, and tools supported by the provider. Infrastructure as a Service (IaaS): The capability provided to the consumer is to provision processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. *
Computing platforms Cloud computing Deployment models* Private cloud: The cloud infrastructure is provisioned for exclusive use by a single organization comprising multiple consumers. Community cloud: The cloud infrastructure is provisioned for exclusive use by a specific community of consumers from organizations that have shared concerns Public cloud: The cloud infrastructure is provisioned for open use by the general public. Hybrid cloud: The cloud infrastructure is a composition of two or more distinct cloud infrastructures (private, community, or public) that remain unique entities, but are bound together by standardized or proprietary technology that enables data and application portability. *
Summary Distributed system: Distributed computing system Distributed information system Distributed pervasive system Distributed process & parallel process: Distributed Geospatial Information Processing (DGIP): Client-server architectures Distributed object architectures Inter-organizational computing Computing platforms: High performance computing (HPC) Grid computing Cloud computing