Grid Platform for Geospatial Applications & Fine Granule Scheduler Presented by Bin Zhou Bin Zhou, Jibo Xie, Chaowei Yang Joint Center for Intelligent Spatial Computing George Mason University
Agenda ➲ Grid Computing Introduction ➲ CISC & SURA Grid ➲ Geospatial Applications Require Grid ➲ CISC Fine Granule Scheduler ➲ Architecture,Strategy ➲ Progress Status
Grid Computing Introduction ➲ Definition Grid computing is an emerging computing infrastructure that treats all resources as a collection of manageable entities with common interfaces to such functionality as lifetime management, discoverable properties and accessibility via open protocols – wikipedia ➲ Popular Grid Middleware Condor Globus Condor-G Unicore
GMU grid environment SURAgrid GMU CISC GMU Grid can access the computing resources contributed by SURAgrid member universities
GMU grid environment LambdaRail GMU CISC Grid can setup 1-10Gbps connection to any of the LamdaRail supported Universities, Agencies, and Centers, such as GSFC & SDSC
CISC Computing Pool
Geospatial Requirements ➲ Large Data Set Map Data, Sensor Data, in Tera-bytes ➲ Reliability,Interoperability collaboration ➲ Intensive Computation More Complex Algorithms Adaptive Algorithms Intelligent Processing
Grid Computing Could Satisfy these requirements ➲ Reliable File Transfer ➲ Resource Management and Allocation ➲ Authorization & Control ➲ Job Control ➲ Web Service Oriented
Detecting Watersheds from multi-scale DEM ➲ Watershed boundaries are not known before processing massive data ➲ extract coarse watershed boundaries from multi-scale DEM ➲ Using the boundaries to decompose the massive data with some redundancy resample Extraction Xie 2006
Use 24 units to test the speed up (each unit is 3.08M) (Xie 2006)
CISC Test Applications s293s s Job Amount CPUs Executing Time Speed Up Efficiency s 1 1 Real Time Routing Test Result: The efficiency decreases with the CPU numbers because the overhead increase, but the major problem is Condor can’t handle small jobs efficient. Demonstrates the need for fine granule scheduler
Specific Applications: Fine- Grained Near Real Time Jobs ➲ Fine-Grained Very Short Executing Time Huge Amount Job Similarity ➲ Near Real Time Sensitive to scheduling latency example: Real-Time Routing, Short-Time stock prediction, Condor cannot be used for tasks that require less than 3.5 min to complete ---Gregg Cooke, IT Technical Council,"Evaluating Condor for Enterprise Use: A UBS Case Study"
CISC Scheduler ➲ Purpose improve near real time job response time improve mass Fine Granularity job throughput ➲ Scheduling Strategy Short Communicating Message Simple Match-Making Function Dynamic Index Multi-Dispatch
System Architecture TCP/UDP Socket File TransferProcessOther Lib Services Abstract Interface /APIs Message passingMemory System Function Dispatcher Collector Container Resource Manager Submitter Algorithm module Central ManagerWorkerUser Interface
Components
Job Work Flow
Prototype Overhead Test ➲ Test Case Insertion Sort 200,000 integers Dataset: 5.56M Execute File : 1.8M ➲ Test Platform OS: ubuntu 6.10 Network: 100Mbps CPU: Celeron M 1.6G Memory: 1G Job Amou nt File Transfer Time Job Executing Time Other Overhead Communicating Overhead Efficiency 11s27s0.4s18ms95.1% 54s154s1.2s20ms98.9%
Thanks Questions?