Presentation is loading. Please wait.

Presentation is loading. Please wait.

ETRI Site Introduction Han Namgoong, 2009. 6. 8.

Similar presentations


Presentation on theme: "ETRI Site Introduction Han Namgoong, 2009. 6. 8."— Presentation transcript:

1 ETRI Site Introduction Han Namgoong, nghan@etri.re.kr 2009. 6. 8

2 Government sponsored Research Institute  3,000 staffs, 500M USD (year 2009)  focus on technologies of broadcasting, software and contents, IT convergence, and convergence components and materials ETRI

3 ETRI Cluster Topology (1/2)

4 Server Pool +Agent +10,000Nodes Monitoring +Provisioning Proxy +DHCP +Agents +256 nodes +Provisioning Server +LVS +DB +40 Group Masters Cluster Master Database Master File System Master Group Master +Global Service Dispatcher +Disaster Recovery +100 Data Centers Distributed Procesing Master ETRI Cluster Topology (2/2)

5 Video based Internet Application Services UGC Search ServiceIPTV Servicee-Learning Service Platform OS and HW Low Power OS Node Manager Low Power HW Global File System File Metadata Management File Store And Replication File Remote Backup/Archiving Large Scale Data Mgmt. Service Data Management Distributed Data Store Data Access and Recovery Internet Services Common Components Large Scale Parallel Processing Job Partition and Merge Distributed Job Scheduling Video Management Components Security ProductionTaggingStoreRetrievalDelivery Device/ kernel authen. User/ service authen. Cluster Management Cluster Management Cluster Orchestration Provisioning Resources Monitoring Service Mgmt. ETRI Cluster Software Stack and Services ( 2009.6.8 )

6 Research Topics ( 2009.6.8 ) 1.Monitoring Tool for Large Cluster System - current monitoring SW  heavy overhead cpu/memory  small/light monitoring tool 2. Management of Big Video Feature Data - Google YouTube(2006) * Upload : 70,000 per day, Viewing : 100 Million play per day - Keyword based Retrieval (vague, imprecise,..) - Content based Retrieval (not simple interface/slow result)  Integrated Query(Keyword + Content based) 3.Elimination of Duplicated Video Data - Lots of same video files occupied storage spaces.  File (NOT data) deduplication is strongly required.

7 Schedule (2009.6.8) 1.Phase 1 : 2009.9.1 ~ 2010.2.28 Cloud stack (OSS) for evaluation - System management/Monitoring tool - Middleware(Web/AP/DB server) - Linux(CentOS,..) - Virtualization(Xen, KVM) - Distributed file-system/DB (Hadoop, Hbase) - Authentication(OpenLDAP) Evaluation point - Error recovery procedure, configuration, structure - Add resource(planned, unexpected) - Remove resource by degrade of load, and Migration - Overhead of virtualization, distributed file-system, distributed DB - Authentication between systems Source : Tomomi Suzuki, Status report of Cloud Computing activity, Japan OSS Promotion Forum, 2009.6.4

8 Schedule (2009.6.8) 2. Phase 2 : 2010.3.1 ~ 2011.2.28 Selection of Requirements Develop, Test and Deployment - Monitoring Tool for Large Cluster System - Management of Big Video Feature Data - Elimination of Duplicated Video Data Distributed Data Management based on Hadoop/Hbase - Multi dimensional map model - Support a composite row key - Column group based storage model - Distribute partitions splited by a composite row key - Data access control by user and privilege management ……… Distributed Processing - Fail-over of task execution node and job manage node - Distributed task processing based on data location - Configurable job scheduling : 9 policies ……..

9 Plans, Expectations (1/3) Hadoop/MapReduceWhatExpectationsCategory Parallel Processing Model Cluster Size Job Control Job Scheduling Task Distribution High Availability - Map/Reduce Programming Model - I/O Source : HDFS, LFS, Hbase - Map/Reduce Programming Model -I/O Source : + new-FS, new-DB Enlargement of parallel processing target - Thousand nodes - Manually configure - Thousand nodes - Automatically configure Easy to manage parallel processing cluster - None - Execution control based on user Access control to parallel processing cluster - Direct Priority, FIFO - Priority management by job - 9 configurable scheduling policies - Priority management by job, Group and user Support of various jobs - Consideration of data location and node position - Consideration of data location, node position and node resource Increase of node utilization - Fail-over of task execution node - Fail-over of job manage node - Increase availability -Reduction of Job execution time

10 Plans, Expectations (2/3) HbaseWhatExpectationsCategory Data Model Video Manage Data Storage Model Data Distribution Access Control High Availability Query Language - Multi dimensional map - Row key : single field - Multi dimensional map - Row key : composite field Easy to construct key - None - High dimensional index manage - k-NN search Provide large scale video content based retrieval - Column oriented - Per column - Column oriented - Per column group Performance enhancement - Distribute partitions splited by row key - Distribute clusters by high dimensional index Performance enhancement of key-based/content based retrieval - None - User management - Privilege management of table/column Provide data security - Fail-over of partition management node  serial processing log file and parallel recovery - Fail-over of partition management node  parallel processing log file and parallel recovery - Fail-over of master node - Increase availability - Reduction of down time - Use in shell - Use in application Easy to develop application

11 Category FunctionWhatOSCAR Cluster Orchestration StructureHierarchicalFlat ScalabilityAutomatic ReconfigurationPxe+DHCP AvailabilityIndependent HA ToolActiv-active(2 head node) Management InterfaceWebX-GUI, Command(C3) CommunicationXMLXDR, XML IP ManagementServer ConfigurationDHCP auto/static Maximum Nodes 10,000 per data center / Max. 1,000,000 Oscar 440 Load BalancingFront-end LVS, Back-end new-DP Front-end PBS, TORQUE, MAUI Service Management Node Reconfiguration By Load Balancing YesNone Master Management Master Node Configuration Hierarchy (Key Master, Cluster Master, Group Master) Head node Resources Monitoring Monitoring ToolProprietaryGanglia Provisioning Provisioning (image)OS imaging Provisioning (streaming)SW streamingSW tar/rpm Plans, Expectations (3/3)

12 Thank you


Download ppt "ETRI Site Introduction Han Namgoong, 2009. 6. 8."

Similar presentations


Ads by Google