Best practises and experiences in user support

Slides:



Advertisements
Similar presentations
AN INGENIOUS APPROACH FOR IMPROVING TURNAROUND TIME OF GRID JOBS WITH RESOURCE ASSURANCE AND ALLOCATION MECHANISM Shikha Mehrotra Centre for Development.
Advertisements

Polish Infrastructure for Supporting Computational Science in the European Research Space EUROPEAN UNION Services and Operations in Polish NGI M. Radecki,
The Business Value of CA Solutions Ovidiu VALEANU Senior Consultant DNA Software – CA Regional Representative.
SYSchange for z/OS By Pristine Software April 2009 Thomas Phillips April 2009 SYSchange Pristine Software.
Integrate into existing systems with PowerShell integration modules Extend by building PS modules to enable integrating into other systems Optimize.
WMU GNL Automation How to make my IT life easier CHRISTOPHER KEYAERT CONSULTANT AT INOVATIV CLOUD AND DATACENTER MANAGEMENT MVP.
Cloud Attributes Business Challenges Influence Your IT Solutions Business to IT Conversation Microsoft is Changing too Supporting System Center In House.
Assessment of Core Services provided to USLHC by OSG.
Effectively Explaining the Cloud to Your Colleagues.
Maintaining a Microsoft SQL Server 2008 Database SQLServer-Training.com.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
Tim Vander Kooi Systems
Presenter: Dipesh Gautam.  Introduction  Why Data Grid?  High Level View  Design Considerations  Data Grid Services  Topology  Grids and Cloud.
Module 7: Fundamentals of Administering Windows Server 2008.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
COMS E Cloud Computing and Data Center Networking Sambit Sahu
Automating Operational and Management Tasks in Microsoft Operations Management Suite and Azure
OSG Tier 3 support Marco Mambelli - OSG Tier 3 Dan Fraser - OSG Tier 3 liaison Tanya Levshina - OSG.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN)
Virtual Classes Provides an Innovative App for Education that Stimulates Engagement and Sharing Content and Experiences in Office 365 MICROSOFT OFFICE.
Internet2 AdvCollab Apps 1 Access Grid Vision To create virtual spaces where distributed people can work together. Challenges:
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
DTI Mission – 29 June LCG Security Ian Neilson LCG Security Officer Grid Deployment Group CERN.
The Claromentis Digital Workplace An Introduction
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
Ian Collier, STFC, Romain Wartel, CERN Maintaining Traceability in an Evolving Distributed Computing Environment Introduction Security.
 TECHNOLOGIA is a startup company in Bangalore in 2007 which is completely owned by emirates telecommunication corporation- ETISALAT.  It has helped.
Microsoft Azure and ServiceNow: Extending IT Best Practices to the Microsoft Cloud to Give Enterprises Total Control of Their Infrastructure MICROSOFT.
The VERSO Product Returns Portal Incorporates Office 365 Outlook and Excel Add-Ins to Create Seamless Workflow for All Participating Users OFFICE 365 APP.
Instantly Deliver and Track Training to Learners Anytime, Around the World and on Any Device Within Your Office 365 Environment with LMS365 OFFICE 365.
KeepItSafe Solution Suite Securely control and manage all of your data backups with ease, from a single location. KeepItSafe Online Backup KeepItSafe.
Grid Deployment Technical Working Groups: Middleware selection AAA,security Resource scheduling Operations User Support GDB Grid Deployment Resource planning,
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
HTCondor-CE. 2 The Open Science Grid OSG is a consortium of software, service and resource providers and researchers, from universities, national laboratories.
Academic Technology Services The UC Grid Project OSG Consortium All-Hands Meeting Bill Labate & Joan Slottow Research Computing Technologies UCLA Academic.
Zscaler Support Best Practices Guide Version September 27, 2016.
 1- Definition  2- Helpdesk  3- Asset management  4- Analytics  5- Tools.
SUSE Linux Enterprise Server for SAP Applications
Windows 2012R2 Hyper-V and System Center 2012
Accessing the VI-SEEM infrastructure
New Paradigms: Clouds, Virtualization and Co.
WorkDiff Mobile, Scenario-Based Collaboration Solution WorkDiff Allows Users to Work Differently While Using Familiar Functions of Microsoft Office 365.
Regional Operations Centres Core infrastructure Centres
Computing Clusters, Grids and Clouds Globus data service
Cisco Unified Operations Manager Proactive Voice Troubleshooting
Clouds , Grids and Clusters
Introduction to Distributed Platforms
CIM Modeling for E&U - (Short Version)
Dynamic Deployment of VO Specific Condor Scheduler using GT4
Integrated Management System and Certification
Ian Bird GDB Meeting CERN 9 September 2003
Microsoft Ignite /17/ :54 PM BRK2092
Configuration Management with Azure Automation DSC
Nordic ROC Organization
Interoperability & Standards
LCG Operations Workshop, e-IRG Workshop
Management of Virtual Execution Environments 3 June 2008
SocialBoards Self-Service, Multichannel Support Ticket Notifications in Microsoft Office 365 Groups Help Customer Care Teams to Provide Better Care OFFICE.

With IvSign, Office 365 Users Can Digitally Sign Word Documents in the Cloud from Any Device Without Having to Install Any Digital Certificates OFFICE.
Be Better: Achieve Customer Service Excellence and Create a Lean RMA and Returns Process with Renewity RMA and the Power of Microsoft Azure MICROSOFT AZURE.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Leigh Grundhoefer Indiana University
The Globus Toolkit™: Information Services
Smart Team Making a Beautiful software
A Process View of the Supply Chain
Service Delivery Maturity
SCCM in hybrid world Predrag Jelesijević Microsoft 7/6/ :17 AM
Yining ZHAO Computer Network Information Center,
OU BATTLECARD: Oracle WebCenter Training
Presentation transcript:

Best practises and experiences in user support – A case study at GARUDA Mr Santhosh J santhoshj@cdac.in Centre for Development of Advanced Computing, Bangalore, India

That holds true for most businesses and services That holds true for most businesses and services. And it holds true for Grid or Cloud computing environments as well. 3/22/2018 ISGC 2018

GARUDA India's first national grid initiative bringing together academic, scientific and research communities for their data and compute intensive applications that are of national importance. GARUDA grid is an aggregation of resources comprising of computational nodes and mass storage distributed across the country. No. of GARUDA Partners – 75 NKN (National Knowledge Network) connectivity at 10Gbps About GARUDA & its complex environment Distributed administrative domains Varied types of users – General grid user, developer, remote admins, application enablement, VO managers, users from different domains Multiple Applications & VOs PKI infrastructure 3/22/2018 ISGC 2018

GARUDA – computational resources Distributed resources Distributed administrative domains Varied architecture Heterogenous Resources – Hardware: Opteron, POWER5 (AIX, Linux), Xeon, etc., OS: Linux (Varied distributions like, RHEL, CentOS, AIX etc., Schedulers: SGE, PBS, PBS PRO, LoadLeveller 3/22/2018 ISGC 2018

High Level System Components of GARUDA About GARUDA & its complex environment Multiple Applications & VOs PKI infrastructure 3/22/2018 ISGC 2018

Indian Grid Certification Authority (IGCA) IGCA is the accredited member of APGridPMA Issues x.509 Certificates to support the secure environment for Grid. Issues certificates for users & resources of GARUDA grid, institutes that do research in grid computing in India and foreign institutes that collaborates with GARUDA. http://ca.garudaindia.in 3/22/2018 ISGC 2018

3/22/2018 ISGC 2018

Interoperability with International Grids Integrating technological components of GARUDA and EGI Glite and Globus Customizing Gridway meta-scheduler To run real life application across both infrastructures 3/22/2018 ISGC 2018

Grid support challenges Support for Interoperable Grids Lack of Knowledge base Lack of Tracking, Prioritization & work allocation Different Types of Users Grid users, developers, remote admins, VO managers, Grid certificate requests, Application enablement. Distributed resources and administrators Decentralized Support requests EU-India Grid, CHAIN 80% of users request support via emails 20% make phone calls Distributed Support teams Incident Management Release Management Change Management Need a support system to handle all these challenges…. Boils down to addressing each challenge/module. Network, R&D, Admin, Security HPC admins, appl. enablement Incidents, attacks and recovery Software, Portals and Application updates Addition & Removal of Resources, Maintenance of Resources 3/22/2018 ISGC 2018

Transforming Grid Support Centralized Support system Integrating distributed support teams Tracking, Prioritizing, Categorizing and assigning to right team Automating Grid Operations Integrated FAQ’s and Knowledge base Integrated Reporting & Analytics Weekly Review meetings Effective User support Decentralized to Centralized. Bringing all the support teams under one umbrella. Track, etc., Automating Grid Operations. – Service recovery, testing of job submission (Daily cron jobs), Automated Data backup & Recovery, Setting up of High Availability, User creation, DN Mapping, Certificate Renewal Notice etc., Creating Knowledge Base Reporting & Analytics – Which service is getting more tickets, how the tickets are been resolved, How long an issue takes. – Followed by Review meeting to solve the pending cases Asap, and share the experiences. Weekly meeting – With Remote administrators with VC. Aim of this transformation is to build an effective user support. 3/22/2018 ISGC 2018

Transforming Grid Support Integrated ticketing system Convert all incoming emails, calls, chats into tickets. Prioritize, categorize and assign them to the right people. Integrating support teams GARUDA Grid has plenty of support teams distributed across institutes VO Managers, HPC administrators, Grid administrators, , Portal & PSE developers, Application enablers and security handling groups. Integrated ticketing system brings all of those distributed groups into a unified team. Enables team collaboration, avoids collision. 3/22/2018 ISGC 2018

Automating Grid operations Setting up of Grid Operation Center to monitoring resources and events. Monitoring and recovering services automatically Simplifying registrations, certificate issuance, VO subscription and credential/proxy management and mapping certificate DNs across resources. Incident management & notifications. Configuration management. Automated tests for compliance, security and other policies https://www.inspec.io/ - Automated tests for compliance, security & other policies. Chef.io or puppet. – Configuration management. Recovering services by monitoring automatically by the scripts. Grid Operations to monitor the job flow by submitting test jobs. 3/22/2018 ISGC 2018

Integrated FAQ’s and Knowledge Base Frequently asked questions and knowledge base integrated into the ticketing system. Reduces the volume of tickets raised. We saw up to 40% of users using knowledge base to solve their issues. 3/22/2018 ISGC 2018

Reporting and analytics Ticketing system with integrated reporting and analytics system helped to measure and understand the entire user experience. Helps to differentiate channels with the volume of tickets raised. ISO standards Adhering to ISO standards ensures users get reliable, timely and efficient grid services. 3/22/2018 ISGC 2018

Weekly review meetings Conducted weekly review meetings to understand the issues. All remote support teams participate via video conference. Helps to discuss pending issues, share expertise and suggestions 3/22/2018 ISGC 2018

Manual Support Vs Ticketing system Identify, explain & assign to right group Manual communication Monitor status Respond to users Before 10 mins engage 5 mins Assign 5 mins Follow up 5 mins Resolve Ticket raised & automatically assigned to right group Automated response to users After How a ticketing system can improve the user support when compared with manual support. Fix SLA in the ticketing system. 2 mins Assign 1 min Resolved Automated communication 3/22/2018 ISGC 2018

No. of Tickets – 20,000+ 3/22/2018 Stats from 2007 to 2016 ISGC 2018

GARUDA stats No. of Jobs Executed 1,10,687 No. of VO’s 10 Total no. of Users 2500 Active Users 500+ Stats from 2007 to 2016 3/22/2018 ISGC 2018

Lessons learnt Even Grid & HPC support system can learn from industries best practices Automation of grid operations helps in reducing the volume of issues raised. Knowledge bases & FAQ’s helped users solve the issues themselves. Ticketing system helped offering effective user support. Collaboration is crucial to get everyone working on an issue. No tool provides all solutions. Hence, certain modules are developed in- house to deploy a complete ticketing system. Finally, we have found that proper monitoring methodology, automation and adhering to quality standards gives good results in providing high grid availability hence the effective user support. In-House – Knowledge Base, Phone to Tickets, Chat to tickets, Automating Grid Operations, etc., 3/22/2018 ISGC 2018

Conclusion Our paper aims to share our experience in offering user support for a highly complex Grid computing environment - GARUDA However, the approach will suite for any HPC computing environments. 3/22/2018 ISGC 2018

Thank you 3/22/2018 ISGC 2018