IT Infrastructure for a Data Science Campus

Slides:



Advertisements
Similar presentations
IT INFRASTRUCTURE AND EMERGING TECHNOLOGIES
Advertisements

ITS Offsite Workshop 2002 PolyU IT Security Policy PolyU IT/Computer Systems Security Policy (SSP) By Ken Chung Senior Computing Officer Information Technology.
Information Security in Real Business
DAKNET Presented By: rreema.
Cloud computing Tahani aljehani.
The World's Most Secured Browsing Solution COCKPIT4i is a radically new, powerful solution that protects against the security risks posed by exposure to.
Effectively Explaining the Cloud to Your Colleagues.
Presentation to the Housing Technology Conference Tim Cowland- Senior Consultant 27 th February 2014 The Rise of the Housing Cloud.
©Kwan Sai Kit, All Rights Reserved Windows Small Business Server 2003 Features.
11 SECURITY TEMPLATES AND PLANNING Chapter 7. Chapter 7: SECURITY TEMPLATES AND PLANNING2 OVERVIEW  Understand the uses of security templates  Explain.
Current Job Components Information Technology Department Network Systems Administration Telecommunications Database Design and Administration.
Chapter 9: Novell NetWare
Open Web App. Purpose To explain Open Web Apps To explain Open Web Apps To demonstrate some opportunities for a small business with this technology To.
Copyright © 2011 EMC Corporation. All Rights Reserved. MODULE – 6 VIRTUALIZED DATA CENTER – DESKTOP AND APPLICATION 1.
Introduction to Cloud Computing
70-294: MCSE Guide to Microsoft Windows Server 2003 Active Directory, Enhanced Chapter 5: Active Directory Logical Design.
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Business Plug-In B17 Organizational Architecture Trends.
Network and Perimeter Security Paula Kiernan Senior Consultant Ward Solutions.
1 Introduction to Microsoft Windows 2000 Windows 2000 Overview Windows 2000 Architecture Overview Windows 2000 Directory Services Overview Logging On to.
Module 11 Upgrading to Microsoft ® Exchange Server 2010.
Our Company What your are Getting If your workstation or laptop has been giving you problems, we can fix it and restore your productivity without adding.
Continental expects to increase system uptime, implement new services in minutes, and save up to $1.5 million a year in hardware, software, labor, implementation,
Introduction TO Network Administration
Windows Small Business Server 2003 R2 Powering Small Businesses.
Document Name CONFIDENTIAL Version Control Version No.DateType of ChangesOwner/ Author Date of Review/Expiry The information contained in this document.
Moving Small Business Server into the Future. STANDARD Workload Optimized DATACENTER Virtualization Optimized Virtualization SKUs ESSENTIALS Small Business,
Introduction to Mobile-Cloud Computing. What is Mobile Cloud Computing? an infrastructure where both the data storage and processing happen outside of.
Network and Server Basics. Learning Objectives After viewing this presentation, you will be able to: Understand the benefits of a client/server network.
Google. Android What is Android ? -Android is Linux Based OS -Designed for use on cell phones, e-readers, tablet PCs. -Android provides easy access to.
Clouding with Microsoft Azure
READ ME FIRST Use this template to create your Partner datasheet for Azure Stack Foundation. The intent is that this document can be saved to PDF and provided.
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Chapter 6: Securing the Cloud
Microsoft Certification Paths
Introduction to VMware Virtualization
COMP532 IT INFRASTRUCTURE
BEST CLOUD COMPUTING PLATFORM Skype : mukesh.k.bansal.
IT Architecture Technical blueprint for evolving a corporate infrastructure resource that can be shared by many users and services processing systems hardware.
Data and database administration
Virtual Network Computing
Securing the Network Perimeter with ISA 2004
A New Era in Critical Communications
Control system network security issues and recommendations
Platform as a Service.
Welcome! Thank you for joining us. We’ll get started in a few minutes.
Malcolm Days Mark Miners James Farnhill
MCSA VCE
Virtual Private Networks (VPN)
Enterprise Productivity Services
IS4550 Security Policies and Implementation
Windows® MultiPoint™ Server 2010
Intelledox Infiniti Helps Organizations Digitally Transform Paper and Manual Business Processes into Intuitive, Guided User Experiences on Azure MICROSOFT.
Networks Software.
Performance Management Microsoft Office PerformancePoint Server 2007
Data Security for Microsoft Azure
m+ USERS ORGANISATIONS
Microsoft Certification Paths
Enterprise Program Management Office
JOINED AT THE HIP: DEVSECOPS AND CLOUD-BASED ASSETS
Chapter 17: Client/Server Computing
Multithreaded Programming
Increase and Improve your PC management with Windows Intune
Storing and Accessing G-OnRamp’s Assembly Hubs outside of Galaxy
The Anatomy and The Physiology of the Grid
Creating a University IT Service Portfolio
Cloud Computing LegalRun Solutions Why It’s Right for You!
Introduction to Portal for ArcGIS
SBS 2008 – One year on David Overton
Productive + Hybrid + Intelligent + Trusted
06 | SQL Server and the Cloud
Presentation transcript:

IT Infrastructure for a Data Science Campus Craig Pritchard: Technical Architect David Pugh: Senior Data Scientist https://datasciencecampus.ons.gov.uk/ @DataSciCampus

Challenges Data Science Campus a hub for the whole of the UK public and private sectors to gain practical advantage from the increased investment in data science research and capability building Challenges Ingesting data Goal -> explore how new data sources and data science techniques can improve our understanding of the UK’s economy, communities & people. Technology Security Our goals can pose a challenge within the organisation – how do we ingest and store data, what technology do we want to use and how do we ensure security? How do we do this and sit within the ONS network. organisation going through significant transformation Introduction

Digital Services Technology and Data Transformation Architecture relocation to secure redundant datacentres Corporate data migrated from Lotus Notes into secure SharePoint zones Microsoft Office upgrade from 2007 to 2016 Operating System upgrade on laptops from Windows 7 to Windows 10 Email system upgrade Rollout of VDI Datacentre Refresh Exchange SharePoint Office 2016 Virtual Desktop Infrastructure Windows 10 2016 2019 Hardware Refresh Legacy Uplift Skype for Business Campus Network ONS Data Service Significant technology and data transformation ongoing since 2016 Migration to Exchange, Sharepoint, Skype, Office Hardware improvements Development of platforms for data analysis Campus Network created spanning two data centres. Isolated from the corporate Creation of ONS data service providing secure environment to ingest and process sensitive data for multiple sources Replacement and upgrade of network switches, servers Replacement of JAVA legacy applications Adoption of Microsoft Skype for Business and replacement of legacy telephone system Technology and Data Transformation

Zones Core Network Security - Network Zones CI Pipeline Security - Network Zones Core network redesign and upgrade Benefits - Increases in performance, reliability and resiliency Services are isolated from the core network into zones Managed under Strict change control using firewall rules Service orientated Isolated from core network Secure by default Data Ingestion SharePoint Zones Exchange Core Network Data Service Skype Data Science Campus Network - Summary

Ingest data, provide technology and ensure security ONS Data Service “Enable teams to transform by providing access to support data and technology services” Ingest data, provide technology and ensure security Ingest and Secure Data Platform Standards Methodology Training and Support Acquire Ingest Prepare Explore Production Export Data is core to ONS As well as technology transformation we also implemented data transformation Key part of this is the ONS Data Service Provides the support data and technology services and training and support to teams transform how they use data Tools for the preparation, exploration of data ONS Data Service

Data Science with Open Source Tools Can provide a security risk Can take many weeks or months for updates to be installed on corporate network Not all packages and techniques are supported This can limit innovation and constrains the ability for data scientists to implement and experiment with new tools and develop new techniques Examples include NLP where having models and corpus available is an issue Deep learning – availability of TPU, image storage and processing, geospatial Also some data sets are better off not in a HDFS, and we can select and optimise storage as required The Data Science Campus network (DSCN) has been created as separate infrastructure to provide users with IT services and tool sets required to investigate more advanced techniques and produce the next generation of statistics ONS Data Service

Corporate Network Campus Network Why a separate network? Highly secure and controlled – sensitive data Innovation – non-sensitive data Internet Internet Internal/External Users APIs SFTP Email HTTPS Data ingestion Core ONS network Less restrictive internet access Scanned for viruses and malware Ingest Zone ONS Users Remote Access Web Proxy Access tightly controlled and monitored Isolated from the corporate ONS network ONS Data Science platform allows data science to be performed on sensitive data. It is zoned and secure, restricted access However, the increased security also constrains and restricts libraries, models and tools that can be used This can limit innovation and constrains the ability for data scientists to implement and experiment with new tools and develop new techniques. The Data Science Campus network (DSCN) has been created as separate infrastructure to provide users with IT services and tool sets required to investigate more advanced techniques and produce the next generation of statistics. Much more freedom to develop the systems and services required to develop cutting edge techniques and pipelines. These can be refined and developed for future use on ONS Data Service. Data ingested into data lake Data Lake VDI Zone Virtual Machines Virtual Machines Production Environment Virtual desktops provide users with applications and tools Local Admin rights, No group policies! Why a separate network?

Data Science Campus Network Campus network spans 2 data centres and is isolated from the corporate ONS network. It is accessible from corporate and external networks Equipped with many services required for Data Science, and can be easily extended to meet users needs Users able to build their own system as required, virtual Windows or Linux instances, open source packages Variety of storage mechanisms depending on data and need Integration with TPUs to develop data visualisation and geospatial Ability to create web services and APIs using a variety of coding languages, e.g. Python, R, JavaScript Also includes a sandbox for training 10 Gbps 10 Gbps Data centre 2 Data centre 1 Campus Network spans 2 data centres. Isolated from the corporate ONS network. Accessible from corporate and external networks Platform for innovation The network is equipped with many services required for Data Science, and can be easily extended to meet users needs Users build their own system as required, eg, virtual windows or linux instances variety of storage mechanisms depending on data and need ability to use TPUs ability to develop data visualisation and geospatial analysis ability to create web services and APIs different coding languages, eg, JS, C Users free to install open source Also includes a sandbox for training 20 Gbps Inter-link Data Science Campus Network - Overview

CAMPUS NETWORK Project and infrastructure consumption Computer Vision Natural Language Processing OCR prototyping Geospatial Git Apache Spark Deep Learning Natural Language Processing Python Machine Learning Develop Training TPUs Rapids CAMPUS NETWORK Patent Data – Emerging Trends Campus Network spans 2 data centres. Isolated from the corporate ONS network. Platform for innovation. Data Science Campus Green Spaces National Accounts and Economic statistics Used for variety of applications Mapping the Urban Forest Data Architecture Projects Teams Optimus Rwanda International Development Access to Services - propeR Sustainable Development Goals Project and infrastructure consumption

Mapping the Urban Forest Optimus propeR Access to services using multimodal transport networks https://datasciencecampus.ons.gov.uk/access-to-services-using-multimodal-transport-networks/ Mapping the Urban Forest Estimating density of trees & vegetation at street level https://datasciencecampus.ons.gov.uk/mapping-the-urban-forest-at-street-level/ Optimus Advanced NLP pipeline to turn free text lists into hierarchical datasets https://datasciencecampus.ons.gov.uk/o-p-t-i-m-u-s-turning-free-text-lists-into-hierarchical-datasets/ Example Projects

ONS Network and Data Service An office wide ONS Data Service provides the access to the support, data and technology services to enable teams to transform It controls the ingestion, technology and security tools to allow data science to be performed on sensitive data However, the increased security also constrains and restricts libraries, models and tools that can be used This can limit innovation and constrains the ability for data scientists to implement and experiment with new tools and develop new techniques. Data Science Campus Network The Data Science Campus network (DSCN) has been created as separate infrastructure to provide users with IT services and tool sets required to investigate more advanced techniques and produce the next generation of statistics Much more freedom to develop the systems and services required to develop cutting edge techniques and pipelines These can be refined and developed for future use on ONS Data Service A number of successful projects have been completed using both platforms Data Science Campus Network - Summary