Software Installation Deck Big Data Workshop Saturday March 10 th, 2012.

Slides:



Advertisements
Similar presentations
Intro to Virtualization
Advertisements

Cloud Computing Computer Science Innovations, LLC.
Legal Meetings: Extended Instructions on Movica and Screencast.
1 How to build and use a Cloud GIS Redlands, California March 8 th – 9 th, 2010 Web Mapping for Conservation Organizations Workshop.
Content Management Systems Wally Hubbard GLITS January 16, 2012.
Cloud Resource Broker for Scientific Community By: Shahzad Nizamani Supervisor: Peter Dew Co Supervisor: Karim Djemame Mo Haji.
©2011 Quest Software, Inc. All rights reserved.. Andrei Polevoi, Tatiana Golubovich Program Management Group ActiveRoles Add-on Manager Overview.
Customizing Putty to work with Consuls. Step 1Get Putty Go to the website –
Platforms: Unix and on Windows. Linux: the only supported production platform. Other variants of Unix, like Mac OS X: run Hadoop for development. Windows.
Apache Bigtop Working Group Cluster stuff. Cloud computing.
Physics Network Integration Chris Hunter. Physics network team Chris Hunter : Network Manager David Newton : Network Support Technician Room DWB 663 Phone.
Intel Do-It-Yourself Challenge Lab 5: Controlling Galileo from a webpage Nicolas Vailliet
TEACHTOWN Adapting the Program to a Tablet Application
An Introduction to Gauss Paul D. Baines University of California, Davis November 20 th 2012.
Windows Azure for SharePoint people Dennis – Solution Architect Microsoft Windows Azure.
K. Stoeckigt, Secure real-time audio/video communication – H.350, Encryption & Gatekeeper/Proxy – using H.323 (…and a bit SIP) Tutorial/Workshop.
Connecting IoT Andy Cross, Director Elastacloud Ltd Windows Azure MVP Da zee zij
OFFICE OF SUPERINTENDENT OF PUBLIC INSTRUCTION Division of Assessment and Student Information Online MSP Testing In-Depth Technology Training January 13,
INFORMATION TECHNOLOGY SERVICES UTBackup: Desktop Backup Client and Service Overview.
Installing the MATLAB Add-On
Data Mining with R/ORE Minming Duan. 2 iTech Solution Profile Agenda R/ORE Overview 1 XML output generation using SQL 4 Integration with IBP and BIEE.
A Report by Kaleb Noggle. Mac OS X Leopard Who and when it was developed How does it differ from other OS How it is used in the work place Recommended.
Getting started Starting the Virtual Machines, utilities, intro to workflows using Trident ADD BUSINESS UNIT/FLAGSHIP NAME Nick Murray| March 2013.
MSDN Connection Get personalised information on the topics and technologies you want Profile yourself today and get updates via RSS Get personalised information.
Dan Bassett, Jonathan Canfield December 13, 2011.
Web Hosting Lan Vu. How does a Website work ? Web development concepts Web Design Web Hosting Domain Name.
MySQL Installation Guide. MySQL Downloading MySQL Installer.
General Operation and Facts As of 3/24/2014. Virtual Desktop 1. What is a Virtual Desktop? 2. Why VDI? 3. Installing the Virtual Desktop 4. Accessing.
GDB/KGDB HARISH CHETTY. WHAT IS GDB/KGDB  GNU Project Debugger  Supports Windows & Linux  USES  Pass anything to the program  Break anywhere within.
Internet of Things with Intel Edison Web controller
Amazon Web Services (aws) B. Ramamurthy. Introduction  Amazon.com, the online market place for goods, has leveraged the services that worked for their.
Configuring Windows to run Dr.Web scanner remotely.
Linux on Windows Azure Andreas Wasita.
To run the program: To run the program: You need the OS: You need the OS:
Introduction to Apache Hadoop CSCI 572: Information Retrieval and Search Engines Summer 2010.
Space Science and Engineering Center University of Wisconsin-Madison Virtual Machines: A method for distributing DB processing software Liam Gumley.
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Testing Virtual Machine Performance Running ATLAS Software Yushu Yao Paolo Calafiura LBNL April 15,
1 port BOSS on Wenjing Wu (IHEP-CC)
Accessing the Amazon Elastic Compute Cloud (EC2) Angadh Singh Jerome Braun.
STAR CBT Tryout Setting Up Your Computer Systems for the STAR CBT Tryout for Technical Coordinators J.
K. Liu, Q. Huang, J. Xia, Z. Li, P. Lostritto, Chapter 4 How to use cloud computing?, In Spatial Cloud Computing: a practical approach, edited by.
CSE 548 Advanced Computer Network Security Document Search in MobiCloud using Hadoop Framework Sayan Cole Jaya Chakladar Group No: 1.
Vagrant workflow Jul. 15, 2014.
Ubuntu, SUSE, OpenSUSE, CentOS & Oracle EL + hundreds on VM Depot Bring your own framework! Ecosystem Supported Microsoft 1st Party Support.
Web Development in Microsoft Visual Studio Slide 2 Lecture Overview How to create a first ASP.NET application.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Atmosphere.
VMWare Workstation Installation. Starting Vmware Workstation Go to the start menu and start the VMware Workstation program. *Note: The following instructions.
Windows Azure. Azure Application platform for the public cloud. Windows Azure is an operating system You can: – build a web application that runs.
MapReduce & Hadoop IT332 Distributed Systems. Outline  MapReduce  Hadoop  Cloudera Hadoop  Tutorial 2.
Predrag Buncic (CERN/PH-SFT) Software Packaging: Can Virtualization help?
36 th LHCb Software Week Pere Mato/CERN.  Provide a complete, portable and easy to configure user environment for developing and running LHC data analysis.
Predrag Buncic (CERN/PH-SFT) CernVM Status. CERN, 24/10/ Virtualization R&D (WP9)  The aim of WP9 is to provide a complete, portable and easy.
XAMPP.
Learn Hadoop and Big Data Technologies. Hadoop  An Open source framework that stores and processes Big Data in distributed manner on a large groups of.
Bellevue College Cloud Workshops Try: Cloud services Friday, May 6, 2016 Azure Virtual Machines (VM) Fawad Khan.
Transforming Science Through Data-driven Discovery Tools and Services Workshop Atmosphere Joslynn Lee – Data Science Educator Cold Spring Harbor Laboratory,
GETTING STARTED WITH AWS AND PYTHON. OUTLINE  Intro to Boto  Installation and configuration  Working with AWS S3 using Bot  Working with AWS SQS using.
Installing git In Linux: sudo apt-get install git In Windows: download it from run the setuphttp://git-scm.com/download/win.
Outline  XAMPP  XAMPP Install  Put php and HTML documents  Windows and Mac Version  Security.
bitcurator-access-webtools Quick Start Guide
Big data toolbox.
How to download, configure and run a mapReduce program In a cloudera VM Presented By: Mehakdeep Singh Amrit Singh Chaggar Ranjodh Singh.
Getting Started with R.
CernVM Status Report Predrag Buncic (CERN/PH-SFT).
INSTALLING AND SETTING UP APACHE2 IN A LINUX ENVIRONMENT
Introduction to Apache
Map Reduce Workshop Monday November 12th, 2012
bitcurator-access-webtools Quick Start Guide
Lab 1: Getting Started.
Presentation transcript:

Software Installation Deck Big Data Workshop Saturday March 10 th, 2012

Outline Local Installation – Python – Word Count Code and Files – R and R-Studio – Hadoop Local Installation Cloud Access – Amazon Web Services Account – Cloud-Based Software Demos – R and R-Studio in the Cloud – Cloudera Virtual Manager – Virtualization Software – R and Hadoop: rmr

Local

Python Installation Mac/Linux comes with Python (should be able to run). Windows use the following website to download and install: –

Python Wikipedia Word Count Files WhatURL Python Word Count Scripthttps://s3.amazonaws.com/com.hadoopinboston.scripts/seq.py Very Small File: 10 lines, 251 words: Small: lines, 1.65M words (10MB) Large: lines, 12M words (76 MB) Very Large: 85 million lines, (8 GB) Mapper.py – mapper in pythonhttps://s3.amazonaws.com/com.hadoopinboston.scripts/mapper.py Reducer.py – reducer in pythonhttps://s3.amazonaws.com/com.hadoopinboston.scripts/reducer-all.py Mapper in Rhttps://s3.amazonaws.com/com.hadoopinboston.scripts/mapper.R Reducer in Rhttps://s3.amazonaws.com/com.hadoopinboston.scripts/reducer.R The four files of different sizes were created by Vipin to test out the time to run each one locally.

LOCAL INSTALLATION: Rhttp://lib.stat.cmu.edu/R/CRAN/ R-Studiohttp://rstudio.org/ R and R-Studio Local Installation

Hadoop Installation Mac/Linux Macbook – – Install ports package to get Hadoop ( sudo port install hadoop (DONE!) Linux – – Use yum/apt-get package to get hadoop. sudo yum install hadoop (your mirror should have hadoop binaries) Please note that the local installation is for test and debug, and that production jobs will be ran on the cloud.

Hadoop Installation Windows Microsoft is working with Hortonworks on contributing to the Apache Hadoop project for Windows. Microsoft is working on a Community Technology Preview for Hadoop on Windows Azure ( and the release for on-premises installation is forthcoming. Those interested in running Hadoop on their own Windows hardware can follow technologies/business-intelligence/big-data-solution.aspx to sign up for the preview when its available. technologies/business-intelligence/big-data-solution.aspx TODAY, it is possible to install Hadoop on Windows, but those distributions require Cygwin, whereas the upcoming release will not. There are some instructions for Windows (see for instance apache.html) that people can try. apache.html Please note that the local installation is for test and debug, and that production jobs will be ran on the cloud.

Cloud

The first example will be through Amazon's Elastic Map/Reduce. Similar in nature to: Cloud Account

Cloud Numerics 12/02/07/cloud-numerics-example-analyzing- demographics-data-from-windows-azure- marketplace.aspx 12/02/07/cloud-numerics-example-analyzing- demographics-data-from-windows-azure- marketplace.aspx MortarData Cloud-Based Software Packages (Demos)

R-Studio in the Cloud: R or R-Studio in the Cloud: R and R-Studio Cloud Access (No VM)

Cloudera Hadoop Package op+Demo+VM op+Demo+VM There are 3 options that relate to different Virtualization Software one of which also need to be installed (next slide) SSH Software (Windows) ad.html ad.html Virtual Manager with Hadoop Please note that these are 64-bit versions, and that the Virtualization Software will require a laptop that supports virtualization. If you are unsure, one way this can be checked by looking at your BIOS and seeing if Virtualization is Enabled. Most chips support virtualization; however a handful of MFG installed BIOS do not enable virtualization.

VMware Player: Jeffrey Uses This One in his Session KVM: VirtualBox: Jim uses this one. – Virtual Manager with Hadoop Jeffrey will be walking through this process.

Session 6: R and Hadoop: rmr Jeffrey will be walking through this process. We realize the VM and R and Hadoop parts are very detailed, and that there may be questions on other workshop parts. Following the last session we will try to have a post-workshop help session.