Hola Hadoop.

Slides:



Advertisements
Similar presentations
LIS651 lecture 5 direct use of wotan Thomas Krichel
Advertisements

Dayu Zhang 9/8/2014 Lab02. Example of Commands pwd --- show your current directory This is home of venus, not your home directory Tilde: means you are.
ATS Programming Short Course I INTRODUCTORY CONCEPTS Tuesday, Jan. 27 th, 2009 Essential Unix Commands.
Introducing the Command Line CMSC 121 Introduction to UNIX Much of the material in these slides was taken from Dan Hood’s CMSC 121 Lecture Notes.
T UTORIAL OF U NIX C OMMAND & SHELL SCRIPT S 5027 Professor: Dr. Shu-Ching Chen TA: Samira Pouyanfar Spring 2015.
CS1020: Intro Workshop. Topics CS1020Intro Workshop Login to UNIX operating system 2. …………………………………… 3. …………………………………… 4. …………………………………… 5. ……………………………………
Introduction to CVS 7/3/20151UMBC CMSC 341. Outline Introduction to Source Code Management What is CVS? CVS for Project Submission Basic commands Checkout,
1 Basics of Linux On linux machine: Login at your home directory Open a “shell” or “terminal” or “xterm” workspace (4) On windows machine Intall linux.
Jian Wang Based on “Meet Hadoop! Open Source Grid Computing” by Devaraj Das Yahoo! Inc. Bangalore & Apache Software Foundation.
Reproducible Environment for Scientific Applications (Lab session) Tak-Lon (Stephen) Wu.
Actores y Actrices. Peligro Please be careful! IMDb (I assume you all know?)
Hola Hadoop. 0. Clean-Up The Hard-disks Delete tmp/ folder from workspace/mdp-lab3 Delete unneeded downloads.
Tutorial on Hadoop Environment for ECE Login to the Hadoop Server Host name: , Port: If you are using Linux, you could simply.
CSE 390a Editing and Moving Files
Nick Geoghegan1 Introduction to Linux Workshop. Nick Geoghegan2 Getting Started Download the following files:
Alistair Chalk, Elisabet Andersson Stem Cell Biology and Bioinformatic Tools, DBRM, Karolinska Institutet, September Bioinformatics Primer.
File Permissions. What are the three categories of users that apply to file permissions? Owner (or user) Group All others (public, world, others)
Copyright (c) by CNAPTICS Corporation. All rights reserved.1 INFO Oracle Database 11g: Administration II Presented By: Marc S. Paller,
PROGRAMMING PROJECT POLICIES AND UNIX INTRO Sal LaMarca CSCI 1302, Fall 2009.
Getting started: Basics Outline: I.Connecting to cluster: ssh II.Connecting outside UCF firewall: VPN client III.Introduction to Linux IV.Intoduction to.
Kevin Bacon. The Question You are Going to Answer (again) … Which pair of actors/actresses have acted together the most times?
Preliminary PreparationS for the Training AIST=Innovation 1.WinSCP Installation Download Site: 2. Putty.exe file download.
Setup Environment: Instructions for CS1520 Server Setup Nils Murrugarra
Working with Hadoop. Requirement Virtual machine software –VM Ware –VirtualBox Virtual machine images –Download from Cloudera (Founded by leaders in the.
Set up environment for mapreduce developing on Hadoop.
1 Manufacturing Operations Center 16. Demo Data Customization Scripts APAC Training, Feb-Mar, 2010.
Unix Servers Used in This Class  Two Unix servers set up in CS department will be used for some programming projects  Machine name: eustis.eecs.ucf.edu.
ENEE150: Discussion 1 Section 0104/0105 Please Sit Down at a Computer and Login!
Unix Environment Basics CSCI-1302 Lakshmish Ramaswamy.
Installing RandoNode Starter Kit OPEN Development Conference September 17, 2008 Kasi Perumal Consultant.
 Last lesson, the Windows Operating System was discussed along with the Windows command shell  Unix is a computer operating system, that similarly manages.
CIS 370 Lab1 Unix Commands. Things to do before start... Login username : name with password: fall2009 Open : Terminal (Applications->Systems.
+ Introduction to Unix Joey Azofeifa Dowell Lab Short Read Class Day 2 (Slides inspired by David Knox)
Setting up visualization. Make output folder for visualization files Log into vieques $ ssh
Access QA servers Install SSH/SFTP software –T:\QualityAssurance\Tools\SSH.
Assignprelim.1 Assignment Preliminaries © 2012 B. Wilkinson/Clayton Ferner. Modification date: Jan 16a, 2014.
Introduction to Hadoop Programming Bryon Gill, Pittsburgh Supercomputing Center.
Linux Workshop Session 2 By Amol and Prem. Overview of Presentation Brief Review Useful tools Remote Access Troubleshooting.
1.1.2 OneOs Downloading Software Upgrade
ENEE150 Discussion 01 Section 0101 Adam Wang.
UNIX To do work for the class, you will be using the Unix operating system. Once connected to the system, you will be presented with a login screen. Once.
Tutorial of Unix Command & shell scriptS 5027
CS1010: Intro Workshop.
Hadoop Architecture Mr. Sriram
Getting started with CentOS Linux
Web Programming Essentials:
Linux 101 Training Module Linux Basics.
How to download, configure and run a mapReduce program In a cloudera VM Presented By: Mehakdeep Singh Amrit Singh Chaggar Ranjodh Singh.
인공지능연구실 이남기 ( ) 유비쿼터스 응용시스템: 실습 가이드 인공지능연구실 이남기 ( )
Set up environment for mapreduce developing on Hadoop
Andy Wang Object Oriented Programming in C++ COP 3330
Hands-On Hadoop Tutorial
Counting (co-)Stars.
Tutorial of Unix Command & shell scriptS 5027
Tutorial of Unix Command & shell scriptS 5027
WordCount 빅데이터 분산컴퓨팅 박영택.
인공지능연구실 이남기 ( ) 유비쿼터스 응용시스템: 실습 가이드 인공지능연구실 이남기 ( )
Wordcount CSCE 587 Spring 2018.
Hands-On Hadoop Tutorial
Tutorial of Unix Command & shell scriptS 5027
Map Reduce Workshop Monday November 12th, 2012
Getting started with CentOS Linux
Andy Wang Object Oriented Programming in C++ COP 3330
Tutorial Unix Command & Makefile CIS 5027
Yung-Hsiang Lu Purdue University
PHYSICS LABS – WEEK 3 XREG
Bryon Gill Pittsburgh Supercomputing Center
Leon Kos University of Ljubljana
02 | Getting Started with HDInsight
DIBBs Brown Dog Tutorial Setup
Presentation transcript:

Hola Hadoop

0. Peligro! Please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please

0. Peligro! … please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please please

Peligro! … please

Peligro! … please be careful of what you are doing! Think twice before: rm mv cp kill emacs/vim/… configuration files

Peligro! … please.

cluster.dcc.uchile.cl

1. Download tools http://aidanhogan.com/teaching/cc5212-1-2015/tools/ Unzip them somewhere you can find them

2. Log-in PuTTy 1 2 3

3. Open DFS Browser http://cluster.dcc.uchile.cl:50070/

4. PuTTy: See state of DFS hdfs dfsadmin -report

5. PuTTy: Create folder hdfs dfs -ls / hdfs dfs -ls /uhadoop hdfs dfs -mkdir /uhadoop/[username] [username] = first letter first name, last name (e.g., “ahogan”)

6. PuTTy: Upload Data cd /data/2014/uhadoop/shared/ Then hdfs dfs -copyFromLocal /data/2014/uhadoop/shared/es-wiki-abstracts.txt /uhadoop/[username]/ OR hdfs dfs -copyFromLocal /data/2014/uhadoop/shared/es-wiki-abstracts.txt.gz /uhadoop/[username]/

Note on namespace If you need to disambiguate local/remote files HDFS file hdfs://cm:9000/uhadoop/… Local file file:///data/hadoop/...

7. Let’s Build Our First MapReduce Job Hint: Use Monday’s slides for “inspiration” http://aidanhogan.com/teaching/cc5212-1-2016/ Also copied in CitationCount.java Implement map(.,.,.,.) method Implement reduce(.,.,.,.) method Implement main(.) method

8. Eclipse: Build jar Right Click build.xml > dist (Might need to make a dist folder)

9. WinSCP: Copy .jar to Master Server 1 2 3 Don’t save password! 4

9. WinSCP: Copy .jar to Master Server

9. WinSCP: Copy .jar to Master Server Create dir: /data/2014/uhadoop/[username]/ Copy your mdp-lab4.jar into it

10. PuTTy: Run Job hadoop jar /data/2014/uhadoop/[username]/mdp-lab4.jar WordCount /uhadoop/[username]/es-wiki-abstracts.txt[.gz] /uhadoop/[username]/wc/ All one command!

11. PuTTY: Look at output hdfs dfs -ls /uhadoop/[username]/wc/ hdfs dfs -cat /uhadoop/[username]/wc/part-r-00000 | more hdfs dfs -cat /uhadoop/[username]/wc/part-r-00000 | grep -P "^de\t" | more All one command! Look for “de” … 4916432 occurrences in local run

12. Look at output through browser http://cluster.dcc.uchile.cl:50070/