Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid -by Rewati Ovalekar
2 ● Step 1: – Code is available on: – Download the code from: %2Ftrunk%2Fproject%2Fspring2011%2FEEMDAnalysi s%2FEEMDJava %2Ftrunk%2Fproject%2Fspring2011%2FEEMDAnalysi s%2FEEMDJava
3 ● Step 2: – Create a futuregrid account – For further details refer: (FutureGrid Tutorial)
4 ● Step 3: – Login to Futuregrid – ssh – Following message will be displayed for successful login
5 ● Step 4: – Create a jar file ● Step 5: – To transfer the jar file and the input file: – sftp – put /../filepath
6 ● Step 6: – In order to run Hadoop on FutureGrid create an eucalyptus account – For further details refer: ● Step 7: – Once the account is approved, load the eucalyptus tools : Module load euca2ools
7 ● Step 8: – Make sure that the jar file and the input file are in the same directory as the username.private key – Run the image which has hadoop on it: euca-run-instances -k rovaleka -t c1.xlarge emi-D778156D -k indicates the key name -t indicates the type of instance emi-D778156D indicates the image name -n indicates the number of clusters to run
8 ● Step 8: – Check the status using: – euca-describe-instances – Keep checking till the status is running, once the status is running one can login to run the Hadoop. It will be displayed as below:
9 ● Step 9: – Transfer the input file and the jar file to the required VM using: scp –i username.private filename (Make sure that the address is same as the address assigned to you else it will ask for password) – Login using: scp –i username.private (Make sure the address is
10 SINGLE NODE ● Step 10: – Above message will be displayed for successful login – Retrieve the transferred files and transfer it in the Hadoop folder: cd /.. mv filename /opt/hadoop cd /opt/hadoop
11 ● Step 11: – To run Hadoop: cd /opt/hadoop bin/start-all.sh – To check if everything is started: jps
12 ● Step 12: – Transfer the input file on the HDFS: bin/hadoop dfs –copyFromLocal inputfile name_in_HDFS – To check if it is present on HDFS: bin/hadoop dfs –ls NOTE: We need to transfer the input file whenever we start Hadoop
13 ● Step 13: – To run the code: bin/hadoop jar [jarFile] EEMDHadoop [inputfilename] [required_output_file]
14 ● Step 14: – Retrieve the output : bin/hadoop dfs -copyToLocal [outputFileName] [outputfileNameToBeGiven] (output will be avaliable in part file) To check the logs and to debug the code go to folder logs/userlogs
15 ● Step 15: – Stop the Hadoop: bin/stop-all.sh exit
16 Thank you!!!