Apache Bigtop Working Group Cluster stuff
Cloud computing
Bigtop Administration Make sure you are signed up on the bigtop- dev mailing list. Lots of info which will never get repeated if you miss it Bigtop-user, bigtop-dev
Bigtop Administration Sign up for jira
Bigtop Administration – Registration, Join Biocurious. Pays for space nobody takes a cut of this Registration – Free drinks – Registration = AWS Credits. Cancelling IntelliJ. Expires end of April. –
Newbie Slide Structure: – Do labs Lab 1 Modified to take 1-2 weeks. Update the wiki with your findings Lab 2 Build Bigtop 0.3.0; Can start projects here, do Jira tickets Lab 3 map reduce program Lab 4 Run the unit tests under the component downloads Lab 5 Run the integration tests Lab 6 Puppet, deploy and run Lab 7 Port a module – Labs are changing; not a class. Requires time commitment – Demo, doesnt need to be working; for your benefit not ours
Lab 1 Install bigtop. Web search for apache bigtop, go to wiki link IGTOP/Index IGTOP/Index IGTOP/How+to+install+Hadoop+distribution+f rom+Bigtop IGTOP/How+to+install+Hadoop+distribution+f rom+Bigtop
Lab 1 Install bigtop, run all the components, Hive/Hbase/Pig/Hadoop/Mahout/Oozie There are bugs, document them Add the sample programs in quickstart to the wiki. Not all are included yet
Lab 1 Update the wiki Sqoop open (User group meeting next week) Flume/Flume NG (open/nothing) Zookeeper(open/nothing)
Hadoop Components Old: Dont stop at running Pi as test of HDFS Still missing: Run Terasort in Hadoop, need cluster IGTOP/How+to+install+Hadoop+distribution+f rom+Bigtop IGTOP/How+to+install+Hadoop+distribution+f rom+Bigtop Whirr may need patch depending on where you run it from
Mahout Dont run jar like in Hadoop Scripts handle downloading and clustering, demo, etc.. Under /examples/bin. Bigtop puts example/bin under /usr/share/doc/mahout. Is this correct? Not documentation Add documentation to wiki Ticket filed
Oozie Oozie runs, forget the error message, set to highest version
Oozie
Flume/Flume NG New patch checkin for Flume NG Testing
Whirr sudo apt-get install whirr Run as: whirr launch-cluster --config /udt/lib/whirr/recipes/mahout-ec2.properties If successful will see directory under ~/.whirr whirr.log mvn clean install
Puppet sudo apt-get install puppet facter fails
Ticket Questions/Demo Bigtop install should include stable for ubuntu? Diff between stable and bigtop incubating. There used to be a diff. Monitoring, metrics.properties ->metrics2 Ganglia or JMX? All components w/daemon; Bruno has Ganglia recipes to monitor status of cluster. Hadoop monitoring: performance and functionality. Hooked up to kerberos/ commercial version is Cloudera manager. Networking, i/o, block sizes, swap space, disk space. Stable vs. incubating? Anwar: LogMining (M/R, clickstream and FE log data, exception on day to day basis);