Presentation is loading. Please wait.

Presentation is loading. Please wait.

BigML “Warren Buffett is one of the best learning machines on this earth. The turtles which outrun the hares are learning machines. If you stop learning.

Similar presentations


Presentation on theme: "BigML “Warren Buffett is one of the best learning machines on this earth. The turtles which outrun the hares are learning machines. If you stop learning."— Presentation transcript:

1 BigML “Warren Buffett is one of the best learning machines on this earth. The turtles which outrun the hares are learning machines. If you stop learning in this world, the world rushes right by you.” Lucas Remmerswaal

2 What is BigML? www.bigml.com
A predictive analytics tool that is offered through software-as-a-service (SaaS). In particular, it uses machine learning algorithms to create decision trees

3 About BigML “The service can be used in production mode or development mode. Development mode is free but limited in the size of tasks that can be completed. Production mode is a paid mode and credits can be purchased ad hoc in blocks or on a subscription basis. This is a familiar pattern from other cloud based services like storage or compute servers” Source:

4 Interesting Fact BigML was not evaluated by Gartner:
“Among those evaluated by Gartner but excluded from the analysis: BigML, Business-Insight, Dataiku, Dato, H2O.ai, MathWorks, Oracle, Rapid Insight, Salford Systems, Skytree and TIBCO.”

5 Industry Competition “The primary goal of many machine learning models is to make accurate predictions from unseen data.” How well does BigML do in this respect? Source:

6 https://bigml.com/pricing
Pros / Cons Pros Easy of use (no need to write code to make predictions) Ability to store models locally and make predictions offline Functionality (fast predictions offline) Cons No batch predictions (one per API call) Individual predictions through the API are very slow Cost ($45,000/year for 1 server (8cores) Source:

7 BigML & MIST 5620 One assignment BigML Project (Group) – 10 points
Gets you familiarized with BigML. You will first have training, answer a few questions about the 2012 presidential election and churn, and also use a dataset to create a predictive machine learning model of your choice. Finally, you will turn a pdf file of your work

8 BigML & MIST 5620 Step 1: Register to use BigML
Its use is free, but you need to register here Step 2: Learn about BigML Features can be learned here. More details and a complete example is available here. Read more about BigML here. At each node in a decision tree, BigML provides information about how confident the prediction is correct at that node. To understand the meaning of confidence, read an article here

9 BigML & MIST 5620 Step 3: Your Project Description
Part A. In the 2012 Presidential Election, Obama and the Democrats received considerable recognition for the use of analytics to understand the electorate and get out the vote. Use the 2012 Presidential Election Winner dataset in BigML to answer the following questions: Which single variable best predicts whether a person voted for Obama? Which path has the highest confidence level in predicting whether a person voted for Obama or Romney?

10 BigML & MIST 5620 Step 3: Your Project Description
Part B. Churn is a major problem for telecommunications firms. It is not unusual for 20 percent of a company’s customers to not renew their contracts. Because of this, Telcos are using analytics to identify customers that are most likely to churn so they can intervene to try to influence these customers to stay, such as providing attractive offers to renew or the promise of better service through a new cell tower. Use the Telco churn dataset to develop a model to predict churn. Exclude any variables that are unlikely to be related to churn and provide the logic behind your thinking. Also check for any variables that need to be recategorized from numerical to categorical and discuss why. What is the single most important variable in predicting churn?

11 BigML & MIST 5620 Step 3: Your Project Description
Part C. Using one of the other datasets provided on the BigML site, create a model, and then develop questions that you answer using the Prediction feature. In designing your model, first build it using part of the data (say 80%) and then test it using the remaining data. How good is your model? Provide a discussion.

12 BigML & MIST 5620 Step 3: Your Project Data & Timeline
You will use data sources provided on the BigML site. The datasets are small but will give you the experience and skills needed to take on larger projects You have nearly one month to turn in your pdf file on eLc (due date is Dec 2 at 11:59pm) see guidelines here

13 Conclusion BigML is a BI software designed to help you incorporate machine learning into your organization It offers a variety of analyses (ranging from regression to cluster analysis and anomaly detection) in ways that are accessible and easy for users with no technical knowledge Good accuracy (~.96) It can be expensive though and it lacks batch production


Download ppt "BigML “Warren Buffett is one of the best learning machines on this earth. The turtles which outrun the hares are learning machines. If you stop learning."

Similar presentations


Ads by Google