5-star Ratings & Recommendations with Mahout

Slides:



Advertisements
Similar presentations
WDL Technical Architecture Working Group (TAWG) June 2010 Achievements and Recommendations Co-chaired by Noha Adly, Bibliotheca Alexandrina Babak Hamidzadeh,
Advertisements

Rene Modery Introduction on how businesses can benefit from SharePoint Online.
UMT and Microsoft Presenting Tips and Tricks Basics What’s new in Microsoft Project 2010 Brian Feder, MBA, PMP Senior Vice President UMT Consulting Group.
Collaborative Filtering - Rajashree. Apache Mahout In 2008 as a subproject of Apache’s Lucene project Mahout absorbed the Taste open source collaborative.
Batch Geocoding Online Bruce Harold
SharePoint Users Group Content Classification Step by Step SharePoint 2007 and 2010.
Behind the Scenes – Scoping a Solution Chris Maertz, CTO.
Support.ebsco.com Basic EBSCOhost Searching for Public Libraries Tutorial.
The Eyeblaster ACM Advertising Campaign Management.
Esri UC 2014 | Demo Theater | Batch Geocoding Online Bruce
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
CyVerse-enabled NCBI Sequence Read Archive (SRA) Submission Pipeline
Streamlining Business Processes at PGA TOUR with Alfresco Workdesk 13 Nov 2013 Doug Edwards, The PGA TOUR Terence McDevitt, Blue Fish Development Group.
Using the Javascript Console for development and administration Florian Maul (fme AG)
Armedia Case Management for Investigative Case Management David Miller Director of Technology, Armedia James Bailey President, Armedia.
Esri UC 2014 | Technical Workshop | Address Maps and Apps for State and Local Government Allison Muise Nikki Golding Scott Oppmann.
Microsoft Virtual Academy Jamie McAllister | SharePoint MVP & Solution Architect Rob Latino | Program Manager in Office 365 Support.
Recommendation Systems ARGEDOR. Introduction Sample Data Tools Cases.
Activiti in an Event- driven architecture Robin Bramley Chief Scientific Officer, Ixxus.
+ The Learning Registry: A How To Primer for Digital Content Publishers and Aggregators December 20, 2011.
An Introduction to Big Data (With a strong focus on Apache) Nick Burch Senior Developer, Alfresco Software VP ConCom, ASF Member.
Apache David Schneider (schnei21) ITEC400. What is Hadoop? Distributed Computing Open Source Reliable Scalable Fun Facts What is a Hadoop? Hadoop was.
Hadoop Javad Azimi May What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data. It includes:
4 Chapter 4: Beginning the Analysis: Investigating System Requirements Systems Analysis and Design in a Changing World, 3 rd Edition.
Metataxis Can you really implement taxonomies in native SharePoint? Marc Stephenson March 2017.
Google Glass Developing for Glass & Alfresco.
Fahd Shaaban, Director of Professional Services
SharePoint Broken Link Manager
Big Data is a Big Deal!.
Copyright Licensing Agency “Digitally transforming the use of content in Education” Adam Sewell, CIO 8th November 2016.
Sushant Ahuja, Cassio Cristovao, Sameep Mohta
Leveraging the Business Intelligence Features in SharePoint 2010
Hadoop.
The Future and Content Management
Getting Started with Alfresco Development
Geocoding with ArcGIS Online
“A Day in the Life of SharePoint” Explaining SharePoint to End Users and Management Scott Shearer SharePoint Evangelist/Developer FlexPoint Technology.
Hadoopla: Microsoft and the Hadoop Ecosystem
Steering Group Member, Link Digital
Alfresco over the years
Software Documentation
Getting started with Alfresco Development
Hadoop Clusters Tess Fulkerson.
SRA Submission Pipeline
Basic Searching for K-12 School Libraries
Future State Business Process Discovery & Design Recap
Office 365 Security Assessment Workshop
GARRETT SINGLETARY.
Office 365 Development July 2014.
Practical guide to build Modern Intranet sites with SharePoint Communication Sites Asish Padhy.
SharePoint Broken Link Manager
TIM TAYLOR AND JOSH NEEDHAM
Session 2: Metadata and Catalogues
Social media for global scientific community – Mendeley project
Mendeley Overview VISHAL GUPTA Customer Consultant South Asia
Dynamics AX Upgrades Microsoft Dynamics AX 2009
Zoie Barrett and Brian Lam
Charles Tappert Seidenberg School of CSIS, Pace University
StudentWeb Orientation
Delivering great hardware solutions for Windows
GETTING DOWN TO BUSINESS
Anatomy of a modern data-driven content product
10.03 Delivery of a Presentation
INSTRUCTOR NOTES/LINKS
IBM C IBM Big Data Engineer. You want to train yourself to do better in exam or you want to test your preparation in either situation Dumpspedia’s.
Links Launch Outlook Launch Skype Place Skype on Do Not Disturb.
Databricks and End-to-End Processes Demo Links & Help
Make it real: Help your customers comply with the GDPR
Mendeley Overview VISHAL GUPTA Customer Consultant South Asia
SDMX IT Tools SDMX Registry
Presentation transcript:

5-star Ratings & Recommendations with Mahout Robin Bramley Chief Scientific Officer Ixxus

We are a leading global provider of end-to-end custom-built content solutions.

Our Alfresco Credentials Long-standing Platinum Alfresco partner in US and UK Working with Alfresco since Alfresco v0.6 Excellent Alfresco knowledge and highly trained and experienced staff We are trusted to deliver some of the largest Alfresco projects in the World Alfresco Million $ Club (May 2012) Best Solution Partner (Nov 2013)

Award-winning projects: Contributed to:

Presented at: Published in:

Discovering existing knowledge How did we find answers 30 years ago? How was that information organised? Encyclopædia Library Bookshop Printed 1768 - 2010 7

The landscape changed "Updating dozens of books every two years now seems so pedestrian. The younger generation consumes data differently now, and we want to be there.” Jorge Cauz, Britannica, 2012

Number 6: “What do you want?” Number 2: “We want information.” The Prisoner

Discoverability Metadata is key Permits discovery through multiple dimensions

Finding stuff in Alfresco A quick recap

Wordle: Browse Keyword~search Advanced~search Faceted~navigation Workflow Taxonomy Folksonomy~tags Dashlets Image~browsing Association~relationships Favourites Likes

Wanted to use the Anthrax Anti-Social single cover here – copyright stopped play Audience participation exercise

Social content

Alice and Barbara I love my new iPhone 6 Me too! Alfresco on iOS is great isn’t it? If you like Alfresco you should check out Robin’s Summit talk… Recommendations in a nutshell

Collaborative filtering User similarity recommendations in a nutshell A B C 1 2 3 4 5

Alfresco 5-star ratings 5 star rating scheme supported by the Ratings Service Not exposed in Share Nod to metaversant / Jeff Potts’ 5 star Share extension

Demo time

Overview Diagram needs to be made clearer for projection

Technical details UML class diagram here?

5 stars give us preference level Taste

The elephant in the room

Hadoop Hadoop was named after a stuffed toy elephant owned by the son of Doug Cutting who started the project Hadoop was extracted from the Nutch crawler Lucene sub-project and provides a scalable batch data processing framework using Map-Reduce on top of a distributed file system (HDFS). The use of Hadoop is beyond the scope of this session

Mahout started off as a sub-project of Apache Lucene Portions of Mahout were* built on top of Hadoop The name is a Hindi word referring to an elephant driver * the project is moving over to Apache Spark

Recommendations Clustering Classification User or item similarity Grouping similar documents Classification Reduce manual burden of assigning categories

RDBMS data source

Back to the demo

Overview Diagram needs to be made clearer for projection

Technical details

Sample Code { // extract avm store id and path var fullpath = url.extension.split("/"); if (fullpath.length == 0) status.code = 400; status.message = "Store id has not been provided."; status.redirect = true; break script; } var storeid = fullpath[0]; var path = (fullpath.length == 1 ? "/" : "/" + fullpath.slice(1).join("/"));

Questions

Image credits Land Rover Discovery 3 Encylopædia Dewey Decimal http://www.flickr.com/photos/klausnahr/2572689595/ Encylopædia http://www.flickr.com/photos/stewart/461099066/ Dewey Decimal http://www.flickr.com/photos/brewbooks/4467301505/ Book store http://www.flickr.com/photos/brewbooks/6541665609/ Anti-social sign https://www.flickr.com/photos/ell-r-brown/6937806186