Automated Recipe Completion using Multi-Label Neural Networks Alexander Politowicz 12/12/2018
Executive Summary Using partial list of ingredients, predict missing ingredients Completed! Model attempts prediction of missing ingredients Unfortunately, accuracy not great; room for improvement - Sesame Oil - Sichuan peppercorns Cherry tomatoes Paprika
Approaches and Data Analysis Initial problem: variable input-output Difficult for most machine learning (ML) models Simplification: use ML for cuisine prediction, then get and extract best matching ingredients Data: Inputs: TF-IDF vectorized partial list of ingredients Outputs: cuisine -> missing ingredients
Methodology Neural Network: Post-Processing: Input: TF-IDF vectorized list of ingredients Hidden: 300 nodes Hidden: 100 nodes Output: Number of missing ingredients Post-Processing: Given predicted cuisine, retrieve most common ingredients Recommend by density of main ingredients (e.g. chicken legs, carrots) and spices (e.g. salt, vegetable oil)
Results Neural network performance: Algorithm performance: 71.6% 62.4%
Discussion Neural network likely overfits Post-processing technique not powerful enough to catch unique cases Future improvements: Look into better neural network structure and dropout to combat overfitting Investigate better methods of encoding ingredients (or similar data) for prediction