Download presentation
Presentation is loading. Please wait.
1
Deploy Tensorflow on PySpark
2016 April AILab
2
1. Hyperparameter Tuning
2. Deploying models at scale
3
Hyperparameter Tuning
Machine learning practitioners rerun the same model multiple times with different hyperparameters in order to find the best set. This is a classical technique called hyperparameter tuning.
4
Hyperparameter Tuning
We can use Spark to broadcast the common elements such as data and model description, and then schedule the individual repetitive computations across a cluster of machines Fold 1 Fold 2 Fold 3 Distributing Cross Validation sets
5
Hyperparameter
6
Deploying Models at Scale
Broadcasted Trained Model ImageNet data Model Batch 1 Batch1 Labels Predicting Model Batch2 Labels Batch 2 Predicting Model Batch 3 Batch3 Labels Predicting ImageNet is a large collection of images from the internet that is commonly used as a benchmark in image recognition tasks.
7
Deploying Models at Scale
We are now going to take an existing neural network model that has already been trained on a large corpus (the Inception V3 model), and we are going to apply it to images downloaded from the internet. : military uniform : suit, suit of clothes : academic gown, academic robe, judge's robe : bearskin, busby, shako : pickelhaube Inception-v3. image classifier is a CNN which is developed by GoogleInc
8
Deploying Models at Scale
The model is first distributed to the workers of the clusters, using Spark’s built-in broadcasting mechanism: This model is loaded on each node and applied to images model sc.broadcast worker1 model model worker2 …
9
Read Test Data (ImageNet)
image_batch_size=3 >>> batched_data = read_file_index() batch1 batch2
10
Distribute Input data >>> urls = sc.parallelize(batched_data)
RDD (Batches) Tests images (id, url) Batches … model im [(id1, url1) (id2, url2) (id3, url3)] split into Batches batch1 sc.parallelize mapper1 [(id4, url4) (id5, url5) (id6, url6)] model [(id7, url7) (id8, url8) (id9, url9)] batch2 mapper2 …
11
Splitting Input data >>> labeled_images = urls.flatMap(apply_batch) Tensorflow graph Tensorflow graph Get model data (CNN graph) Get model data (CNN graph) Predict image based on trained model Broadcasted Decoding dictionary for labels Broadcasted Decoding dictionary for labels … model map(apply_batch) model Loads a human readable English name for each softmax node (1008 output nodes) We have too many nodes in the output layer. Each class is a node. It uses encoded output labels for memory efficiency batch1 batch1 mapper1 mapper1 model map(apply_batch) model batch2 batch2 mapper2 mapper2 … …
12
GraphDef The foundation of computation in TensorFlow is the Graph object. This holds a network of nodes, each representing one operation, connected to each other as inputs and outputs. After you've created a Graph object, you can save it out by calling as_graph_def(), which returns a GraphDef object.
13
Prediction Fetch an image from the web and uses the trained CNN to infer the topics of this image Download image from the internet Run CNN: Compute forward propagation based on image_data (input nodes/data) and trained weights Softmax_tensor is the output_layer tensor of the CNN Sort the probabilities Decode labels
14
Prediction output_layer tensor of the CNN (1008 output nodes) [
Remove one dimension (2 dimensions to 1 dimension) ] predictions =np.squeeze(predictions) top_k = predictions.argsort()[-5:][::-1] Indexes of top 5 labels sorted by their probabilities
15
Get Result … … … model model batch1 batch1 mapper1 mapper1 model model
map(apply_batch) model collect batch1 batch1 mapper1 mapper1 labels model map(apply_batch) model batch2 batch2 mapper2 mapper2 … …
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.