DeepFont: Identify Your Font from An Image

Slides:



Advertisements
Similar presentations
A brief review of non-neural-network approaches to deep learning
Advertisements

Brian Merrick CS498 Seminar.  Introduction to Neural Networks  Types of Neural Networks  Neural Networks with Pattern Recognition  Applications.
Spatial Pyramid Pooling in Deep Convolutional
Relaxed Transfer of Different Classes via Spectral Partition Xiaoxiao Shi 1 Wei Fan 2 Qiang Yang 3 Jiangtao Ren 4 1 University of Illinois at Chicago 2.
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
DeepFont: Large-Scale Real-World Font Recognition from Images
Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,
Dr. Z. R. Ghassabi Spring 2015 Deep learning for Human action Recognition 1.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman.
CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Big data classification using neural network
Hybrid Deep Learning for Reflectance Confocal Microscopy Skin Images
Dimensionality Reduction and Principle Components Analysis
Learning to Compare Image Patches via Convolutional Neural Networks
Handwritten Digit Recognition Using Stacked Autoencoders
Analysis of Sparse Convolutional Neural Networks
Convolutional Neural Network
Applying Deep Neural Network to Enhance EMPI Searching
Deep Feedforward Networks
The Relationship between Deep Learning and Brain Function
Compact Bilinear Pooling
Fall 2004 Perceptron CS478 - Machine Learning.
Randomness in Neural Networks
Data Mining, Neural Network and Genetic Programming
Data Mining, Neural Network and Genetic Programming
DeepCount Mark Lenson.
Understanding and Predicting Image Memorability at a Large Scale
Article Review Todd Hricik.
Deep learning possibilities for GERDA
Matt Gormley Lecture 16 October 24, 2016
Restricted Boltzmann Machines for Classification
Neural networks (3) Regularization Autoencoder
Are End-to-end Systems the Ultimate Solutions for NLP?
Supervised Training of Deep Networks
Deep Belief Networks Psychology 209 February 22, 2013.
Urban Sound Classification with a Convolution Neural Network
State-of-the-art face recognition systems
By: Kevin Yu Ph.D. in Computer Engineering
Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang
Face Recognition with Deep Learning Method
Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.
RGB-D Image for Scene Recognition by Jiaqi Guo
INF 5860 Machine learning for image classification
Very Deep Convolutional Networks for Large-Scale Image Recognition
Similarity based on Shape and Appearance
Pose Estimation for non-cooperative Spacecraft Rendevous using CNN
An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.
Lecture: Deep Convolutional Neural Networks
Outline Background Motivation Proposed Model Experimental Results
Deep Cross-media Knowledge Transfer
Deep Robust Unsupervised Multi-Modal Network
Tuning CNN: Tips & Tricks
Deep Learning Some slides are from Prof. Andrew Ng of Stanford.
Zhedong Zheng, Liang Zheng and Yi Yang
Neural networks (3) Regularization Autoencoder
Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning.
Heterogeneous convolutional neural networks for visual recognition
Course Recap and What’s Next?
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Unsupervised Perceptual Rewards For Imitation Learning
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Learning and Memorization
Huawei CBG AI Challenges
Low-Rank Sparse Feature Selection for Patient Similarity Learning
Presentation transcript:

DeepFont: Identify Your Font from An Image Zhangyang (Atlas) Wang Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman, Aseem Argawala, and Thomas Huang

DeepFont: Product & Press

Problem Definition Seen a font in use and want to identify what it is?

Problem Definition Font recognition: recognize font style (typeface, slop, weight, etc) automatically from real-world photos Why it matters? Highly desirable feature for designers Design library collection Design inspiration Text editing

Challenges An extremely large-scale recognition problem Over 100,000 fonts claimed on myfonts.com in their collection Beyond object recognition: recognizing subtle design styles. Extremely difficult to collect labeled real-world training data Has to rely on synthetic training data BIG mismatch between synthetic training and real-world testing

Solution Deep convolutional neural network? Effective at large-scale recognition Effective at fine-grained recognition Data-driven Problem: huge mismatch between synthetic training and real-world testing Task-Specific Data augmentation Decomposition-based deep CNN for domain adaptation

The AdobeVFR Dataset Synthetic training set 2383 fonts from Adobe Type Library (extended to 4052 classes later) 1000 synthetic English word images per font ~2.4M training images Real-world testing set 4383 labeled images, covering 671 fonts out of 2383 Over 100K unlabeled images …………………………………………… The first large-scale benchmark for the task Both synthetic and real-world text images Also good for fine-grain classification, domain adaption, understand design styles

Deep Convolutional Neural Network Following the benchmark structure?

Domain Mismatch Direct training on synthetic data and testing on real-world data (Top-5 accuracy) Need domain adaptation to minimize the gap between synthetic training and real-world testing! Synthetic Real-World Training 99.16% NA Testing 98.97% 49.24%

Data Augmentation Common degradations Noise, blur, warping, shading, compression artifacts, etc

Data Augmentation Common degradations Noise, blur, warping, shading, compression artifacts, etc Special degradations Aspect ratio squeezing: squeeze the image using a random ratio in [1.5, 3.5] in the horizontal direction.

Data Augmentation Common degradations Noise, blur, warping, shading, compression artifacts, etc Special degradations Aspect ratio squeezing: squeeze the image using a random ratio in [1.5, 3.5] in the horizontal direction. Random character spacing: render training text images with random character spacing

Data Augmentation Common degradations Noise, blur, warping, shading, compression artifacts, etc Special degradations Aspect ratio squeezing: squeeze the image using a random ratio in [1.5, 3.5] in the horizontal direction. Random character spacing: render training text images with random character spacing Inputs to the network: random 105x105 crops

Effects of Data Augmentation Synthetic 1-4: common degradations Synthetic 5-6: special degradations Synthetic 1-6: all degradations On the right: MMD between synthetic and real-world data responses

Beyond Data Augmentation Problems Cannot enumerate all possible degradations, e.g., background and font decorations. May introduce degradation bias in training Design the learning algorithm to be robust to domain mismatch? Mismatch already happens in the low-level features Tons of unlabeled real-world data

Network Decomposition for Domain Adaptation Unsupervised cross-domain sub-network Cu (N layers) Supervised domain-specific sub-network Cs (7-N layers)

Network Decomposition for Domain Adaptation Train sub-network Cu in a unsupervised training using stacked convolutional auto encoders, with both synthetic data and unlabeled real-world data. Fix sub-network Cu, and train sub-network Cs in a supervised way, using the labeled synthetic data.

Quantitative Evaluation 4383 real-world test images collected from font forums. Model Augmentation? Decomposition? Real-World Test (Accuracy) Top 1 Top 5 LFE Y Na 42.56% 60.31% DeepFont N 42.49% 49.24% 66.70% 79.28% 71.42% 81.79% Varying the layer number K of unsupervised network Cu K 1 2 3 4 5 Training 91.54% 90.12% 88.77% 87.46% 84.79% 82.12% Testing 79.28% 79.69% 81.79% 81.04% 77.48% 74.03%

Successful Examples

Failure Examples

Model Compression For a typical CNN, about 90% of the storage is taken up by the dense connected layers Matrix factorization methods are considered for compressing parameters in linear models, by capturing nearly low-rank property of parameter matrices. The plots of eigenvalues for the fc6 layer weight matrix in DeepFont. This densely connected layer takes up 85% of the total model size.

Model Compression During training, we add a low rank constraint on the fc 6 (rank < k) layer In practice, we adopt very aggressive compression on all fc layers, and obtained a mini-model with ~40 MB in storage, with a compression ratio >18, and (top-5) performance loss ~3%. Take-Home Points: FC layers can be highly redundant. Compressing them aggressively MIGHT work well. Joint Training-Compression performs notably better than two-stage.

In Adobe Product: Recognize Fonts from Images

In Adobe Product: Photoshop Prototype

Text Editing Inside Photoshop

Text Editing Inside Photoshop

In Adobe Product: Discover Similarity between Fonts Font inspiration, browsing, and organization

In Adobe Product: Discover Similarity between Fonts Font inspiration, browsing, and organization

For more information & dataset download: Thank you! For more information & dataset download: atlaswang.com/deepfont.html