DeepFont: Identify Your Font from An Image

Slides:

Advertisements

Similar presentations

A brief review of non-neural-network approaches to deep learning

Advertisements

Brian Merrick CS498 Seminar.  Introduction to Neural Networks  Types of Neural Networks  Neural Networks with Pattern Recognition  Applications.

Spatial Pyramid Pooling in Deep Convolutional

Relaxed Transfer of Different Classes via Spectral Partition Xiaoxiao Shi 1 Wei Fan 2 Qiang Yang 3 Jiangtao Ren 4 1 University of Illinois at Chicago 2.

Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.

Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.

Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab

DeepFont: Large-Scale Real-World Font Recognition from Images

Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Dr. Z. R. Ghassabi Spring 2015 Deep learning for Human action Recognition 1.

A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.

Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman.

CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Big data classification using neural network

Hybrid Deep Learning for Reflectance Confocal Microscopy Skin Images

Dimensionality Reduction and Principle Components Analysis

Learning to Compare Image Patches via Convolutional Neural Networks

Handwritten Digit Recognition Using Stacked Autoencoders

Analysis of Sparse Convolutional Neural Networks

Convolutional Neural Network

Applying Deep Neural Network to Enhance EMPI Searching

Deep Feedforward Networks

The Relationship between Deep Learning and Brain Function

Compact Bilinear Pooling

Fall 2004 Perceptron CS478 - Machine Learning.

Randomness in Neural Networks

Data Mining, Neural Network and Genetic Programming

Data Mining, Neural Network and Genetic Programming

DeepCount Mark Lenson.

Understanding and Predicting Image Memorability at a Large Scale

Article Review Todd Hricik.

Deep learning possibilities for GERDA

Matt Gormley Lecture 16 October 24, 2016

Restricted Boltzmann Machines for Classification

Neural networks (3) Regularization Autoencoder

Are End-to-end Systems the Ultimate Solutions for NLP?

Supervised Training of Deep Networks

Deep Belief Networks Psychology 209 February 22, 2013.

Urban Sound Classification with a Convolution Neural Network

State-of-the-art face recognition systems

By: Kevin Yu Ph.D. in Computer Engineering

Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang

Face Recognition with Deep Learning Method

Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.

RGB-D Image for Scene Recognition by Jiaqi Guo

INF 5860 Machine learning for image classification

Very Deep Convolutional Networks for Large-Scale Image Recognition

Similarity based on Shape and Appearance

Pose Estimation for non-cooperative Spacecraft Rendevous using CNN

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.

Lecture: Deep Convolutional Neural Networks

Outline Background Motivation Proposed Model Experimental Results

Deep Cross-media Knowledge Transfer

Deep Robust Unsupervised Multi-Modal Network

Tuning CNN: Tips & Tricks

Deep Learning Some slides are from Prof. Andrew Ng of Stanford.

Zhedong Zheng, Liang Zheng and Yi Yang

Neural networks (3) Regularization Autoencoder

Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning.

Heterogeneous convolutional neural networks for visual recognition

Course Recap and What’s Next?

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Unsupervised Perceptual Rewards For Imitation Learning

An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Learning and Memorization

Huawei CBG AI Challenges

Low-Rank Sparse Feature Selection for Patient Similarity Learning

Presentation transcript:

DeepFont: Identify Your Font from An Image Zhangyang (Atlas) Wang Joint work with Jianchao Yang, Hailin Jin, Jon Brandt, Eli Shechtman, Aseem Argawala, and Thomas Huang

DeepFont: Product & Press

Problem Definition Seen a font in use and want to identify what it is?

Problem Definition Font recognition: recognize font style (typeface, slop, weight, etc) automatically from real-world photos Why it matters? Highly desirable feature for designers Design library collection Design inspiration Text editing

Challenges An extremely large-scale recognition problem Over 100,000 fonts claimed on myfonts.com in their collection Beyond object recognition: recognizing subtle design styles. Extremely difficult to collect labeled real-world training data Has to rely on synthetic training data BIG mismatch between synthetic training and real-world testing

Solution Deep convolutional neural network? Effective at large-scale recognition Effective at fine-grained recognition Data-driven Problem: huge mismatch between synthetic training and real-world testing Task-Specific Data augmentation Decomposition-based deep CNN for domain adaptation

The AdobeVFR Dataset Synthetic training set 2383 fonts from Adobe Type Library (extended to 4052 classes later) 1000 synthetic English word images per font ~2.4M training images Real-world testing set 4383 labeled images, covering 671 fonts out of 2383 Over 100K unlabeled images …………………………………………… The first large-scale benchmark for the task Both synthetic and real-world text images Also good for fine-grain classification, domain adaption, understand design styles

Deep Convolutional Neural Network Following the benchmark structure?

Domain Mismatch Direct training on synthetic data and testing on real-world data (Top-5 accuracy) Need domain adaptation to minimize the gap between synthetic training and real-world testing! Synthetic Real-World Training 99.16% NA Testing 98.97% 49.24%

Data Augmentation Common degradations Noise, blur, warping, shading, compression artifacts, etc

Data Augmentation Common degradations Noise, blur, warping, shading, compression artifacts, etc Special degradations Aspect ratio squeezing: squeeze the image using a random ratio in [1.5, 3.5] in the horizontal direction.

Data Augmentation Common degradations Noise, blur, warping, shading, compression artifacts, etc Special degradations Aspect ratio squeezing: squeeze the image using a random ratio in [1.5, 3.5] in the horizontal direction. Random character spacing: render training text images with random character spacing

Data Augmentation Common degradations Noise, blur, warping, shading, compression artifacts, etc Special degradations Aspect ratio squeezing: squeeze the image using a random ratio in [1.5, 3.5] in the horizontal direction. Random character spacing: render training text images with random character spacing Inputs to the network: random 105x105 crops

Effects of Data Augmentation Synthetic 1-4: common degradations Synthetic 5-6: special degradations Synthetic 1-6: all degradations On the right: MMD between synthetic and real-world data responses

Beyond Data Augmentation Problems Cannot enumerate all possible degradations, e.g., background and font decorations. May introduce degradation bias in training Design the learning algorithm to be robust to domain mismatch? Mismatch already happens in the low-level features Tons of unlabeled real-world data

Network Decomposition for Domain Adaptation Unsupervised cross-domain sub-network Cu (N layers) Supervised domain-specific sub-network Cs (7-N layers)

Network Decomposition for Domain Adaptation Train sub-network Cu in a unsupervised training using stacked convolutional auto encoders, with both synthetic data and unlabeled real-world data. Fix sub-network Cu, and train sub-network Cs in a supervised way, using the labeled synthetic data.

Quantitative Evaluation 4383 real-world test images collected from font forums. Model Augmentation? Decomposition? Real-World Test (Accuracy) Top 1 Top 5 LFE Y Na 42.56% 60.31% DeepFont N 42.49% 49.24% 66.70% 79.28% 71.42% 81.79% Varying the layer number K of unsupervised network Cu K 1 2 3 4 5 Training 91.54% 90.12% 88.77% 87.46% 84.79% 82.12% Testing 79.28% 79.69% 81.79% 81.04% 77.48% 74.03%

Successful Examples

Failure Examples

Model Compression For a typical CNN, about 90% of the storage is taken up by the dense connected layers Matrix factorization methods are considered for compressing parameters in linear models, by capturing nearly low-rank property of parameter matrices. The plots of eigenvalues for the fc6 layer weight matrix in DeepFont. This densely connected layer takes up 85% of the total model size.

Model Compression During training, we add a low rank constraint on the fc 6 (rank < k) layer In practice, we adopt very aggressive compression on all fc layers, and obtained a mini-model with ~40 MB in storage, with a compression ratio >18, and (top-5) performance loss ~3%. Take-Home Points: FC layers can be highly redundant. Compressing them aggressively MIGHT work well. Joint Training-Compression performs notably better than two-stage.

In Adobe Product: Recognize Fonts from Images

In Adobe Product: Photoshop Prototype

Text Editing Inside Photoshop

Text Editing Inside Photoshop

In Adobe Product: Discover Similarity between Fonts Font inspiration, browsing, and organization

In Adobe Product: Discover Similarity between Fonts Font inspiration, browsing, and organization

For more information & dataset download: Thank you! For more information & dataset download: atlaswang.com/deepfont.html