Presentation is loading. Please wait.

Presentation is loading. Please wait.

READINGS IN DEEP LEARNING 4 Sep 2013. ADMINSTRIVIA New course numbers (11-785/786) are assigned – Should be up on the hub shortly Lab assignment 1 up.

Similar presentations


Presentation on theme: "READINGS IN DEEP LEARNING 4 Sep 2013. ADMINSTRIVIA New course numbers (11-785/786) are assigned – Should be up on the hub shortly Lab assignment 1 up."— Presentation transcript:

1 READINGS IN DEEP LEARNING 4 Sep 2013

2 ADMINSTRIVIA New course numbers (11-785/786) are assigned – Should be up on the hub shortly Lab assignment 1 up – Due date: 2 weeks from today Google group: is everyone on? Website issues.. – Wordpress not yet an option (CMU CS setup) – Piazza?

3 Poll for next 2 classes Monday, Sep 9 – The perceptron: A probabilistic model for information storage and organization in the brain Rosenblatt Not really about the logistic perceptron, more about the probabilistic interpretation of learning in connectionist networks – Organization of behavior Donald Hebb About the Hebbian learning rule

4 Poll for next 2 classes Wed, Sep 11 – Optimal unsupervised learning in a single-layer linear feedforward neural network. Terence Sanger Generalized Hebbian learning rule – The Widrow Hoff learning rule Widrow and Hoff Will be presented by Pallavi Baljekar

5 Notices Success of course depends on good presentations Please send in your slides 1-2 days before the presentations – So that we can ensure they are OK You are encouraged to discuss your papers with us/your classmates while preparing for them – Use the google group for discussion

6 A new project Distributed large scale training of NNs.. Looking for volunteers

7 The Problem: Distributed data Training enormous networks – Billions of units from large amounts of data – Billions or Trillions of instances – Data may be localized.. – Or distributed

8 The problem: Distributed computing A single computer will not suffice – Need many processors – Tens or hundreds or thousands of computers Of possibly varying types and capacity

9 Challenge Getting the data to the computers – Tons of data to many computers Bandwidth problems Timing issues – Synchronizing the learning

10 Logistic Challenges How to transfer vast amounts of data to processors Which processor gets how much data.. – Not all processors equally fast – Not all data take equal amounts of time to process.. and which data – Data locality

11 Learning Challenges How to transfer parameters to processors – Networks are large, billions or trillions of parameters – Each processor must have the latest copy of parameters How to receive updates from processors – Each processor learns on local data – Updates from all processors must be pooled

12 Learning Challenges Synchronizing processor updates – Some processors slower than others – Inefficient to wait for slower ones In order to update parameters at all processors Requires asynchronous updates – Each processor updates when done – Problem: Different processors now have different set of parameters Other processors may have updated parameters already Requires algorithmic changes – How to update asynchronously – Which updates to trust

13 Current Solutions Faster processors GPUs – GPU programming required Large simple clusters – Simple distributed programming Large heterogeneous clusters – Techniques for asynchronous learning

14 Current Solutions Still assume data distribution not a major problem Assume relatively fast connectivity – Gigabit ethernet Fundamentally cluster-computing based – Local area network

15 New project Distributed learning Wide area network – Computers distributed across the world

16 New project Supervisor/Worker architecture One or more supervisors – May be a hierarchy A large number of workers Supervisors in charge of resource and task allocation, gathering and redistributing updates, synchronization

17 New project Challenges Data allocation – Optimal policy for data distribution Minimal latency Maximum locality

18 New project Challenges Computation allocation – Optimal policy for learning Compute load proportional to compute capacity Reallocation of data/task as required

19 New project Challenges Parameter allocation – Do we have to distribute all parameters – Can learning be local

20 New project Challenges Trustable updates – Different processors/LANs have different speeds – How do we trust their updates Do we incorporate or reject?

21 New project Optimal resychronization: how much do we transmit – Should not have to retransmit everything – Entropy coding? – Bit-level optimization?

22 Possibilities Massively parallel learning Never ending learning Multimodal learning GAIA..

23 Asking for Volunteers Will be an open source project Write to Anders

24 Today Bain’s theory: Lars Mahler – Linguist, mathematician, philosopher – One of the earliest people to propose connectionist architecture – Anticipated much of modern ideas McCulloch and Pitts: Kartik Goyal – Early model of neuron: Threshold gates – Earliest model to consider excitation and inhibition


Download ppt "READINGS IN DEEP LEARNING 4 Sep 2013. ADMINSTRIVIA New course numbers (11-785/786) are assigned – Should be up on the hub shortly Lab assignment 1 up."

Similar presentations


Ads by Google