Integrated Logistics PROBE Princeton University, 10/31-11/1.

1 Integrated Logistics PROBE Princeton University, 10/31-11/1

2 Presentation Outline  Defining Logistics  Applications and Key Problems  Facility Location  Known Results  Open Problems  Hierarchical Network Design  Known Results  Open Problems

3 Defining Logistics  Given service demands, must satisfy  “transporting products” from A to B  Goal is to minimize service cost  Aggregation problems

4 Facility Location Problems  Open facilities  Each demand near to some facility  Minimize sum or max distances  Some restriction on facilities to open  NP Hard (1.46)

5 Hierarchical Aggregation  More than one level of “cluster”  Basically building a tree or forest  Solve FL over and over… but don’t want to pay much!

6 App: Trucking Service

7  Talk by Ted Gifford  Schneider Logistics  Multi-Billion dollar industry  Solve FL problems  Difficult to determine costs, constraints  Often solve problems exactly (IP)  Usually ~500-1000 nodes

8 Open Problems: Trucking  Often multi-commodity FL  Hierarchical, but typically only 3-4 levels  Need extremely accurate solutions  “average case” bounds?

9 App: Databases

10  Talk by Sudipto Guha  U. Penn, AT&T research  Distributed databases  Determining data placement on network  Database Clustering  Many models, measures  Many different heuristics!

11 Open Problems: Databases  Databases can be VERY large  “polynomial-time” not good enough  Streaming/sampling based approaches  Data may change with time  Need fast “update” algorithm  No clear measure of quality  “quick and dirty” may be best

12 App: Genetics

13  Talk by Kamesh Munagala  Stanford University, Strand Genomics  Finding patterns in DNA/proteins  Known DNA code, but proteins mysterious  Can scan protein content of cells fast  Scan is not very accurate though  Find patterns in healthy vs. tumor cells

14 Open Problems: Genetics  Huge amounts of data!  Also, not very accurate, many “mistakes”  Try to find separating dimension  Potentially many clusterings, find “best”  Really two-step problem  Find best “dimension” of exp. combinations  Cluster it, see if it separates

15 Results: Facility Location  Talk by David Shmoys  Cornell University  Three main paradigms  Linear Program Rounding  Primal-Dual Method  Local Search

16 Results: Facility Location  Talk by Kamal Jain  Microsoft Research  Talk by Mohammad Mahdian  MIT  Best approximation: 1.52  Primal-dual based “greedy” algorithm  Solve LP to find “worst-case” approx

17 Results: Facility Location  Talk by Martin Pal  Cornell University  Problem of FL with hard capacities  O(1) via local search  Open: O(1) via primal-dual or LP?  What is LP gap?  Often good to have “lower bound”

18 Results: Facility Location  Talk by Ramgopal Mettu  Dartmouth University  FAST approximations for k-median  O(nk) constant approx  Repeated sampling approach  Compared to DB clustering heuristics  Slightly slower, much more accurate

19 Open Problems: FL  Eliminate the gap!  1.52 vs. 1.46, VERY close  Analysis of Mahdian is tight  Maybe time to revisit lower bound?  K-Median Problem  Local search gives 3, improve?  Load Balanced Problem  Exact on the lower bounds?

20 Results: Network Design  Talk by Adam Meyerson  CMU  O(log n) for single-sink  O(log n log log n) for one function  O(1) for one sink, one function

21 Results: Network Design  Talk by Kunal Talwar  UC Berkeley  Improved O(1) for one sink, function  LP rounding

22 Results: Network Design  Connected Facility Location  Talks by Anupam Gupta  Lucent Research, CMU  Chaitanya Swamy  Cornell University  Give 9-approx for the problem  Greedy, primal-dual approaches

23 Results: Network Design  Talk by Amitabh Sinha  CMU  Combining Buy-at-bulk with FL  O(log n) immediate, but what about O(1)?  O(1) for one cable type, small constant  O(1) in general  What about capacitated? K-med?

24 Open Problems: ND  Multi-commodity, multiple function  No nontrivial approximations known!  O(1) for single sink?  LP gap not even known!  O(1) for single function?  Cannot depend on tree embedding  Make the constants reasonable!  Euclidean problem: easier?

25 Conclusions  Many applications and open problems!  Must get in touch with DB community…  Workshop was a success, but…  Need more OR participation  Too short notice for faculty?  Plan another workshop, late March  Hope to have some more solutions!

26 Thanks to Princeton Local Arrangements by Moses Charikar + Mitra Kelly

