Vitaly Feldman and Jan Vondrâk IBM Research - Almaden

Vitaly Feldman and Jan Vondrâk IBM Research - Almaden
Optimal bounds on approximation of submodular and XOS functions by juntas Vitaly Feldman and Jan Vondrâk IBM Research - Almaden

Approximation by juntas
$20 $27 Prelims: Submodular and XOS Functions; Learning Applications to learning

Submodularity “Diminishing marginal returns”
[𝑛] = {1,2,…,𝑛}, 𝑓: 2 [𝑛] →𝑅 or equivalently {0,1} 𝑛 ⇔ 2 [𝑛] by associating 𝑆 with 1 𝑆 “Diminishing marginal returns” Discrete partial derivative: 𝜕 𝑖 𝑓 𝑆 ≝𝑓 𝑆∪ 𝑖 −𝑓 𝑆 Definition: 𝑓 is submodular if ∀𝑖, 𝜕 𝑖 𝑓 is monotone decreasing: 𝑆⊆𝑇⇒ 𝜕 𝑖 𝑓 𝑇 ≤ 𝜕 𝑖 𝑓 𝑆 Equivalently, ∀𝑖≠𝑗, 𝜕 𝑗 𝜕 𝑖 𝑓 𝑆 = 𝜕 𝑖,𝑗 𝑆 ≤0 “concave”

Submodularity: examples
Linear functions 𝑓 𝑆 = 𝑖∈𝑆 𝑎 𝑖 or 𝑓 𝑥 = 𝑖∈[𝑛] 𝑥 𝑖 𝑎 𝑖 Coverage functions: for 𝐴 1 , 𝐴 2 ,…, 𝐴 𝑛 ⊆𝑈, 𝑓 𝑆 =| 𝑖∈𝑆 𝐴 𝑖 | graph cut function mutual information of random variables matroid rank functions 𝑓(𝑆) 𝑓(𝑇) 𝑓(𝑆∪{𝑖}) 𝑓(𝑇∪{𝑖})

Submodularity: applications
Operations research TCS: approximation algorithms Machine learning Algorithmic game theory/Economics Represent valuation functions $20 $27

Fractionally-subadditive (XOS)
Def: 𝑓 is subadditive if 𝑓 𝑆∪𝑇 ≤𝑓 𝑆 +𝑓(𝑇) for all 𝑆,𝑇 Def: 𝑓 is fractionally-subadditive if 1 𝑇 ≤ 𝑖 𝛼 𝑖 ⋅ 1 𝑆 𝑖 ⇒ 𝑓 𝑇 ≤ 𝑖 𝛼 𝑖 𝑓 𝑆 𝑖 ∀ℓ∈𝑇, ℓ∈ 𝑆 𝑖 𝛼 𝑖 ≥1 Def: 𝑓 is XOS if 𝑓 𝑥 = max 𝑖 𝑗 𝑎 𝑖,𝑗 𝑥 𝑗 where 𝑎 𝑖,𝑗 ≥0 Fact: for 𝑓 such that 𝑓 ∅ =0, Monot. submodular ⊆ Fractionally-subadditive = XOS 111≤ ⇒ 𝑓 111 ≤ 1 2 𝑓 𝑓 𝑓 011

Learning model: PAC [Valiant 84]
Learner has access to random examples: 𝑥,𝑓 𝑥 where 𝑥 is drawn randomly from distribution 𝐷 over 0,1 𝑛 𝑓∈ 𝐶 (e.g. submodular, XOS) For every 𝜖>0, 𝑓∈𝐶 and 𝐷, given access to random examples, output hypothesis ℎ: 𝑓−ℎ 2 = E 𝑥∼𝐷 𝑓 𝑥 −ℎ 𝑥 ≤𝜖 Multiplicative (PMAC) [Balcan, Harvey 11] : Pr 𝑥∼𝐷 ℎ(𝑥)≤𝑓 𝑥 ≤𝛼 ℎ 𝑥 ≥1−𝛿

Prior work Unrestricted 𝐷, PMAC Uniform 𝑈. Range [0,1] Submodular:
𝑛 -factor approx. in time 𝑝𝑜𝑙𝑦(𝑛, 1 𝛿 ) No poly-time algo with Ω (𝑛 1/3 )-factor [Balcan,Harvey‘11] XOS: 𝑛 -factor: approx. in time 𝑝𝑜𝑙𝑦 𝑛, 1 𝛿 [Balcan,Constantin,Iwata,Wang‘12] No poly-time algo with Ω ( 𝑛 )-factor [Badanidiyuru,Dobzinski,Fu,Kleinberg,Nisan, Roughgarden’12] Uniform 𝑈. Range [0,1] Submodular: Concentration gives learning [Balcan,Harvey‘11] ℓ 2 -error 𝜖 in time 𝑛 𝑂( 1/𝜖 2 ) with value queries [Gupta,Hardt,Roth,Ullman’11] ℓ 2 -error 𝜖 in time 𝑛 𝑂( 1/𝜖 2 ) [Cheraghchi,Klivans,Kothari,Lee’12] ℓ 2 -error 𝜖 in time 𝑛 2 ⋅ 2 𝑂( 1/𝜖 4 ) ℓ 2 -error 𝜖 needs 2 Ω( 𝜖 −2/3 ) examples [F.,Kothari,Vondrak’13] {0,1,..,𝑘}-valued: 𝑘 𝑂 𝑘 log 𝑘/𝜖 time with value queries [Raskhodnikova,Yaroslavtsev’13] Coverage: ℓ 2 -error 𝜖 in time 𝑛 𝑂(log 1/𝜖 ) (1+𝛾)-factor w.p. 1−𝛿 in time 𝑝𝑜𝑙𝑦 𝑛, 1 𝛾 , 1 𝛿 [F.,Kothari’13] For any 𝑥 get 𝑓 𝑥

ℓ-junta: function that depends on at most ℓ variables {0,1,..,𝑘}-valued: 𝑘 log 1/𝜖 𝑂 𝑘 -junta [Blais,Onak,Servedio,Yaroslavtsev‘13] 𝑝𝑜𝑙𝑦 (2 𝑘 /𝜖)-junta [FKV‘13] Coverage: 𝑂 1/ 𝜖 2 -junta [F.,Kothari‘13] For every submodular function 𝑓: 0,1 𝑛 → 0,1 and 𝜖>0 there exist a 2 𝑂 1/ 𝜖 2 -junta ℎ such that 𝑓−ℎ 2 ≤𝜖 [F.,Kothari,Vondrak’13]

Questions Can broader classes of functions (XOS) be approximated by juntas? Is 2 𝑂 1/ 𝜖 2 optimal for submodular functions? Can techniques for Boolean functions be used? XOS functions can be approximated by 2 𝑂 1/ 𝜖 2 -junta XOS functions require 2 Ω 1/𝜖 -junta Submodular functions can be approximated by 𝑂 1 𝜖 2 log 1 𝜖 -junta (even linear functions require Ω 1/𝜖 2 -junta)

Answers XOS functions can be approximated by 2 𝑂 1/ 𝜖 2 -junta
Monot.linear Ω 1/ 𝜖 2 Coverage O 1/𝜖 2 [FK’13] Monot. submodular 𝑂 1 𝜖 2 log 1 𝜖 XOS 2 Ω(1/𝜖) Constant total ℓ 1 -influence 2 𝑂(1/ 𝜖 2 ) XOS functions can be approximated by 2 𝑂 1/ 𝜖 2 -junta XOS functions require 2 Ω 1/𝜖 -junta Submodular functions can be approximated by 𝑂 1 𝜖 2 log 1 𝜖 -junta (even linear functions require Ω 1 𝜖 2 -junta)

Uniform/product 𝑈. Range [0,1]
New learning results Unrestricted 𝐷, PMAC Submodular: 𝑛 -factor approx. in time 𝑝𝑜𝑙𝑦(𝑛, 1 𝛿 ) No poly-time algo with Ω (𝑛 1/3 )-factor [Balcan,Harvey‘11] XOS: 𝑛 -factor: approx. in time 𝑝𝑜𝑙𝑦 𝑛, 1 𝛿 [Balcan,Constantin,Iwata,Wang‘12] No poly-time algo with Ω (𝑛 1/2 )-factor [Badanidiyuru,Dobzinski,Fu,Kleinberg,Nisan, Roughgarden’12] Uniform/product 𝑈. Range [0,1] Submodular: ℓ 2 -error 𝜖 in time 𝑛 𝑂( 1/𝜖 2 ) with value queries [Gupta,Hardt,Roth,Ullman’11] ℓ 2 -error 𝜖 in time 𝑛 𝑂( 1/𝜖 2 ) [Cheraghchi,Klivans,Kothari,Lee’12] ℓ 2 -error 𝜖 in time 𝑛 2 ⋅ 2 𝑂( 1/𝜖 4 ) ℓ 2 -error 𝜖 needs 2 Ω( 𝜖 −2/3 ) examples [F.,Kothari,Vondrak’13] XOS: ℓ 2 -error 𝜖 in time 𝑛⋅ 2 𝑂( 1 𝜖 4 ) Submodular: ℓ 2 -error 𝜖 in time 𝑛 2 ⋅ 2 𝑂 𝜖 2 1+𝛾 -factor w.p. 1−𝛿 in time 𝑛 2 ⋅ 2 𝑂 𝛾 𝛿 2

Approximation by juntas for Boolean functions
Well-studied: [Nisan,Szegedy‛92] [Friedgut‛98] [Bourgain‛01] [Friedgut,Kalai,Naor‛01] [Kindler,Safra‛04] [Alon,Dinur,Friedgut,Sudakov‛04] [Dinur,Friedgut,Kindler,O'Donnell‛06] [O’Donnell,Servedio‛07] [Diakonikolas,Servedio‛09] Friedgut’s theorem For every 𝑓: 0,1 𝑛 →{0,1}, 𝜖>0, there exists a 2 𝑂( 𝐼𝑛𝑓𝑙 𝑓 𝜖 ) -junta 𝜖-close to 𝑓 𝐼𝑛𝑓𝑙 𝑓 = 𝑖=1 𝑛 𝐼𝑛𝑓 𝑙 𝑖 𝑓 = 𝑖=1 𝑛 Pr⁡[𝑓 𝑥 ≠𝑓 𝑥⊕ 𝑒 𝑖 )

For real-valued functions: 𝐼𝑛𝑓 𝑙 𝑖 2 𝑓 =E (𝑓 𝑥 −𝑓 𝑥⊕ 𝑒 𝑖 2 ] 𝐼𝑛𝑓 𝑙 2 𝑓 = 𝑖 𝐼𝑛𝑓 𝑙 𝑖 2 𝑓 Note: for {0,1}-valued 𝑓, Pr⁡[𝑓 𝑥 ≠𝑓 𝑥⊕ 𝑒 𝑖 ) = E (𝑓 𝑥 −𝑓 𝑥⊕ 𝑒 𝑖 2 ] 𝐼𝑛𝑓 𝑙 𝑖 𝑓 = 𝐴∋𝑖 𝑓 𝐴 2 𝐼𝑛𝑓 𝑙 𝑖 2 𝑓 = 𝐴∋𝑖 𝑓 𝐴 2 Friedgut’s theorem does not hold for 𝐼𝑛𝑓 𝑙 2 (𝑓) : Exists a function such that 𝐼𝑛𝑓 𝑙 2 (𝑓) =1 but requires Ω(𝑛) variables to approximate within a constant [O’Donnell,Servedio’06]

Our approach Analogue of Friedgut’s theorem for real-valued functions Influence bounds for XOS functions For every 𝑓: 0,1 𝑛 →𝑅, 𝜖>0, there exists a 2 𝑂 𝐼𝑛𝑓 𝑙 2 𝑓 / 𝜖 2 ⋅ 𝐼𝑛𝑓 𝑙 1 𝑓 𝑂(1) -junta ℎ s.t. 𝑓−ℎ 2 ≤𝜖 𝐼𝑛𝑓 𝑙 1 𝑓 = 𝑖=1 𝑛 𝐼𝑛𝑓 𝑙 𝑖 𝑓 = 𝑖=1 𝑛 E[|𝑓 𝑥 −𝑓 𝑥⊕ 𝑒 𝑖 )| ℎ is also a polynomial of degree 𝑂 𝐼𝑛𝑓 𝑙 2 𝑓 𝜖 2 For every XOS 𝑓: 0,1 𝑛 → 0,1 , 𝐼𝑛𝑓 𝑙 1 𝑓 ≤2 For every submodular 𝑓: 0,1 𝑛 → 0,1 , 𝐼𝑛𝑓 𝑙 1 𝑓 ≤4 Note: range 0,1 ⇒ 𝐼𝑛𝑓 𝑙 2 𝑓 ≤𝐼𝑛𝑓 𝑙 1 𝑓

Bounding influence For every XOS 𝑓: 0,1 𝑛 → 0,1 , 𝐼𝑛𝑓 𝑙 1 𝑓 ≤2
=2⋅E 𝑖∈S 𝑓 𝑆 −𝑓(𝑆\{𝑖}) =2⋅E 𝑆 𝑓 𝑆 − 𝑖∈S 𝑓(𝑆\{𝑖}) Fractional-subaddivity: 1 𝑆 = 𝑖∈S 1 𝑆 −1 1 𝑆\{𝑖} ⇒ 𝑓 𝑆 ≤ 𝑖∈S 1 𝑆 −1 𝑓(𝑆\{𝑖}) Therefore: 𝑆 𝑓 𝑆 − 𝑖∈S 𝑓(𝑆\{𝑖}) ≤𝑓(𝑆) and 𝐼𝑛𝑓 𝑙 1 𝑓 ≤2⋅E 𝑓 𝑆 ≤2 Also applies to submodular and generally to all self-bounding functions [Boucheron,Lugosi,Massart’00]

Juntas for submodular functions
How to select important variables? Select variables with high enough maximum marginal value Let 𝑗 be such that max 𝑥 | 𝜕 𝑗 𝑓 𝑥 | ≥𝛽 Continue recursively for 𝑥 𝑗 =0 and 𝑥 𝑗 =1 No such variable means 𝛽-Lipschitz ⇒ 𝛽 -close to a constant. Use 𝛽= 𝜖 2 [Gupta,Hardt,Roth,Ullman’11] 2 1/𝛽 /poly 𝜖 -variables suffice [F.,Kothari,Vondrak’13] Gives 2 𝑂(1/ 𝜖 2 ) -junta

The solution (monotone case)
For a set 𝐽 and 𝛿∈[0,1] let 𝐽(𝛿) = each element of 𝐽 included w.p. 𝛿 rand. and indep. Start 𝐽=∅ Given a set 𝐽 and element 𝑖∉𝐽 s.t. Pr [ 𝜕 𝑖 𝑓(𝐽 𝛿 )≥𝛽] >1/2 Add 𝑖 to 𝐽 Where 𝛽= 𝜖 2 4 How many variables are selected? E 𝑓 𝐽 𝛿 ≥ 𝛽E 𝐽 𝛿 2 = 𝛽𝛿 𝐽 2 ⇒ 𝐽 ≤ 2 𝛽𝛿

Boosting lemma [Goemans,Vondrak’04]
𝐽 1/2 =𝐽 𝛿 ∪𝐽 𝛿 ∪⋯∪𝐽 𝛿 𝜕 𝑖 𝑓 𝑆 1 ∪ 𝑆 2 ⋯∪ 𝑆 𝑡 ≤ min 𝑘≤𝑡 𝜕 𝑖 𝑓 𝑆 𝑘 For all 𝑖 ∉𝐽, Pr [ 𝜕 𝑖 𝑓 𝐽 𝛿 ≥𝛽] ≤1/2 ⇒ Pr [ 𝜕 𝑖 𝑓(𝐽 1/2 )≥𝛽] = Pr [ 𝜕 𝑖 𝑓(𝐽 𝛿 ∪⋯∪𝐽 𝛿 )≥𝛽] ≤ 2 −1/𝛿 Let 𝛿=1/log 2𝑛 𝜖 , Pr [∃𝑖∉𝐽, 𝜕 𝑖 𝑓(𝐽 1/2 )≥𝛽] ≤𝜖/2 For all but 𝜖 2 -fraction of settings of vars in 𝐽, 𝑓 restricted to the setting is 𝛽-Lipschitz. 𝛽= 𝜖 2 4 ⇒ 𝜖 2 -close to constant 𝑡=1/log 1 1−𝛿 ≈1/𝛿

Junta approximation for submodular functions
𝐽 ≤ 2 𝛽𝛿 ⇒ 𝐽 =𝑂 1 𝜖 2 log 𝑛 𝜖 The best approximating function is the averaging projection for 𝑆⊆𝐽, 𝑓 𝐽 (𝑆)= E 𝑇⊆ 𝑛 \𝐽 [𝑓(𝑆∪𝑇)] 𝑓 𝐽 is submodular ⇒ can repeat the approximation E.g.: 𝐽 1 =𝑂 1 𝜖 2 log |𝐽| 𝜖 = 𝑂 1 𝜖 2 log log 𝑛 𝜖 …

Applications to learning
XOS function 𝑓: 0,1 𝑛 →[0,1]: Choose 𝐽= 𝑖 𝐼𝑛𝑓 𝑙 𝑖 1 𝑓 ≥𝛽}, where 𝛽= 2 −𝑂 1 𝜖 2 Exists polynomial ℎ over 𝐽 of degree 𝑑=𝑂 1 𝜖 2 such that 𝑓−ℎ 2 ≤𝜖 For monotone 𝑓, 𝐼𝑛𝑓 𝑙 𝑖 1 𝑓 =E 𝑓 𝑥 𝑖←1 −𝑓 𝑥 𝑖←0 =2 𝑓 ( 𝑖 ) Can be estimated using random examples Given 𝐽 use least squares regression over all monomials of degree 𝑑. Total 𝐽 𝑑 = 2 𝑂 1 𝜖 4 monomials.

Learning submodular functions
How to exploit the small junta? Pr [ 𝜕 𝑖 𝑓(𝐽 𝛿 )≥𝛽] >1/2 Cannot be evaluated using uniform samples! Why can be estimated? 𝐽 =? 𝑓− 𝑓 𝐽 2 ? Let 𝑓: 0,1 𝑛 →[0,1] be a submodular func. that is 𝜖-approximated by 𝑘-junta in ℓ 2 Let 𝐽= 𝑖 𝑓 𝑖 ≥𝜖/𝑘} ⋃ {𝑖 | ∃𝑗, 𝜕 𝑖,𝑗 𝑓 1 ≥ 𝜖 2 / 𝑘 2 } Then 𝐽 =𝑂( 𝑘 2 / 𝜖 2 ) and 𝑓− 𝑓 𝐽 2 ≤3𝜖

Estimation, |𝐽| 𝐽= 𝑖 𝑓 𝑖 ≥𝜖/𝑘} ⋃ {𝑖 | ∃𝑗, 𝜕 𝑖,𝑗 𝑓 1 ≥ 𝜖 2 / 𝑘 2 } 𝜕 𝑖,𝑗 𝑓 1 =E 𝜕 𝑖,𝑗 𝑓 𝑥 =−E 𝜕 𝑖,𝑗 𝑓 𝑥 =−4⋅E 𝑓 𝑥 𝜒 𝑖,𝑗 𝑥 =−4⋅ 𝑓 𝑖,𝑗 𝑖 𝑓 𝑖 2 + 𝑖,𝑗 𝑓 𝑖,𝑗 2 ≤1 ⇒ 𝐽 ≤ 𝑘 4 /(16 𝜖 4 ) Using 𝐼𝑛𝑓 𝑙 1 𝑓 ≤4 can get 𝐽 =𝑂( 𝑘 2 / 𝜖 2 ) and 𝑂 1 𝜖 6 -junta for submodular functions

𝑓− 𝑓 𝐽 2 𝐽= 𝑖 𝑓 𝑖 ≥𝜖/𝑘} ⋃ {𝑖 | ∃𝑗, 𝜕 𝑖,𝑗 𝑓 1 ≥ 𝜖 2 / 𝑘 2 } Given that 𝑓− 𝑓 𝐼 2 ≤𝜖 , for some 𝐼 =𝑘 Suffices to prove 𝑓 𝐼 − 𝑓 𝐼∩𝐽 2 ≤2𝜖 𝑓− 𝑓 𝐽 2 ≤ 𝑓− 𝑓 𝐼∩𝐽 2 ≤ 𝑓− 𝑓 𝐼 2 + 𝑓 𝐼 − 𝑓 𝐼∩𝐽 2 ≤3𝜖 𝑓 𝐼 𝑥 = 𝐴⊆𝐼 𝑓 𝐴 𝜒 𝐴 𝑥 𝑓 𝐼 − 𝑓 𝐼∩𝐽 2 2 = 𝐴⊆𝐼, 𝐴∩(𝐼\𝐽)≠∅ 𝑓 𝐴 2 = 𝑖∈𝐼\𝐽 𝑓 {𝑖} 2 + 𝐴⊆𝐼\{𝑖,𝑗} 𝑖∈𝐼\𝐽, 𝑗∈𝐼 𝑓 𝐴∪{𝑖,𝑗} 2 𝐴⊆[𝑛]\{𝑖,𝑗}, 𝑖∈[𝑛]\𝐽 𝑓 𝐴∪{𝑖,𝑗} 2 = 𝜕 𝑖,𝑗 𝑓/4 2 2 ≤ 𝜕 𝑖,𝑗 𝑓 1 4 ≤ 𝜖 2 4 𝑘 2 ≤ 𝜖 2 + 𝜖 2 /4

PMAC Learning Multiplicative (PMAC) [Balcan, Harvey 11] :
Assume max 𝑥 𝑓 𝑥 =1 Given 𝑂 1 𝜖 6 variables find 𝑂 1 𝜖 2 -junta 𝑓 𝐽 by trying all subsets of size 𝑂 1 𝜖 2 . Takes 2 𝑂 1/ 𝜖 2 -time Gives 𝑓− 𝑓 𝐽 1 ≤ 𝑓− 𝑓 𝐽 2 ≤𝜖 Multiplicative (PMAC) [Balcan, Harvey 11] : Pr 𝑥∼𝐷 ℎ(𝑥)≤𝑓 𝑥 ≤(1+𝛾) ℎ 𝑥 ≥1−𝛿

PMAC algo 𝑓− 𝑓 𝐽 1 ≤𝜖 … 1 𝑇⊆[𝑛]\𝐽 𝑆⊆𝐽

When for 𝑆⊆𝐽 the conditions do not hold:
Define, for 𝑇⊆ 𝑛 \𝐽, 𝑓 𝐽←𝑆 𝑇 =𝑓(𝑆∪𝑇) (submodular) 𝑓− 𝑓 𝐽 1 ≤𝜖 … 1 𝑆⊆𝐽

Define, for 𝑇⊆ 𝑛 \𝐽, 𝑓 𝐽←𝑆 𝑇 =𝑓(𝑆∪𝑇) (submodular) Rescale 𝑓 𝐽←𝑆 so that max 𝑇⊆ 𝑛 \𝐽 𝑓 𝐽←𝑆 𝑇 ∈[1/4,1] Solve recursively for scaled 𝑓 𝐽←𝑆 by filtering examples 𝑓− 𝑓 𝐽 1 ≤𝜖 … 1 𝑆⊆𝐽

Define, for 𝑇⊆ 𝑛 \𝐽, 𝑓 𝐽←𝑆 𝑇 =𝑓(𝑆∪𝑇) (submodular) Rescale 𝑓 𝐽←𝑆 so that max 𝑇⊆ 𝑛 \𝐽 𝑓 𝐽←𝑆 𝑇 ∈[1/4,1] Solve recursively for scaled 𝑓 𝐽←𝑆 by filtering examples 𝑓− 𝑓 𝐽 1 ≤𝜖 … 1 … 1 … 1 … 1 … 1 𝑆⊆𝐽

For submodular 𝑓: 0,1 𝑛 → R + , E 𝑓 ≥ max 𝑥 𝑓 𝑥 /4
[Feige’06][Feige,Mirrokni,Vondrak’07] 𝐸 𝑓 𝐽 =𝐸 𝑓 ≥ ⇒ Pr S⊆𝐽 𝑓 𝐽 𝑆 ≥ 1 8 ≥ 1 8 Conditions 𝑓 𝐽 𝑆 ≥1/8 E 𝑇⊆ 𝑛 \𝐽 𝑓 𝐽 𝑆 −𝑓 𝑆∪𝑇 ≤𝛾𝛿/16 Holds for 1 8 − 1 16 = fraction of 𝑆⊆𝐽 Can be (approximately) verified using random examples

PMAC algorithm When for 𝑆⊆𝐽 the conditions do not hold:
Define, for 𝑇⊆ 𝑛 \𝐽, 𝑓 𝐽←𝑆 𝑇 =𝑓(𝑆∪𝑇) (submodular) Rescale 𝑓 𝐽←𝑆 so that max 𝑇⊆ 𝑛 \𝐽 𝑓 𝐽←𝑆 𝑇 ∈[1/4,1] PMAC learn 𝑓 𝐽←𝑆 recursively by filtering examples but stop after level 𝑂 log 1 𝛿 Successful for 1−𝛿 fraction points Running time: 𝑂 1/𝜖 𝑂 log 1/𝛿 = 2 𝑂 1 𝛾 2 𝛿 2

Conclusions and open problems
Also in the paper: Applications to agnostic learning, proper learning, testing Multiplicative junta for monotone submod funcs Is this true for non-monotone submodular functions? What about XOS? Can dependence on 𝛿 in the running time of PMAC algorithm be improved from 2 𝑂 1 𝛾 2 𝛿 2 ? Learning and structure of submodular functions under more general families of distributions For any monotone submodular function 𝑓: 0,1 𝑛 → 𝑅 + , and 𝛾,𝛿>0 exists 𝐽, 𝐽 =𝑂 1 𝛾 2 log 1 𝛾 log 1 𝛾𝛿 s.t. Pr 𝑥∼𝑈 1−𝛾 𝑓 𝐽 (𝑥)≤𝑓 𝑥 ≤ 1+𝛾 𝑓 𝐽 𝑥 ≥1−𝛿

Vitaly Feldman and Jan Vondrâk IBM Research - Almaden

Similar presentations

Presentation on theme: "Vitaly Feldman and Jan Vondrâk IBM Research - Almaden"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Vitaly Feldman and Jan Vondrâk IBM Research - Almaden

Similar presentations

Presentation on theme: "Vitaly Feldman and Jan Vondrâk IBM Research - Almaden"— Presentation transcript:

Similar presentations

About project

Feedback