Sorting Networks Uri Zwick Tel Aviv University Started: May 2015

Slides:

Advertisements

Similar presentations

Request Dispatching for Cheap Energy Prices in Cloud Data Centers

Advertisements

SpringerLink Training Kit

Luminosity measurements at Hadron Colliders

From Word Embeddings To Document Distances

Choosing a Dental Plan Student Name

Virtual Environments and Computer Graphics

Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI

THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –

D. Phát triển thương hiệu

NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN

Điều trị chống huyết khối trong tai biến mạch máu não

BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.

Nasal Cannula X particulate mask

Evolving Architecture for Beyond the Standard Model

HF NOISE FILTERS PERFORMANCE

Electronics for Pedestrians – Passive Components –

Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel

L-Systems and Affine Transformations

CMSC423: Bioinformatic Algorithms, Databases and Tools

Some aspect concerning the LMDZ dynamical core and its use

Bayesian Confidence Limits and Intervals

实习总结（Internship Summary)

Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,

Front End Electronics for SOI Monolithic Pixel Sensor

Face Recognition Monday, February 1, 2016.

Solving Rubik's Cube By: Etai Nativ.

CS284 Paper Presentation Arpad Kovacs

انتقال حرارت 2 خانم خسرویار.

Summer Student Program First results

Theoretical Results on Neutrinos

HERMESでのHard Exclusive生成過程による核子内クォーク全角運動量についての研究

Wavelet Coherence & Cross-Wavelet Transform

yaSpMV: Yet Another SpMV Framework on GPUs

Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.

MOCLA02 Design of a Compact L-band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,

Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,

Fuel cell development program for electric vehicle

Overview of TST-2 Experiment

Optomechanics with atoms

داده کاوی سئوالات نمونه

Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium

ლექცია 4 - ფული და ინფლაცია

10. predavanje Novac i financijski sustav

Wissenschaftliche Aussprache zur Dissertation

FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,

Particle acceleration during the gamma-ray flares of the Crab Nebular

Interpretations of the Derivative Gottfried Wilhelm Leibniz

Advisor: Chiuyuan Chen Student: Shao-Chun Lin

Widow Rockfish Assessment

SiW-ECAL Beam Test 2015 Kick-Off meeting

On Robust Neighbor Discovery in Mobile Wireless Networks

Chapter 6 并发：死锁和饥饿 Operating Systems: Internals and Design Principles

You NEED your book!!! Frequency Distribution

Y V =0 a V =V0 x b b V =0 z

Fairness-oriented Scheduling Support for Multicore Systems

Climate-Energy-Policy Interaction

Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,

Ch48 Statistics by Chtan FYHSKulai

The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.

Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs

Online Learning: An Introduction

Factor Based Index of Systemic Stress (FISS)

What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.

THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*

Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.

The Toroidal Sporadic Source: Understanding Temporal Variations

FW 3.4: More Circle Practice

ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف

Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM

Limits on Anomalous WWγ and WWZ Couplings from DØ

Presentation transcript:

Sorting Networks Uri Zwick Tel Aviv University Started: May 2015 Last Update: February 5, 2017

Can we use comparators to build efficient sorting networks? 𝑥 min⁡(𝑥,𝑦) 𝑦 max(𝑥,𝑦) 𝑥 min⁡(𝑥,𝑦) 𝑦 max(𝑥,𝑦) Can we use comparators to build efficient sorting networks?

Comparator networks A comparator network is a network composed of comparators. There are 𝑛 input wires, each feeding a single comparator. Each output of a comparator is either an output wire, or feeds a single comparator. The network must be acyclic. Number of output wires is also 𝑛.

“Standard form” For each compare/exchange, the smaller item goes up, the larger item goes down. Exercise: Show that any comparator network is equivalent to a network in standard form with the same number of comparators.

A simple sorting network 5 comparators 3 levels

Insertion sort 𝑆𝑜𝑟𝑡(𝑛)

Selection/bubble sort 𝑆𝑜𝑟𝑡(𝑛)

Selection/bubble Sort Size = 𝑛 𝑛−1 2 Depth = 2𝑛−1

Exercise: Any sorting network that only compares adjacent lines must be of size at least 𝑛 𝑛−1 2 . Exercise: Prove that the odd-even transposition sort, shown on the next slide, is a sorting network.

Odd-Even Transposition Sort Size = 𝑛 𝑛−1 2 Depth = 𝑛

Theorem: If a network sort all 0-1 inputs, then it sort all inputs. The 0-1 principle Theorem: If a network sort all 0-1 inputs, then it sort all inputs.

The 0-1 principle Lemma: Let 𝑓 be a monotone non-decreasing function. Then, if a network maps 𝑥 1 , 𝑥 2 ,…, 𝑥 𝑛 to 𝑦 1 , 𝑦 2 ,…, 𝑦 𝑛 , then it maps 𝑓(𝑥 1 ), 𝑓(𝑥 2 ),…, 𝑓(𝑥 𝑛 ) to 𝑓(𝑦 1 ), 𝑓(𝑦 2 ),…, 𝑓(𝑦 𝑛 ). Proof: By induction on the number of comparisons using 𝑓 min 𝑎,𝑏 =min⁡(𝑓 𝑎 ,𝑓 𝑏 ) 𝑓 max 𝑎,𝑏 =max⁡(𝑓 𝑎 ,𝑓 𝑏 )

The 0-1 principle Proof: Suppose that a network is not a sorting network. It then maps some 𝑥 1 , 𝑥 2 ,…, 𝑥 𝑛 to 𝑦 1 , 𝑦 2 ,…, 𝑦 𝑛 , where 𝑦 𝑖 > 𝑦 𝑖+1 , for some 1≤𝑖<𝑛. Let 𝑓 𝑥 =1, iff 𝑥≥ 𝑦 𝑖 , 0 otherwise. The network maps 𝑓(𝑥 1 ), 𝑓(𝑥 2 ),…, 𝑓(𝑥 𝑛 ) to 𝑓(𝑦 1 ),…, 𝑓(𝑦 𝑖 )=1,𝑓 𝑦 𝑖+1 =0 ,…, 𝑓(𝑦 𝑛 ). Thus, the network does not sort all 0-1 inputs.

Sorting by merging 𝑆𝑜𝑟𝑡(𝑛) 𝑆𝑜𝑟𝑡(𝑚) 𝑀𝑒𝑟𝑔𝑒(𝑛,𝑚)

Batcher’s odd-even merge 𝑜 1 𝑀 𝑛 2 , 𝑚 2 𝑎 1 𝑒 1 𝑎 2 𝑜 2 𝑎 3 𝑒 2 𝑛 𝑜 3 𝑏 1 𝑏 2 𝑚 𝑏 3

Batcher’s odd-even merge 𝑀 𝑛 2 , 𝑚 2 𝑛 𝑚

Batcher’s odd-even merge To merge 𝑎 1 , 𝑎 2 ,.., 𝑎 𝑛 with 𝑏 1 , 𝑏 2 ,.., 𝑏 𝑚 : Split 𝑎 1 , 𝑎 2 ,.., 𝑎 𝑛 into 𝑎 1 , 𝑎 3 ,… and 𝑎 2 , 𝑎 4 ,… Split 𝑏 1 , 𝑏 2 ,.., 𝑏 𝑚 into 𝑏 1 , 𝑏 3 ,… and 𝑏 2 , 𝑏 4 ,… Merge odd-indexed items to create 𝑜 1 , 𝑜 2 ,… Merge even-indexed items to create 𝑒 1 , 𝑒 2 ,… Compare/swap ( 𝑒 1 , 𝑜 2 ),( 𝑒 2 , 𝑜 3 ),… Why does it work?

Batcher’s odd-even merge Direct proof (without the 0-1 principle, distinct items) Claim 1: 𝑜 𝑖+1 should be in position 2𝑖 or 2𝑖+1 of merged sequence. Claim 2: 𝑒 𝑖 should be in position 2𝑖 or 2𝑖+1 of merged sequence. Thus, the compare/swap of 𝑒 𝑖 and 𝑜 𝑖+1 puts all items in correct place. Proof of Claim 1: Suppose 𝑜 𝑖+1 comes from 𝑎. (Other case similar.) If 𝑜 𝑖+1 = 𝑎 2𝑖+1 , then 𝑎 1 <…< 𝑎 2𝑖+1 = 𝑜 𝑖+1 < 𝑏 1 <… Thus 𝑜 𝑖+1 should be in position 2𝑖+1. The 𝑗-th odd item in 𝑎 This item may or may not exist If 𝑜 𝑖+1 = 𝑎 2𝑗−1 , where 𝑗≤𝑖, then 𝑎 1 <…< 𝑎 2𝑗−1 = 𝑜 𝑖+1 , 𝑏 1 <…< 𝑏 2 𝑖−𝑗 +1 <𝑜 𝑖+1 < 𝑏 2 𝑖−𝑗 +3 Thus, 𝑜 𝑖+1 should is in position 2𝑖, if 𝑜 𝑖+1 < 𝑏 2 𝑖−𝑗 +2 , or in position 2𝑖+1, if 𝑏 2 𝑖−𝑗 +2 <𝑜 𝑖+1 . The (𝑖−𝑗+1)-st odd item in 𝑏 2𝑖 = 2𝑗−1 + 2 𝑖−𝑗 +1 The proof of Claim 2 is similar (exercise).

Batcher’s odd-even merge 𝑀(1,1) 𝑀(2,2)

Batcher’s odd-even merge - 𝑀(4,4)

Batcher’s odd-even merge - 𝑀(4,4)

Odd-even merge → Odd-even sort

Odd-even merge for 𝑚=1 𝑎 1 𝑎 2 𝑀(2,1) 𝑏 1 𝑎 1 𝑎 2 𝑀(3,1) 𝑎 3 𝑏 1

Odd-even merge for 𝑚=1 𝑀 𝑛 2 ,1 𝑛 𝑏 1

Batcher’s odd-even merge Proof using the 0-1 principle. Suppose that 𝑎 1 , 𝑎 2 ,.., 𝑎 𝑛 starts with 𝑛 0 0’s and that 𝑏 1 , 𝑏 2 ,.., 𝑏 𝑚 starts with 𝑚 0 0’s. Then 𝑜 1 , 𝑜 2 ,… starts with 𝑛 0 2 + 𝑚 0 2 0’s, and 𝑒 1 , 𝑒 2 ,… starts 𝑛 0 2 + 𝑚 0 2 0’s. The difference is either 0,1 or 2! There is a problem only if difference is 2 and last level of comparators fixes it.

Batcher’s odd-even merge 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 𝑒 𝑜 0 0 0 0 1 1 1 1 0 0 0 0 0 1 1 1 𝑒 𝑜 0 0 0 1 1 1 1 1 0 0 0 0 0 1 1 1 𝑒 𝑜 Exercise: Justify the use of the 0-1 principle for merging networks.

Batcher’s odd-even merge Size – number of comparators 𝑀 𝑛,0 =𝑀 0,𝑚 =0 𝑀 1,1 =1 𝑀 𝑛,𝑚 =𝑀 𝑛 2 , 𝑚 2 +𝑀 𝑛 2 , 𝑚 2 + 𝑛+𝑚−1 2 𝑀 𝑛,𝑛 =2𝑀 𝑛 2 , 𝑛 2 +(𝑛−1) 𝑀 2 𝑘 , 2 𝑘 =𝑘 2 𝑘 +1 𝑀 𝑛,𝑛 =𝑛 lg 𝑛+ 𝑂(𝑛) 𝑀 𝑛,𝑚 = 𝑛+𝑚 2 lg 𝑚+ 𝑂(𝑛) , 𝑛≥𝑚 No better merging networks are known for any 𝑛,𝑚! Are the odd-even merge networks optimal?

Bitonic sequences A sequence is strict bitonic iff it is a concatenation of a decreasing sequence and an increasing sequence A sequence is bitonic iff it is a cyclic shift of a strict bitonic sequence 8 6 4 1 2 5 7 9 4 1 2 5 7 9 8 6 1 4 2 3

Thus, a strict bitonic sorter can serve as a merging network. A Bitonic sorter A (strict) bitonic sorter is a network that sorts every (strict) bitonic sequence. If 𝑎 1 , 𝑎 2 ,.., 𝑎 𝑛 and 𝑏 1 , 𝑏 2 ,.., 𝑏 𝑚 are sorted, then 𝑎 𝑛 , 𝑎 𝑛−1 ,.., 𝑎 1 , 𝑏 1 , 𝑏 2 ,.., 𝑏 𝑚 , is bitonic. 𝑎 𝑛 ≥ 𝑎 𝑛−1 ≥…≥ 𝑎 1 ? 𝑏 1 ≤ 𝑏 2 ≤…≤ 𝑏 𝑚 Thus, a strict bitonic sorter can serve as a merging network.

Batcher’s bitonic sorter 𝐵 𝑛 2 𝑛 Is this different from odd-even merge?

Batcher’s bitonic sorter Simple proof using the 0-1 principle. A strict binary bitonic sequence – 1 𝑘 0 ℓ 1 𝑛−𝑘−ℓ The odd and even subsequences are also strict binary bitonic sequences. By induction they are sorted correctly. The difference between the number of 0’s in the two sorted sequences is one of −1,0,1. Final level of comparators fixes the problem.

Batcher’s bitonic sorter for 𝑛= 2 𝑘 000 Very regular structure! 001 When 𝑛= 2 𝑘 , there are 𝑘 levels with 𝑛 2 comparators each. 010 011 100 Lines 𝑖 and 𝑗 are compared at level ℓ iff they differ only in the ℓ-th most significant bit. 101 110 111

Batcher’s bitonic sorter for 𝑛= 2 𝑘 Alternative recursive definition for 𝑛= 2 𝑘 . Sorts general bitonic sequences. (only when 𝑛= 2 𝑘 .) First level called “half cleaner”.

Batcher’s bitonic sorter for 𝑛= 2 𝑘 𝐵 𝑛 2 Alternative recursive definition for 𝑛= 2 𝑘 . Sorts general bitonic sequences. (only when 𝑛= 2 𝑘 .) 𝐵 𝑛 2 First level called “half cleaner”.

Batcher’s bitonic merger for 𝑛= 2 𝑘 𝑎 1 𝐵 𝑛 𝑎 2 ⋮ Instead of reversing the first sequence, simply adjust the first level. 𝑎 𝑛 𝑏 1 𝐵 𝑛 𝑏 2 ⋮ 𝑏 𝑛

Batcher’s bitonic merge 𝐵(4,4) 𝑎 1 𝑎 2 𝑎 3 𝑎 4 𝑏 1 𝑏 2 𝑏 3 𝑏 4

Batcher’s merging and sorting networks Assume that 𝑛= 2 𝑘 . Odd-Even merge Bitonic sort 𝑀 𝑛,𝑛 = 2𝑀 𝑛 2 , 𝑛 2 + 𝑛−1 𝑀 𝑛,𝑛 = 2𝑀 𝑛 2 , 𝑛 2 +𝑛 𝑀 𝑛,𝑛 =𝑛 lg 𝑛+1 𝑀 𝑛,𝑛 =𝑛 lg 𝑛+𝑛 Very regular structure. Smaller depth for 𝑚≠𝑛. Smallest merging network known! 𝑆 𝑛 =2𝑆 𝑛 2 +𝑀 𝑛 2 , 𝑛 2 𝑆 𝑛 = 1 4 𝑛 lg 𝑛 lg 𝑛 −1 +(𝑛−1) 𝑆 𝑛 = 1 4 𝑛 lg 𝑛 lg 𝑛 −1 + 1 2 𝑛 lg 𝑛

Lower bound for merging networks Let 𝑀(𝑛,𝑛) be the smallest possible size of an (𝑛,𝑛)-merging network. Top and bottom parts are both 𝑛/2,𝑛/2 -merging network. 𝑎 1 Consider 𝑎 1 , 𝑏 1 ,…, 𝑎 𝑛/2 , 𝑏 𝑛/2 < 𝑎 𝑛/2+1 , 𝑏 𝑛/2+1 ,…, 𝑎 𝑛 , 𝑏 𝑛 𝑏 1 𝑎 𝑛/2 There must be at least 𝑛/2 comparisons between top and bottom. 𝑏 𝑛/2 𝑎 𝑛/2+1 𝑏 𝑛/2+1 Consider 𝑏 1 ,…, 𝑏 𝑛 < 𝑎 1 ,…, 𝑎 𝑛 𝑎 𝑛 𝑏 𝑛 𝑀 𝑛,𝑛 ≥2𝑀 𝑛 2 , 𝑛 2 + 𝑛 2 𝑀 𝑛,𝑚 ≥ 1 2 𝑛+𝑚 lg 𝑚 −𝑂(𝑚) 𝑀 𝑛,𝑛 ≥ 1 2 𝑛 lg 𝑛 −𝑂(𝑛) [Miltersen-Paterson-Tarui (1996)] [Floyd (1969)]

Batcher’s Odd-even sort 𝑛=8 Optimal size!

Batcher’s Odd-even sort 𝑛=16 Size = 63 Optimal? No!

Smallest known network for 𝑛=16 [Green (1969)] Knuth, Vol. 3, p. 227 Size = 60 Depth = 10

Smallest known sorting networks Knuth (1998) Vol. 3, p. 227

Smallest known sorting networks For 𝑛≤8, the Batcher’s odd-even networks optimal. 8 7 6 5 4 3 2 1 𝑛 19 16 12 9 Batcher For 𝑛>8, Batcher’s networks are not optimal. 16 15 14 13 12 11 10 9 𝑛 63 59 53 48 41 37 31 26 Batcher 60 56 51 45 39 35 29 25 Upper bound 49 33 Lower bound (See [Codish-(Cruz-Filipe)-(Schneider-Kamp) (2014)])

Fastest sorting networks 𝑛=10 𝑑𝑒𝑝𝑡ℎ=7 𝑛=6 𝑑𝑒𝑝𝑡ℎ=5 𝑛=12 𝑑𝑒𝑝𝑡ℎ=8 𝑛=16 𝑑𝑒𝑝𝑡ℎ=9

[Codish-(Cruz-Filipe)-Ehlers-Müller-(Schneider-Kamp) (2015)] 𝑛=17 𝑑𝑒𝑝𝑡ℎ=10 Optimal! 𝑛=20 𝑑𝑒𝑝𝑡ℎ=11 𝑑𝑒𝑝𝑡ℎ≥10 [Codish-(Cruz-Filipe)-Ehlers-Müller-(Schneider-Kamp) (2015)]

The AKS sorting networks [Ajtai-Komlós-Szemerédi (1983)] There are sorting networks of depth 𝑂( log 𝑛) , and hence size 𝑂(𝑛 log 𝑛) . Simplifications and improvements by [Paterson (1990)] [Pippenger (1990)] [Chvátal (1991)] [Seiferas (2009)] The constant factors are very large. 1830 lg 𝑛 −58657 , 𝑛≥ 2 78 [Chvátal (1991)] Our presentation follows [Paterson (1990)] [Seiferas (2009)] Focus on simplicity, not necessarily efficiency.

Problem: halving requires Ω( log 𝑛) depth. Sorting by splitting 𝑆𝑜𝑟𝑡(𝑛/2) 𝐻𝑎𝑙𝑣𝑒𝑟(𝑛) Problem: halving requires Ω( log 𝑛) depth. Furthermore, prior to AKS, even halvers of depth 𝑂 log 𝑛 were not known…

𝜀-Halver 𝜀-𝐻(𝑚) 𝑚 2 𝑚 𝑚 2 Definition: For every input, and every 𝑘≤ 𝑚 2 , at most 𝜀𝑘 of the 𝑘 smallest items end up at the bottom, and at most 𝜀𝑘 of the 𝑘 largest items end up at the top.

(𝜀,𝛽)-Halver 𝜀-𝐻(𝑚) 𝑚 2 𝑚 𝑚 2 Definition: For every input, and every 𝑘≤ 𝛽𝑚 2 , at most 𝜀𝑘 of the 𝑘 smallest items end up at the bottom, and at most 𝜀𝑘 of the 𝑘 largest items end up at the top.

(𝛼,𝛽)-Expander A bipartite graph 𝐺=(𝑈,𝑉,𝐸), where 𝑈 = 𝑉 =𝑛, 𝑋 𝑁(𝑋) A bipartite graph 𝐺=(𝑈,𝑉,𝐸), where 𝑈 = 𝑉 =𝑛, is an (𝛼,𝛽)-expander if and only if for every 𝑋⊆𝑈 or 𝑋⊆𝑉 with 𝑋 ≤𝛽𝑛, we have 𝑁 𝑋 >𝛼|𝑋|. Constant degree that does not depend on 𝑛 ! Theorem: For every 𝛽<1 and 𝛼<1/𝛽, there exists Δ 𝛼,𝛽 , such that for every 𝑛≥1 there exists an 𝛼,𝛽 -expander with 𝑛 vertices on each side.

Probabilistic construction of expanders Let 𝐺 Δ =(𝑈,𝑉,𝐸), where 𝑈 = 𝑉 =𝑛, be obtained as the union of Δ independently chosen random perfect matchings of 𝑈 and 𝑉. Theorem: If Δ>Δ 𝛼,𝛽 = 𝐻 𝛽 +𝐻 𝛼𝛽 𝛽 lg 1 𝛼𝛽 , then with positive probability 𝐺 Δ = 𝑈,𝑉,𝐸 is an (𝛼,𝛽)-expander. 𝐻 𝑝 =−𝑝 lg 𝑝 − 1−𝑝 lg 1−𝑝 (The Entropy function)

Probabilistic construction of expanders 𝑛 𝑈 𝑉 𝐴 𝐵 For 𝐴⊆𝑈, 𝐵⊆𝑉, let 𝑝 𝐴,𝐵 be the probability that 𝑁 𝐴 ⊆𝐵 in 𝐺 Δ . 𝑝 𝐴,𝐵 = 𝑏 𝑎 𝑛 𝑎 Δ ≤ 𝑏 𝑛 𝑎Δ , where 𝐴 =𝑎, 𝐵 =𝑏. The probability that 𝐺 Δ is not an (𝛼,𝛽)-expander is at most: 2 𝑘=1 𝛽𝑛 𝐴 =𝑘 𝐵 = 𝛼𝑘 𝑝 𝐴,𝐵 =2 𝑘=1 𝛽𝑛 𝑛 𝑘 𝑛 𝛼𝑘 𝛼𝑘 𝑛 𝑘Δ

Probabilistic construction of expanders 2 𝑘=1 𝛽𝑛 𝐴 =𝑘 𝐵 = 𝛼𝑘 𝑝 𝐴,𝐵 =2 𝑘=1 𝛽𝑛 𝑛 𝑘 𝑛 𝛼𝑘 𝛼𝑘 𝑛 𝑘Δ 𝑛 𝑘 < 2 𝐻 𝑘 𝑛 𝑛 𝑛 𝑘 𝑛 𝛼𝑘 𝛼𝑘 𝑛 𝑘Δ ≤ 2 𝐻 𝑘 𝑛 𝑛+𝐻 𝛼𝑘 𝑛 𝑛− 𝑘 lg 𝑛 𝛼𝑘 Δ ≤ 2 𝐻 𝛾 +𝐻 𝛼𝛾 − 𝛾 lg 1 𝛼𝛾 Δ n 𝑘=𝛾𝑛 , 𝛾≤𝛽 Δ > 𝐻 𝛾 +𝐻 𝛼𝛾 𝛾 lg 1 𝛼𝛾 Increasing in 𝛾.

1−𝜀 𝜀 ,𝛽𝜀 -Expander ⇒ (𝜀,𝛽)-Halver An 1−𝜀 𝜀 ,𝛽𝜀 -expander 𝐺=(𝑈,𝑉,𝐸), where 𝑈 = 𝑉 =𝑚/2, composed of Δ matchings, can be used to construct an (𝜀,𝛽)-halver of 𝑚 inputs of depth Δ. Each matching describes 𝑚/2 comparisons that can be performed in parallel in one round. 1 2 … Δ 𝑈 𝑉

1−𝜀 𝜀 ,𝛽𝜀 -Expander ⇒ (𝜀,𝛽)-Halver 𝑈 𝑉 1 2 Δ … For a certain input and 𝑘≤𝑚, let 𝐴⊆𝑈 and 𝐵⊆𝑉 be the set of lines in which the largest 𝑘 items end up. 𝐴 𝐵 𝑈 Lemma: The network does not contain a comparison between a line in 𝐴 and a line not in 𝐵. 𝑉 Items on lines of 𝑈 decrease with time. Items on lines of 𝑉 increase with time. Items in 𝐴 must be among the 𝑘 largest, at all times. Items in 𝑉∖𝐵 must not be among the 𝑘 largest, at all times. If a line in 𝐴 is compared with a line in 𝑉∖𝐵, we get a contradiction.

1−𝜀 𝜀 ,𝛽𝜀 -Expander ⇒ (𝜀,𝛽)-Halver Suppose that for a certain input, at least 𝜀𝑘 of the 𝑘≤ 𝛽𝑚 2 largest items, end up in 𝑈. Let 𝐴⊆𝑈 be the lines/registers in 𝑈 in which 𝜀𝑘 these items end up. 𝐴 =𝜀𝑘≤𝛽𝜀 𝑚 2 . Let 𝐵⊆𝑉 be the lines in 𝑉 containing the at most 1−𝜀 𝑘 remaining 𝑘 largest items. Lines in 𝐴 could only have been compared with lines in 𝐵. Thus, 𝑁 𝐴 ⊆𝐵 and 𝑁 𝐴 ≤ 𝐵 ≤ 1−𝜀 𝑘= 1−𝜀 𝜀 |A|, so 𝐺 is not an ( 1−𝜀 𝜀 ,𝛽𝜀)-expander. 𝑈 𝑉 1 2 Δ … 𝐴 𝐴 =𝜀𝑘≤𝛽𝜀 𝑚 2 𝐵 ≤ 1−𝜀 𝑘= 1−𝜀 𝜀 |A| 𝐵

(𝜆,𝜀, 𝜀 0 )-Separator [Paterson (1990)] 𝑚 (𝜆,𝜀, 𝜀 0 )-Separator(𝑚) 𝐹 1 𝐶 1 𝐶 2 𝐹 2 Left fringe Right fringe 𝜆𝑚 2 1−𝜆 𝑚 2 1−𝜆 𝑚 2 𝜆𝑚 2 For every 𝑘≤ 𝜆𝑚 2 , at most 𝜀𝑘 of the 𝑘 smallest items are not in 𝐹 1 , and at most 𝜀𝑘 of the 𝑘 largest items are not in 𝐹 2 . For every 𝑘≤ 𝑚 2 , at most 𝜀 0 𝑘 of the 𝑘 smallest items are not in 𝐹 1 ∪ 𝐶 1 , and at most 𝜀 0 𝑘 of the 𝑘 largest items are not in 𝐹 2 ∪ 𝐶 2 .

𝜀-Halvers ⇒ (𝜆,𝜀, 𝜀 0 )-Separator 𝑚 𝑝 levels 𝜀 0 -Halver(𝑚) 𝐹 1 𝐶 1 𝐶 2 𝐹 2 𝜀 0 -Halver 𝑚 2 𝜀 0 -H 𝑚 4 𝜆𝑚 2 1−𝜆 𝑚 2 1−𝜆 𝑚 2 𝜆𝑚 2 𝜆= 2 1−𝑝 𝜀=1− 1− 𝜀 0 𝑝 <𝑝 𝜀 0

The simplified AKS sorting network [Paterson (1990)] 𝑡=0 𝑛= 2 ℓ

The simplified AKS sorting network [Paterson (1990)] 𝑡=1 𝑛/2 𝑛/2

The simplified AKS sorting network [Paterson (1990)] Some items are sent back to the root. 𝑡=2 Capacity of a node at depth 𝑖 at time 𝑡 : 𝑏 𝑖,𝑡 =𝑛 𝐴 𝑖 𝜈 𝑡 , 𝐴>1 , 𝜈<1

The simplified AKS sorting network [Paterson (1990)] [Seiferas (2009)] min 𝜆𝑏 , 𝑚 2 Capacity = 𝑏 No. of items = 𝑚 min 𝜆𝑏 , 𝑚 2 𝑚≤𝑏 𝑚 2 − 𝜆𝑏 If 𝑚 is odd, send an arbitrary item up. At a leaf we should have 𝑚≤2 𝜆𝑏 +1, as items cannot be sent down.

The simplified AKS sorting network [Paterson (1990)] [Seiferas (2009)] Number of items 𝑛= 2 ℓ is a power of two. Three parameters (so far) determine the network: 𝐴>1,𝜈<1,𝜆<1/2. The network is modeled on a binary tree of depth ℓ−1= lg 𝑛 −1. 𝑏 𝑖,𝑡 =𝑛 𝐴 𝑖 𝜈 𝑡 - Capacity of a node at depth 𝑖 at time 𝑡. Number of items in each node at each time is at most the capacity. At the root, split item evenly and send to the two children. (Number of items in the root is always even.) At a non-root node containing 𝑚 items: If 𝜆𝑏 ≥ 𝑚 2 , send all items up. Otherwise: If 𝑚 is odd, send an item up and let 𝑚←𝑚−1. Apply a separator with 𝜆 ′ such that 𝜆 ′ 𝑚 2 = 𝜆𝑏 . Send the 2 𝜆𝑏 fringe items up. Send the 𝑚 2 − 𝜆𝑏 central left/right items to left/right children.

Depth and Time 𝑛= 2 ℓ ℓ−2 ℓ−1 The depth of the tree is ℓ−1= lg 𝑛 −1. ℓ−2 ℓ−1 The depth of the tree is ℓ−1= lg 𝑛 −1. The process goes on until the capacity of a leaf < 1/𝜆. 𝑏 ℓ−1,𝑇 =𝑛 𝐴 ℓ−1 𝜈 𝑇 = 1 𝐴 2𝐴 ℓ 𝜈 𝑇 <1/𝜆 𝑇 ≤ log 1 𝜈 2𝐴 ⋅ lg 𝑛 + log 1 𝜈 𝜆/𝐴 +1 Number of time steps is 𝑂( log 𝑛 ).

Termination 𝑛= 2 ℓ At termination, capacity of leaves < 1/𝜆. ℓ−𝑘 ℓ−2 ℓ−1 At termination, capacity of leaves < 1/𝜆. Let 𝑘 be such that 1 𝜆 1 𝐴 𝑘 <1. Nodes at levels above ℓ−𝑘 have capacity less than 1, hence they are all empty. All items are in small subtrees of height 𝑘−1=𝑂(1). Each such subtree contains exactly 2 𝑘 items. IF all items are in the correct subtree, we can finish off by sorting each set of 2 𝑘 items.

Strangers [AKS (1983)] Number of items 𝑛= 2 ℓ is a power of two. Each node of the binary tree corresponds naturally to a subset of the items. (E.g., the left child of the root corresponds to the 𝑛/2 smallest items.) If the items are 0,1,…,𝑛−1, and a node 𝑢 at level 𝑖 is represented by an 𝑖-bit string 𝑥 1 𝑥 2 … 𝑥 𝑖 , then the subset of 𝑢 contains all items whose binary representation starts with 𝑥 1 𝑥 2 … 𝑥 𝑖 . An item currently in a node is native if it belongs to the set of items associated with the node. Otherwise, it is a stranger. An item is a 𝑗-stranger at a node if it has to move at least 𝑗 levels up the tree to become native. If a 𝑗-stranger, 𝑗≥1, is sent down, it becomes a (𝑗+1)-stranger. If a 𝑗-stranger, 𝑗≥1, is sent up, it becomes a (𝑗−1)-stranger. If a native is sent to the correct child, it remains a native. If a native is sent to the wrong child, it becomes a 1-stranger.

Invariants and correctness [AKS (1983)] [Paterson (1990)] [Seiferas (2009)] At even/odd times only even/odd levels are non-empty. Nodes at the same level contain the same number if items. Number of items in each node does not exceed the capacity. Number of 𝑗-strangers in a node of capacity 𝑏 is at most 𝜆 𝜀 𝑗−1 ⋅𝑏. At termination, the capacity of leaves, and hence all nodes < 1/𝜆. ⇓ Number of strangers in each node ≤𝜆𝑏<𝜆 1 𝜆 =1, i.e., no strangers! Total Depth ≤ [Depth of (2𝜆,𝜀,𝜀)-separator] ⋅ log 1/𝜈 2𝐴 ⋅ lg 𝑛 Constant! The network sorts, and has logarithmic depth!

Capacity invariant 2 2 𝜆𝑏𝐴 +1 + 𝑏 2𝐴 ≤ 𝜈𝑏 4𝜆𝐴+ 5 2𝐴 ≤ 𝜈 𝑏/𝐴 𝑏 𝜈𝑏 𝑏𝐴 𝑚 If 𝑚 odd ??? 2 2 𝜆𝑏𝐴 +1 + 𝑏 2𝐴 ≤ 𝜈𝑏 If 𝑏<𝐴, then all nodes above are empty and 𝑚 must be even. Otherwise, 1≤ 𝑏 𝐴 . 4𝜆𝐴+ 5 2𝐴 ≤ 𝜈

Strangeness invariant , 𝑗≥2 𝑏 𝑏/𝐴 𝑏𝐴 𝑚 𝜈𝑏 𝐵 𝐵 Fraction of these that are sent to 𝐵 No. of (𝑗−1)-strangers at the parent of 𝐵 𝜆 𝜀 𝑗 ⋅2𝑏𝐴 +𝜀⋅ 𝜆 𝜀 𝑗−2 ⋅ 𝑏 𝐴 ≤ 𝜆 𝜀 𝑗−1 ⋅𝜈𝑏 ??? No. of (𝑗+1)-strangers at the children of 𝐵 2𝜀𝐴+ 1 𝐴 ≤ 𝜈

Strangeness invariant , 𝑗=1 𝑏/𝐴 𝑃 𝑏 𝐵 𝐶 𝑏𝐴 𝑃 – the parent of 𝐵 𝐶 – the sibling of 𝐵 Added complication: More than half of the items in 𝑃 may belong to the set associated with 𝐶. Thus, even if the split at 𝑃 is perfect, some items native to 𝑃 (and 𝐶) will be sent to 𝐵 and become 1-strangers. How many extra items native to 𝐶 can there be at 𝑃?

Excess 𝑏/𝐴 𝑃 𝑏 𝐶 𝐵 𝑏𝐴 Let 𝐼 be the set of items native to 𝐶. Consider an “ideal” distribution of the items of 𝐼 in the tree: Items of 𝐼 fill all the descendants of 𝐶 (up to their current sizes), one-half of 𝑃, one-eighth of 𝑃’s grandparent, etc. Compare the current distribution of items in 𝐼 with the “ideal” distribution. 𝑏/𝐴 𝑃 If more than half of the current items in 𝑃 are from 𝐼, this must be compensated by nodes currently containing less items of 𝐼 than in the “ideal” distribution. 𝐵 𝑏 𝐶 𝑏𝐴 Exercise: Show that the “ideal” distribution does indeed exist.

Excess 𝑏/𝐴 𝑃 𝑏 𝐶 𝐵 𝑏𝐴 Let 𝐼 be the set of items native to 𝐶. Items in descendants of 𝐶 that do not belong to 𝐼 are strangers. Their number, therefore, is at most: 2⋅𝜆𝜀⋅𝑏𝐴+8⋅𝜆 𝜀 3 ⋅𝑏 𝐴 3 +32⋅𝜆 𝜀 5 ⋅𝑏 𝐴 5 +…< 2𝜆𝜀𝑏𝐴 1− 2𝜀𝐴 2 Number of items of 𝐼 contained in the “ideal” distribution in all ancestors of 𝑃 is: 1 8 𝑏 𝐴 3 + 1 32 𝑏 𝐴 5 +…< 𝑏 8 𝐴 3 −2𝐴 𝑏/𝐴 𝑃 𝐵 𝑏 𝐶 𝑏𝐴

Strangeness invariant , 𝑗=1 𝑏 𝑏/𝐴 𝑏𝐴 𝑃 𝜈𝑏 𝑃 𝐵 𝐵 No. of 1-strangers at 𝑃, “pushing” natives of 𝐶 to 𝐵 Excess no. of items in 𝑃 that are native to 𝐶 ??? 𝜆𝜀⋅2𝑏𝐴 + 𝜆⋅ 𝑏 𝐴 + 𝜀⋅ 𝑏 2𝐴 + 2𝜆𝜀𝑏𝐴 1− 2𝜀𝐴 2 + 𝑏 8 𝐴 3 −2𝐴 ≤ 𝜆⋅𝜈𝑏 2𝜆𝜀𝐴+ 𝜆 𝐴 + 𝜀 2𝐴 + 2𝜆𝜀𝐴 1− 2𝜀𝐴 2 + 1 8 𝐴 3 −2𝐴 ≤ 𝜆𝜈 Note: This term in Seiferas’ paper is wrong No. of 2-strangers at the children of 𝐵 Clasiffication errors at 𝑃

Required inequalities [Paterson (1990)] [Seiferas (2009)] 𝐴>1 𝜈<1 𝜀>0 𝜆<1/2 4𝜆𝐴+ 5 2𝐴 ≤ 𝜈 2𝜀𝐴+ 1 𝐴 ≤ 𝜈 2𝜆𝜀𝐴+ 𝜆 𝐴 + 𝜀 2𝐴 + 2𝜆𝜀𝐴 1− 2𝜀𝐴 2 + 1 8 𝐴 3 −2𝐴 ≤ 𝜆𝜈 Sample parameters: 𝐴=10 , 𝜆=𝜀= 1 100 , 𝜈= 13 20 . With this choice log 1/𝜈 2𝐴 ≅6.95. Total Depth ≤ [Depth of (2𝜆,𝜀,𝜀)-separator] ⋅ log 1/𝜈 2𝐴 ⋅ lg 𝑛