Presentation is loading. Please wait.

Presentation is loading. Please wait.

Agenda NSF SDCI Project Review, Oct. 29, 2012 –9:00-9:20: Overview, MV, UVA –9:20-9:50: Details, Zhenyang Liu, UVA –9:50-10:00: GUI, Tyler Clinch, UVA.

Similar presentations


Presentation on theme: "Agenda NSF SDCI Project Review, Oct. 29, 2012 –9:00-9:20: Overview, MV, UVA –9:20-9:50: Details, Zhenyang Liu, UVA –9:50-10:00: GUI, Tyler Clinch, UVA."— Presentation transcript:

1 Agenda NSF SDCI Project Review, Oct. 29, 2012 –9:00-9:20: Overview, MV, UVA –9:20-9:50: Details, Zhenyang Liu, UVA –9:50-10:00: GUI, Tyler Clinch, UVA –10:00-10:10: Traffic isolation, Zhenzhen Yan, UVA –10:10-10:30: Bob Russell, UNH –10:30-10:40: Break –10:40-11:00: Tim Carlin, UNH –11:00-11:20: John Dennis, NCAR –11:20-12:00: Discussion –12:00-1:30: Lunch/Break –4:00–4:30: Diversity activities, Carolyn Vallas, CDE, UVA 1 Supported by NSF grants: OCI-1127340, OCI-1127228, OCI-1127341 Questions on this slide set: Malathi Veeraraghavan, mv5g@virginia.edu

2 Year 1 Accomplishments Wide-area data movement traffic characterized (UVA and NCAR) –GridFTP logs obtained and analyzed –Logins obtained on NERSC and SLAC data-transfer nodes and experiments conducted for throughput variance studies –Published an SC paper, and developing GUI for broader impact Experiments on DOE ANI WAN 100GE and LIMAN testbeds –Develop tools for controlled data collection for variance studies –TCP behavior on 100 Gbps paths (impact of bit errors) –Compare RoCE over L2 circuit vs TCP over IP-routed path (UNH and UVA) Engineering solutions: –GridFTP integrated with RoCE and IDC client (UNH and UVA) Datacenter networking: b/g acqd. and prob. identified (all 3) Established a wide network of collaborators 2

3 Acknowledgment NSF OCI & co-PIs: Kevin Thompson, Bob Russell and John Dennis ESnet: Chris Tracy, Brian Tierney, Joe Burrescia, Jon Dugan, Andy Lake, Tareq Saif, and Eric Pouyoul ANL: Ian Foster, Raj Kettimuthu, and Linda Winkler NERSC: Brent Draney, Jason Hick, Jason Lee SLAC: Yee-Ting Li and Wei Yang Internet2: Jason Zurawski and Eric Boyd V. Tech: Jeff Crowder, John Nichols, John Lawson UCAR: Pete Siemsen, Steve Emmerson, Marla Meehl Boston U: Chuck Von Lichtenberg & David Starobinski GridFTP data: BNL (Scott Bradley and John Bigrow), NICS (Victor Hazelwood), ORNL (Galen Shipman and Scott Atchley) RoCE: Ezra Kissel, IU, D. K. Panda, OSU LIGO FDT: Ashish Mahabal, Caltech 3

4 Wide-area data movement Questions we asked (“science” phase): –are science data transfer rates high or still low? –is there is significant variance in throughput? –inspite of increasing rates (which means shorter transfer durations), are transfer sizes large enough to justify VC setup overhead? Method used to answer –Obtained GridFTP logs from four sources –Wrote R statistical programs to analyze logs (supporting: shell, awk, Javascript, SQL) 4 “Scientists discover that which is Engineers create that which never was”

5 Wide-area data movement Answers: –Transfer rates: on 4 analyzed paths, found max rate of 4.3 Gbps and on all 4 paths max rate was 2.5 Gbps this is a significant fraction of link capacity (10 Gbps) –Throughput variance significant: coefficient of variation: 30%-75% (4 paths) causes? –transfer parameters (e.g., parallel streams, close and open TCP connections between files, striping) –competition for server resources by concurrent transfers –not the network (link util is low, and packet losses rare) 5 “Scientists discover that which is Engineers create that which never was”

6 Answers Inspite of increasing rates (which means shorter transfer durations), are transfer sizes large enough to justify VC setup overhead? Yes, significant fraction of transfers occur in sessions whose durations are longer than 10 times VC setup delay Used hypothetical third-quartile transfer rate to compute duration rather than using actual durations to make the above determination 6

7 Experiments on DOE ANI WAN 100GE and LIMAN testbeds Leveraged our ESnet DOE project relationship to gain login access to ANI 100 GbE WAN and Long Island MAN testbeds Experiments run: –Developed tools on LIMAN testbed for controlled data collection for variance studies Impact of competition for CPU and packet losses Deployed these tools on production NERSC and SLAC GridFTP servers and data collected for controlled transfers – analysis ongoing –Reserved whole 100 GE testbed (NERSC and ANL) for weekend and ran continuous GridFTP/TCP transfers No losses! TCP throughput stayed close to 96 Gbps Why? Bit errors corrected by FEC (thanks to Chris Tracy) –RoCE over L2 circuit vs TCP over IP-routed path Results in UNH presentations 7

8 ANI 100G Testbed 8 Brian Tierney DOE PI meeting, March 1-2, 2012

9 Engineering solutions Moving forward with GridFTP+RoCE+IDC client integration Testing on DYNES – Internet ION planned UVA proposal for DYNES approved: awaiting delivery Remote logins obtained: –FRGP (regional – so no FDT server, but IDC controller) –Boston University (CNS project collaboration) –Requests made: Julio Ibarra, Martin Swany –Plan is to give each other logins for wide-area tests Leveraging collaboration with ESnet in DOE project –Testing IDC Java client with ANI testbed OSCARS IDC 9 [Needs further discussion]

10 Intra-datacenter networking/apps John Dennis has identified interesting topology/routing problems With fewer than 1000 cores, application is CPU limited, but with more than 1000 cores, it is network limited UNH and NCAR will run MPI apps and collect data UVA will analyze and design datacenter networking solutions 10


Download ppt "Agenda NSF SDCI Project Review, Oct. 29, 2012 –9:00-9:20: Overview, MV, UVA –9:20-9:50: Details, Zhenyang Liu, UVA –9:50-10:00: GUI, Tyler Clinch, UVA."

Similar presentations


Ads by Google