Commodity Node Procurement Process Task Force: Status Stephen Wolbers Run 2 Computing Review September 13, 2005
Stephen Wolbers Run 2 Computing Review 2 Outline Charge and Organization Research and Discovery Topics Plans for Completion
September 13, 2005Stephen Wolbers Run 2 Computing Review 3 Some Background Procurement of 1000’s of PCs takes a large amount of effort. Housing all of these machines takes a huge (Megawatts) amount of power. –Should we be thinking about “performance per watt”? It is useful to examine the procedure used in recent years: –Vendor evaluation to qualify vendors. –Limited bids –30 day burn-in for acceptance –Integrated PC/rack (Fermilab specifies rack configuration) –No real consideration for power, cooling, space in bid evaluations. Some recent acquisitions have had problems. –Technical problems/failure during burn-in. –Leads to delays in getting the computing in production. –Can we do better? Many ideas for improving the process exist and may benefit all of us.
September 13, 2005Stephen Wolbers Run 2 Computing Review 4 Charge and organization Charge to the task force was sent to department heads June 2, 2005 by Vicky White. –CD-doc-886 Steve Wolbers was asked to lead the task force. Mark Fischler was asked to serve to assist to formulate economic models. Departments were asked to nominate people and the task force has membership from CSS, CMS, Running Experiment, CEPA, CCF. Work began in June and was interrupted once or twice by vacations.
September 13, 2005Stephen Wolbers Run 2 Computing Review 5 Charge The task force is asked to: 1) Consider the existing procurement strategy and its pros and cons. 2) Hear stakeholder and provider ideas about possible modifications to the procurement process, including input from procurement and facilities providers. 3) Consider the economic model of what it actually costs us to procure, install and run systems over their lifetime. Here factors such as space, power, repairs, vendor liaison and visits, time spent on installs or vendor education, risk, integration costs, and more might be taken into account in a full economic model 4) Consider which aspects of the economic model might in some way be considered in evaluating the value of a vendor’s response to a bid. 5) Consider whether the acceptance process is optimal for rejecting systems. Since it is actually hard to send systems back in reality the acceptance process has turned out to be merely the first step in the long process of owning systems and making them run reliably enough, including working with the vendor to address deficiencies. 6) Recommend a procurement and acceptance strategy for the future. The goal is to maximize the utility of the computers while minimizing the total cost, including costs associated with the procurement and operation of the systems.
September 13, 2005Stephen Wolbers Run 2 Computing Review 6 Deliverables and Timescales (1) From the charge: –“Recommendation for either maintaining the current process or making some short term do-able modifications to it. We will need these before the end of June.” Committee’s recommendation (June 20, 2005): 1) Lattice QCD should use their standard process. 2) Run 2 and GP Farms can use the current process with necessary updates. Changes to take into account vendors, IPMI infrastructure, power, cooling and space needs, etc. are all within the boundaries of the current procurement methodology. The task force considered the possibility of recommending that the process used for the FY05 CMS node procurement be used for other FY05 procurements. However, the FY05 CMS process won’t be finished until August-September, It is too early to evaluate that procurement at this time. Even though the task force cannot recommend the use of the FY05 CMS procurement process for remaining FY05 purchases the task force believes that it should be an option for those purchases.
September 13, 2005Stephen Wolbers Run 2 Computing Review 7 Deliverables and Timescales (2) From the charge: –“Recommendation for procurement an acceptance processes for future procurements. We will need this before October 1.” We won’t make it by October 1 but we plan to be finished before the FY06 procurement cycle begins.
September 13, 2005Stephen Wolbers Run 2 Computing Review 8 Topics Covered or Scheduled Computer Room Facilities Cost –Space, Power, Cooling Vendor and Hardware Qualification Concepts for Modeling Node Procurement PC Farm Acquisitions at Other Labs –Argonne, BNL, CERN, JLAB Lattice QCD Procurement Economic Models for Bid Evaluation Formula Meeting with Fermilab Procurement Department CMS Procurement Strategy FY05 and Plans and Ideas for the Future Moore’s Law Racking/Packaging * All of these are documented in CD docDB
September 13, 2005Stephen Wolbers Run 2 Computing Review 9 Recent Construction Costs
September 13, 2005Stephen Wolbers Run 2 Computing Review 10 Some Observations Space, Power, Cooling is important and it is expensive. –Performance/watt has become an important metric for computers. It would be wise to do as much as possible in common to save effort, to learn from each other, to gain some leverage from all the efforts. Fermilab is not significantly better or worse than any other place in how we acquire commodity PCs.
September 13, 2005Stephen Wolbers Run 2 Computing Review 11 Issues to be resolved Weight to be put on various costs and benefits in the bid evaluation formula. –Performance. –Infrastructure costs. –Other lifetime costs. Vendor evaluation process. How to speed up acquisitions. Delivery schedule and acceptance process: –All at once vs. a few racks at a time –2 weeks vs. 30 days. Racking strategies. Commonality of acquisition process across the Division.
September 13, 2005Stephen Wolbers Run 2 Computing Review 12 The Plan/Conclusions We will hear some more detailed reports over the next couple of weeks. Then we will work on recommendations with the goal of having them to Vicky well before the large FY06 purchases and certainly as early as possible given the lead time needed for evaluation, requisition-writing, approvals, etc.