1 Hardware-Based Implementations of Factoring Algorithms Factoring Estimates for a 1024-Bit RSA Modulus A. Lenstra, E. Tromer, A. Shamir, W. Kortsmit,

Slides:



Advertisements
Similar presentations
AKS Implementation of a Deterministic Primality Algorithm
Advertisements

I/O Management and Disk Scheduling
Hash Tables CS 310 – Professor Roch Weiss Chapter 20 All figures marked with a chapter and section number are copyrighted © 2006 by Pearson Addison-Wesley.
CSCI 4717/5717 Computer Architecture
Factoring of Large Numbers using Number Field Sieve Matrix Step Chandana Anand, Arman Gungor, and Kimberly A. Thomas ECE 646 Fall 2006.
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Lecture Implementations. The efficiency of a particular cryptographic scheme based on any one of the algebraic structures will depend on a number.
Parallel Database Systems
RSA & F ACTORING I NTEGERS BY: MIKE NEUMILLER & BRIAN YARBROUGH.
Memory Hierarchy. Smaller and faster, (per byte) storage devices Larger, slower, and cheaper (per byte) storage devices.
Factoring 1 Factoring Factoring 2 Factoring  Security of RSA algorithm depends on (presumed) difficulty of factoring o Given N = pq, find p or q and.
Foundations of Network and Computer Security J J ohn Black Lecture #13 Sep 26 th 2007 CSCI 6268/TLEN 5831, Fall 2007.
1 Factoring Large Numbers with the TWIRL Device Adi Shamir, Eran Tromer.
Foundations of Network and Computer Security J J ohn Black Lecture #12 Sep 23 rd 2009 CSCI 6268/TLEN 5550, Fall 2009.
1 Hardware-Based Implementations of Factoring Algorithms Factoring Large Numbers with the TWIRL Device Adi Shamir, Eran Tromer Analysis of Bernstein’s.
Foundations of Network and Computer Security J J ohn Black Lecture #10 Sep 29 th 2005 CSCI 6268/TLEN 5831, Fall 2005.
1 Special Purpose Hardware for Factoring: the NFS Sieving Step Adi Shamir Eran Tromer Weizmann Institute of Science.
Faculty of Computer Science © 2006 CMPUT 229 Accelerating Performance The RISC Revolution.
Techniques for Efficient Processing in Runahead Execution Engines Onur Mutlu Hyesoon Kim Yale N. Patt.
These slides are additional material for TIES4451 Lecture 5 TIES445 Data mining Nov-Dec 2007 Sami Äyrämö.
1 Section 2.3 Complexity of Algorithms. 2 Computational Complexity Measure of algorithm efficiency in terms of: –Time: how long it takes computer to solve.
1 Hardware Assisted Solution of Huge Systems of Linear Equations Adi Shamir Computer Science and Applied Math Dept The Weizmann Institute of Science Joint.
Algebra 2 Second semester review. Examples of solving quadratics by factoring.
1 Special Purpose Hardware for Factoring: The Linear Algebra Part Eran Tromer and Adi Shamir Applied Math Dept The Weizmann Institute of Science SHARCS.
RSA Question 2 Bob thinks that p and q are primes but p isn’t. Then, Bob thinks ©Bob:=(p-1)(q-1) = Á(n). Is this true ? Bob chooses a random e (1 < e
Reduced Instruction Set Computers (RISC) Computer Organization and Architecture.
Foundations of Network and Computer Security J J ohn Black Lecture #14 Oct 1 st 2007 CSCI 6268/TLEN 5831, Fall 2007.
Algorithm Analysis (Big O)
May 29, 2008 GNFS polynomials Peter L. Montgomery Microsoft Research, USA 1 Abstract The Number Field Sieve is asymptotically the fastest known algorithm.
Copyright © Cengage Learning. All rights reserved.
CS 312: Algorithm Analysis Lecture #3: Algorithms for Modular Arithmetic, Modular Exponentiation This work is licensed under a Creative Commons Attribution-Share.
October,2006 Higher- Degree Polynomials Peter L. Montgomery Microsoft Research and CWI 1 Abstract The Number Field Sieve is asymptotically the fastest.
Computer System Overview Chapter 1. Operating System Exploits the hardware resources of one or more processors Provides a set of services to system users.
Copyright, Yogesh Malhotra, PhD, 2013www.yogeshmalhotra.com SPECIAL PURPOSE FACTORING ALGORITHMS Special Purpose Factoring Algorithms For special class.
Algorithm Efficiency CS 110: Data Structures and Algorithms First Semester,
CSC 201 Analysis and Design of Algorithms Lecture 04: CSC 201 Analysis and Design of Algorithms Lecture 04: Time complexity analysis in form of Big-Oh.
1 Chapter 1 Analysis Basics. 2 Chapter Outline What is analysis? What to count and consider Mathematical background Rates of growth Tournament method.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Quine-McCluskey (Tabular) Minimization Two step process utilizing tabular listings to: Identify prime implicants (implicant tables) Identify minimal PI.
Implementation of Finite Field Inversion
Cryptography Lecture 7: RSA Primality Testing Piotr Faliszewski.
Copyright © Curt Hill Query Evaluation Translating a query into action.
Factorization of a 768-bit RSA modulus Jung Daejin Lee Sangho.
SNFS versus (G)NFS and the feasibility of factoring a 1024-bit number with SNFS Arjen K. Lenstra Citibank, New York Technische Universiteit Eindhoven.
B. Ramamurthy.  12 stage pipeline  At peak speed, the processor can request both an instruction and a data word on every clock.  We cannot afford pipeline.
Precise Bounds for Montgomery Modular Multiplication and Some Potentially Insecure RSA Moduli Colin D. Walter formerly: (Manchester,
Copyright © 2014 Curt Hill Growth of Functions Analysis of Algorithms and its Notation.
Optimizing Robustness while Generating Shared Secret Safe Primes Emil Ong and John Kubiatowicz University of California, Berkeley.
16.7 Completing the Physical- Query-Plan By Aniket Mulye CS257 Prof: Dr. T. Y. Lin.
9/22/15UB Fall 2015 CSE565: S. Upadhyaya Lec 7.1 CSE565: Computer Security Lecture 7 Number Theory Concepts Shambhu Upadhyaya Computer Science & Eng. University.
Precise Bounds for Montgomery Modular Multiplication and Some Potentially Insecure RSA Moduli Colin D. Walter formerly: (Manchester,
Algorithm Analysis (Big O)
A Survey on Factoring Large Numbers ~ 巨大数の因数分解に関する調査 ~ Kanada Lab. M Yoshida Hitoshi.
R ANDOM N UMBER G ENERATORS Modeling and Simulation CS
Percolation Percolation is a purely geometric problem which exhibits a phase transition consider a 2 dimensional lattice where the sites are occupied with.
1 IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo, Jose G. Delgado-Frias Publisher: Journal of Systems.
BITS Pilani Pilani Campus Data Structure and Algorithms Design Dr. Maheswari Karthikeyan Lecture1.
Announcement We will have a 10 minutes Quiz on Feb. 4 at the end of the lecture. The quiz is about Big O notation. The weight of this quiz is 3% (please.
CS 1410 Intro to Computer Tecnology Computer Hardware1.
Full Design. DESIGN CONCEPTS The main idea behind this design was to create an architecture capable of performing run-time load balancing in order to.
Analyzing Memory Access Intensity in Parallel Programs on Multicore Lixia Liu, Zhiyuan Li, Ahmed Sameh Department of Computer Science, Purdue University,
Bounded Nonlinear Optimization to Fit a Model of Acoustic Foams
Design and Analysis of ALGORITHM (Session 1)
Cache Memory Presentation I
RSA Cryptosystem Bits PCs Memory MB ,000 4GB 1,020
Homework 3 As announced: not due today 
Parallel Quadratic Sieve
Discrete Math for CS CMPSC 360 LECTURE 12 Last time: Stable matching
Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics
Factoring RSA Moduli: Current State of the Art J
Presentation transcript:

1 Hardware-Based Implementations of Factoring Algorithms Factoring Estimates for a 1024-Bit RSA Modulus A. Lenstra, E. Tromer, A. Shamir, W. Kortsmit, B. Dodson, J. Hughes, and P. Leyland Springer On-line, Lecture Notes in CS 2894, pp (2003) (E. Tromer’s presentation)

2 Bicycle chain sieve [D. H. Lehmer, 1928]

3 The Quadratic Sieve How to find S such that is a square? Look at the factorization of f 1 (a): f 1 (0)= 102 f 1 (1)= 33 f 1 (2)= 1495 f 1 (3)= 84 f 1 (4)= 616 f 1 (5)= 145 f 1 (6)= 42  This is a square, because all exponents are even = 113 = = 732 = = 295 = 732 = 

4 Comparison: Number Field Sieve (NFS): e (α+o ( 1 ) )·(log n) 1/3 ·(log log n) 2/3 Quadratic Sieve (QS): (log n)^(1/2)*(log log n)^(1/2) L_a(n): Exp{ (c +o(1))* (log n)^a * (log log n)^(1-a)}, Then a = 0 polynomial, a=1 exponential.

5 The Sieving Problem Input: a set of arithmetic progressions. Each progression has a prime interval p and value log p. OOO OOO OOOOO OOOOOOOOO OOOOOOOOOOOO Output: indices where the sum of values exceeds a threshold.

6 Example: handling large primes Primary consideration: efficient storage between contributions. Each memory+processor unit handle many progressions. It computes and sends contributions across the bus, where they are added at just the right time. Timing is critical. Memory Processor Memory Processor

7 Handling large primes (cont.) The memory used by past events can be reused. Think of the processor as rotating around the cyclic memory: By appropriate choice of parameters, we guarantee that new events are always written just behind the read head. There is a tiny (1:1000) window of activity which is “twirling” around the memory bank. It is handled by an SRAM-based cache. The bulk of storage is handled in compact DRAM. Processor

8 Rational vs. algebraic sieves We actually have two sieves: rational and algebraic. We are looking for the indices that accumulated enough value in both sieves. The algebraic sieve has many more progressions, and thus dominates cost. We cannot compensate by making s much larger, since the pipeline becomes very wide and the device exceeds the capacity of a wafer. rationalalgebraic

9 Estimating NFS parameters Predicting cost requires estimating the NFS parameters (smoothness bounds, sieving area, frequency of candidates etc.). Methodology: [Lenstra,Dodson,Hughes,Leyland] Find good NFS polynomials for the RSA-1024 and RSA-768 composites. Analyze and optimize relation yield for these polynomials according to smoothness probability functions. Hope that cycle yield, as a function of relation yield, behaves similarly to past experiments.

bit NFS sieving parameters Smoothness bounds: Rational:3.5 £ 10 9 Algebraic: 2.6 £ Region: a 2 {-5.5 £ 10 14,…,5.5 £ } b 2 {1,…,2.7 £ 10 8 } Total: 3 £ ( £ 6/  2 )

11 TWIRL for 1024-bit composites A cluster of 9 TWIRLS can process a sieve line (10 15 indices) in 34 seconds. To complete the sieving in 1 year, use 194 clusters. Initial investment (NRE): ~$20M After NRE, total cost of sieving for a given 1024-bit composite: ~10M $  year (compared to ~1T $  year). A R R R R R R R R

12.