Genome Revolution: COMPSCI 006G 1.1 FOCUS COMPSCI 006G Genome Revolution Owen Astrachan

Slides:



Advertisements
Similar presentations
Counting the bits Analysis of Algorithms Will it run on a larger problem? When will it fail?
Advertisements

Design & Analysis of Algoritms
Introductory Lecture. What is Discrete Mathematics? Discrete mathematics is the part of mathematics devoted to the study of discrete (as opposed to continuous)
CSC 2400 Computer Systems I Lecture 3 Big Ideas. 2 Big Idea: Universal Computing Device All computers, given enough time and memory, are capable of computing.
Chapter Chapter Goals Describe the layers of a computer system Describe the concept of abstraction and its relationship to computing Describe.
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. slide 1 CS 125 Introduction to Computers and Object- Oriented Programming.
A Computer Science Tapestry 1.1 A Computer Science Tapestry Exploring Programming and Computer Science with C++ Second Edition Owen Astrachan Duke University.
Computability to Practical Computing - and - How to Talk to Machines.
CS /29/2004 (Recitation Objectives) and Computer Science and Objects and Algorithms.
1. By the end of this lecture you should be able … To describe in general terms how computers function To describe the fetch-execute cycle To explain.
A Computer Science Tapestry 1.1 Wake up with CPS 006 Program Design and Methodology I Jeff Forbes
1 An Extremely Short Introduction to Computer Programming Mike Scott.
Chapter 1 The Big Picture Chapter Goals Describe the layers of a computer system Describe the concept of abstraction and its relationship to computing.
Principles of Computer Science
Chapter 0: Introduction CSCI-UA 0002 – Introduction to Computer Programming Mr. Joel Kemp.
CPS Today’s topics l Computability ä Great Ideas Ch. 14 l Artificial Intelligence ä Great Ideas Ch. 15 l Reading up to this point ä Course Pack.
CPS Welcome! Principles of Computer Science CPS 1 LSRC B101 M, W, F 1:10-2:00 Professor Dietolf (Dee) Ramm
Chapter 01 Nell Dale & John Lewis.
Compsci 101, Spring COMPSCI 101, Spring 2012 Introduction to Computer Science Owen Astrachan
Computer Science 1000 Introduction. What is Computer Science? the study of computers? not quite rather, computers provide a tool for which to carry out.
Welcome Aboard – Chapter 1 COMP 2610 Dr. James Money COMP
CPS What is Computer Science? What is it that distinguishes it from the separate subjects with which it is related? What is the linking thread.
CompSci Welcome! Principles of Computer Science CompSci 1 LSRC B101 M, W, F 1:30-2:20 Professor Jeff Forbes.
Introduction Welcome to COSC 135 – Introduction to Computer Science Dr. Donald Simon.
High level & Low level language High level programming languages are more structured, are closer to spoken language and are more intuitive than low level.
Duke University Computer Science 1 Why Computer Science is Uglier and Prettier than Mathematics Owen Astrachan Duke University
1 TOPIC 1 INTRODUCTION TO COMPUTER SCIENCE AND PROGRAMMING Topic 1 Introduction to Computer Science and Programming Notes adapted from Introduction to.
Overview of Computing. Computer Science What is computer science? The systematic study of computing systems and computation. Contains theories for understanding.
Introduction CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
CompSci Welcome! Principles of Computer Science CompSci 1 LSRC B101 M, W, F 10:20-11:10 Professor Jeff Forbes See
Compsci 06/101, Fall COMPSCI 06/101, Fall 2010 Owen Astrachan Robert Duvall
David Evans Turing Machines, Busy Beavers, and Big Questions about Computing.
Chapter 1 The Big Picture.
The Turing machine Olena Lastivka. Definition Turing machine is a theoretical device that manipulates symbols on a strip of tape according to a table.
Introduction CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Automata, Computability, and Complexity Lecture 1 Section 0.1 Wed, Aug 22, 2007.
CompSci 1: Principles of Computer Science Lecture 1 Course Overview.
CompSci Welcome! Principles of Computer Science CompSci B101 LSRC M W F 1:30-2:20 Dietolf (Dee) Ramm
CompSci Topics since midterm l Java  Methods  Sound  Graphics l Software design  Recursion  Arrays  Copyright issues l Computer systems.
Lecture on Computer Science as a Discipline. 2 Computer “Science” some people argue that computer science is not a science in the same sense that biology.
The Beauty and Joy of Computing Lecture #3 : Creativity & Abstraction UC Berkeley EECS Lecturer Gerald Friedland.
CPS Topics since last test l Recursion l Software design  Object-oriented  Copyright issues l Computer systems  Hardware  Architecture  Operating.
CPS Welcome! Principles of Computer Science CPS 1 LSRC B101 M, W, F 10:30-11:20 Professor Jeff Forbes.
CompSci Welcome! Principles of Computer Science CompSci 1 SocSci 136 W, F 10:05-11:20 Professor Jeff Forbes See
CompSci CompSci 6 Programming Design and Analysis I Dietolf (Dee) Ramm
Computer Science and Everything 1 Big Ideas in Computer Science l “Mathematics is the Queen of the Sciences” Carl Friedrich Gauss l What is Computer Science?
Halting Problem Introduction to Computing Science and Programming I.
The Nature of Computing INEL 4206 – Microprocessors Lecture 3 Bienvenido Vélez Ph. D. School of Engineering University of Puerto Rico - Mayagüez.
The Nature of Computing INEL 4206 – Microprocessors Lecture 2 Bienvenido Vélez Ph. D. School of Engineering University of Puerto Rico - Mayagüez.
CPS Topics since last test l Recursion l Sound l Graphics l Software design  Object-oriented  Copyright issues l Computer systems  Hardware 
Software Design 14.1 CPS 108 l Object oriented design and programming of medium-sized projects in groups using modern tools and practices in meaningful.
CS 127 Introduction to Computer Science. What is a computer?  “A machine that stores and manipulates information under the control of a changeable program”
Data Structures and Algorithms Dr. Tehseen Zia Assistant Professor Dept. Computer Science and IT University of Sargodha Lecture 1.
CPS What is Computer Science? What is it that distinguishes it from the separate subjects with which it is related? What is the linking thread.
CS1315 Introduction to Media Computation Introduction: Why study computer science at all?!?
Introduction CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Duke University Computer Science 1 Topics since last test l More invariants l Pixmap l Sorting & Templates  Selection sort  Insertion sort  Quicksort.
CPS What is Computer Science? What is it that distinguishes it from the separate subjects with which it is related? What is the linking thread.
Introductory Lecture. What is Discrete Mathematics? Discrete mathematics is the part of mathematics devoted to the study of discrete (as opposed to continuous)
MARC: Developing Bioinformatics Programs Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez Reference: How to Think Like a Computer Scientist: Learning.
Theory of Computation. Introduction We study this course in order to answer the following questions: What are the fundamental capabilities and limitations.
Mohammed I DAABO COURSE CODE: CSC 355 COURSE TITLE: Data Structures.
CompSci 6 Introduction to Computer Science
Principles of Computer Science
Introduction CSE 1310 – Introduction to Computers and Programming
Turing Machines, Busy Beavers, and Big Questions about Computing
CSC Classes Required for TCC CS Degree
Topics since last test More invariants Pixmap Sorting & Templates
CompSci 1: Principles of Computer Science Lecture 1 Course Overview
Topics since last test More invariants Pixmap Sorting & Templates
Presentation transcript:

Genome Revolution: COMPSCI 006G 1.1 FOCUS COMPSCI 006G Genome Revolution Owen Astrachan

Genome Revolution: COMPSCI 006G 1.2 Where are we going? l What is computer science? l What is biology? l What is computational biology? l What is bioinformatics? l What tools do scientists need? l What is a scientist?

Genome Revolution: COMPSCI 006G 1.3 What is Bioinformatics? Synonym?: computational biology The application of computational techniques to the management and analysis of biological information. Bioinformatics is a rapidly developing branch of biology and is highly interdisciplinary, using techniques and concepts from informatics, statistics, mathematics, chemistry, biochemistry, physics, and linguistics. It has many practical applications in different areas of biology and medicine.

Genome Revolution: COMPSCI 006G 1.4 What is Bioinformatics? Bioinformatics applies principles of information sciences and technologies to make the vast, diverse, and complex life sciences data more understandable and useful. Computational biology uses mathematical and computational approaches to address theoretical and experimental questions in biology. Although bioinformatics and computational biology are distinct, there is also significant overlap and activity at their interface. Because of the great success of genome-sequencing projects, the quantity of DNA sequence data that are now available greatly exceeds the tools that are available to process those data. Consequently the analysis of those data presents one of today's great scientific challenges.

Genome Revolution: COMPSCI 006G 1.5 What is Computer Science? l Computer science is no more about computers than astronomy is about telescopes. Edsger Dijkstra l Computer science is not as old as physics; it lags by a couple of hundred years. However, this does not mean that there is significantly less on the computer scientist's plate than on the physicist's: younger it may be, but it has had a far more intense upbringing! Richard Feyneman

Genome Revolution: COMPSCI 006G 1.6 Scientists and Engineers l Scientists build to learn, engineers learn to build –Fred Brooks

Genome Revolution: COMPSCI 006G 1.7 Computer Science and Programming l Computer Science is more than programming  The discipline is called informatics in many countries  Elements of both science and engineering  Elements of mathematics, physics, cognitive science, music, art, and many other fields l Computer Science is a young discipline  Fiftieth anniversary in 1997, but closer to forty years of research and development  First graduate program at CMU (then Carnegie Tech) in 1965 l To some programming is an art, to others a science, to others an engineering discipline

Genome Revolution: COMPSCI 006G 1.8 What is Computer Science? What is it that distinguishes it from the separate subjects with which it is related? What is the linking thread which gathers these disparate branches into a single discipline? My answer to these questions is simple --- it is the art of programming a computer. It is the art of designing efficient and elegant methods of getting a computer to solve problems, theoretical or practical, small or large, simple or complex. C.A.R. (Tony)Hoare

Genome Revolution: COMPSCI 006G 1.9 Algorithms as Cornerstone of CS l Step-by-step process that solves a problem  more precise than a recipe  eventually stops with an answer  general process rather than specific to a computer or to a programming language l Searching: for phone number of G. Samsa, whose number is , or for the person whose number is  Are these searches different? l If the phone book has 8 million numbers in it (why are there only 7.9 million phone numbers with area code 212?)  How many queries to find phone number of G. Samsa?  How many queries to find person with number  What about IP addresses?

Genome Revolution: COMPSCI 006G 1.10 Search, Efficiency, Complexity l Think of a number between 1 and 1,000  respond high, low, correct, how many guesses needed? l Look up a word in a dictionary  Finding the page/word, how many words do you look at? l Looking up a phone number in the Manhattan, NY directory  How many names are examined? l How many times can 1,024 be cut in half?  2 10 = 1,024, 2 20 = 1,048,576

Genome Revolution: COMPSCI 006G 1.11 Sorting Experiment: why do we sort? l Groups of four people are given a bag containing strips of paper  on each piece of paper is an 8-15 letter English word  create a sorted list of all the words in the bag  there are 100 words in a bag l What issues arise in developing an algorithm for this sort?  l Can you write a description of an algorithm for others to follow?  Do you need a support line for your algorithm?  Are you confident your algorithm works?

Genome Revolution: COMPSCI 006G 1.12 Themes and Concepts of CS l Theory  properties of algorithms, how fast, how much memory  average case, worst case: sorting cards, words, exams  provable properties, in a mathematical sense l Language  programming languages: C++, Java, C, Perl, Fortran, Lisp, Scheme, Visual BASIC,...  Assembly language, machine language,  Natural language such as English l Architecture  Main memory, cache memory, disk, USB, SCSI,...  pipeline, multi-processor

Genome Revolution: COMPSCI 006G 1.13 Theory, Language, Architecture l We can prove that in the worst case quicksort is bad  doesn’t matter what machine it’s executed on  doesn’t matter what language it’s coded in  unlikely in practice, but worst case always possible l Solutions? Develop an algorithm as fast as quicksort in the average case, but has good worst case performance  quicksort invented in 1960  introsort (for introspective sort) invented in 1996 l Sometimes live with worst case being bad  bad for sorting isn’t bad for other algorithms, needs to be quantified using notation studied as part of the theory of algorithms

Genome Revolution: COMPSCI 006G 1.14 Abstraction, Complexity, Models l What is an integer?  In mathematics we can define integers easily, infinite set of numbers and operations on the numbers (e.g.,+, -, *, /) {…-3, -2, -1, 0, 1, 2, 3, …}  In programming, finite memory of computer imposes a limit on the magnitude of integers. Possible to program with effectively infinite integers (as large as computation and memory permit) at the expense of efficiency At some point addition is implemented with hardware, but that’s not a concern to those writing software (or is it?) C++ doesn’t require specific size for integers, Java does l Floating-point numbers have an IEEE standard, it’s more expensive to do arithmetic with than with 2

Genome Revolution: COMPSCI 006G 1.15 Alan Turing ( ) l Instrumental in breaking codes during WW II l Developed mathematical model of a computer called a Turing Machine (before computers)  solves same problems as a Pentium III (more slowly) l Church-Turing thesis  All “computers” can solve the same problems l Showed there are problems that cannot be solved by a computer l Both a hero and a scientist/ mathematician, but lived in an era hard for gay people

Genome Revolution: COMPSCI 006G 1.16 Complexity: What’s hard, what’s easy? l What is a prime number?  2, 3, 5, 7, 11, 13, …  Largest prime? l l How do we determine if these numbers are prime?  Test 3, 5, 7, …  If we can test one million numbers a second, how long to check a 100 digit #? l Why do we care? is not prime, I can prove it but I can’t give you the factors. l Finding factors is “hard”, determining primality is “easy”  What does this mean?  Why do we care? l Encryption depends on this relationship, without encryption and secure web transactions where would we be?

Genome Revolution: COMPSCI 006G 1.17 C.A.R. (Tony) Hoare (b. 1934) l Won Turing award in 1980 l Invented quicksort, but didn’t see how simple it was to program recursively l Developed mechanism and theory for concurrent processing l In Turing Award speech used “Emporer’s New Clothes” as metaphor for current fads in programming “ Beginning students don’t know how to do top-down design because they don’t know which end is up ”

Genome Revolution: COMPSCI 006G 1.18 What is digital? l What’s the difference between  Vinyl LP and CD/DVD?  Rolex and Timex? l Sampling analog music for CD’s  44,100 samples/channel/second * 2 channels * 2 bytes/sample * 74 minutes * 60 seconds/minute = 783 million bytes l How does MP3 help?

Genome Revolution: COMPSCI 006G 1.19 Chips, Central Processing Unit (CPU) l CPU chips  Pentium (top)  G4 (bottom)  Sound, video, … l Moore’s Law  chip “size” (# transistors) doubles every months (formulated in 1965)  2,300 transistors Intel 4004, 42 million Pentium 4

Genome Revolution: COMPSCI 006G 1.20 Why is programming fun? What delights may its practitioner expect as a reward? First is the sheer joy of making things Second is the pleasure of making things that are useful Third is the fascination of fashioning complex puzzle-like objects of interlocking moving parts Fourth is the joy of always learning Finally, there is the delight of working in such a tractable medium. The programmer, like the poet, works only slightly removed from pure thought- stuff. Fred Brooks