Programmazione ad alto livello con Python Lezione 7: Introduzione alla libreria per il calcolo numerico NumPy. Dr. Gabriele Mencagli Dipartimento di Informatica Università di Pisa 20/11/2018
Informazioni Dr. Gabriele Mencagli Dipartimento di Informatica, Università di Pisa Stanza 287. Ricevimento su appuntamento (mandare una email) Email: mencagli@di.unipi.it web: www.di.unipi.it/~mencagli Organizzazione: corso introduttivo. 16 ore di didattica frontale, 14 di esercizi individuali in laboratorio Durata: 2 mesi Modalità di esame: progetto finale di complessità variabile Importante gli homework!!! (tipicamente uno per settimana) 20/11/2018
What is NumPy? Python is a fabulous language: Easy to extend; Great syntax which encourages easy to write and maintain code; Incredibly large standard-library and third-party tools; No built-in multi-dimensional arrays (but it supports the needed syntax for extracting elements from one). NumPy provides a fast built-in object (ndarray) which is a multi-dimensional array of a homogeneous data-type. Offers Matlab-like capabilities within Python. NumPy replaces Numeric and Numarray (old modules and libraries). Initially developed by Travis Oliphant (building on the work of dozens of others). NumPy is very fast (much more efficient than using standard lists) because its operations are translated into a C program by the interpreter. 20/11/2018
NumPy arrays (ndarray) A NumPy array (ndarray) is an N-dimensional homogeneous collection of “items” of the same “kind”. The kind can be any arbitrary structure and is specified using the data- type. The main difference to standard lists consists in the fact that the elements of a NumPy array have to be of the same type, usually float or int. NumPy arrays are by far more efficient than the standard lists of Python. Principially, an array can be seen like a list with the following differences: All elements have to be of the same type, i.e. integer, float (real) or complex numbers; The number of elements have to be known a priori, i.e. when the array is created. It can't be changed afterwards (static size). 20/11/2018
NumPy arrays (2) In the header we have the strides parameter: a number which shows you how many bytes you have to move to get from first row to second row. Setting this attribute to another value will change the way the memory is viewed. >>> a=np.arange(9) >>> a=a.reshape(3,3) >>> a array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) >>> a.dtype dtype('int64') >>> a.strides (24, 8) 20/11/2018
Ndarray data types This is a list of the data types supported for building a NumPy array. Not only the ones introduced in our first lecture but several others: 20/11/2018
Terminology As said, NumPy's main object is the homogeneous multidimensional array called Ndarray: This is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. Typical examples of multidimensional arrays include vectors, matrices, images and spreadsheets. Dimensions usually called axes, number of axes is the rank. >>> a = numpy.array([1,3,5,7,9]) >>> b = numpy.array([3,5,6,7,9]) >>> c = a + b >>> print c [4, 8, 11, 14, 18] [7, 5, -1] An array of rank 1 i.e. It has 1 axis of length 3 [ [ 1.5, 0.2, -3.7] , An array of rank 2 i.e. It has 2 axes, the first [ 0.1, 1.7, 2.9] ] length 3, the second of length 3 (a matrix with 2 rows and 3 columns 20/11/2018
Example with ndarrays A NumPy array is a homogeneous collection of “items” of the same “data-type” (dtype). Another example in which we create a NumPy array of floats and we print the data type: 20/11/2018
Ndarray attributes Here a list of the most important attributes of a ndarray object: ndarray.ndim the number of axes (dimensions) of the array i.e. the rank. ndarray.shape the dimensions of the array. This is a tuple of integers indicating the size of the array in each dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of the shape tuple is therefore the rank, or number of dimensions, ndim. ndarray.size the total number of elements of the array, equal to the product of the elements of shape. ndarray.dtype an object describing the type of the elements in the array. One can create or specify dtype's using standard Python types. NumPy provides many, for example bool_, character, int_, int8, int16, int32, int64, float_, float8, float16, float32, float64, complex_, complex64, object_. ndarray.itemsize the size in bytes of each element of the array. E.g., for elements of type float64, itemsize is 8 (=64/8), while complex32 has itemsize 4 (=32/8) (equivalent to ndarray.dtype.itemsize). ndarray.data the buffer containing the actual elements of the array. Normally, we won't need to use this attribute because we will access the elements in an array using indexing facilities. 20/11/2018
Ndarray creation and use Summary of the main basic operations on NumPy arrays: SIMPLE ARRAY CREATION ARRAY SHAPE >>> a = array([0,1,2,3]) >>> a array([0, 1, 2, 3]) # shape returns a tuple # listing the length of the # array along each dimension. >>> a.shape (4,) #1 row and 4 columns! >>> shape(a) (4,) # size reports the entire # number of elements in an # array. >>> a.size 4 >>> size(a) CHECKING THE TYPE >>> type(a) <type 'array'> NUMERIC ‘TYPE’ OF Elem. ARRAY SIZE >>> a.dtype dtype(‘int32’) BYTES PER ELEMENT >>> a.itemsize # per element 4 20/11/2018
Example In the following an example of ndarray creation and use through NumPy calls. Example with simple indexing of the array with also slicing (as regular lists and tuples). >>> import numpy as np >>> lst = [[1, 2, 3], [3, 6, 9], [2, 4, 6]] # create a list >>> a = np.array(lst) # convert a list into an array >>> print(a) [[1 2 3] [3 6 9] [2 4 6]] >>> a.shape (3, 3) >>> print(a.dtype) # get type of an array int32 >>> print(a[0]) # this is just like a list of lists [1 2 3] >>> print(a[1, 2]) # arrays can be given comma separated indices 9 >>> print(a[1, 1:3]) # and slices [6 9] >>> print(a[:,1]) [2 6 4] 20/11/2018
Setting array elements How to access and modify elements of a NumPy array: ARRAY INDEXING BEWARE OF TYPE COERSION >>> a[0] >>> a[0] = 10 >>> a [10, 1, 2, 3] >>> a.dtype dtype('int32') # assigning a float to into # an int32 array will # truncate decimal part. >>> a[0] = 10.6 >>> a [10, 1, 2, 3] # fill has the same behavior >>> a.fill(-4.8) [-4, -4, -4, -4] FILL # set all values in an array. >>> a.fill(0) >>> a [0, 0, 0, 0] # This also works, but may # be slower. >>> a[:] = 1 [1, 1, 1, 1] 20/11/2018 12
Example In the following an example of ndarray creation and use through NumPy calls. Example of modification of an array. We also understand how to create an array with zeros. >>> a[1, 2] = 7 >>> print(a) [[1 2 3] [3 6 7] [2 4 6]] >>> a[:, 0] = [0, 9, 8] [[0 2 3] [9 6 7] [8 4 6]] >>> b = np.zeros(5) >>> print(b) [ 0. 0. 0. 0. 0.] >>> b.dtype dtype(‘float64’) >>> n = 1000 >>> my_int_array = np.zeros(n, dtype=numpy.int) >>> my_int_array.dtype dtype(‘int32’) 20/11/2018
Multi-dimensional arrays Examples of use of multi-dimensional arrays in NumPy: MULTI-DIMENSIONAL ARRAYS NUMBER OF DIMENSIONS >>> a = array([[ 0, 1, 2, 3], [10,11,12,13]]) >>> a array([[ 0, 1, 2, 3], >>> a.ndims 2 GET/SET ELEMENTS >>> a[1,3] 13 >>> a[1,3] = -1 >>> a array([[ 0, 1, 2, 3], [10,11,12,-1]]) column (ROWS,COLUMNS) row >>> a.shape (2, 4) >>> shape(a) ELEMENT COUNT ADDRESS FIRST ROW USING SINGLE INDEX >>> a.size 8 >>> size(a) >>> a[1] array([10, 11, 12, -1]) 20/11/2018
Example In the following an example of ndarray creation and use through NumPy calls. How to create an array of ones and how the use the function arange: >>> c = np.ones(4) >>> print(c) [ 1. 1. 1. 1. ] >>> d = np.arange(5) # just like range() >>> print(d) [0 1 2 3 4] >>> d[1] = 9.7 >>> print(d) # arrays keep their type even if elements changed [0 9 2 3 4] >>> print(d*0.4) # operations create a new array, with new type [ 0. 3.6 0.8 1.2 1.6] >>> d = np.arange(5, dtype=numpy.float) [ 0. 1. 2. 3. 4.] >>> np.arange(3, 7, 0.5) # arbitrary start, stop and step array([ 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. , 6.5]) 20/11/2018
Operations on arrays… BYTES OF MEMORY USED CONVERSION TO LIST Other useful methods on ndarray provided by the NumPy library: BYTES OF MEMORY USED CONVERSION TO LIST # returns the number of bytes # used by the data portion of # the array. >>> a.nbytes 16 # convert a numpy array to a # python list. >>> a.tolist() [0, 1, 2, 3] # For 1D arrays, list also # works equivalently, but # is slower. >>> list(a) NUMBER OF DIMENSIONS >>> a.ndim 1 ARRAY COPY # create a copy of the array >>> b = a.copy() >>> b array([0, 1, 2, 3]) 20/11/2018
Array slicing SLICING WORKS MUCH LIKE STANDARD PYTHON SLICING Slicing of NumPy arrays: SLICING WORKS MUCH LIKE STANDARD PYTHON SLICING >>> a[0,3:5] array([3, 4]) >>> a[4:,4:] array([[44, 45], [54, 55]]) >>> a[:,2] array([2,12,22,32,42,52]) STRIDES ARE ALSO POSSIBLE >>> a[2::2,::2] array([[20, 22, 24], [40, 42, 44]]) 20/11/2018 17
Fortran-order and C-order An instance of class ndarray consists of a contiguous one-dimensional segment of memory, combined with an indexing scheme that maps integers into the location of an item in the segment. A segment of memory is inherently 1-dimensional, and there are different schemes for arranging the items of an N-dimensional array in a 1-dimensional block. Two schemes are widely used: Fortran-order (column major) and C-order (row major). Both the C and Fortran orders are contiguous, i.e., single-segment, memory layouts, in which every part of the memory block can be accessed by some combination of the indices. 20/11/2018
Fortran-order and C-order (2) An example in which we create a ndarray with some elements and a specific ordering in memory. We inspect several properties of the array in the two cases of memory ordering. 20/11/2018
Views vs. Copies A slicing operation creates a VIEW on the original array, which is just a way of accessing array data. Thus the original array is not copied in memory. When modifying the view, the original array is modified as well! 20/11/2018
Fancy indexing a y INDEXING BY POSITION INDEXING WITH BOOLEANS Different kinds of indexing mechanisms are possible in NumPy: INDEXING BY POSITION INDEXING WITH BOOLEANS >>> a = np.arange(0,80,10) # fancy indexing >>> y = a[[1, 2, -3]] >>> print y [10 20 50] # using take >>> y = np.take(a,[1,2,-3]) >>> mask = np.array([0,1,1,0,0,1,0,0], ... dtype=bool) # fancy indexing >>> y = a[mask] >>> print y [10,20,50] # using compress >>> y = np.compress(mask, a) a y 20/11/2018
Fancy indexing in 2D >>> a[[0,1,2,3,4],[1,2,3,4,5]] Fancy indexing can be applied to multi-dimensional array too: >>> a[[0,1,2,3,4],[1,2,3,4,5]] array([ 1, 12, 23, 34, 45]) >>> a[3:,[0, 2, 5]] array([[30, 32, 35], [40, 42, 45]]) [50, 52, 55]]) >>> mask = array([1,0,1,0,0,1], dtype=bool) >>> a[mask,2] array([2,22,52]) Unlike slicing, fancy indexing creates copies instead of views into original arrays!!!! 20/11/2018
Data types (dtype) Basic Type Available NumPy types Comments Boolean Elements are 1 byte in size Integer int8, int16, int32, int64, int128, int int defaults to the size of int in C for the platform Unsigned Integer uint8, uint16, uint32, uint64, uint128, uint uint defaults to the size of unsigned int in C for the platform Float float32, float64, float, longfloat, Float is always a double precision floating point value (64 bits). longfloat represents large precision floats. Its size is platform dependent. Complex complex64, complex128, complex The real and complex elements of a complex64 are each represented by a single precision (32 bit) value for a total size of 64 bits. Strings str, unicode Unicode is always UTF32 (UCS4) Object object Represent items in array as Python objects. Records void Used for arbitrary data structures in record arrays. 20/11/2018
Built-in “scalar” types Hierarchy of types used in NumPy: There are 21 “built-in” (static) data-type objects New (dynamic) data-type objects are created to handle Alteration of the byteorder Change in the element size (for string, unicode, and void built-ins) Addition of fields Change of the type object (C-structure arrays) 20/11/2018
Universal functions >>> type(N.exp) Universal functions (ufuncs) are objects that rapidly evaluate a function element-by- element over an array. Core piece is a 1-d loop written in C that performs the operation over the largest dimension of the array. For 1-d arrays it is equivalent to but much faster than list comprehension. >>> type(N.exp) <type 'numpy.ufunc'> >>> x = array([1,2,3,4,5]) >>> print N.exp(x) [ 2.71828183 7.3890561 20.08553692 54.59815003 148.4131591 ] >>> print [math.exp(val) for val in x] [2.7182818284590451, 7.3890560989306504,20.085536923187668, 54.598150033144236,148.4131591025766] 20/11/2018
Array calculation methods SUM FUNCTION SUM ARRAY METHOD >>> a = np.array([[1,2,3], [4,5,6]], float) # Sum defaults to summing all # *all* array values. >>> np.sum(a) 21. # supply the keyword axis to # sum along the 0th axis. >>> np.sum(a, axis=0) array([5., 7., 9.]) # sum along the last axis. >>> np.sum(a, axis=1) array([6., 15.]) # The a.sum() defaults to # summing *all* array values >>> a.sum() 21. # Supply an axis argument to # sum along a specific axis. >>> a.sum(axis=0) array([5., 7., 9.]) PRODUCT # product along columns. >>> a.prod(axis=0) array([ 4., 10., 18.]) # functional form. >>> np.prod(a, axis=0) 20/11/2018
Min and Max methods MIN MAX ARGMIN ARGMAX >>> a = array([2.,3.,0.,1.]) >>> a.min(axis=0) 0. # use Numpy’s amin() instead # of Python’s builtin min() # for speed operations on # multi-dimensional arrays. >>> np.amin(a, axis=0) >>> a = array([2.,1.,0.,3.]) >>> a.max(axis=0) 3. # functional form >>> np.amax(a, axis=0) ARGMIN ARGMAX # Find index of minimum value. >>> a.argmin(axis=0) 2 # functional form >>> np.argmin(a, axis=0) # Find index of maximum value. >>> a.argmax(axis=0) 1 # functional form >>> np.argmax(a, axis=0) 20/11/2018
Simple statistics methods MEAN STANDARD DEV./VARIANCE >>> a = array([[1,2,3], [4,5,6]], float) # mean value of each column >>> a.mean(axis=0) array([ 2.5, 3.5, 4.5]) >>> mean(a, axis=0) >>> average(a, axis=0) # average can also calculate # a weighted average >>> average(a, weights=[1,2], ... axis=0) array([ 3., 4., 5.]) # Standard Deviation >>> a.std(axis=0) array([ 1.5, 1.5, 1.5]) # Variance >>> a.var(axis=0) array([2.25, 2.25, 2.25]) >>> var(a, axis=0) 20/11/2018
Other array methods CLIP ROUND POINT TO POINT # Limit values to a range >>> a = array([[1,2,3], [4,5,6]], float) # Set values < 3 equal to 3. # Set values > 5 equal to 5. >>> a.clip(3,5) >>> a array([[ 3., 3., 3.], [ 4., 5., 5.]]) # Round values in an array. # Numpy rounds to even, so # 1.5 and 2.5 both round to 2. >>> a = array([1.35, 2.5, 1.5]) >>> a.round() array([ 1., 2., 2.]) # Round to first decimal place. >>> a.round(decimals=1) array([ 1.4, 2.5, 1.5]) POINT TO POINT # Calculate max – min for # array along columns >>> a.ptp(axis=0) array([ 3.0, 3.0, 3.0]) # max – min for entire array. >>> a.ptp(axis=None) 5.0 20/11/2018
Summary BASIC ATTRIBUTES SHAPE OPERATIONS In the following a partial list of the most common array attributes and methods in NumPy: BASIC ATTRIBUTES a.dtype – Numerical type of array elements. float32, uint8, etc. a.shape – Shape of the array. (m,n,o,...) a.size – Number of elements in entire array. a.itemsize – Number of bytes used by a single element in the array. a.nbytes – Number of bytes used by entire array (data only). a.ndim – Number of dimensions in the array. SHAPE OPERATIONS a.flat – An iterator to step through array as if it is 1D. a.flatten() – Returns a 1D copy of a multi-dimensional array. a.ravel() – Same as flatten(), but returns a ‘view’ if possible. a.resize(new_size) – Change the size/shape of an array in-place. a.swapaxes(axis1, axis2) – Swap the order of two axes in an array. a.transpose(*axes) – Swap the order of any number of array axes. a.T – Shorthand for a.transpose() a.squeeze() – Remove any length=1 dimensions from an array. 20/11/2018
Summary (2) FILL AND COPY CONVERSION / COERSION COMPLEX NUMBERS In the following a partial list of the most common array attributes and methods in NumPy: FILL AND COPY a.copy() – Return a copy of the array. a.fill(value) – Fill array with a scalar value. CONVERSION / COERSION a.tolist() – Convert array into nested lists of values. a.tostring() – raw copy of array memory into a python string. a.astype(dtype) – Return array coerced to given dtype. a.byteswap(False) – Convert byte order (big <-> little endian). COMPLEX NUMBERS a.real – Return the real part of the array. a.imag – Return the imaginary part of the array. a.conjugate() – Return the complex conjugate of the array. a.conj()– Return the complex conjugate of an array.(same as conjugate) 20/11/2018
Summary (3) SAVING SEARCH / SORT ELEMENT MATH OPERATIONS In the following a partial list of the most common array attributes and methods in NumPy: SAVING a.dump(file) – Store a binary array data out to the given file. a.dumps() – returns the binary pickle of the array as a string. a.tofile(fid, sep="", format="%s") Formatted ascii output to file. SEARCH / SORT a.nonzero() – Return indices for all non-zero elements in a. a.sort(axis=-1) – Inplace sort of array elements along axis. a.argsort(axis=-1) – Return indices for element sort order along axis. a.searchsorted(b) - Return index where elements from b would go in a. ELEMENT MATH OPERATIONS a.clip(low, high) – Limit values in array to the specified range. a.round(decimals=0) – Round to the specified number of digits. a.cumsum(axis=None) – Cumulative sum of elements along axis. a.cumprod(axis=None) – Cumulative product of elements along axis. 20/11/2018
Reduction methods REDUCTION METHODS In the following a list of the reduction methods that return scalar values from NumPy arrays: REDUCTION METHODS All the following methods “reduce” the size of the array by 1 dimension by carrying out an operation along the specified axis. If axis is None, the operation is carried out across the entire array. a.sum(axis=None) – Sum up values along axis. a.prod(axis=None) – Find the product of all values along axis. a.min(axis=None)– Find the minimum value along axis. a.max(axis=None) – Find the maximum value along axis. a.argmin(axis=None) – Find the index of the minimum value along axis. a.argmax(axis=None) – Find the index of the maximum value along axis. a.ptp(axis=None) – Calculate a.max(axis) – a.min(axis) a.mean(axis=None) – Find the mean (average) value along axis. a.std(axis=None) – Find the standard deviation along axis. a.var(axis=None) – Find the variance along axis. a.any(axis=None) – True if any value along axis is non-zero. (or) a.all(axis=None) – True if all values along axis are non-zero. (and) 20/11/2018
Array operations SIMPLE ARRAY MATH MATH FUNCTIONS >>> a = array([1,2,3,4]) >>> b = array([2,3,4,5]) >>> a + b array([3, 5, 7, 9]) # Create array from 0 to 10 >>> x = arange(11.) # multiply entire array by # scalar value >>> a = (2*pi)/10. >>> a 0.62831853071795862 >>> a*x array([ 0.,0.628,…,6.283]) # inplace operations >>> x *= a >>> x # apply functions to array. >>> y = sin(x) NumPy defines the following constants: pi = 3.14159265359 e = 2.71828182846 20/11/2018
Outer operations op.outer(a,b) forms all possible combinations of elements between a and b using op. The shape of the resulting array results from concatenating the shapes of a and b. (Order matters!) Op can be any binary operator (e.g. add, sub, div, etc…) 20/11/2018
Concatenate ndarrays Concatenate((a0,a1,…,aN),axis=0): the input arrays (a0,a1,…,aN) are concatenated along the given axis. They must have the same shape along every axis except the one given. >>>concatenate((x,y),0) >>> concatenate((x,y),1) >>> array((x,y)) 20/11/2018
Dot, cross and outer products NumPy provides methods for many operations. Also for the classic products between vectors (ndarray): 20/11/2018
Matrix-vector product NumPy provides methods for many operations. Also for the classic products between vectors (ndarray): 20/11/2018
Determinant of a matrix NumPy provides functions over matrices besides the ones on unidimensional arrays. In this example we compute the determinant of a matrix: 20/11/2018
Inverse of a matrix The inverse of a matrix can also be easily found: 20/11/2018
Eigenvector and Eigenvalues Given a matrix M, the eigenvalues b and eigenvectors V of the matrix are defined by the equation: MV=bV 20/11/2018
Solving system of equations Consider this system of equations: We can write these three equations in matrix form: This has the form Mx =b, so we can solve for X = M^−1 b, where M^−1 is the inverse of M. 20/11/2018
Solving system of equations (2) In NumPy, the calculation is straightforward (take note how matrix multiplication works here): 20/11/2018
Mathematic binary operators Operations between NumPy arrays and/or with scalars: a + b add(a,b) a - b subtract(a,b) a % b remainder(a,b) a * b multiply(a,b) a / b divide(a,b) a ** b power(a,b) MULTIPLY BY A SCALAR ADDITION USING AN OPERATOR FUNCTION >>> a = array((1,2)) >>> a*3. array([3., 6.]) >>> add(a,b) array([4, 6]) ELEMENT BY ELEMENT ADDITION IN PLACE OPERATION # Overwrite contents of a. >>> add(a,b,a) # a += b array([4, 6]) >>> a >>> a = array([1,2]) >>> b = array([3,4]) >>> a + b array([4, 6]) 20/11/2018
Comparison and logical operators NumPy supports operators for comparing two arrays and logical operators: equal (==) greater_equal (>=) logical_and logical_not not_equal (!=) less (<) logical_or greater (>) less_equal (<=) logical_xor 2D EXAMPLE >>> a = array(((1,2,3,4),(2,3,4,5))) >>> b = array(((1,2,5,4),(1,3,4,5))) >>> a == b array([[True, True, False, True], [False, True, True, True]]) #Functional equivalent >>> equal(a,b) 20/11/2018
Bitwise operators bitwise_and (&) bitwise_or (|) Invert (~) Bitwise operators on NumPy arrays: bitwise_and (&) bitwise_or (|) Invert (~) bitwise_xor right_shift(a,shifts) left_shift (a,shifts) BITWISE EXAMPLES >>> a = array((1,2,4,8)) >>> b = array((16,32,64,128)) >>> bitwise_or(a,b) array([ 17, 34, 68, 136]) #bit inversion >>> a = array((1,2,3,4), uint8) >>> invert(a) array([254, 253, 252, 251], dtype=uint8) #left shift operation >>> left_shift(a,3) array([ 8, 16, 24, 32], dtype=uint8) 20/11/2018
Trig. and other functions NumPy supports several classes of universal functions that can be applied on NumPy arrays: TRIGONOMETRIC OTHERS sin(x) sinh(x) cos(x) cosh(x) arccos(x) arccosh(x) arctan(x) arctanh(x) arcsin(x) arcsinh(x) arctan2(x,y) exp(x) log(x) log10(x) sqrt(x) absolute(x) conjugate(x) negative(x) ceil(x) floor(x) fabs(x) hypot(x,y) fmod(x,y) maximum(x,y) minimum(x,y) hyperbolic hypot(x,y) Element by element distance calculation using 20/11/2018
Broadcasting The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes. NumPy operations are usually done on pairs of arrays on an element-by-element basis. In this example, the two arrays must have exactly the same shape: NumPy’s broadcasting rule relaxes this constraint when the arrays’ shapes meet certain constraints. The simplest broadcasting example occurs when an array and a scalar value are combined in an operation: >>> a = np.array([1.0, 2.0, 3.0]) >>> b = np.array([2.0, 2.0, 2.0]) >>> a * b array([ 2., 4., 6.]) >>> a = np.array([1.0, 2.0, 3.0]) >>> b = 2.0 >>> a * b array([ 2., 4., 6.]) 20/11/2018
Broadcasting (2) We can think of the scalar b being stretched during the arithmetic operation into an array with the same shape as a. The stretching analogy is only conceptual. NumPy is smart enough to use the original scalar value without actually making copies. When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions, and works its way forward. Two dimensions are compatible when: they are equal, or, one of them is 1. Otherwise a ValueError: frames are not aligned exception is thrown! In the previous example: a has 3 rows and 1 columns b has 1 row and 1 column 20/11/2018
Broadcasting example Another example of broadcasting between two arrays with different shapes and the same number of dimensions: x has shape (4,) the ufunc sees it as having shape (1,4) y has shape (3,1) The ufunc result has shape (3,4) 20/11/2018
Broadcasting more… Arrays do not need to have the same number of dimensions. This example shows this concepts in more detail: 20/11/2018
Summary on broadcasting 4x3 4x3 4x3 3 stretch 4x1 3 stretch stretch 20/11/2018
Broadcasting rules mismatch! >>> a = array((0,10,20,30)) The trailing axes of both arrays must either be 1 or have the same size for broadcasting to occur. Otherwise, a “ValueError: frames are not aligned” exception is thrown. 4x3 4 mismatch! >>> a = array((0,10,20,30)) >>> b = array((0,1,2)) >>> y = a[:, None] + b 20/11/2018
END 20/11/2018
Searching and indexing NumPy provides a fast way to search (and extract) individual elements of a NumPy array. Much faster than using a for loop or list comprehension! >>> a = np.array([1, 3, 0, -5, 0], float) >>> np.where(a != 0) (array([0, 1, 3]),) >>> a[a != 0] array([ 1., 3., -5.]) >>> len(a[np.where(a > 0)]) 2 >>> x = np.arange(9.).reshape(3, 3) >>> x array([[ 0., 1., 2.], [ 3., 4., 5.], [ 6., 7., 8.]]) >>> np.where( x > 5 ) (array([2, 2, 2]), array([0, 1, 2])) >>> b = np.array([10,20,30,40,50]) >>> i = np.where(a > 0) >>> i (array([0, 1]),) >>> b[i] array([ 10, 20]) >>> b array([10, 20, 30, 40, 50]) Fancy Indexing creates copies instead of references! 20/11/2018
Indexing with newaxis Newaxis is a special index that inserts a new axis in the array at the specified location. Each newaxis increases the array’s dimensionality by 1. >>> a = array((0,1,2)) >>> b = array((0,10,20,30)) >>> y = a+ b[:, newaxis] >>> y array([[ 0, 1, 2], [10, 11, 12], [20, 21, 22], [30, 31, 32]]) >>> shape(a) (3,) >>> y = a[newaxis,:] >>> shape(y) (1, 3) >>> y = a[:, newaxis] >>> shape(y) (3, 1) >>> y = a[:, newaxis, newaxis] >>> shape(y) (3, 1, 1) 20/11/2018
Newaxis example This is a very simple example in which we use broadcastig and newaxis to execute a simple operation between ndarrays: 20/11/2018
Example Let us solve the Prime number sieve (Eratosthenes sieve) using NumPy arrays and views. Views are not copies. This makes it possible to save memory and performs the operations more efficiently and faster! 20/11/2018