Python Lists and NumPy Arrays
Python Lists and NumPy Arrays
NumPy is a Python package used for numerical calculations, working with arrays of homogeneous values, and scientific computing. This section introduces NumPy arrays then explains the difference between Python lists and NumPy arrays.
Python Lists
NumPy is used to construct homogeneous arrays and perform mathematical operations on arrays. A NumPy array is different from a Python list. The data types stored in a Python list can all be different.
python_list = [ 1, -0.038, 'gear', True]
The Python list above contains four different data types: 1
is an integer, -0.038
is a float, 'gear'
is a string, and 'True'
is a boolean.
The code below prints the data type of each value store in python_list
.
python_list = [1, -0.038, 'gear', True]
for item in python_list:
print(type(item))
NumPy Arrays
The values stored in a NumPy array must all share the same data type. Consider the NumPy array below:
np.array([1.0, 3.1, 5e-04, 0.007])
All four values stored in the NumPy array above share the same data type: 1.0
, 3.1
, 5e-04
, and 0.007
are all floats.
The code below prints the data type of each value stored in the NumPy array above.
import numpy as np
for value in np.array([1.0, 3.1, 5e-04, 0.007]):
print(type(value))
In the next code section, all four items are converted to type '<U32'
, which is a string data type in NumPy (the U
refers Unicode strings; all strings in Python are Unicode by default).
np.array([1, -0.038, 'gear', True])
The code below demonstrates list repetition using the multiplication operator, *
.
lst = [1, 2, 3, 4]
lst*2
2
, a loop can be used:
lst = [1, 2, 3, 4]
for i, item in enumerate(lst):
lst[i] = lst[i]*2
lst
Another way to complete the same operation in the loop above is to use a NumPy array.
Array Multiplication
An entire NumPy array can be multiplied by a scalar in one step. The scalar multiplication operation below produces an array with each element multiplied by the scalar 2
.
nparray = np.array([1,2,3,4])
2*nparray
Timing Arrays
Jupyter notebooks have a nice built-in method to time how long a line of code takes to execute. In a Jupyter notebook, when a line starts with %timeit
followed by code, the kernel runs the line of code multiple times and outputs an average of the time spent to execute the line of code.
We can use %timit
to compare a mathematical operation on a Python list using a for loop to the same mathematical operation on a NumPy array.
lst = list(range(10000))
%timeit for i, item in enumerate(lst): lst[i] = lst[i]*2
nparray = np.arange(0,10000,1)
%timeit 2*nparray
For larger lists of numbers, the speed increase using NumPy is considerable.