This Blog is enough to get you started for hand ons with numpy, because it covers

### What Is Numpy?

Numpy or Numerical python is a free and open source library for working with n-dimensional arrays or ndarrays.It is often used with other libraries such as scipy,matplotlib,Pandas,scikit for scientific computations for various data science or machine learning applications.Numpy is partly written in Python and the rest with C C++.

### Getting Started With Numpy

Numpy does not come pre installed with python.So you will need to install it with either pip or pipenv installed on your system, then the following command is enough to get started with numpy

Pip install numpy |

Or if you are using any python distribution like Anaconda, Spyder etc., then they have numpy pre installed and you can just start from importing numpy in your environment.

Throughout the blog i will use an alias name np for numpy,so i will import it in this way,from numpy import np |

### Arrays In Numpy

The number lists passed to the array function is the number or rows and the number of elements in the list is the number of columnsThe numpy.array() function takes an optional argument as dtype,which you can use to define the data type of the elements of the array,by default the dtype will be set to the data type of the elements guessed by the numpy.For example

n = np.array([[1, 2, 3], [1, 3, 4]]) print(n) |

[[1 2 3] [1 3 4]] |

print(np.zeros((2,2))) |

[[0. 0.] [0. 0.]] |

print(np.ones((2,2))) |

[[1. 1.] [1. 1.]] |

print(np.fill((2,2),4)) |

[[4. 4.] [4. 4.]] |

print(np.eye((3))) |

[[1. 0. 0.] [0. 1. 0.] [0. 0. 1.]] |

### Python Lists Vs Numpy Arrays

In python we used lists for storing multiple data instead of arrays, because arrays can only include homogenous data or elements of the same data type.Then why use arrays instead of lists?

Array elements are stored at contiguous memory locations,and use less memory and space which make them convenient to use.

Consider the following example where we are adding numbers,(1 to 100000 with 1 to 100000) with lists and arrays and comparing their time difference (with time module) to do this math.

import time import numpy as np vector = 100000 def pure_python_ver(): t = time.time() x = range(vector) y = range(vector) z = [x[i] + y[i] for i in range(len(y))] return time.time() – t def numpy_ver(): t = time.time() x = np.arange(vector) y = np.arange(vector) z = x + y return time.time() – t t1 = pure_python_ver() t2 = numpy_ver() print(‘time taken by python version:’,t1) print(‘time taken by numpy version: ‘, t2) print(“Numpy is” + str(t1/t2) + ” faster!”) |

time taken by python version: 0.19698405265808105 time taken by python version: 0.0011696815490722656 Numpy is in this example 168.40827558092133 faster! |

*Supervised,Unsupervised machine learning algorithms,Data Analysis Manipulation and visualisation,reinforcement testing, hypothesis testing*and much more to make an industry required data scientist at an affordable price, which includes certification, support with career guidance assistance.

### Array Indexing

arr = np.array([1, 2,3, 5, 11]) print(arr[2],arr[3]) # outputs: 3 5 |

arr = np.array([[1,2,3],[6,7,8]]) print(‘3rd element from row 1: ‘, arr[0,2]) # outputs 3 |

arr = np.array([[[0, 1, 1], [3, 3, 4]], [[4, 4, 5], [3, 3, 5]]]) print(arr[0][1][2]) #outputs 6 print(arr[0, 1, 2]) # outputs 6 |

print(arr) print(arr[0]) print(arr[0][1]) print(arr[0][1][2]) |

[[[0 1 1] [3 3 4]] [[4 4 5] [3 3 5]]] [[0 1 1] [3 3 4]] [3 3 4] 4 |

### Slicing Elements from the array

We’ll have the same slicing method as the list that is using indexing or [start:end:step].arr = np.array([1, 3, 5, 7, 11, 13, 17]) print(arr[1:7:2]) #output: [ 3 7 13] |

arr = np.array([[1, 1, 2, 3, 5], [1, 2, 3, 5, 7]]) print(arr[1, 2:5]) # outputs [3 5 7] |

arr = np.array([[1, 1, 2, 3, 5], [1, 2, 3, 5, 7]]) print(arr[0:2, 2:4]) |

[[2 3] [3 5]] |

### Other Numpy operations

arr = np.array([[1, 1, 2, 3, 5], [1, 2, 3, 5, 7], [1, 2, 3, 4, 5]]) print(arr.shape) # output: (3, 5) |

arr = np.array([1, 1, 2, 3, 5, 8, 13, 21]) newarr_1 = arr.reshape(-1, 2) # unknown value passed for row newarr_2 = arr.reshape(2, -1) # unknown value passed for column print(newarr_1) print() print(newarr_2) |

[[ 1 1] [ 2 3] [ 5 8] [13 21]] [[ 1 1] [ 2 3] [ 5 8] [13 21]] |

arr = np.array([[1, 2], [3, 4], [5, 6]]) new_arr = arr.ravel() print(new_arr) |

### Copy And View In Numpy

The difference between view and copy of an array is that the former one(view) will create a new array with the elements as of the parents elements,but changing any of the elements of either the parent’s or the view copy’s element will be reflected in both the arrays.Whereas the later one(copy) is like deep copy of lists, where changing elements of either the parents or the copied array will not change the other one.

### Math operations on arrays

Doing addition, multiplication division or finding the square root is just as you would do in python with 2 or more operators.I will implement all these operations in the program below.

Consider the following example for converting a list of temperatures in celsius to fahrenheit and converting an array of temperatures in Celsius to Fahrenheit.

cel_list = [20.2, 20.4, 22.9, 21.5, 23.7, 25.3, 21.8, 24.2, 20.9, 22.1] cel_arr = np.array(cel_list) # or create with this # cel_arr = np.array([20.2,20.4,22.9, 21.5,23.7, 25.3,21.8,24.2,20.9, 22.1]) print(cel_arr) # output: [20.2 20.4 22.9 21.5 23.7 25.3 21.8 24.2 20.9 22.1] |

f_list = [ x*(9/5) + 32 for x in cel_list] print(f_list) # output # [68.36, 68.72, 73.22, 70.7, 74.66, 77.53999999999999, 71.24000000000001, 75.56, 69.62, 71.78] |

feh_arr = cel_arr * (9 / 5) + 32 print(feh_arr) # Output: [68.36 68.72 73.22 70.7 74.66 77.54 71.24 75.56 69.62 71.78] |

### Statistics with Numpy

The mean of a data set is the average value or the sum of values divided by the number of values.We can calculate the mean of any data set with numpy.mean(). Now let us find the average value of the speed of the given 10 people.speed = [91, 20, 81, 86, 120, 86, 91, 122, 91, 78] x = np.mean(speed) print(x) # output: 86.6 |

speed = [91, 20, 81, 86, 120, 86, 91, 122, 91, 78] x = np.median(speed) print(x) # output: 88.5 |

speed = [91, 20, 81, 86, 120, 86, 91, 122, 91, 78] x = np.percentile(speed, 34) print(x) # output: 86.0 |

speed = [91, 20, 81, 86, 120, 86, 91, 122, 91, 78] x = np.std(speed) print(x) # output: 26.3977271748914 |

speed = [91, 20, 81, 86, 120, 86, 91, 122, 91, 78] x = np.var(speed) print(x) # output: 696.8399999999999 |

### Random Data Distribution With Numpy and Matplotlib

Let us create 2 arrays with random values and specified mean and variance.import numpy as np set_1 = np.random.normal(10.0,1.0,100) set_2 = np.random.normal(17.0,2.5,100) |

import matplotlib.pyplot as plt %matplotlib inline #to prevent opening of new window for showing diagram plt.scatter(set_1,set_2) plt.show() |

This was a quick start tutorial for Numpy and its implementations. To get in-depth knowledge of Python along with its various applications and real-time projects, you can enroll in Python Training in Chennai or Python Training in Bangalore by FITA or enroll for a Data science course in Chennai or Data science course in Bangalore which includes *Supervised, Unsupervised machine learning algorithms, Data Analysis Manipulation and visualization, reinforcement testing, hypothesis testing* and much more to make an industry required data scientist at an affordable price, which includes certification, support with career guidance assistance.