Introduction to NumPy
Fast, compact arrays for numerical computing — the foundation of data science, machine learning, and scientific Python.
"NumPy is the reason Python became the language of science. A Python list of a million numbers takes 400MB. A NumPy array takes 8MB — and runs 100x faster."
— ShuraiWhat is NumPy and Why Use It?
NumPy (Numerical Python) is a library for fast number crunching. It stores numbers in a compact ndarray (n-dimensional array) and runs operations on entire arrays at once, in highly optimised C code:
Each item has overhead
Loops are slow
No built-in math ops
Tiny memory footprint
Operations are vectorized (no loop needed)
Built-in maths, stats, linear algebra
pip install numpy
Creating Arrays
import numpy as np # np is the universal alias
# From a list
a = np.array([1, 2, 3, 4, 5])
print(a) # [1 2 3 4 5]
print(a.dtype) # int64 — all elements are the same type
print(a.shape) # (5,) — 5 elements, 1 dimension
# Convenience creation functions
zeros = np.zeros(5) # [0. 0. 0. 0. 0.]
ones = np.ones(3) # [1. 1. 1.]
rng = np.arange(0, 10, 2) # [0 2 4 6 8] — like range() but array
linsp = np.linspace(0, 1, 5) # [0. 0.25 0.5 0.75 1.] — 5 evenly spaced
rand = np.random.rand(4) # 4 random floats between 0 and 1
Vectorized Operations — No Loops Needed
The big win with NumPy: operations apply to every element at once, without writing a loop:
scores = np.array([55, 72, 88, 91, 63])
# Apply to all elements — no for loop!
print(scores + 5) # [60 77 93 96 68] — add 5 to every score
print(scores * 2) # [110 144 176 182 126]
print(scores >= 75) # [False True True True False]
print(scores[scores >= 75]) # [72 88 91] — boolean indexing
# Math on two arrays — element-wise
a = np.array([1, 2, 3])
b = np.array([10, 20, 30])
print(a + b) # [11 22 33]
print(a * b) # [10 40 90]
Useful Built-in Functions
data = np.array([4, 7, 2, 9, 1, 5, 8])
print(np.sum(data)) # 36
print(np.mean(data)) # 5.142...
print(np.std(data)) # standard deviation
print(np.min(data)) # 1
print(np.max(data)) # 9
print(np.sort(data)) # [1 2 4 5 7 8 9]
print(np.argmax(data)) # 3 — index of the max value
2D Arrays — Matrices
# 3 rows, 3 columns
matrix = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
print(matrix.shape) # (3, 3)
print(matrix[0]) # [1 2 3] — first row
print(matrix[1, 2]) # 6 — row 1, col 2
print(matrix[:, 1]) # [2 5 8] — all rows, col 1
# Reshape: change dimensions without changing data
flat = np.arange(12) # [0 1 2 ... 11]
grid = flat.reshape(3, 4) # 3 rows, 4 cols
print(grid)
Real Example — Student Grade Analysis
import numpy as np
# Each row = one student, each column = one subject
grades = np.array([
[85, 92, 78], # Riya
[70, 65, 80], # Arjun
[95, 88, 91], # Sneha
[60, 72, 55], # Vikram
])
# axis=1 means "across columns" (per student average)
student_avgs = np.mean(grades, axis=1)
print("Student averages:", student_avgs)
# [85. 71.67 91.33 62.33]
# axis=0 means "across rows" (per subject average)
subject_avgs = np.mean(grades, axis=0)
print("Subject averages:", subject_avgs)
# [77.5 79.25 76.]
# Boolean indexing: which students passed (avg >= 75)?
names = np.array(["Riya", "Arjun", "Sneha", "Vikram"])
print("Passed:", names[student_avgs >= 75])
# Passed: ['Riya' 'Sneha']
"Once you get comfortable with NumPy arrays and vectorized operations, you'll never want to write a for loop over numbers again. The code is shorter, faster, and clearer."
— Shurai🧠 Quiz — Q1
What is the main advantage of a NumPy array over a Python list for numbers?
🧠 Quiz — Q2
What does np.arange(0, 10, 2) return?
🧠 Quiz — Q3
scores = np.array([55, 72, 88]). What does scores + 5 return?
🧠 Quiz — Q4
In a 2D array m, what does m[:, 1] select?