Course Progress73%
🍎 Python Advanced Python Topic 73 / 100
⏳ 8 min read

Dataclasses

Cut class boilerplate in half: @dataclass auto-generates __init__, __repr__, and __eq__ from your field annotations.

"@dataclass is Python saying: 'I know you just want to store data in a class. I'll write the boring parts for you.'"

— ShurAI

The Boilerplate Problem

A plain class that holds data requires a lot of repetitive code. Look how much you have to write just to store three fields:

python — tedious old way
class Point:
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z

    def __repr__(self):
        return f"Point(x={self.x}, y={self.y}, z={self.z})"

    def __eq__(self, other):
        return (self.x, self.y, self.z) == (other.x, other.y, other.z)
# 12 lines for 3 fields. Imagine 10 fields.

@dataclass — All of That in 4 Lines

python
from dataclasses import dataclass

@dataclass
class Point:
    x: float
    y: float
    z: float

# Python auto-generates __init__, __repr__, __eq__ for free
p1 = Point(1.0, 2.0, 3.0)
p2 = Point(1.0, 2.0, 3.0)

print(p1)         # Point(x=1.0, y=2.0, z=3.0)
print(p1 == p2)   # True
print(p1.x)       # 1.0
What @dataclass auto-generates:
__init__
Constructor with all fields as parameters
__repr__
Readable string like Point(x=1, y=2)
__eq__
Equality check comparing all fields

Default Values and field()

python
from dataclasses import dataclass, field

@dataclass
class Player:
    name:   str
    level:  int   = 1           # simple default
    score:  float = 0.0
    items:  list  = field(default_factory=list)  # mutable default MUST use field()

p = Player("Riya")
print(p)   # Player(name='Riya', level=1, score=0.0, items=[])

p.items.append("Sword")
print(p)   # Player(name='Riya', level=1, score=0.0, items=['Sword'])
⚠️ Always use field(default_factory=list) for mutable defaults

Never write items: list = [] in a dataclass. Python would share the same list across ALL instances — the classic mutable default argument bug. Use field(default_factory=list) so each instance gets its own fresh list.

frozen=True — Immutable Dataclasses

python
@dataclass(frozen=True)    # makes instances immutable + hashable
class Coordinate:
    lat:  float
    lon:  float

c = Coordinate(28.6, 77.2)
print(c)        # Coordinate(lat=28.6, lon=77.2)

# c.lat = 99  ← would raise FrozenInstanceError

# Frozen dataclasses are hashable — can be used as dict keys or in sets
locations = {Coordinate(28.6, 77.2): "Delhi"}
print(locations[Coordinate(28.6, 77.2)])   # Delhi

Real Example — Product Catalogue

python
from dataclasses import dataclass, field

@dataclass
class Product:
    name:     str
    price:    float
    category: str
    in_stock: bool  = True
    tags:     list  = field(default_factory=list)

    def discounted_price(self, pct):
        return self.price * (1 - pct / 100)

p1 = Product("Laptop", 50000, "Electronics", tags=["tech", "new"])
p2 = Product("Rice",   80,    "Groceries",   in_stock=False)

print(p1)
print(f"After 10% off: {p1.discounted_price(10)}")
print(p2)
output
Product(name='Laptop', price=50000, category='Electronics', in_stock=True, tags=['tech', 'new'])
After 10% off: 45000.0
Product(name='Rice', price=80, category='Groceries', in_stock=False, tags=[])

"Use @dataclass whenever your class is primarily about holding data. Save hand-written __init__ for classes with complex setup logic where you genuinely need full control."

— ShurAI

🧠 Quiz — Q1

What does @dataclass automatically generate for a class?

🧠 Quiz — Q2

You have a dataclass with a list field. Why should you write items: list = field(default_factory=list) instead of items: list = []?

🧠 Quiz — Q3

What does @dataclass(frozen=True) do?

🧠 Quiz — Q4

When is it better to use a regular class with a hand-written __init__ instead of @dataclass?