14 May 2026

Type-safe Python design: patterns a Scala developer uses to stop runtime surprises

I’ve been writing Scala for years. At some point the compiler stops feeling like an obstacle and starts feeling like a colleague - one who reads every line you write and says “no, that doesn’t make sense” before you ship it. That’s a good feeling.

Then a project came in. Real-time signal processing, embedded hardware, weeks to prototype. The kind of system that would run unattended in remote locations where you can’t push a hotfix at 4am. The team had already decided on Python. I had no say.

My first reaction was the usual one. My second was to think: okay, Python’s type system exists. Most people treat it as optional decoration. What if I didn’t?

This post is what I built over the next few weeks - a system of patterns that brought Scala-level design discipline into plain Python. No third-party libraries. Just the standard library, mypy in CI, and twenty-four patterns I now use on every Python project that matters.

Type safety is non-negotiable. Runtime surprises in production aren’t a personality quirk of dynamically typed languages - they’re a sign that the code wasn’t designed to be trusted.


Why this matters

Here’s the bug that started it.

The system consumed audio from a sensor at a certain sample rate (Hz) and also used window sizes in samples. Both were float or int. Both were passed around as parameters. At one point, deep in a feature extraction function, someone passed 44100 where the code expected a sample count, not a rate. Same number, different meaning. Python accepted it without complaint. The output was plausible enough to pass casual inspection.

The error surfaced hours later. Silently wrong output is worse than a crash - a crash tells you where to look.

In Scala, this bug is literally impossible:

1
2
3
4
5
6
7
case class Hertz(value: Double) extends AnyVal
case class SampleCount(value: Int) extends AnyVal

def extractFeatures(rate: Hertz, windowSize: SampleCount): Array[Float] = ???

// Compiler error - cannot pass Hertz where SampleCount is expected
extractFeatures(Hertz(44100.0), Hertz(44100.0))

That’s it. The types encode the domain meaning. The compiler enforces it. The Saturday debugging session never happens.

Python can do this. Not at the language runtime level - but with mypy running in CI, you get the same safety net. The patterns below are how.


What we’ll build

Twenty-four patterns, grouped by what they prevent. Each one is:

  • stdlib only (no pip install)
  • annotated with which Python version it needs
  • shown with the Scala original so the translation is clear
  • usable independently - you don’t need to adopt all twenty-four at once

At the end there’s a version compatibility table and a standalone design example that combines the core patterns.

flowchart TD
    ROOT["Production problem"] --> C1["Unit confusion\nPart 1: NewType"]
    ROOT --> C2["Mutable / invalid state\nParts 2, 13: Frozen dataclass, Smart ctors"]
    ROOT --> C3["Swallowed errors\nParts 3, 14: Result, Validated"]
    ROOT --> C4["Inheritance coupling\nParts 4, 15: Protocol, Composition"]
    ROOT --> C5["Missing match cases\nPart 5: Exhaustive match"]
    ROOT --> C6["Implicit side effects\nParts 19–22: Reader, Writer, State, Monoid"]

    style ROOT fill:#e1f5ff,stroke:#0066cc,color:#000
    style C1 fill:#e1ffe1,stroke:#2d7a2d,color:#000
    style C2 fill:#e1ffe1,stroke:#2d7a2d,color:#000
    style C3 fill:#ffe1e1,stroke:#cc0000,color:#000
    style C4 fill:#f0e1ff,stroke:#8800cc,color:#000
    style C5 fill:#fff4e1,stroke:#cc8800,color:#000
    style C6 fill:#f0e1ff,stroke:#8800cc,color:#000

Part 1 - Domain NewTypes: stop mixing units

Why. When multiple parameters share the same base type (float, int, str), a caller can pass them in the wrong order. The function runs. Something is silently wrong.

How. NewType creates a named alias that mypy treats as a distinct type at check-time, with zero runtime overhead. It’s Python’s equivalent of Scala 2’s AnyVal value classes or Scala 3’s opaque type.

1
2
3
4
5
6
7
// Scala 2
case class Seconds(value: Double) extends AnyVal
case class Hertz(value: Double) extends AnyVal

// Scala 3
opaque type Seconds = Double
opaque type Hertz = Double
1
2
3
4
5
6
7
# Python 3.5.2+  (typing module)
from typing import NewType

Seconds = NewType("Seconds", float)
Hertz   = NewType("Hertz", float)
SampleCount = NewType("SampleCount", int)
Probability = NewType("Probability", float)

What you get:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def build_windows(
    sample_rate: Hertz,
    window_duration: Seconds,
    step_duration: Seconds,
) -> list[list[float]]:
    ...

# mypy error: Argument 1 has incompatible type "Seconds"; expected "Hertz"
build_windows(Seconds(44100.0), Hertz(0.5), Seconds(0.25))

# Correct - explicit construction makes the intent visible
build_windows(Hertz(44100.0), Seconds(0.5), Seconds(0.25))

Cross-domain arithmetic also gets flagged:

1
2
3
4
5
6
7
8
rate: Hertz = Hertz(44100.0)
duration: Seconds = Seconds(0.5)

# mypy: Unsupported operand types for * ("Hertz" and "Seconds")
samples = rate * duration

# Explicit cast is the correct path - forces you to document the conversion
samples = SampleCount(int(float(rate) * float(duration)))
Python version notes
NewType is available from Python 3.5.2. In Python 3.10+, NewType became a class (not a function alias) - same behaviour, slightly better error messages. No workaround needed for older versions; the typing module is the entire API.

Part 2 - Frozen dataclasses as sealed ADTs

Why. Domain objects that can be mutated after construction are a liability. An Order with a negative price shouldn’t exist at all - not as a “partially initialised object” waiting for validation, not as something that passed construction and got corrupted later. Valid construction should be the only kind.

How. @dataclass(frozen=True) gives you an immutable value type. __post_init__ is the smart constructor - validation happens at build time, not scattered through the codebase.

1
2
3
4
5
6
7
8
// Scala: case class + sealed trait + __post_init__ equivalent
sealed trait ValidationError
case class NegativePrice(price: Double) extends ValidationError

case class Order(itemId: String, price: Double, qty: Int) {
  require(price >= 0, s"NegativePrice($price)")
  require(qty > 0, "qty must be positive")
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Python 3.7+
from dataclasses import dataclass
from typing import final

@dataclass(frozen=True)
class Order:
    item_id: str
    price: float
    qty: int

    def __post_init__(self) -> None:
        if self.price < 0:
            raise ValueError(f"price must be non-negative, got {self.price}")
        if self.qty <= 0:
            raise ValueError(f"qty must be positive, got {self.qty}")

What you get:

  • Order("SKU-1", -5.0, 10) raises immediately at construction
  • Fields cannot be reassigned after construction (FrozenInstanceError)
  • __eq__ and __hash__ are generated from field values - the object behaves like a value
  • Two Order("SKU-1", 9.99, 5) objects compare equal and hash identically - safe for set/dict keys

For sum types (types with exactly N variants), combine frozen=True with @final:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# Python 3.8+ for @final
from dataclasses import dataclass
from typing import TypeVar, Generic, final

T = TypeVar("T")
E = TypeVar("E")

class Result(Generic[T, E]):
    """Sealed base. Only Ok and Err are valid variants."""
    __slots__ = ()

@final
@dataclass(frozen=True)
class Ok(Result[T, E]):
    value: T

@final
@dataclass(frozen=True)
class Err(Result[T, E]):
    error: E

@final tells mypy that Ok and Err cannot be subclassed. Combined with match (Part 5), this gives exhaustive dispatch - the Python analogue of Scala’s sealed trait.

Python version notes
@dataclass requires Python 3.7+. @final decorator requires Python 3.8+. slots=True as a dataclass argument (zero-overhead frozen objects) requires Python 3.10+. For 3.7-3.9 you can define __slots__ manually alongside the dataclass.

Part 3 - The Result type: explicit error channels

Why. Python’s standard error model is exceptions. A function’s signature says nothing about what it can fail with, and callers have no obligation to handle failures at all. This leads to missing try/except, inconsistent error propagation, and bugs that surface only in production edge cases.

The Scala solution - Either[L, R] or Try[T] - makes the error part of the return type:

1
2
def parseConfig(path: String): Either[String, Config] =
  Try(readFile(path)).toEither.left.map(_.getMessage)

The caller cannot ignore it. The signature tells the whole story.

How. Build a Result[T, E] type using Generic and frozen dataclasses:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# Python 3.5+ for Generic. Python 3.10+ for match syntax.
from __future__ import annotations
from dataclasses import dataclass
from typing import Callable, Generic, TypeVar, final

T = TypeVar("T")
U = TypeVar("U")
E = TypeVar("E")
F = TypeVar("F")

class Result(Generic[T, E]):
    __slots__ = ()

    def map(self, f: Callable[[T], U]) -> Result[U, E]:
        raise NotImplementedError

    def flat_map(self, f: Callable[[T], Result[U, E]]) -> Result[U, E]:
        raise NotImplementedError

    def map_error(self, f: Callable[[E], F]) -> Result[T, F]:
        raise NotImplementedError

    def get_or_else(self, default: T) -> T:  # type: ignore[misc]
        raise NotImplementedError

    def get_or_raise(self) -> T:
        raise NotImplementedError

@final
@dataclass(frozen=True)
class Ok(Result[T, E]):
    value: T

    def map(self, f):         return Ok(f(self.value))
    def flat_map(self, f):    return f(self.value)
    def map_error(self, f):   return Ok(self.value)
    def get_or_else(self, d): return self.value
    def get_or_raise(self):   return self.value

@final
@dataclass(frozen=True)
class Err(Result[T, E]):
    error: E

    def map(self, f):         return Err(self.error)
    def flat_map(self, f):    return Err(self.error)
    def map_error(self, f):   return Err(f(self.error))
    def get_or_else(self, d): return d
    def get_or_raise(self):   raise RuntimeError(str(self.error))

What you get. Functions that can fail announce it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import json
from pathlib import Path

def try_load_config(path: str) -> Result[dict, str]:
    try:
        return Ok(json.loads(Path(path).read_text()))
    except FileNotFoundError:
        return Err(f"file not found: {path}")
    except json.JSONDecodeError as exc:
        return Err(f"invalid JSON: {exc}")

Callers can’t ignore the error. They choose how to handle it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Option A: structural pattern match (Python 3.10+)
match try_load_config("config.json"):
    case Ok(value=cfg):
        start_service(cfg)
    case Err(error=msg):
        raise SystemExit(f"startup failed: {msg}")

# Option B: combinator chain (all versions, works like Scala's for-comprehension)
result = (
    try_load_config("app.json")
    .flat_map(try_validate_schema)
    .flat_map(try_build_pipeline)
    .map_error(lambda e: f"pipeline init: {e}")
)

The flat_map chain is railroad-oriented: the first Err short-circuits everything after it. No nested try/except. No if result is None. The error path is explicit and unavoidable.

The rule I follow: pure functions return Result. Effectful functions (I/O, network, filesystem) can raise. The try_* wrapper converts the exception to an Err at the boundary.

flowchart LR
    L["try_load_config()"] --> D1{"Ok or Err?"}
    D1 -->|"Ok(config)"| V["try_validate_schema()"]
    D1 -->|"Err(msg)"| EXIT["Err - chain stops"]
    V --> D2{"Ok or Err?"}
    D2 -->|"Ok(valid)"| B["try_build_pipeline()"]
    D2 -->|"Err(msg)"| EXIT
    B --> D3{"Ok or Err?"}
    D3 -->|"Ok(pipeline)"| RUN["run(pipeline)"]
    D3 -->|"Err(msg)"| EXIT

    style L fill:#e1f5ff,stroke:#0066cc,color:#000
    style D1 fill:#fff4e1,stroke:#cc8800,color:#000
    style D2 fill:#fff4e1,stroke:#cc8800,color:#000
    style D3 fill:#fff4e1,stroke:#cc8800,color:#000
    style V fill:#f0e1ff,stroke:#8800cc,color:#000
    style B fill:#f0e1ff,stroke:#8800cc,color:#000
    style RUN fill:#e1ffe1,stroke:#2d7a2d,color:#000
    style EXIT fill:#ffe1e1,stroke:#cc0000,color:#000

Part 4 - Protocol as structural type class

Why. The classical OOP solution for “this function should accept any X that has method Y” is inheritance: define a base class, subclass it. The problem is tight coupling. Your BaseExtractor, once exported, is part of your public API forever. Every library that wants to plug in must import it. A plain function or lambda never qualifies even if the signature matches perfectly.

Scala’s answer is type classes and structural types:

1
2
3
4
trait Extractor[A] {
  def extract(window: Array[Float], config: A): Array[Float]
}
// Any type with an instance of Extractor[A] in scope is valid

Python’s answer is Protocol - structural subtyping:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Python 3.8+ (typing.Protocol)
from typing import Protocol, runtime_checkable
import numpy as np

@runtime_checkable
class Extractor(Protocol):
    def __call__(
        self,
        window: np.ndarray,
        config: object = None,
    ) -> np.ndarray: ...

What you get. Any callable with the right signature satisfies Extractor automatically, without inheriting from it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def build_table(
    run: RunData,
    extractor: Extractor = default_extract,
) -> list[dict]:
    ...

# All of these work - no base class needed
build_table(run)
build_table(run, extractor=mfcc_extract)
build_table(run, extractor=lambda w, _: np.array([w.mean(), w.std()]))
build_table(run, extractor=EnergyExtractor())  # any callable class instance

@runtime_checkable lets you use isinstance(fn, Extractor) for defensive checks at boundaries.

Why this is the right design: the pipeline owns the Protocol definition. Implementations don’t know about the pipeline. You can add new implementations from a different module, a different team, even a third-party package - without touching the pipeline. This is the Open-Closed Principle through structural typing, not class hierarchies.

Python version notes
Protocol requires Python 3.8+. For 3.7, install typing_extensions - from typing_extensions import Protocol. @runtime_checkable is also in typing_extensions. The rest works unchanged.

Part 5 - Exhaustive match on sealed types

Why. Adding a new variant to an enum or sealed type should force every switch/match statement on that type to be updated. In Scala, the compiler enforces this on sealed trait hierarchies - you get a warning if a match is non-exhaustive. Most Python code today uses if isinstance(x, A): ... elif isinstance(x, B): ... chains where forgetting a case is silent.

How. Python 3.10 structural pattern matching on @final frozen dataclasses:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Python 3.10+
from result import Ok, Err, Result

def handle(result: Result[Config, str]) -> None:
    match result:
        case Ok(value=cfg):
            start(cfg)
        case Err(error=msg):
            log.error(msg)
    # mypy + pyright flag non-exhaustive match when Ok/Err are @final

Enums work the same way:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# StrEnum requires Python 3.11+. For 3.10, use str, Enum instead:
# class Status(str, Enum): ...
from enum import StrEnum

class Status(StrEnum):
    PENDING  = "pending"
    ACTIVE   = "active"
    ARCHIVED = "archived"

def describe(s: Status) -> str:
    match s:
        case Status.PENDING:  return "waiting for approval"
        case Status.ACTIVE:   return "live"
        case Status.ARCHIVED: return "read-only"
    # mypy warns: Missing return statement - non-exhaustive

Match goes further than switch. You can destructure, add guards, and match on nested structure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
@dataclass(frozen=True)
class Request:
    method: str
    path: str
    body: str | None

def route(req: Request) -> str:
    match req:
        case Request(method="GET", path="/health"):
            return "ok"
        case Request(method="POST", path="/infer", body=payload) if payload:
            return infer(payload)
        case Request(method=m, path=p):
            return f"404: {m} {p}"
  • Guards (if payload) run after the structural pattern matches
  • Destructuring with body=payload binds the field value to a local name
  • case _: is your explicit fallthrough - omitting it and adding # type: ignore is a code smell

Pre-3.10 alternative. If you’re stuck on Python 3.9 or earlier, isinstance chains get you most of the way. You lose guard syntax and structured destructuring, but mypy still narrows types after each isinstance check:

1
2
3
4
5
6
# Python 3.5+ - no match, but type narrowing still works
def handle_legacy(result: Result[Config, str]) -> None:
    if isinstance(result, Ok):
        start(result.value)  # mypy knows result.value is Config here
    elif isinstance(result, Err):
        log.error(result.error)  # mypy knows result.error is str here

Part 6 - Literal types for controlled values

Why. A function that accepts method: str will accept anything - "GET", "GETT", "Bananas". There’s no way to say “only these four strings are legal” without runtime validation that could silently get missed.

Scala has literal types:

1
2
val method: "GET" | "POST" = "GET"
// "BANANAS" does not compile

How. Python’s Literal does the same:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Python 3.8+
from typing import Literal

HttpMethod = Literal["GET", "POST", "PUT", "DELETE"]

def fetch(url: str, method: HttpMethod = "GET") -> bytes:
    ...

fetch("https://api.example.com/data", method="GETT")
# mypy: Argument "method" has incompatible type "str"; expected "Literal['GET', 'POST', 'PUT', 'DELETE']"

Literal composes well with Union:

1
2
3
4
5
6
7
from typing import Union

# A result is either a status code or an error string
ApiResult = Union[Literal[200, 201, 204], str]

def call(url: str) -> ApiResult:
    ...

And it’s useful for boolean flags that should be distinct:

1
2
3
4
LogLevel = Literal["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"]

def log(msg: str, level: LogLevel = "INFO") -> None:
    ...

When Literal is overkill. If the set of values changes at runtime (loaded from a database, config file, or external API), Literal won’t help - it’s a static type. Use an Enum instead. If the set is fixed at design time, Literal is the right tool.

Python version notes
Literal requires Python 3.8+. For 3.7, use typing_extensions.Literal. For older versions, use Enum - it provides runtime enforcement at the cost of some ergonomics.

Part 7 - TypedDict for structured data contracts

Why. dict[str, Any] is the Object of Python - you can put anything in it and nothing warns you when you access a key that doesn’t exist or has the wrong type. Across API boundaries, between services, or in event-driven pipelines, these untyped dicts are where bugs hide.

Scala models this with case classes. The Python equivalent for existing dict-shaped data is TypedDict:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Python 3.8+
from typing import TypedDict

class UserEvent(TypedDict):
    user_id: str
    event_type: str
    timestamp: float
    metadata: dict[str, str]

def process_event(event: UserEvent) -> None:
    # mypy knows event["user_id"] is str
    # mypy errors on event["nonexistent_key"]
    ...

For partial dicts (not all keys required), use total=False or mix Required/NotRequired:

1
2
3
4
5
6
7
8
# Python 3.11+ for Required/NotRequired in typing (3.9.3+ in typing_extensions)
from typing import TypedDict, Required, NotRequired

class CreateUserRequest(TypedDict):
    name: Required[str]
    email: Required[str]
    display_name: NotRequired[str]    # optional
    role: NotRequired[str]            # optional

TypedDict is structural - any dict with the right keys and types satisfies it. You don’t need to call a constructor. This makes it the right choice for external data (JSON from an API, database rows, config files) where you receive plain dicts but want type-checked access.

TypedDict vs dataclass
Use TypedDict when you’re consuming dicts from outside your code (APIs, JSON, db rows). Use @dataclass(frozen=True) for your own domain objects. Once you’ve validated external data into a TypedDict, convert it to a frozen dataclass as early as possible.

Part 8 - Generics, TypeVar, and variance

Why. A Stack that accepts Any is not a stack - it’s a bag of surprises. You push an int, you pop a str. No warning. Scala’s generics with variance annotations prevent this at compile time.

How. Python’s TypeVar with Generic:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# Python 3.5+ for Generic/TypeVar. Python 3.12+ for class Stack[T] syntax.
from typing import TypeVar, Generic

T = TypeVar("T")

class Stack(Generic[T]):
    def __init__(self) -> None:
        self._items: list[T] = []

    def push(self, item: T) -> None:
        self._items.append(item)

    def pop(self) -> T:
        return self._items.pop()

    def peek(self) -> T:
        return self._items[-1]

stack: Stack[int] = Stack()
stack.push(42)        # ok
stack.push("hello")   # mypy: Argument 1 to "push" has incompatible type "str"; expected "int"

Variance controls whether Stack[Dog] can substitute for Stack[Animal]:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Covariance: Stack[Dog] is-a Stack[Animal]  (Scala: class Stack[+T])
T_co = TypeVar("T_co", covariant=True)

class ReadOnlyStack(Generic[T_co]):
    def peek(self) -> T_co: ...

# Contravariance: Sink[Animal] is-a Sink[Dog]  (Scala: class Sink[-T])
T_contra = TypeVar("T_contra", contravariant=True)

class Sink(Generic[T_contra]):
    def consume(self, item: T_contra) -> None: ...

TypeVar with bounds - like Scala’s T <: Animal:

1
2
3
4
5
6
7
8
9
from typing import TypeVar

Comparable = TypeVar("Comparable", bound="SupportsLessThan")

class SupportsLessThan(Protocol):
    def __lt__(self, other: object) -> bool: ...

def minimum(a: Comparable, b: Comparable) -> Comparable:
    return a if a < b else b

Python 3.12+ shorthand drops the TypeVar boilerplate:

1
2
3
4
5
6
7
# Python 3.12+
class Stack[T]:
    def push(self, item: T) -> None: ...
    def pop(self) -> T: ...

def minimum[T: SupportsLessThan](a: T, b: T) -> T:
    return a if a < b else b

This is PEP 695 - same semantics, much less ceremony.


Part 9 - Overloaded signatures with @overload

Why. Some functions behave differently based on input type - not because they’re poorly designed, but because the domain genuinely calls for it. json.loads returns Any, but you know it returns dict when you pass str. @overload lets you encode the relationship precisely.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Python 3.5+
from typing import overload

@overload
def parse(data: str) -> dict: ...
@overload
def parse(data: bytes) -> dict: ...
@overload
def parse(data: dict) -> dict: ...

def parse(data):
    if isinstance(data, (str, bytes)):
        return json.loads(data)
    return data

The @overload stubs are type-checker-only - they never execute. The final implementation without @overload is what runs. Mypy sees the stubs and enforces the return type per input type.

Where this shines: any function that has a conditional return type based on input. Without @overload, you’d write -> str | bytes | dict, which forces callers to check types themselves. With it, the correct type flows through:

1
2
result = parse(b'{"key": "value"}')
# mypy knows result: dict - no isinstance check needed downstream

Part 10 - TypeGuard for safe narrowing

Why. isinstance(x, str) narrows the type inside the if block. But custom validation functions don’t narrow anything - mypy has no idea that is_valid_email(s) implies s: str. TypeGuard bridges that gap.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Python 3.10+
from typing import TypeGuard

def is_non_empty_string(val: object) -> TypeGuard[str]:
    return isinstance(val, str) and len(val) > 0

data: object = get_user_input()

if is_non_empty_string(data):
    # mypy knows data: str here
    print(data.upper())

This is especially useful for custom validators and parsers at I/O boundaries:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def is_event_dict(obj: object) -> TypeGuard[UserEvent]:
    return (
        isinstance(obj, dict)
        and isinstance(obj.get("user_id"), str)
        and isinstance(obj.get("timestamp"), float)
    )

raw = json.loads(response.text)
if is_event_dict(raw):
    process_event(raw)  # mypy: raw is UserEvent here
Python version notes
TypeGuard is in typing from Python 3.10. For 3.9 and earlier, use typing_extensions.TypeGuard - same API, backported. No other workaround needed.

Part 11 - Final constants and ClassVar

Why. MAX_RETRIES = 3 at module level can be reassigned by anyone, anywhere. In Scala, val MAX_RETRIES = 3 is immutable by definition. Python’s Final gives the same guarantee at the type-checker level.

1
2
3
4
5
6
7
8
9
# Python 3.8+
from typing import Final, ClassVar
from dataclasses import dataclass

MAX_RETRIES: Final = 3
BASE_URL: Final[str] = "https://api.example.com"

# mypy error: Cannot assign to a Final name "MAX_RETRIES"
MAX_RETRIES = 5

ClassVar marks class-level attributes that should not appear on instances:

1
2
3
4
5
6
@dataclass(frozen=True)
class Config:
    DEFAULT_TIMEOUT: ClassVar[float] = 30.0   # class attribute, not a field
    host: str
    port: int
    timeout: float = 30.0

Without ClassVar, mypy can’t distinguish between dataclass fields and class-level constants. With it, Config.DEFAULT_TIMEOUT is valid but Config("localhost", 8080).DEFAULT_TIMEOUT = 60.0 is a type error.


Part 12 - Phantom types: compile-time state machines

Why. Some operations must happen in a fixed order. You should never call .build() on a pipeline builder before .add_source(). The naive protection is a runtime ValueError - and honestly, most codebases stop there. The better protection is making the wrong order unrepresentable - the type system refuses to compile it.

This is the Builder + Phantom Types pattern from Scala. Phantom types exist only at the type-checker level; they have zero runtime overhead.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Scala: phantom type builder
sealed trait BuilderState
sealed trait Empty     extends BuilderState
sealed trait WithSource extends BuilderState
sealed trait Complete  extends BuilderState

case class PipelineBuilder[S <: BuilderState] private (steps: List[Step]) {
  def addSource(src: Source): PipelineBuilder[WithSource] = ???
  def addSink(sink: Sink)(implicit ev: S =:= WithSource): PipelineBuilder[Complete] = ???
  def build(implicit ev: S =:= Complete): Pipeline = ???
}

Python translation uses Generic[S] where S is a Literal type acting as the phantom:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Python 3.10+
from __future__ import annotations
from dataclasses import dataclass, field
from typing import Generic, Literal, TypeVar, final

# Phantom state tags - never instantiated, just used as type parameters
Empty      = Literal["empty"]
WithSource = Literal["with_source"]
Complete   = Literal["complete"]

S = TypeVar("S")

@dataclass
class PipelineBuilder(Generic[S]):
    _source: str | None = field(default=None, repr=False)
    _sink:   str | None = field(default=None, repr=False)

    @staticmethod
    def new() -> PipelineBuilder[Literal["empty"]]:
        return PipelineBuilder()

    def add_source(
        self: PipelineBuilder[Literal["empty"]],
        source: str,
    ) -> PipelineBuilder[Literal["with_source"]]:
        return PipelineBuilder(_source=source)

    def add_sink(
        self: PipelineBuilder[Literal["with_source"]],
        sink: str,
    ) -> PipelineBuilder[Literal["complete"]]:
        return PipelineBuilder(_source=self._source, _sink=sink)

    def build(
        self: PipelineBuilder[Literal["complete"]],
    ) -> str:
        return f"pipeline({self._source} -> {self._sink})"

What you get:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Correct order - type checks pass
pipeline = PipelineBuilder.new().add_source("csv").add_sink("db").build()

# Wrong order - mypy error at the wrong call site
PipelineBuilder.new().add_sink("db")
# mypy: Argument 1 to "add_sink" has incompatible type
#       "PipelineBuilder[Literal['empty']]";
#       expected "PipelineBuilder[Literal['with_source']]"

PipelineBuilder.new().add_source("csv").build()
# mypy: Argument 1 to "build" has incompatible type
#       "PipelineBuilder[Literal['with_source']]";
#       expected "PipelineBuilder[Literal['complete']]"

The phantom type S is never stored, never allocated, never checked at runtime. It only exists so mypy can track builder state transitions. The runtime cost is zero.

flowchart LR
    N["PipelineBuilder.new()\nstate = empty"] -->|add_source| S["PipelineBuilder\nstate = with_source"]
    S -->|add_sink| C["PipelineBuilder\nstate = complete"]
    C -->|build| R["pipeline(source → sink)"]

    N -. "add_sink() ✗" .-> E1["mypy error:\nexpected with_source"]
    S -. "build() ✗" .-> E2["mypy error:\nexpected complete"]

    style N fill:#e1f5ff,stroke:#0066cc,color:#000
    style S fill:#fff4e1,stroke:#cc8800,color:#000
    style C fill:#e1ffe1,stroke:#2d7a2d,color:#000
    style R fill:#e1ffe1,stroke:#2d7a2d,color:#000
    style E1 fill:#ffe1e1,stroke:#cc0000,color:#000
    style E2 fill:#ffe1e1,stroke:#cc0000,color:#000
Python version notes
Needs Python 3.10+ for Self-style binding via self: PipelineBuilder[...]. For 3.8-3.9, the same pattern works but requires from __future__ import annotations and the type of self must be declared as a string literal. The phantom approach also works in Python 3.8 using a separate TypeVar per state.

Part 13 - Smart constructors: parse, don’t validate

Why. There’s a subtle but important difference between “validate” and “parse.” Validation checks whether data is valid but returns the same unproven type. Parsing produces a new type that carries proof of the validation.

With __post_init__ (Part 2), an Order that passes construction is valid - but the type is still called Order and any function that receives an Order can’t tell from the type whether it’s been checked. This is fine for small codebases. In larger ones, you want the type itself to be proof.

1
2
3
4
5
6
7
// Scala: private constructor + smart constructor returning Either
final case class Email private (value: String)
object Email {
  def parse(raw: String): Either[String, Email] =
    if (raw.contains("@")) Right(Email(raw))
    else Left(s"invalid email: $raw")
}

Python translation with a private constructor pattern:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Python 3.5+  (works everywhere)
from __future__ import annotations
from dataclasses import dataclass

@dataclass(frozen=True)
class Email:
    _value: str  # convention: treat as private

    @classmethod
    def parse(cls, raw: str) -> Result[Email, str]:
        raw = raw.strip().lower()
        if "@" not in raw or raw.count("@") != 1:
            return Err(f"invalid email format: {raw!r}")
        local, domain = raw.split("@")
        if not local or not domain or "." not in domain:
            return Err(f"invalid email structure: {raw!r}")
        return Ok(cls(_value=raw))

    @property
    def value(self) -> str:
        return self._value

    def domain(self) -> str:
        return self._value.split("@")[1]

Usage - the caller always handles both outcomes:

1
2
3
4
5
match Email.parse("[email protected]"):
    case Ok(value=email):
        send_welcome(email)
    case Err(error=msg):
        return Err(f"registration failed: {msg}")

The key difference from __post_init__. __post_init__ raises. The function’s type signature doesn’t tell you it can fail. parse returns Result[Email, str] - callers know at a glance that construction can fail and what the failure type is. The type signature is the contract.

Stricter enforcement with __new__. If you want to truly block direct construction:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
class Email:
    __slots__ = ("_value",)

    def __new__(cls, *args, **kwargs):
        raise TypeError("use Email.parse(raw) to construct")

    @classmethod
    def _trusted(cls, value: str) -> Email:
        obj = object.__new__(cls)
        object.__setattr__(obj, "_value", value)
        return obj

    @classmethod
    def parse(cls, raw: str) -> Result[Email, str]:
        ...
        return Ok(cls._trusted(raw))

This is heavier - use it when you genuinely want to prevent accidental direct construction.


Part 14 - Accumulating errors: the Validated pattern

Why. Result[T, E] is fail-fast. The first Err short-circuits the chain. That’s exactly right for sequential logic - if step 1 fails, step 2 can’t proceed.

But form validation is different. When a user submits a registration form with an invalid email, a short password, and a missing name, you want all three errors at once - not just the first one. Fail-fast would mean three round trips to fix one form.

Scala solves this with Validated[E, A] where E has a Semigroup instance (can combine errors):

1
2
3
4
5
6
def validateUser(name: String, email: String, age: Int):
    Validated[List[String], User] =
  validateName(name)
    .product(validateEmail(email))
    .product(validateAge(age))
    .map { case ((n, e), a) => User(n, e, a) }

Python translation - pure stdlib, no cats:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# Python 3.5+
from __future__ import annotations
from dataclasses import dataclass
from typing import TypeVar, Generic

T = TypeVar("T")
U = TypeVar("U")

@dataclass(frozen=True)
class Valid(Generic[T]):
    value: T
    errors: tuple[str, ...] = ()

    @staticmethod
    def ok(value: T) -> Valid[T]:
        return Valid(value=value)

    @staticmethod
    def err(*errors: str) -> Valid[None]:
        return Valid(value=None, errors=errors)  # type: ignore[arg-type]

    @property
    def is_valid(self) -> bool:
        return len(self.errors) == 0

    def and_then(self, other: Valid[U]) -> Valid[tuple[T, U] | None]:
        combined = self.errors + other.errors
        if combined:
            return Valid(value=None, errors=combined)  # type: ignore[arg-type]
        return Valid(value=(self.value, other.value))  # type: ignore[arg-type]


def check_name(name: str) -> Valid[str]:
    if not name.strip():
        return Valid.err("name cannot be empty")
    if len(name) < 2:
        return Valid.err("name must be at least 2 characters")
    return Valid.ok(name.strip())

def check_email(email: str) -> Valid[str]:
    if "@" not in email:
        return Valid.err("email must contain @")
    return Valid.ok(email.lower())

def check_age(age: int) -> Valid[int]:
    if age < 0 or age > 150:
        return Valid.err(f"age {age} is out of range [0, 150]")
    return Valid.ok(age)


# Collect all errors at once
name_result  = check_name("")
email_result = check_email("not-an-email")
age_result   = check_age(-5)

combined = name_result.and_then(email_result).and_then(age_result)
print(combined.errors)
# ("name cannot be empty", "email must contain @", "age -5 is out of range [0, 150]")

When to use Valid vs Result. The rule is about dependency:

  • If step B needs the output of step A, use Result (fail-fast, sequential)
  • If steps A, B, C are independent checks, use Valid (collect-all, parallel)

Typical split: Valid at I/O boundaries (form input, config parsing, API request validation). Result inside domain logic.

flowchart TD
    Q{"Do the checks\ndepend on each other?"}
    Q -->|"Yes - step B needs step A output"| USE_R["Use Result\nfail-fast railway"]
    Q -->|"No - independent checks"| USE_V["Use Valid\ncollect all errors"]

    USE_R --> EX_R["load config\n→ validate schema\n→ build pipeline"]
    USE_V --> EX_V["check name\n+ check email\n+ check age\n→ all errors at once"]

    style Q fill:#fff4e1,stroke:#cc8800,color:#000
    style USE_R fill:#e1f5ff,stroke:#0066cc,color:#000
    style USE_V fill:#f0e1ff,stroke:#8800cc,color:#000
    style EX_R fill:#e1ffe1,stroke:#2d7a2d,color:#000
    style EX_V fill:#e1ffe1,stroke:#2d7a2d,color:#000

Part 15 - Protocol composition: small interfaces that combine

Why. Scala’s strength is fine-grained traits that compose:

1
2
3
4
5
trait Readable   { def read: String }
trait Writable   { def write(s: String): Unit }
trait Seekable   { def seek(pos: Int): Unit }
type ReadWrite   = Readable with Writable
type ReadWriteSeek = Readable with Writable with Seekable

Python’s Protocol composes the same way - Interface Segregation Principle enforced by the type system, not by convention.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# Python 3.8+
from typing import Protocol, runtime_checkable

@runtime_checkable
class Readable(Protocol):
    def read(self, n: int = -1) -> bytes: ...

@runtime_checkable
class Writable(Protocol):
    def write(self, data: bytes) -> int: ...

@runtime_checkable
class Seekable(Protocol):
    def seek(self, pos: int, whence: int = 0) -> int: ...
    def tell(self) -> int: ...

# Compose: a type that is both Readable and Writable
class ReadWrite(Readable, Writable, Protocol): ...

# A type that is all three
class ReadWriteSeek(Readable, Writable, Seekable, Protocol): ...

Now functions declare exactly the capability they need:

1
2
3
4
5
6
7
8
9
def copy_data(src: Readable, dst: Writable, chunk: int = 4096) -> int:
    total = 0
    while data := src.read(chunk):
        total += dst.write(data)
    return total

def random_access_read(src: ReadWriteSeek, pos: int, n: int) -> bytes:
    src.seek(pos)
    return src.read(n)

open("file", "rb") satisfies Readable. io.BytesIO satisfies ReadWriteSeek. A custom network stream satisfies only Readable. None of them inherit from your Protocol. They just match the shape.

The ISP payoff. copy_data doesn’t ask for Seekable - it doesn’t need it. If tomorrow you need to copy from a network socket (not seekable), no change is required. The function already accepts any Readable. Contrast with an inheritance-based approach where you’d need to either break the hierarchy or add a no-op seek() to the socket class.

flowchart LR
    R["Readable\nread()"] --> RW["ReadWrite\nread() + write()"]
    W["Writable\nwrite()"] --> RW
    RW --> RWS["ReadWriteSeek\nread() + write() + seek()"]
    S["Seekable\nseek() + tell()"] --> RWS

    RWS --> IMPL1["io.BytesIO\nsatisfies all three"]
    R --> IMPL2["network socket\nReadable only"]
    RW --> FN1["copy_data(src, dst)\nneeds Readable + Writable"]
    RWS --> FN2["random_access_read()\nneeds all three"]

    style R fill:#e1f5ff,stroke:#0066cc,color:#000
    style W fill:#e1f5ff,stroke:#0066cc,color:#000
    style S fill:#e1f5ff,stroke:#0066cc,color:#000
    style RW fill:#f0e1ff,stroke:#8800cc,color:#000
    style RWS fill:#f0e1ff,stroke:#8800cc,color:#000
    style IMPL1 fill:#e1ffe1,stroke:#2d7a2d,color:#000
    style IMPL2 fill:#e1ffe1,stroke:#2d7a2d,color:#000
    style FN1 fill:#fff4e1,stroke:#cc8800,color:#000
    style FN2 fill:#fff4e1,stroke:#cc8800,color:#000

Part 16 - Callable type aliases: function contracts as types

Why. Functions are values in Python. When you accept a callback, a transform, or a strategy, the type is Callable[[ArgTypes], ReturnType]. Writing these inline everywhere is noisy, and there’s no name to communicate the domain meaning.

Scala handles this with type aliases and function types:

1
2
3
type Transformer[A, B] = A => B
type Validator[A]      = A => Either[String, A]
type EventHandler      = Event => IO[Unit]

Python’s version:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Python 3.12+: new generic type alias syntax (PEP 695)
from typing import Callable

type Transformer[A, B] = Callable[[A], B]
type Validator[A]      = Callable[[A], "Result[A, str]"]
type Predicate[A]      = Callable[[A], bool]
type EventHandler      = Callable[[dict], None]

# Python 3.10-3.11: TypeAlias
from typing import TypeAlias
Transformer_: TypeAlias = Callable[[dict], dict]

# Python 3.5+: plain assignment (no TypeAlias annotation)
Predicate_ = Callable[[str], bool]

Now function signatures read as domain language:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
def pipeline(
    source:    Callable[[], list[dict]],
    transform: Transformer[dict, dict],
    validate:  Validator[dict],
    sink:      Callable[[dict], None],
) -> Result[int, str]:
    count = 0
    for record in source():
        transformed = transform(record)
        match validate(transformed):
            case Ok(value=valid):
                sink(valid)
                count += 1
            case Err(error=msg):
                return Err(f"validation failed on record {count}: {msg}")
    return Ok(count)

The Transformer[dict, dict] and Validator[dict] communicate intent. A future reader knows at a glance that transform is a pure data converter and validate can fail. Without the aliases, all three would be Callable[..., ...] and you’d need to read the implementation.

Combine with @overload for multiple contract flavours:

1
2
3
4
@overload
def make_validator(strict: Literal[True]) -> Validator[str]: ...
@overload
def make_validator(strict: Literal[False]) -> Predicate[str]: ...

Part 17 - Extension methods: enriching types you don’t own

Why. In Scala, implicit class lets you add methods to any existing type without modifying it:

1
2
3
4
5
6
implicit class RichString(s: String) {
  def toSnakeCase: String = s.replaceAll("(?<=[a-z])([A-Z])", "_$1").toLowerCase
  def isValidEmail: Boolean = s.contains("@")
}

"MyFieldName".toSnakeCase  // "my_field_name"

Python doesn’t have implicit conversions, but there are two clean patterns that give you most of this with full type safety.

Option A: typed wrapper class. Wraps the original type, exposes only the methods you care about:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
from dataclasses import dataclass

@dataclass(frozen=True)
class RichStr:
    value: str

    def to_snake_case(self) -> str:
        import re
        return re.sub(r"(?<!^)(?=[A-Z])", "_", self.value).lower()

    def is_valid_email(self) -> bool:
        return "@" in self.value and self.value.count("@") == 1

    def truncate(self, max_len: int, suffix: str = "...") -> str:
        if len(self.value) <= max_len:
            return self.value
        return self.value[: max_len - len(suffix)] + suffix

# Usage
rich = RichStr("MyFieldName")
rich.to_snake_case()   # "my_field_name"
rich.is_valid_email()  # False

Option B: module-level typed functions. No wrapper, just well-typed utility functions. This is the more Pythonic form and equally type-safe:

1
2
3
4
5
6
7
8
def to_snake_case(s: str) -> str:
    import re
    return re.sub(r"(?<!^)(?=[A-Z])", "_", s).lower()

def truncate(s: str, max_len: int, suffix: str = "...") -> str:
    if len(s) <= max_len:
        return s
    return s[: max_len - len(suffix)] + suffix

When the wrapper pays off. Use RichStr when you’re chaining multiple operations and want a fluent style - RichStr(value).to_snake_case().upper() reads better than nested function calls. Use plain functions when you’re not chaining.

Extending with Protocol. If you want multiple types to support the same enrichment, define the Protocol for the base capability and write functions against it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
class HasLength(Protocol):
    def __len__(self) -> int: ...

def is_empty(x: HasLength) -> bool:
    return len(x) == 0

def size_bucket(x: HasLength) -> Literal["small", "medium", "large"]:
    n = len(x)
    if n < 10:   return "small"
    if n < 100:  return "medium"
    return "large"

# Works for str, list, dict, bytes, any custom type with __len__
is_empty("")      # True
size_bucket([1,2,3])  # "small"

Part 18 - Recursive generics: type-safe trees and collections

Why. A tree node that holds Any children is not a tree - it’s a risk. Scala’s recursive ADTs are precise about what a node contains:

1
2
3
sealed trait Tree[+A]
case class Leaf[A](value: A) extends Tree[A]
case class Branch[A](left: Tree[A], right: Tree[A]) extends Tree[A]

Python handles this with recursive Generic types. The key is using from __future__ import annotations (or string literals) to break the forward reference cycle:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# Python 3.10+ (cleaner with match); 3.7+ with isinstance chains
from __future__ import annotations
from dataclasses import dataclass
from typing import Generic, TypeVar, final

A = TypeVar("A")

class Tree(Generic[A]):
    __slots__ = ()

@final
@dataclass(frozen=True)
class Leaf(Tree[A]):
    value: A

@final
@dataclass(frozen=True)
class Branch(Tree[A]):
    left:  Tree[A]
    right: Tree[A]

Recursive functions on the tree are type-safe:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
from typing import Callable, TypeVar

B = TypeVar("B")

def depth(tree: Tree[A]) -> int:
    match tree:
        case Leaf():
            return 0
        case Branch(left=l, right=r):
            return 1 + max(depth(l), depth(r))

def map_tree(tree: Tree[A], f: Callable[[A], B]) -> Tree[B]:
    match tree:
        case Leaf(value=v):
            return Leaf(f(v))
        case Branch(left=l, right=r):
            return Branch(map_tree(l, f), map_tree(r, f))

# mypy knows the result is Tree[str] when f is Callable[[int], str]
string_tree = map_tree(Branch(Leaf(1), Leaf(2)), str)

Same pattern for linked lists, expression trees (ASTs), JSON values - anything with recursive structure. mypy follows the generic parameter through all levels.


Part 19 - Type class derivation: Comparable, Serializable, Hashable by contract

Why. In Scala, type classes like Ordering[T], Eq[T], and Show[T] are defined independently of the types they operate on. You provide an instance, the system uses it. Python’s Protocol + ABC gives you the same separation.

The goal is a Sorter or JsonWriter that works for any type that provides the right capability, without those types knowing about the sorter:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# Python 3.8+
from typing import Protocol, TypeVar, runtime_checkable
from abc import abstractmethod

T = TypeVar("T")

# The "type class" - a capability contract
@runtime_checkable
class JsonWritable(Protocol):
    def to_json_dict(self) -> dict: ...

class XmlWritable(Protocol):
    def to_xml_str(self) -> str: ...

# The "summoner" - works for any type satisfying the protocol
def serialize_json(obj: JsonWritable) -> str:
    import json
    return json.dumps(obj.to_json_dict())

# Domain types implement the contract independently
@dataclass(frozen=True)
class User:
    name: str
    email: str

    def to_json_dict(self) -> dict:
        return {"name": self.name, "email": self.email}

@dataclass(frozen=True)
class Product:
    sku: str
    price: float

    def to_json_dict(self) -> dict:
        return {"sku": self.sku, "price": self.price}

# Both work - no shared base class, no import of serialize_json in domain types
serialize_json(User("Alice", "[email protected]"))
serialize_json(Product("SKU-001", 9.99))

For parametric instances (type class for list[T] given an instance for T), use Generic:

1
2
3
4
5
6
7
8
9
from typing import Callable, Generic, Protocol, TypeVar

T = TypeVar("T")

class Sortable(Protocol[T]):
    def __lt__(self, other: T) -> bool: ...

def sorted_typed(items: list[T], key: Callable[[T], int] | None = None) -> list[T]:
    return sorted(items, key=key)

Scala 3’s given/using vs Python. Scala 3 can inject type class instances automatically. Python can’t - you pass them explicitly. But because Protocol is structural, you rarely need to pass anything at all: if the type already has the method, it qualifies. The runtime cost of isinstance(obj, JsonWritable) check is O(1).


Part 20 - Reader monad: dependency injection without globals

Why. Real systems have configuration, loggers, database handles, clocks. Yeah, you know the pattern - config.py imported everywhere, tests that monkey-patch module-level state, surprise failures when the import order is wrong. The Scala approach threads a reader environment R through every function that needs it and keeps the functions pure:

1
2
3
4
5
case class Config(dbUrl: String, timeout: Int)
type Reader[A] = Config => A

def fetchUser(id: Int): Reader[User] =
  config => db.query(config.dbUrl, s"SELECT * FROM users WHERE id=$id")

Python has no native Reader type, but a frozen dataclass with a callable field - or simply functions that take config explicitly - achieves the same effect. The key is that the config is never mutated and never stored in a global:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Python 3.7+
from __future__ import annotations
from dataclasses import dataclass
from typing import TypeVar, Generic, Callable

R = TypeVar("R")
A = TypeVar("A")
B = TypeVar("B")

@dataclass(frozen=True)
class Reader(Generic[R, A]):
    run: Callable[[R], A]

    def map(self, f: Callable[[A], B]) -> Reader[R, B]:
        return Reader(lambda r: f(self.run(r)))

    def flat_map(self, f: Callable[[A], Reader[R, B]]) -> Reader[R, B]:
        return Reader(lambda r: f(self.run(r)).run(r))

    @staticmethod
    def ask() -> Reader[R, R]:
        return Reader(lambda r: r)

    @staticmethod
    def pure(value: A) -> Reader[R, A]:
        return Reader(lambda _: value)

Usage - the environment flows down without any global access:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
@dataclass(frozen=True)
class AppConfig:
    db_url: str
    timeout_s: float
    log_level: str

def fetch_user(user_id: int) -> Reader[AppConfig, str]:
    def _run(cfg: AppConfig) -> str:
        return f"SELECT * FROM users WHERE id={user_id} via {cfg.db_url}"
    return Reader(_run)

def fetch_account(account_id: int) -> Reader[AppConfig, str]:
    return Reader(lambda cfg: f"account {account_id} at {cfg.db_url}")

# Compose two readers - both share the same config
combined: Reader[AppConfig, tuple[str, str]] = (
    fetch_user(42)
    .flat_map(lambda u: fetch_account(99).map(lambda a: (u, a)))
)

cfg = AppConfig(db_url="postgres://localhost/app", timeout_s=5.0, log_level="INFO")
user_result, account_result = combined.run(cfg)

The DI payoff. fetch_user and fetch_account are pure functions. No import config at the top. No thread-local. No monkey-patching in tests. In tests, pass a different AppConfig with a test database URL. Done.


Part 21 - Writer monad: pure logging alongside computation

Why. Logging in Python typically means side effects - print() or logging.getLogger() inside business logic. That contaminates pure functions and makes testing harder.

Scala’s Writer[W, A] carries a log alongside the value, purely:

1
2
3
4
type Writer[A] = (List[String], A)

def multiply(x: Int, y: Int): Writer[Int] =
  (List(s"multiplying $x * $y"), x * y)

Python translation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Python 3.7+
from __future__ import annotations
from dataclasses import dataclass
from typing import TypeVar, Generic, Callable

A = TypeVar("A")
B = TypeVar("B")

@dataclass(frozen=True)
class Writer(Generic[A]):
    value: A
    log: tuple[str, ...] = ()

    def map(self, f: Callable[[A], B]) -> Writer[B]:
        return Writer(value=f(self.value), log=self.log)

    def flat_map(self, f: Callable[[A], Writer[B]]) -> Writer[B]:
        result = f(self.value)
        return Writer(value=result.value, log=self.log + result.log)

    @staticmethod
    def pure(value: A) -> Writer[A]:
        return Writer(value=value)

    @staticmethod
    def tell(*messages: str) -> Writer[None]:
        return Writer(value=None, log=messages)

Pure business logic that produces a structured audit trail:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
def parse_config(raw: dict) -> Writer[dict]:
    if "timeout" not in raw:
        raw = {**raw, "timeout": 30}
        return Writer(value=raw, log=("timeout defaulted to 30s",))
    return Writer(value=raw, log=("timeout present, no default applied",))

def validate_config(cfg: dict) -> Writer[dict]:
    if cfg.get("timeout", 0) <= 0:
        return Writer(value={**cfg, "timeout": 1}, log=("timeout was <= 0, corrected to 1s",))
    return Writer(value=cfg, log=(f"timeout validated: {cfg['timeout']}s",))

result = (
    Writer.pure({"host": "localhost"})
    .flat_map(parse_config)
    .flat_map(validate_config)
)

print(result.value)  # {"host": "localhost", "timeout": 30}
print(result.log)    # ("timeout defaulted to 30s", "timeout validated: 30s")

The entire computation is pure. The log is a value. You test it by asserting on result.log - no log capturing infrastructure needed.


Part 22 - State monad: pure stateful computation

Why. Counters, accumulators, IDs - state that threads through a computation. The mutable approach scatters state updates throughout the call stack. Scala’s State[S, A] makes it explicit:

1
2
3
type State[S, A] = S => (S, A)

def nextId: State[Int, Int] = s => (s + 1, s)

Python:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# Python 3.7+
from __future__ import annotations
from dataclasses import dataclass
from typing import TypeVar, Generic, Callable, Tuple

S = TypeVar("S")
A = TypeVar("A")
B = TypeVar("B")

@dataclass(frozen=True)
class State(Generic[S, A]):
    run: Callable[[S], tuple[S, A]]

    def map(self, f: Callable[[A], B]) -> State[S, B]:
        def _run(s: S) -> tuple[S, B]:
            new_s, a = self.run(s)
            return new_s, f(a)
        return State(_run)

    def flat_map(self, f: Callable[[A], State[S, B]]) -> State[S, B]:
        def _run(s: S) -> tuple[S, B]:
            s1, a = self.run(s)
            return f(a).run(s1)
        return State(_run)

    @staticmethod
    def pure(value: A) -> State[S, A]:
        return State(lambda s: (s, value))

    @staticmethod
    def get() -> State[S, S]:
        return State(lambda s: (s, s))

    @staticmethod
    def put(new_state: S) -> State[S, None]:
        return State(lambda _: (new_state, None))

    @staticmethod
    def modify(f: Callable[[S], S]) -> State[S, None]:
        return State(lambda s: (f(s), None))

Assign unique IDs to nodes without mutating any shared counter:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
@dataclass(frozen=True)
class Node:
    node_id: int
    name: str

def label_node(name: str) -> State[int, Node]:
    return State.get().flat_map(
        lambda counter: State.modify(lambda c: c + 1).map(
            lambda _: Node(node_id=counter, name=name)
        )
    )

label_a = label_node("A")
label_b = label_node("B")
label_c = label_node("C")

program: State[int, list[Node]] = (
    label_a.flat_map(lambda a:
    label_b.flat_map(lambda b:
    label_c.map(lambda c: [a, b, c])))
)

initial_counter = 0
final_counter, nodes = program.run(initial_counter)
# final_counter = 3
# nodes = [Node(0, "A"), Node(1, "B"), Node(2, "C")]

When to prefer State over plain mutation. When the state needs to be rolled back, replayed, or tested in isolation, State makes that trivially easy - pass a different initial state. With shared mutable state, rollback means saving and restoring snapshots.


Part 23 - Semigroup and Monoid: algebraic combining

Why. Combining things is everywhere: merging configs, accumulating errors, concatenating logs, summing metrics. Most Python code handles this with ad-hoc if x is None: ... else: ... logic scattered in every loop. Look, it works - until you have twelve things to combine and six different combining strategies.

Scala’s Cats library encodes this as a law-governed abstraction:

1
2
trait Semigroup[A] { def combine(x: A, y: A): A }
trait Monoid[A] extends Semigroup[A] { def empty: A }

Python with Protocol:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Python 3.8+
from typing import Protocol, TypeVar, runtime_checkable

A = TypeVar("A")

@runtime_checkable
class Semigroup(Protocol[A]):
    def combine(self, other: A) -> A: ...

@runtime_checkable
class Monoid(Semigroup[A], Protocol[A]):
    @staticmethod
    def empty() -> A: ...

Concrete instances:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
from dataclasses import dataclass
from functools import reduce

@dataclass(frozen=True)
class Metrics:
    requests: int
    errors: int
    latency_ms: float

    def combine(self, other: Metrics) -> Metrics:
        return Metrics(
            requests=self.requests + other.requests,
            errors=self.errors + other.errors,
            latency_ms=max(self.latency_ms, other.latency_ms),
        )

    @staticmethod
    def empty() -> Metrics:
        return Metrics(requests=0, errors=0, latency_ms=0.0)

@dataclass(frozen=True)
class ConfigFragment:
    settings: dict

    def combine(self, other: ConfigFragment) -> ConfigFragment:
        return ConfigFragment(settings={**self.settings, **other.settings})

    @staticmethod
    def empty() -> ConfigFragment:
        return ConfigFragment(settings={})

Now write one generic fold that works for any Monoid:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
from typing import Iterable

def fold(items: Iterable[A], m: type) -> A:
    return reduce(lambda acc, x: acc.combine(x), items, m.empty())

fragments = [
    ConfigFragment({"host": "localhost"}),
    ConfigFragment({"port": 5432}),
    ConfigFragment({"timeout": 30}),
]

merged_config = fold(fragments, ConfigFragment)
# ConfigFragment(settings={"host": "localhost", "port": 5432, "timeout": 30})

all_metrics = fold([
    Metrics(100, 2, 50.0),
    Metrics(200, 5, 120.0),
    Metrics(50, 0, 30.0),
], Metrics)
# Metrics(requests=350, errors=7, latency_ms=120.0)

One function. Works for Metrics, ConfigFragment, error lists, anything that implements combine + empty. The abstraction pays off when you have many things to fold - you write the combining logic once, in the type, not in every loop.


Part 24 - F-bounded polymorphism: self-referential types

Why. Sometimes a method should return the type of the concrete subclass, not the base class. In Scala this is natural with F <: Sortable[F]:

1
2
3
4
trait Sortable[F <: Sortable[F]] { def compareTo(other: F): Int }
case class Score(value: Int) extends Sortable[Score] {
  def compareTo(other: Score) = value - other.value
}

Python translation uses a bound TypeVar pointing to the implementing class:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Python 3.9+
from __future__ import annotations
from typing import TypeVar, Protocol, runtime_checkable

F = TypeVar("F", bound="Comparable")

@runtime_checkable
class Comparable(Protocol[F]):
    def compare_to(self, other: F) -> int: ...

    def __lt__(self, other: F) -> bool:
        return self.compare_to(other) < 0

    def __eq__(self, other: object) -> bool:
        if not isinstance(other, type(self)):
            return NotImplemented
        return self.compare_to(other) == 0  # type: ignore[arg-type]

Now min_of preserves the concrete type through generics:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def min_of(a: F, b: F) -> F:
    return a if a.compare_to(b) <= 0 else b

@dataclass(frozen=True)
class Score:
    value: int

    def compare_to(self, other: Score) -> int:
        return self.value - other.value

@dataclass(frozen=True)
class Version:
    major: int
    minor: int

    def compare_to(self, other: Version) -> int:
        if self.major != other.major:
            return self.major - other.major
        return self.minor - other.minor


min_of(Score(3), Score(7))               # Score(3) - return type is Score, not Comparable
min_of(Version(1, 2), Version(1, 3))     # Version(1, 2)

Practical use cases. Domain objects that need ordering (Score, Timestamp, Version, Money), builder types that return Self from mutator methods, and any recursive generic that needs to preserve the subtype in its return position.

Python 3.11+ shortcut. The stdlib added Self in typing:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from typing import Self

class Builder:
    def set_name(self, name: str) -> Self:
        self._name = name
        return self

    def set_timeout(self, seconds: float) -> Self:
        self._timeout = seconds
        return self

Self is the simpler form of F-bounded polymorphism for fluent builder chains.


Python version compatibility: what works where

This table summarises everything above. Run python --version and pick your row:

FeatureModuleMin versionNotes
Type hints on functionstyping3.5PEP 484
TypeVar, Generictyping3.5
NewTypetyping3.5.2Became a class in 3.10
@dataclassdataclasses3.7
Protocoltyping3.8Use typing_extensions for 3.7
Literaltyping3.8Use typing_extensions for 3.7
TypedDicttyping3.8Use typing_extensions for 3.7
@final decoratortyping3.8
Final, ClassVartyping3.8
@overloadtyping3.5Works everywhere
X | Y union syntaxbuilt-in3.10Use Union[X, Y] before 3.10
match statementbuilt-in3.10PEP 634. No stdlib backport.
TypeGuardtyping3.10Use typing_extensions for 3.9
TypeAliastyping3.10
slots=True in @dataclassdataclasses3.10Manual __slots__ before
Self typetyping3.11Use TypeVar("T", bound="MyClass") before
Required/NotRequiredtyping3.11typing_extensions for 3.9-3.10
type X = ... soft aliasbuilt-in3.12PEP 695. Use TypeAlias before
class Stack[T]: syntaxbuilt-in3.12PEP 695. Use Generic[T] before
Self type (F-bounded shortcut)typing3.11Use TypeVar("T", bound="C") before
Callable[[A], B] type aliastyping3.5type X = ... syntax in 3.12
Generic[S] phantom type buildertyping3.5No backport needed
@classmethod smart constructorbuilt-in3.0Result[T, str] return type from 3.9+
Recursive Generic[A] (trees)typing3.7from __future__ import annotations needed
Protocol compositiontyping3.8Multiple Protocol inheritance works on 3.8+
Reader/Writer/State monadsdataclasses + typing3.7Pure Python, no library needed
Semigroup/Monoid via Protocoltyping3.8functools.reduce for fold

The minimum viable setup for all these patterns: Python 3.10 gets you match, TypeGuard, X | Y union syntax, Literal, Protocol, frozen dataclasses, and @final. That’s the version where this whole system clicks.

For teams stuck on 3.8 or 3.9: install typing_extensions. It backports everything except match and the X | Y syntax.


Putting it all together: a design session

Here’s a concrete example combining the core patterns on a single domain problem - a task scheduler - to show how they interact:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
from __future__ import annotations
from dataclasses import dataclass, field
from typing import (
    Callable, ClassVar, Final, Generic, Literal,
    NewType, Protocol, TypedDict, TypeVar, final, overload
)

# --- Pattern 1: domain NewTypes ---
TaskId    = NewType("TaskId", str)
Priority  = NewType("Priority", int)
Seconds   = NewType("Seconds", float)

# --- Pattern 6: Literal for controlled values ---
RetryPolicy = Literal["none", "linear", "exponential"]
RunState    = Literal["pending", "running", "done", "failed"]

# --- Pattern 7: TypedDict for external event shape ---
class TaskEvent(TypedDict):
    task_id: str
    state: str
    elapsed_s: float

# --- Pattern 2: frozen dataclass as ADT ---
@dataclass(frozen=True)
class Task:
    DEFAULT_PRIORITY: ClassVar[Priority] = Priority(5)   # Pattern 11: ClassVar
    MAX_RETRIES:      ClassVar[Final[int]] = 3            # Pattern 11: Final

    task_id:      TaskId
    name:         str
    priority:     Priority  = field(default_factory=lambda: Priority(5))
    timeout_s:    Seconds   = Seconds(30.0)
    retry_policy: RetryPolicy = "none"

    def __post_init__(self) -> None:
        if self.priority < 0 or self.priority > 10:
            raise ValueError(f"priority must be in [0,10], got {self.priority}")
        if self.timeout_s <= 0:
            raise ValueError(f"timeout_s must be positive, got {self.timeout_s}")

# --- Pattern 4: Protocol as structural type class ---
T = TypeVar("T")
E = TypeVar("E")

class TaskRunner(Protocol):
    def __call__(self, task: Task) -> Result[RunState, str]: ...

# --- Pattern 3: Result type ---
class Result(Generic[T, E]):
    __slots__ = ()

@final
@dataclass(frozen=True)
class Ok(Result[T, E]):
    value: T

@final
@dataclass(frozen=True)
class Err(Result[T, E]):
    error: E

# --- Pattern 9: overloaded signatures ---
@overload
def make_task(spec: str) -> Task: ...
@overload
def make_task(spec: dict) -> Task: ...
def make_task(spec):
    if isinstance(spec, str):
        return Task(task_id=TaskId(spec), name=spec)
    return Task(task_id=TaskId(spec["id"]), name=spec["name"])

# --- Using it all together with Pattern 5: exhaustive match ---
def schedule(task: Task, runner: TaskRunner) -> None:
    result = runner(task)
    match result:
        case Ok(value=state):
            print(f"{task.name} finished: {state}")
        case Err(error=reason):
            print(f"{task.name} failed: {reason}")

Notice what this design prevents:

  • Passing Priority(11) raises at construction - never gets scheduled
  • TaskRunner protocol is satisfied by any callable with the right signature - no base class
  • runner(task) returns Result, forcing the caller to handle both outcomes
  • match on Ok/Err is exhaustive - add a third Result variant and mypy tells you what’s missing
  • TaskId("user-input-here") vs str - cross-domain assignment is a type error

What you actually gain

These patterns don’t remove all bugs. Python’s runtime doesn’t enforce NewType. A sufficiently determined programmer can bypass any of this.

What they do:

Self-documenting signatures. def build(rate: Hertz, duration: Seconds) -> Result[Table, str] tells you units, that it can fail, and what success looks like. You read the signature once. You don’t read the implementation to understand what you’re calling.

Cheap refactoring. Change timeout: float to timeout: Seconds and mypy finds every call site that needs a cast. Not grep-and-hope - structured type-checking across the whole codebase.

Forced error awareness. Result is contagious. Once a function returns Result, its callers can’t accidentally swallow the error. The lazy path is the correct path - you have to do something with both Ok and Err.

Consistent team patterns. When the team agrees on frozen dataclasses for domain objects and Result for fallible logic, code review conversations shift from “did you handle the error?” to what actually matters. The patterns enforce the basic contract. You focus on the business logic.


TL;DR

  • NewType makes domain unit confusion a mypy error - zero runtime cost
  • @dataclass(frozen=True) with __post_init__ gives you validated immutable value types
  • @final on Ok/Err turns Result[T, E] into a sealed sum type
  • Protocol with @runtime_checkable is structural subtyping - no inheritance needed
  • match on sealed types gives exhaustive dispatch; mypy flags missing cases
  • Literal restricts string/int parameters to a fixed set at type-check time
  • TypedDict gives type-checked field access on external dict-shaped data
  • TypeVar with covariant=True / contravariant=True mirrors Scala’s +T / -T
  • @overload lets you express different return types per input type
  • TypeGuard propagates type narrowing through custom validation functions
  • Final and ClassVar mark constants as immutable at the type-checker level
  • Generic[S] with Literal phantom states enforces builder step ordering at check time
  • Smart constructors return Result[T, str] - invalid objects become unrepresentable
  • Valid[T] accumulates all validation errors; Result fails fast - each has its place
  • Protocol composition enforces Interface Segregation without inheritance hierarchies
  • Named Callable type aliases make function contract intent visible in signatures
  • Reader monad threads environment/config purely; no globals, no monkey-patching
  • Writer monad carries a pure audit log alongside computation results
  • State monad makes stateful logic rollback-safe and trivially testable
  • Semigroup/Monoid via Protocol gives a single fold that works for any combinable type
  • F-bounded TypeVar / Self (3.11+) preserves concrete type through builder chains
  • Minimum Python version for all patterns combined: 3.10 (3.11 for Self)
  • No third-party dependencies - all of this is in typing, dataclasses, and functools

References

Python official docs and PEPs

Scala comparison references

Further reading

  • typing_extensions - backports for Python 3.7-3.9
  • pyright - Microsoft’s type checker, stricter than mypy on some generics
Vitthal Mirji profile photo

Vitthal Mirji

Staff Data Engineer @ Walmart

Mumbai, India

Staff Data Engineer & Architect from Mumbai, India. Sharing insights on Data Engineering, Functional programming, Scala, Open source, and life.

Expertise
  • Data Engineering
  • Scala
  • Apache Spark
  • Functional Programming
  • Cloud Architecture
  • GCP
  • Big Data
Next time, we'll talk about "Data Engineer's Guide to Why Your Pipeline Failed at 3 AM (Again)"