Recent Posts

01 Aug 2025

Production error handling: When our LLM pipeline threw "Unknown error" in logs

4,362 words, ~17 min read
Stop runtime exceptions from crashing LLM pipelines. Build type-safe error hierarchies with Scala Either, ADTs, and smart constructors. Real refactor from llm4s turned generic exceptions into 12+ structured error types and debugging.
18 Jul 2025

Slicing time at scale: Building a Scala SDK for petabyte CDC on GCP at scale

2,881 words, ~11 min read
Deep-dive into building a Scala SDK for CDC at petabyte scale on GCP. Learn partition pruning, getAffectedPartitions, cloud-native backups, and CDC without ACID. Includes metrics, scan reduction stats, and code from Delta/Hive lakes.
17 Jul 2025

Developer experience: How we turned 20-minute llm4s setup into 60 seconds today

3,540 words, ~14 min read
Stop making contributors guess their way through setup. Build Giter8 templates with working code, CI setup, and zero TODO comments. Real template from llm4s cut onboarding from 20 minutes to 60 seconds, a 95% time saving overall net.
17 Jul 2025

llm4s.g8: a developer experience boost for LLM4S SDK (issue #94) for Scala devs

3,154 words, ~12 min read
How we turned Issue #94 into a smoother onboarding flow for llm4s. Covers the DX problem, giter8 template design, repo structure, and the decisions that reduce setup friction. Includes lessons on developer empathy and Scala SDK adoption.
16 Jul 2025

My road to Google Developer Expert (GDE): DevFest samosas, OSS PRs, acceptance

1,581 words, ~6 min read
Breakdown of my GDE acceptance journey. Learn contribution tracking, referral tactics, interview focus areas, OSS PR examples, blogging setup, CFP submissions, and mentorship. Includes the spreadsheet system and interview questions.
15 Jul 2025

DQ is not Dairy Queen: Building a data quality framework (DQF + DPAT) in prod

670 words, ~2 min read
Complete guide to building a data quality framework. Learn constraint validation in Scala, YAML checks (DPAT), Spark/Airflow/CI/CD integration, and automated failure reporting. Includes real code and architecture to catch issues early.
15 Jun 2025

How I cut delivery errors by 82%: DPAT, gated CRQ, and data contracts in prod

651 words, ~2 min read
Production-tested strategy to cut pipeline failures from 12% to 2% using YAML-based data quality checks (DPAT), CI/CD gates, and strict data contracts. Includes before/after metrics and architecture for self-healing pipelines in prod.
21 May 2025

Kotlin for data pipelines: Why I ditched Scala for backend data architecture

1,438 words, ~5 min read
Deep-dive guide to building data-centric backends with Kotlin. Learn pipeline patterns with coroutines, Spark/Flink integration, LLM enrichment, lakehouse patterns (Iceberg/Delta), and observability with OpenTelemetry. Includes real code.
17 May 2025

Cloud-native AI workflows: BigQuery ML + Vertex AI (skip the CSV exports) now

1,102 words, ~4 min read
Guide to building AI pipelines with BigQuery ML and Vertex AI. Learn SQL-first modeling, real-time endpoints, automated retraining, and drift monitoring. Includes diagrams and code that cut time-to-model from 6 weeks to 2 hours fast.
15 Apr 2025

Behind the scenes: How I crafted my Scala Days CFP and got accepted in 2025

1,156 words, ~4 min read
Step-by-step breakdown of crafting a winning Scala Days CFP. Balance technical depth with accessibility, use storytelling hooks, and iterate through drafts. Includes real examples, rejected titles, and the final accepted abstract.