Skip to content

Blog

I mainly write about learnings, challenges and projects that I encounter during my transition from software engineering to AI. Due to my background is in software engineering, you will also find articles that focus on system design and technical leadership.

Newsletter

To stay in the loop, subscribe to my newsletter which I use to send summaries of new posts no more than once per month.

Subscribe to my Newsletter

Technical Leadership

Large Language Models (LLMs)

Interviews

System Design

Developer Tools


Moore's Law for AI?

From Single-Core Limits to Multi-Agent Systems

Moore’s Law [1] fueled decades of computing progress by predicting the doubling of transistor density approximately every two years. For a long time, this translated into rapid gains in CPU speed-until thermal limits and the end of Dennard scaling hit. At that point, the industry pivoted: instead of faster single cores, we got multicore processors.

AI seems to be approaching a similar transition point. While model sizes have grown exponentially, the returns on intelligence are proving sub-linear. We are now in a resource-constrained environment where the brute-force scaling of the past few years is no longer economically or practically viable. This pressure is forcing an architectural pivot. What comes next might not be another order-of-magnitude increase in model parameters but a shift in architecture. Just like CPUs went multicore, AI is going to find a way to continue to scale intelligence.

Podcast

If you prefer listening over reading, check out this podcast episode where the topic is explored in more detail.

type:audio

The Four Paradigms of Prompting

Introduction

Large language models (LLMs) map sequences of text to other sequences of text. Given an input string, an LLM predicts the most probable continuation based on the statistical patterns it has learned during its training process. What makes LLMs remarkable is not merely their predictive capability, but the emergent behaviors they display across a wide range of tasks including

  • Correct spelling and punctuation
  • Syntactic and grammatical structure
  • Semantic understanding and paraphrasing
  • Conversational coherence and dialogue flow
  • Multilingual expression and translation
  • Code generation and formal logic
  • Question answering and knowledge retrieval

We can use these capabilities to solve complex problems, but to do so effectively, we need to understand how to communicate with these models.

Podcast

If you prefer listening over reading, check out this podcast episode where the four prompting paradigms are explored in more detail.

type:audio

The Fragility of Model Behavior

The Fragility of Model Behavior

At the heart of the prompt engineering challenge is the sensitivity of LLMs to input phrasing. Unlike traditional deterministic systems, LLMs do not "understand" tasks in the human sense; they generate the most statistically probable continuation of a given sequence. This makes their output highly sensitive to subtle changes in prompt design.

For instance, in instruction-based prompting, the difference between "Summarize this article" and "Write a short abstract" can cause the model to shift tone, structure, or depth of analysis. This brittleness necessitates a thoughtful and experimental approach to prompt construction.

Prompt engineering is therefore essential not only for eliciting correct outputs, but also for:

  • Preventing hallucinations
  • Enforcing output structure
  • Ensuring task fidelity
  • Aligning with domain-specific requirements

Each of the four paradigms offers unique affordances and constraints for prompt construction.

uv - An extremely fast Python package and project manager, written in Rust

There exist a broad selection of package managers that manage virtual environments and packages for us. Some of the most popular ones are conda, pip and virtualenv or virtualenvwrapper. They can almost all do the same things:

  • Install packages
  • Resolve dependencies
  • Managing package versions
  • Managing virtual environments

Why uv?

You might ask yourself the question of why do we need yet another tool for this? Because it is faster and faster is better. How fast you ask? Between 10 and 100x faster, depending on the task. Besides that it is also fun to use and support something new. But before we get started with some examples, let's go over the key design decisions that make uv so fast.

The Difference Between a Base LLM and an Instruction-Tuned LLM

Introduction

Large language models (LLMs) can behave very differently depending on how they were trained.

BaseLLMs are trained purely on next-token prediction over a large corpus of text. Instruction-tuned LLMs, by contrast, are further trained to follow prompts in a more helpful and structured way.

To explore how these two types of language models behave, we will take a look at two models from Hugging Face's SmolLM family:

How to Choose Tech for Production: The Default Heuristic

Introduction

During my time as a lead data engineer to build a predictive maintenance platform I came across the concept of the default heuristic. One of my tasks was to identify platform components that would allow us to scale storage and retrieval of time series sensory data to petabytes, while still being able to support critical use cases of data scientists as well as analytical reporting.

Interview Series: Working with an SRE

Preamble

In this insightful interview, Paul Bütow, a Principal Site Reliability Engineer at Mimecast, shares over a decade of experience in the field. Paul highlights the role of an Embedded SRE, emphasizing the importance of automation, observability, and effective incident management. We also focused on the key question of how you can work effectively with an SRE weather you are an individual contributor or a manager, a software engineer or data scientist. And how you can learn more about site reliability engineering.