/ blog / tag
#ai
33 posts
← all postsAI in Everyday Life: LLMs as Board Game Assistants
An LLM can be a very practical helper during board games: explaining rules, resolving ambiguities, searching rulebooks, creating turn summaries, and keeping house rules consistent. The condition is simple: treat it as an assistant, not an infallible judge.
The /goal command in Claude Code - a session contract that won't let the agent give up
How /goal in Claude Code enforces completion through a session-scoped Stop hook. Pros, cons, when to use it, how to phrase the condition, and why it matters for QA.
AI in Smart Home, Part 1: A Home That Can See - LLM Vision in Practice
The first part of a mini-series about connecting AI with smart home systems. We move from simple motion detection to visual interpretation: who is at the gate, what happened on the driveway, and when a notification is actually worth sending.
Where the Holak Scale Fails - Self-Criticism
Every model is a tool, not a truth. Six places where the Holak Scale actually fails when working with teams, plus reader feedback to address in version 3.
Multi-Agent Orchestration - When One Agent Is Enough
Level 10 on the Holak Scale is attractive for a CV, expensive to maintain. A skeptical take on agent teams: when you actually need many, when one well-configured agent wins, and how to spot orchestrating mediocrity.
MCP, CLI or hook - which tool when, and when MCP is overengineering
MCP is a tool, not a status. Each connector is maintenance, risk and attack surface. Criteria for when MCP pays off, when CLI / slash command / hook is better, a decision tree, and the CLI → MCP migration path.
AGENTS.md vs CLAUDE.md vs .cursorrules - Which Context File for What
Three context-file formats for AI agents aren't chaos - each has a niche. What each agent reads, what to put in each, how to keep them in sync, and what not to copy 1:1.
A Prompt Library Is an Anti-Pattern. Yes, the 200-Line One
If you copy prompts from a document, you have a phase-4 problem, not a phase-3 one. Why prompt fetishism is a symptom of stalled evolution, and how to break a 200-line prompt down into custom instructions and project context.
Individual 9, Company 2 - What to Do About AI Maturity Gaps
Most companies measure the maximum instead of the median. How to detect a gap between individual and organisational maturity, how to compute the median, and what to actually do if you're a level-9 engineer in a level-2 company.
AI Adoption Anti-Patterns - 7 Ways to Get Stuck
Most teams don't get stuck because of missing tools - they get stuck on specific patterns of thinking. The seven most common anti-patterns from the Holak Scale, a recognition test, who gets caught, and how to get out.
The Trust Boundary - How to Actually Cross from 8 to 9 on the Holak Scale
The hardest jump on the Holak Scale isn't technical. Verifiability and reversibility as a framework, a readiness checklist, and two case studies - one team that made it, one that dropped back to 2.
How to Diagnose a Team's AI Maturity in 30 Minutes
A concrete protocol for scoring a team on the Holak Scale - 8 calibration questions, 5 minutes of live observation and a one-page report. No surveys, no workshops, no slides.
Claude and Codex: CLI vs Desktop vs Web - Where You Actually Ship Work
Three access channels to the same models differ in what they can touch. A practical comparison of Claude Code and Codex CLI against desktop apps and web interfaces.
Holak Scale v2.1e - enterprise. An AI adoption maturity model for organizations
12 maturity levels of AI in the workplace - from resistance, through agent workflows, to a purpose-built agentic OS for concrete business outcomes. Version v2.1e splits the enterprise path from the private one.
Holak Scale v2.1p - private. Maturity of everyday AI use
12 maturity levels of everyday AI use - home, learning, finance, smart home, life organization. The private version of the Holak Scale, parallel to the enterprise track.
Writing your own Claude Code subagent: frontmatter, prompt, deployment
Part two of the subagent series. We build a blog-post-writer agent from scratch - what to put in the frontmatter, how to write a description that actually triggers, and how to dodge two classic traps.
Advisor in Claude Code: a second opinion from a stronger model before you commit to a path
The advisor() tool forwards the full transcript to a stronger model and returns an opinion. When to call it before work, when NOT for every step, how not to turn it into endless consultation.
The prompt-master skill in Claude Code: a prompt generator for other AI tools
Instead of writing prompts to Midjourney, Sora, Suno or Cursor by hand, you have a Claude Code skill that does it. How it works, where it won't replace knowing the target tool.
Subagents in Claude Code: why orchestrate instead of writing in one window
One Claude vs many specialists invoked through Agent. Why subagents, how Claude picks which one to use, when NOT to use them, and why this isn't 'one window with more context'.
Open WebUI - a ChatGPT-like frontend for your local LLM
Local Ollama only gives you a CLI. LM Studio is single-user. Open WebUI is the missing piece: a ChatGPT-like UI with RAG, web search and tools - running in one Docker command.
Local LLM models in 2026 - what actually runs on a Mac mini M4 16 GB
A review of the current models worth pulling onto 16 GB unified memory: gpt-oss-20b, Gemma 4 e4b, Qwen3-Coder, Phi-4. What works, what doesn't, and why.
Mac mini M4 16 GB as a local LLM workstation - LM Studio vs Ollama
Does the cheapest M4 Mac make sense as a machine for local language models? Hardware reality check, LM Studio vs Ollama comparison, and where the sanity line is.
Context7 - the MCP that gives your model up-to-date docs instead of six-month-old knowledge
How one small MCP solves the biggest pain of working with LLMs on code: stale library knowledge. Setup, real-world cases, limitations.
Caveman - the plugin that cuts output tokens by 75% (and where it breaks)
How a Claude Code plugin that makes the model talk like a caveman changed my token bill - and which tasks it actually fits, which ones it quietly ruins.
A skill for filing bugs in Jira through Atlassian MCP
How to build a skill that turns your „hey, something's broken on checkout” into a complete Jira ticket - with evidence, dedupe, and without polluting the project. No autopilot agent, a deliberate human confirm in the middle.
From article to video: HyperFrames for quality-blog.eu
When a post deserves a thirty-second video explainer, how to build the script, and how to reuse the same material across five channels at once.
How testers should evaluate agent output
Five dimensions for evaluating agent output, a checklist you can walk in fifteen minutes, and situations where you just send the result back without a lengthy debate.
10 AI workflows that actually help a Test Architect
Ten concrete workflows - from test strategy review to the decision not to automate - each deployable in a week and usable every day.
First MCP for QA: search and fetch over evidence
How to start with MCP in QA from the simplest possible pair of tools - search and fetch over evidence. No autonomy, no loops, just a structure you can maintain.
How to write AGENTS.md for a test automation repo
What to put in AGENTS.md for a test automation repo, common pitfalls, templates for Playwright, Cypress, and API tests, and how to tell if the file actually helps.
Skills, tools, agents, MCP, AGENTS.md - a concept map for testers
Five concepts that get confused most often in QA conversations about AI - plus a simple decision model for what to use when.
AI Adoption Maturity Model
From resistance to orchestration - 11 levels of AI utilisation in an organisation. Find out where you and your team stand.
5 Prompts That Will Change How You Work with AI
Proven prompt engineering techniques you can apply right away.