Building Your Own AI Agents: From Personal Assistant to Specialist Team

Not long ago, using AI at work usually looked like this: we opened ChatGPT, Claude, or another tool, typed a prompt, corrected the output, copied a fragment into a document or codebase, and then started over again in the next conversation. It was useful, but it had one big weakness: every time, we had to explain the context, expectations, style, output format, and task boundaries again.

Today, a more natural direction is to build our own specialized AI agents. This does not mean creating magical digital employees that run our lives or write entire production systems without supervision. It means something much more practical: repeatable, well-described AI roles that know what they are for, what their scope is, what sources and tools they can use, what output they should produce, and when they should say: “I do not know” or “this requires a human decision”.

Such an agent can be a simple assistant for weekly planning. It can be a private technical editor. It can review code, generate test ideas, analyze logs, or draft architectural documentation. It can also act as a “leader” - an agent that coordinates a small team of specialists: a researcher, architect, implementer, test architect, security reviewer, and documentation writer.

This article is intentionally practical. No exaggerated hype, no promise that an agent will replace a team, and no assumption that one clever prompt is enough. The goal is to show how to start practicing agent design yourself: first in a very simple way, then with more structure, and eventually with small agent teams for professional and everyday tasks.

1. What do we actually mean by an agent?

The word “agent” is used very loosely. Sometimes it means a normal chatbot with a nicely written prompt. Sometimes it means a script that uses an API. Sometimes it means a developer tool that can read files, modify code, run tests, and iterate until the result passes verification.

For practical purposes, let us use a simple definition:

An AI agent is a specialized assistant with a defined role, goal, context, tools, constraints, operating procedure, and result verification method.

This matters because an agent does not become an agent just because we call it one. It becomes an agent when it has a repeatable way of working. A good agent does not merely answer. A good agent performs a task within a defined process.

Think about the difference between casually asking a colleague in the hallway and assigning a task to a specialist. If we say, “take a quick look at this code”, we will get a general opinion. If we say, “you are a backend code quality reviewer; check this class for readability, testability, error handling, and regression risk; do not change the code, only return a prioritized list of issues”, we will get a completely different result.

An agent should have several elements:

Role - who it is in this task, such as test architect, editor, planner, log analyst.
Mission - why it exists and what problem it solves.
Scope - what it does and what it does not do.
Context - information about the user, project, repository, workflow, and constraints.
Tools - files, search, calendar, repository, terminal, API, smart home system, knowledge base.
Procedure - the steps it follows before answering.
Output format - what the result should look like.
Quality gates - what must be checked before returning the result.
Safety boundaries - what it must not do without approval.
Escalation rules - when it must stop and ask a human.

Only when these elements are present does an agent become a practical time-saving tool rather than a source of impressive but unreliable text.

2. Why build your own agents instead of using one general chat?

A general chat is excellent for exploration. You can ask for an idea, compare technologies, explain an error, draft a paragraph, or translate a message. Problems appear when the task is repetitive.

If you prepare a weekly report every week, a reporting agent should know the report format. If you regularly analyze pull requests, a code review agent should know your team’s standards. If you create automated tests, a testing agent should understand that you do not want random tests just for coverage; you want tests derived from business risk, API contracts, boundary data, and possible regressions. If you are building a house, a construction planning agent should know that you care about specific technologies, style, running costs, automation, and plot constraints.

The biggest value of agents appears in three areas.

First, repeatability. Instead of writing a long prompt every time, you define the working method once and reuse it. That reduces friction. The agent becomes a tool, not an experiment.

Second, consistency. If an agent has clear rules, it will return results in a similar style and structure. This is especially important for code, documentation, reports, and quality analysis.

Third, domain context. A general model knows many things, but it does not know your architectural decisions, team standards, preferred libraries, project history, abbreviations, or personal preferences. An agent can receive that context through instructions, files, notes, examples, and checklists.

In practice, a good agent works a bit like a junior teammate with a well-described task, but with one useful difference: the agent will not complain if you ask it to correct the same output format ten times. It is also a great way to practice your own thinking process. If you cannot define a good agent for reviewing tests, perhaps the test review process in your team is not clearly defined either.

3. Five maturity levels: from prompt to agent team

Building agents does not have to start with APIs, containers, webhooks, and integrations with internal systems. It is better to begin at a simple level and only then increase automation.

Level 0: saved prompt

The simplest agent is a well-written prompt saved in your notes. Example: “You are the technical editor of my blog. Improve the text while preserving my style, do not shorten it too aggressively, list the changes, and mark places that require fact-checking.”

This is not yet a full agentic system, but it is a good start. It makes you think in roles and procedures.

Level 1: project with instructions and files

The next step is moving the prompt into a workspace that preserves context. Tools such as ChatGPT Projects allow users to keep conversations, reference files, and project instructions together. In practice, this means your editorial agent can have access to previous articles, blog style, terminology, and publication rules.

For everyday needs, you can create separate projects: “House and Construction”, “Smart Home”, “Diet and Shopping”, “Blog”, “English Learning”, “Board Games”. For work: “API Project”, “Test Framework”, “ABAP/SAP”, “QA Mentoring”, “Presentations and Training”.

Level 2: custom GPT, skill, project, or configured tool agent

At this level, the agent receives a more persistent configuration: instructions, knowledge, examples, and sometimes actions or tools. In ChatGPT, you can create custom GPTs configured for a specific purpose. In Claude Code, you can define skills and subagents. Other tools have similar mechanisms: projects, repository rules, instruction files, agent profiles, workflows.

The product name is not the most important thing. Products will change. The pattern matters: we extract repeatable knowledge and procedures from our heads into the agent’s configuration.

Level 3: agent working with a repository or a specific dataset

This stage is particularly interesting for people who work with code. The agent does not only answer questions. It can read files, search dependencies, propose changes, run tests, analyze errors, and verify the result. Coding agents often work in a loop: gather context, act, check the result, correct.

At this level, boundaries become critical. An agent may propose code changes, but it should not perform destructive operations, deploy to production, delete data, or modify security configuration without supervision.

Level 4: agent team with a leader

This is where things become very interesting. Instead of one agent, you create a small team. The leader analyzes the task, splits it into parts, selects specialists, collects their results, detects contradictions, and prepares a decision for the human.

Example for a technical task:

Leader Agent - understands the goal, splits the work, controls scope.
Architecture Agent - analyzes the impact on system structure.
Implementation Agent - proposes concrete code changes.
Test Architect Agent - designs tests and regression risks.
Security Reviewer - looks for security issues.
Documentation Writer - updates README, ADR, or changelog.

For an everyday task such as renovation or house construction:

Project Leader - creates the plan and decision sequence.
Cost Agent - tracks budget and variants.
Technical Agent - analyzes installation requirements.
Smart Home Agent - proposes automation and sensors.
Risk Agent - identifies places where a costly mistake can happen.

This does not necessarily mean you need five separate tools. Sometimes one model can simulate multiple roles. However, for larger tasks, separate agents with separate context are often cleaner because they do not pollute the main conversation with detailed side investigations.

Level 5: programmable workflow with APIs, tools, and control

The most advanced level is programmatic agent development. You define agents in code, give them tools, memory, handoffs, validation, logging, and approval gates. You can integrate them with repositories, ticketing systems, documentation, calendars, knowledge bases, Home Assistant, or internal APIs.

This level is for advanced users and organizations. But the principle remains the same: a small, well-defined agent is better than a large, vague “superagent” that is supposed to do everything.

4. The most important rule: keep the agent’s scope small

The most common beginner mistake is: “I will build an agent that helps me with everything.” That is the fastest route to chaos. An agent for everything usually becomes an agent for nothing.

A better agent has a narrow scope and clear responsibility.

Instead of:

“You are my work assistant.”

Use:

“You are an agent for preparing technical test plans for backend changes in Java/Spring applications. Based on the change description, endpoints, DTOs, risks, and existing tests, you prepare a list of test cases, coverage gaps, and automation recommendations. You do not write production code. If requirements or test data are missing, you mark them as open questions.”

The same applies to everyday life.

Instead of:

“Help me with my house.”

Use:

“You are an agent for analyzing home and construction-site sensor data. You care about temperature, humidity, energy consumption, and camera alerts. Your job is to prepare a short daily report, identify anomalies, and propose actions, but you must not turn off devices or change automations without my approval.”

A small scope gives three benefits: the output is easier to judge, the instruction is easier to improve, and the process is easier to trust. Trust should not come from the model sounding confident. Trust should come from the agent operating within a known process and returning a verifiable result.

5. Agent design template

Below is a simple template you can paste into ChatGPT, Claude, a custom GPT, a project, an instruction file, or your own workflow documentation. This is one of the most important parts of the article because this template allows you to start practicing immediately.

# Agent: [name]

## 1. Mission
Your task is [specific goal]. You help the user achieve [result] while following [key principles].

## 2. Role
You act as [role/specialization], for example test architect, code reviewer, technical editor, house planner, board game referee.

## 3. User and context
The user is working on [project/area]. They prefer [answer style, language, level of detail]. Important constraints: [technologies, budget, standards, risks].

## 4. Responsibilities
You handle:
- [area 1]
- [area 2]
- [area 3]

You do not handle:
- [excluded area]
- [area requiring an expert]
- [forbidden action]

## 5. Inputs
You work based on:
- task description,
- attached files,
- code snippets,
- user notes,
- tool results,
- clarifying questions, if necessary.

## 6. Operating procedure
Always:
1. Identify the task goal.
2. List assumptions.
3. Check whether critical information is missing.
4. Perform the analysis or plan.
5. Verify the result against the quality checklist.
6. Return the answer in the agreed format.

## 7. Output format
Answer using this structure:
- Summary
- Key findings
- Recommendations
- Risks / open questions
- Next steps

## 8. Quality rules
Do not guess facts. Mark uncertainty. Separate facts from recommendations. For code: consider testability, maintainability, readability, and regression. For everyday decisions: show options and consequences.

## 9. Safety boundaries
Do not perform irreversible actions without approval. Do not delete data. Do not send messages. Do not buy products. Do not change production configuration. Do not treat medical, legal, or financial advice as final decisions.

## 10. Escalation
Stop and ask the user if:
- the decision is expensive,
- the action is irreversible,
- data is missing,
- there is a security risk,
- the result may affect health, finances, law, or production.

This template looks simple, but it changes the quality of interaction dramatically. It teaches you to think of an agent as a small process rather than a random conversation.

6. Leader agent: how to build a team for a task

One of the most useful patterns is the leader agent. Its job is not to do everything alone. Its job is to design the execution approach and select specialists.

A leader agent should answer several questions:

What is the goal of the task?
What decisions must be made?
Which areas require separate analysis?
Which agents should be created?
What context should each agent receive?
What output format should each agent return?
How should the results be combined into one recommendation?
Where are the contradictions and risks?
What must be approved by a human?

Example prompt for a leader agent:

You are a Leader Agent. Your task is to organize a team of agents for the task below.

Task: [describe the task]

Do not solve the entire problem immediately. First:
1. Define the goal and expected result.
2. Propose a team of specialist agents.
3. For each agent, describe its role, input data, scope, and output format.
4. Define the work sequence: who works in parallel and who works after whom.
5. List risks and decisions requiring human approval.
6. Finally, prepare an integrated action plan.

Do not create an unnecessarily large team. If two roles are enough, use two roles.

For code work, such a leader can create a team like this:

Repo Explorer - reads the project structure and finds relevant files.
Requirement Analyst - translates the requirement into technical consequences.
Implementation Agent - proposes the change.
Test Architect - designs tests.
Regression Hunter - looks for areas that may break.
Documentation Agent - updates README, ADR, or changelog.

For everyday needs, a leader may work differently:

Planner - defines the sequence of actions.
Researcher - gathers options.
Skeptic - looks for weak points.
Budget Agent - tracks costs.
Editor - turns the result into a readable plan.

Remember: an agent team is not always better. If the task is simple, a team only adds overhead. Use leader agents when the task has several dimensions: technical, financial, organizational, quality-related, security-related, or communication-related.

7. Agents for working with code

Coding agents are among the most practical, but also among the riskiest. A model can very confidently propose a change that compiles only in its imagination. That is why a coding agent must have a strong verification procedure.

A good coding agent should not start by writing. It should start by understanding. First it reads the project structure, checks existing patterns, identifies tests, looks at conventions, and only then proposes a change.

Example 1: Test Architect Agent

# Agent: Test Architect

Mission:
You design test strategies for backend and integration changes.

Scope:
- regression risk analysis,
- unit, integration, and contract test design,
- identification of coverage gaps,
- proposal of test data,
- analysis of test readability and maintainability.

You do not:
- change production code unless explicitly asked,
- create tests only to increase coverage percentage,
- ignore boundary cases.

Procedure:
1. Describe the change from a business perspective.
2. Identify components affected by the change.
3. List risks.
4. Propose test cases.
5. Split them into must-have, should-have, and nice-to-have.
6. Indicate which tests should be automated.
7. List open questions.

Format:
- Change context
- Risks
- Proposed tests
- Automation
- Gaps / questions

This agent can be used for pull requests, story analysis, API changes, test framework refactoring, or regression planning.

Example 2: Code Review Adversary

This agent is not supposed to be nice. It is supposed to find problems.

# Agent: Code Review Adversary

You act as a demanding code reviewer. Your goal is to find issues that may go unnoticed: hidden dependencies, unreadability, missing tests, incorrect exception handling, regression risk, performance concerns, and violations of project conventions.

Rules:
- Do not rewrite the code immediately.
- First describe the problem and its impact.
- Separate blockers from suggestions.
- Do not nitpick style unless it affects maintainability.
- If you cannot confirm something, mark it as a hypothesis.

Output format:
1. Blockers
2. Important issues
3. Suggestions
4. Missing tests
5. Questions for the author

This is a very useful quality exercise. If you use it regularly, you will quickly notice recurring error patterns in your project.

Example 3: Refactoring Scout

This agent should not refactor immediately. It should identify where refactoring makes sense.

# Agent: Refactoring Scout

Your task is to find refactoring candidates, but not to change code without user approval.

Look for:
- duplicated logic,
- oversized classes,
- methods with too many responsibilities,
- dependencies that are hard to test,
- names that do not express intent,
- places where exceptions are swallowed or only logged,
- brittle tests too tightly coupled to implementation.

For each issue, provide:
- location,
- symptom,
- impact,
- improvement proposal,
- change risk,
- minimal safe step.

This pattern is useful because it reduces the temptation to perform a huge “refactor everything” operation. The agent must look for small, safe steps.

Example 4: Technical Documentation Agent

Code without documentation can be a problem, but documentation disconnected from code is even worse. A documentation agent can help write ADRs, README files, run instructions, endpoint descriptions, changelogs, and migration notes.

Important rule: the documentation agent should separate what is directly supported by the code from what is its interpretation. If it is not sure, it should mark the fragment as “to be confirmed”.

8. Agents for everyday needs

Agents are not only for developers. In everyday life, we often have repetitive decision processes that are tiring not because they are difficult, but because they return every week.

Weekly planning agent

Such an agent can combine private tasks, work, training, shopping, home duties, and free time. The point is not to let AI manage your life. The point is to help you see resource conflicts: too many tasks in one day, no recovery time, underestimated travel, or overload after work.

Good prompt:

You are a weekly planning agent. You help me create a realistic plan without overloading the day.

Rules:
- Assume buffers between tasks.
- Separate focus work from simple errands.
- Point out overloaded days.
- Propose a maximum of 3 priorities per day.
- Do not plan every minute.

At the end, provide:
- weekly plan,
- biggest risks,
- things to remove or move,
- one recommendation to simplify the week.

Home or construction agent

If you are building a house, renovating a flat, or developing a smart home, an agent can help organize decisions. It can analyze offers, compare variants, plan work sequence, prepare questions for contractors, interpret sensor data, or propose automations.

Example:

You are a construction and smart home support agent. You help analyze technical decisions, but you do not pretend to be a licensed construction manager or electrician.

Always separate:
- facts,
- assumptions,
- risks,
- questions for a professional,
- automation ideas.

If the topic concerns electrical safety, structure, gas, ventilation, or regulations, mark it as requiring confirmation by a specialist.

Shopping agent

A shopping agent should not simply choose the “best product”. It should first help define criteria: budget, constraints, requirements, unnecessary features, and marketing risks. For larger purchases, its value is not the answer “buy this”, but a decision matrix.

Board game agent

A very practical everyday example. An agent can act as a rules judge, instruction translator, rulebook search assistant, dispute moderator, or generator of a short rules summary for new players.

A good instruction for such an agent should include one important rule:

If the rulebook is attached, rely on it. If you do not have the rulebook, clearly state that you are using general knowledge and the result should be confirmed in the official rules.

This reduces hallucinations. The agent should not invent rules just because they sound logical.

Healthy habits agent

Here we must be especially careful. An agent can help plan meals, shopping lists, activity, or reminders, but it should not replace a doctor or dietitian in medical matters. A good healthy habits agent should focus on organization, consistency, and tracking, not diagnosis.

9. Tools, memory, and context: what should the agent know?

An agent without context is like a new team member without onboarding. It may be intelligent, but it will guess. That is why much of agent building is not prompt writing, but context preparation.

For a coding agent, context may include:

project README,
directory structure,
build/test commands,
coding standards,
architectural decisions,
examples of good tests,
anti-pattern list,
local environment instructions,
definitions of review roles and rules.

For an everyday agent:

user preferences,
budget constraints,
schedule,
list of disliked items,
reference documents,
previous decisions,
report templates,
safety rules.

However, do not put everything into one giant file. Too much context reduces quality because important rules disappear in the noise. It is better to have a short main instruction and separate reference files used only when needed.

Practical rule:

What the agent must always know goes into the main instruction. What is needed only sometimes goes into files, skills, checklists, or reference documents.

For code work, the following separation works well:

stable instructions: style, commands, architecture,
task checklists: code review, testing, release,
domain knowledge: dictionaries, protocols, business processes,
examples: good PRs, good tests, good ADRs.

10. External tools and MCP: when the agent should have “hands”

At the beginning, an agent can operate only on text. This is safe and sufficient for learning. Over time, the question appears: should the agent have tools?

A tool can be:

file access,
search,
terminal,
Git repository,
ticketing system,
calendar,
email,
Home Assistant,
database,
internal API,
browser,
test runner.

The more tools, the more value - but also the more risk. An agent that only writes a plan can be wrong. An agent with tools can change something. That is why least privilege is essential.

An agent that analyzes a repository does not need permission to push changes. A home reporting agent does not need permission to turn off heating. A shopping agent does not need permission to complete payment. A mail agent can draft a message, but sending should require human approval.

In the agent ecosystem, the Model Context Protocol is becoming increasingly important as a standardized way to connect models with tools and data. In practice, MCP can be treated as an integration layer: the agent receives descriptions of available tools and can use them according to defined rules. This is a powerful direction, but common sense still applies: not every tool should be connected immediately.

A good question before adding a tool is:

Does this tool actually shorten the agent’s work, or does it only increase the risk surface?

11. How to test an agent

If an agent is supposed to be useful, it must be tested. That may sound funny, but it is very similar to testing an application. The agent has requirements, inputs, an expected output format, and edge cases.

The simplest method is a set of three to five test tasks.

For a code review agent:

A simple PR without major issues.
A PR with a hidden logical bug.
A PR with missing tests.
A PR with a security problem.
A PR with good code that the agent should not criticize unnecessarily.

For a board game agent:

A simple question about action order.
A dispute between two rule interpretations.
A missing rulebook fragment.
A question dependent on an expansion or game edition.
A situation where the agent should say “I do not know”.

For a weekly planning agent:

A week with a small number of tasks.
An overloaded week.
A schedule conflict.
Tasks without priorities.
A request for an unrealistic plan.

After each test, ask:

Did the agent stay in role?
Did the answer follow the correct format?
Did it mark uncertainty?
Did it ask when data was missing?
Did it avoid going out of scope?
Was the result practical?
Does the instruction need clarification?

The best way to develop an agent is iteration. Do not try to write the perfect instruction immediately. Write version 1.0, test it on real cases, improve it, and save version 1.1.

12. Anti-patterns: what to avoid

Agent building is easy to spoil. Here are the most common anti-patterns.

Anti-pattern 1: agent for everything

If one agent is supposed to write code, plan meals, review articles, analyze finances, and control your house, its instructions will be too generic. It is better to have several small agents.

Anti-pattern 2: no output format

Without a format, the agent will answer once with a table, once with an essay, and once with a loose list of comments. Format is a contract. It makes outputs comparable and automatable.

Anti-pattern 3: no safety boundaries

An agent must know what it is not allowed to do. This is especially important with code, data, money, health, law, home devices, and communication with other people.

Anti-pattern 4: too much context

Dropping hundreds of unstructured pages into the context does not create a smart agent. It creates noise. Better to provide short instructions and well-described reference files.

Anti-pattern 5: no verification

An agent can sound confident even when it is wrong. That is why it needs checklists, tests, sources, and clear uncertainty marking.

Anti-pattern 6: automation too early

Do not automate a process you do not yet understand. First run it manually with an agent several times. Only then connect tools and workflows.

13. Practice plan: how to start

Below is a simple multi-week plan. It does not require APIs or programming. You only need an LLM tool, a note file, and a few real tasks.

Week 1: agent as a saved prompt

Choose one repeated task. It can be code review, meal planning, article analysis, training preparation, weekly planning, board game rules judging, or log analysis.

Write a short agent card:

name,
mission,
scope,
output format,
what it does not do,
when it asks the user.

Use it at least three times and write down what went wrong.

Week 2: add examples

Agents work much better when they see examples of good outputs. Add one good answer and one bad answer. Explain why one is good and the other is bad.

Example for a code review agent:

good comment: “There is no test for an empty list because the mapping method returns null instead of an empty collection; this may cause an NPE in the controller.”
bad comment: “The code looks ugly.”

Week 3: add a quality checklist

Every agent should have a checklist. For tests: edge cases, regression, data, assertions, readability. For an article: thesis, structure, examples, sources, uncertainty. For a weekly plan: priorities, buffers, conflicts, overload.

Week 4: build an agent pair

Combine two agents. For example:

author + reviewer,
implementer + tester,
planner + skeptic,
researcher + editor,
leader + specialist.

This is a great exercise because it shows that an agent does not have to be perfect. One agent can create, the other can criticize.

Week 5: add a leader

Create a leader agent that selects roles before starting the task. It should not always create a large team. Its job is to choose the smallest sensible team.

Week 6: only now think about automation

If the agent works well manually, you can move it into a project, custom GPT, skill, subagent in a coding tool, or API workflow. Automation makes sense only when the process is repeatable.

14. Mini-example: your own team for a coding task

Assume you have a task: “Add a password change endpoint in a Spring Boot application, with tests and documentation update.”

A leader agent could propose this team:

1. Requirement Agent

It clarifies the requirement:

who can change the password,
whether the old password is required,
what happens to sessions or tokens after the change,
what validations apply,
what error messages are needed.

2. Architecture Agent

It checks where the change fits:

controller,
service,
validator,
DTO,
security config,
exception handling,
Swagger/OpenAPI documentation.

3. Implementation Agent

It prepares a minimal implementation plan consistent with existing patterns.

4. Test Architect Agent

It proposes tests:

success path,
invalid old password,
weak new password,
new password same as old password,
unauthorized request,
missing user,
impact on tokens/sessions.

5. Security Reviewer

It checks:

whether the password appears in logs,
whether responses reveal too much,
whether validation is not only frontend-side,
whether the operation requires authentication,
whether tokens should be invalidated after password change.

6. Documentation Agent

It prepares endpoint description, request/response examples, and changelog note.

At the end, the leader should merge everything into one plan:

decisions requiring confirmation,
implementation scope,
list of files to change,
test plan,
risks,
work sequence.

This is the real value of agents: not that one model “writes everything”. The value is that the task is split into meaningful perspectives and the human receives better material for decision-making.

15. Mini-example: your own team for an everyday task

Assume you want to plan a small construction-site monitoring system: temperature, humidity, camera, notifications, daily report.

A leader agent can select this team:

1. Context Agent

It gathers assumptions:

whether there is power on site,
whether there is internet,
which rooms are monitored,
where moisture risks exist,
whether cameras are present,
whether Home Assistant is used,
which notifications make sense.

2. Sensor Agent

It proposes sensors and placement:

temperature,
humidity,
leak detection,
door opening,
motion,
energy,
air quality, if relevant.

3. Automation Agent

It designs automations:

daily report,
warning alert,
critical alert,
ventilation reminder,
power outage information,
unusual motion detection.

4. Camera Event Agent

It describes how to interpret camera events:

human motion,
vehicle,
animal,
zone entry,
after-hours event.

5. Risk Agent

It looks for problems:

false alarms,
internet outage,
dead batteries,
poor sensor placement,
no response procedure,
too many notifications.

At the end, the leader creates a minimum viable smart home plan for the construction site. This is a good example of everyday agent use: agents organize decisions without replacing professionals.

16. The key skill: writing instructions that can be improved

Building agents is not about writing one brilliant prompt. It is closer to developing automated tests or process documentation. The first version will be imperfect. That is normal.

A good agent instruction should be versioned. You can keep it in a Markdown file, repository, project, or notes app. After every use, add one observation:

what the agent did well,
what it did poorly,
which rule needs clarification,
which example should be added,
which question it should ask next time.

After a few iterations, you begin to create a valuable asset: not only an agent, but a description of your way of working. That is often more important than the model itself.

17. How to distinguish a good agent from an impressive one

An impressive agent gives long, confident, polished answers. A good agent gives useful, verifiable answers aligned with the process.

A good agent:

asks when data is missing,
says what it does not know,
stays within scope,
returns output in the agreed format,
separates facts, assumptions, and recommendations,
identifies risks,
does not automatically perform costly or irreversible actions,
can work from examples,
improves after feedback.

An impressive but weak agent:

pretends to be certain,
generates a lot of text without decisions,
ignores the expected format,
ignores constraints,
mixes facts with opinions,
invents sources,
proposes overly complex solutions,
acts outside its scope,
cannot say “I do not know”.

Keep this list nearby because AI can easily impress with style. In professional work, style is not enough.

18. Final exercise for today

If you want to start today, do a very simple exercise.

Choose one task that repeats in your work or life. Make it concrete. For example:

pull request review,
test plan for a user story,
log error analysis,
blog post preparation,
weekly shopping plan,
board game rules judging,
smart home sensor report,
contractor offer comparison.

Then write an agent card in five points:

Agent name:
Mission:
Scope:
Output format:
What the agent must not do:

Use this card on three real examples. After each example, add one improvement to the instruction. After three uses, you will have a better agent than most people who only type ad hoc prompts.

The most important mindset is this: you are not building “AI that does everything”. You are building small, specialized tools that strengthen your way of working. The leader agent helps split large tasks. Specialist agents help look from different perspectives. The human still makes decisions, sets boundaries, and remains responsible for the result.

That is the most sensible way to use agents: not as a replacement for thinking, but as a system for organizing work, speeding up analysis, and making better decisions.

Sources and further reading

OpenAI Help Center: Projects in ChatGPT
OpenAI Help Center: Creating and editing GPTs
OpenAI Help Center: GPTs in ChatGPT
OpenAI Developers: Agents SDK
OpenAI Developers: Define agents
OpenAI Developers: Agent Builder
Claude Code Docs: Create custom subagents
Claude Code Docs: Extend Claude with skills
Claude Code Docs: How Claude remembers your project
Claude Code Docs: How Claude Code works
Model Context Protocol: Introduction
Model Context Protocol: Tools