Microsoft AutoGen — Complete Guide to Conversational AI Agents

⚠ Important — AutoGen is in maintenance mode

As of 2025, AutoGen is being merged into Microsoft's Agent Framework (AG2). No new features are being added to AutoGen. Bug fixes only. If you are starting a new project, consider AG2 or another actively developed framework. If you have existing AutoGen code, it continues to work and the migration path to AG2 is documented at microsoft.github.io/autogen. This guide covers AutoGen as it stands — it remains useful for understanding the conversational multi-agent pattern it pioneered.

The core idea: agents that talk to each other

Every other major framework — LangChain, CrewAI, LlamaIndex — structures agents around tool calls and task assignments. AutoGen takes a different approach: agents communicate by sending messages to each other in a conversation, just as humans on a team would.

An AutoGen system is a group chat. Each agent is a participant. The conversation has a topic (the goal). Agents take turns speaking — proposing solutions, writing code, reviewing each other's work, requesting clarification, running tests. The conversation continues until the task is complete or someone signals it is done.

This conversational model turns out to be surprisingly powerful for tasks that involve iterative refinement: a coder agent writes code, an executor agent runs it, the coder reads the output and fixes errors, the executor runs it again. The back-and-forth between agents mirrors exactly how human developers actually work.

The two core agent types

AssistantAgent — powered by an LLM. Receives messages, reasons about them, writes responses. Can write code, propose solutions, answer questions, or direct the conversation. Does not execute code itself.

UserProxyAgent — acts on behalf of a human (or autonomously). Executes code that the AssistantAgent writes. Can ask the human to intervene at any point. Can be set to fully autonomous (no human intervention) or to pause before executing each piece of code.

The most common pattern: one AssistantAgent that writes code, one UserProxyAgent that executes it and returns the output. The two agents exchange messages until the task is complete.

What AutoGen is best for

AutoGen's conversational pattern shines in three scenarios:

Code generation and debugging — write, run, observe output, fix, repeat. The natural back-and-forth of software development maps directly onto the agent conversation model.
Research tasks requiring debate — multiple agents with different perspectives argue, challenge, and refine ideas. The diversity of views produces better outputs than a single agent.
Learning and teaching applications — a student agent asks questions, a teacher agent answers, the student summarises what it learned, the teacher corrects gaps. The dialogue format is natural for education.

New project decision: AutoGen is in maintenance mode. For new projects, evaluate CrewAI (easier, actively developed) or LangGraph (more powerful, actively developed) first. AutoGen's value today is primarily for existing codebases and for understanding the conversational multi-agent pattern it established.

How a conversation-based workflow operates

A basic AutoGen workflow has three components: two or more agents, a task description, and a trigger that starts the conversation. The UserProxyAgent initiates by sending the task to the AssistantAgent. The AssistantAgent responds — typically with a plan and some code. The UserProxyAgent executes the code and returns the result. The AssistantAgent reads the result, identifies any issues, and responds with corrected code or a follow-up step. This continues until one agent signals completion using a termination keyword (commonly "TERMINATE").

The code execution in UserProxyAgent runs in a local Docker container by default, providing sandboxing. The agent can be configured to run code directly in the local environment (faster but less safe) or not at all (for workflows that do not involve code).

Prompts for building and working with AutoGen

Understanding and getting started

Explain AutoGen's conversational model

I understand how LangChain agents work (tool calls, ReAct loop). Explain how AutoGen's conversational model is different. What are the concrete advantages and disadvantages of having agents communicate via messages rather than via tool calls? When would I choose AutoGen over LangChain for a real task?

Should I use AutoGen for my project?

I want to build [describe your project]. AutoGen is in maintenance mode and being merged into AG2. Given this, should I use AutoGen, migrate to AG2, or use a different framework like CrewAI or LangGraph? Give me a specific recommendation with reasoning, considering both the technical fit and the long-term maintenance risk.

Building with AutoGen

Basic two-agent coding workflow

Write a basic AutoGen setup in Python with an AssistantAgent and a UserProxyAgent. The task: write a Python function that takes a list of URLs, fetches each one, and returns a dictionary of {url: word_count}. The assistant writes the code, the proxy executes it, and they iterate until a working solution is produced. Use gpt-4o as the model. Configure the UserProxyAgent to not require human input.

Multi-agent group chat

Build an AutoGen GroupChat with three agents: a product manager (defines requirements), a software engineer (writes code), and a code reviewer (reviews the code and requests changes). The goal is to produce a working Python script for [describe task]. Show how to set up the GroupChat and GroupChatManager, how to define each agent's system message, and how to trigger the conversation.

Data analysis workflow

Build an AutoGen workflow for data analysis. The AssistantAgent receives a task: "Analyse this CSV file and produce a summary of key statistics and any notable patterns." The UserProxyAgent has access to a local Python environment with pandas and matplotlib. The workflow should: load the CSV, compute summary stats, identify outliers, produce a chart, and write a plain English summary. Show the full code including how to pass the file path.

Debate and critique workflow

Build an AutoGen system where two AssistantAgents debate a position. One agent argues FOR [a position], one argues AGAINST. A third agent (the moderator) summarises the strongest points from both sides after three rounds of debate and produces a balanced assessment. Show the GroupChat setup with all three agents and the termination condition.

Configure safe code execution

My AutoGen UserProxyAgent is executing code directly in my local environment, which is risky. Show me how to configure it to execute in a Docker sandbox instead. What Docker image should I use? What are the resource limits I should set? How do I handle the case where Docker is not available on the target machine?

Migrating and transitioning

Migrate AutoGen code to AG2

I have existing AutoGen code that I want to migrate to AG2 (Microsoft's Agent Framework, the successor to AutoGen). Here is my current code: [paste]. Identify what needs to change, what APIs have moved, and what the AG2 equivalents are. Show the migrated code alongside the original so I can see what changed.

Rewrite AutoGen workflow in CrewAI

I have this AutoGen multi-agent workflow: [paste code]. Rewrite it using CrewAI. Map each AutoGen agent to a CrewAI Agent with appropriate role, goal, and backstory. Map the conversation flow to CrewAI tasks. Note any functionality that works differently or that CrewAI handles in a fundamentally different way.

Debugging

Debug an AutoGen conversation that won't terminate

My AutoGen agents keep talking to each other and never terminate. The conversation has exceeded 30 messages. Here is my setup: [paste code]. What termination conditions should I add? Show me how to set a maximum message count, how to configure the TERMINATE keyword correctly, and how to add a fallback timeout condition.

Handle code execution errors gracefully

My AutoGen UserProxyAgent sometimes gets stuck when the AssistantAgent writes code that produces an unhandled exception. The agents loop trying to fix it for too many turns. Show me how to configure the UserProxyAgent to limit retries, pass the full error trace back to the assistant, and gracefully abort with a useful error message after 3 failed attempts.

Add human-in-the-loop checkpoints

I want my AutoGen workflow to pause and ask a human for approval before executing certain types of code — specifically anything that writes to disk or makes external API calls. Show me how to configure the UserProxyAgent's human_input_mode and is_termination_msg to implement this selective pause pattern.

AutoGen's architecture — the ConversableAgent model

AutoGen's core abstraction is the ConversableAgent base class. Every agent in AutoGen is a subclass of ConversableAgent, which provides: a send method (to send messages to another agent), a receive method (to process incoming messages and generate a reply), a message history (stored as a list of role/content pairs), and configurable reply functions that determine how the agent responds to any given message.

The reply function chain is the key architectural mechanism. When an agent receives a message, it passes it through a registered list of reply functions in order, stopping at the first one that produces a non-None response. The default chain includes: a termination check (does the message contain the TERMINATE keyword?), a tool execution check (does the message contain a tool call result?), and an LLM response generation. Developers can insert custom reply functions anywhere in this chain — enabling fine-grained control over agent behaviour without subclassing.

GroupChat — the multi-agent coordination mechanism

For systems with more than two agents, AutoGen provides the GroupChat class and its manager, GroupChatManager. The GroupChatManager is itself an LLM-backed agent whose job is to select which agent speaks next. It receives the conversation history and generates the name of the next speaker based on the context — effectively implementing dynamic routing without explicit conditional edges.

Speaker selection can also be configured as: round-robin (each agent speaks in turn), random, or custom (a developer-provided function determines the next speaker). The LLM-based selection is the most flexible but most expensive option; round-robin is the most predictable for structured workflows.

Code execution sandboxing

AutoGen's UserProxyAgent can execute code in three modes, configured via the code_execution_config parameter:

Local execution — code runs directly in the Python process. Fast, no setup, but no isolation. Appropriate only for trusted code in development environments.
Docker execution — code runs in a Docker container with configurable resource limits. Provides strong isolation. Requires Docker. The recommended mode for any production or semi-production use.
No execution — code blocks are returned as text, not executed. Useful for code review or documentation workflows where execution is not the goal.

AutoGen's contribution to the field

AutoGen's influence extends well beyond its current usage. Three patterns it established are now standard across agentic AI:

Conversational coordination — using natural language messages as the coordination mechanism between agents, rather than structured API calls. Now present in every major framework.
Code-writing and code-executing agent pair — separating the agent that writes code from the agent that executes it, so that code review and approval can be inserted between the two. Standard safety pattern in production agent systems.
Human-in-the-loop as a first-class citizen — AutoGen made it easy to pause agent execution at any point and request human input. Most other frameworks have since added similar capabilities.

The transition to AG2 (Microsoft Agent Framework)

Microsoft announced the consolidation of AutoGen into its Agent Framework (referred to as AG2 in community discussions) in 2025. The key changes in AG2: a unified agent base class compatible with MCP, improved async support, native integration with Azure AI services, and a cleaner separation between the agent runtime and the agent logic. The migration guide and AG2 documentation are maintained at microsoft.github.io/autogen.

For teams with significant investment in AutoGen, the migration path is well-documented and the two APIs are deliberately similar. For teams starting fresh, AG2 or an alternative framework is the recommended starting point.

Official documentation

AutoGen / AG2 docs: microsoft.github.io/autogen
GitHub: github.com/microsoft/autogen
Research paper: Wu et al. (2023). AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. arXiv:2308.08155

Source note: All technical specifications are drawn from the official AutoGen documentation and the original AutoGen research paper (arXiv:2308.08155). Maintenance mode status is documented at the official GitHub repository. Verified April 2026.