Architecting Resilient LLM Agents: A Guide to Secure Plan-then-Execute Implementations


TL;DR

This paper proposes a resilient LLM agent architecture guide called “Plan-then-Execute” (P-t-E). By decoupling strategic planning from tactical execution to establish control-flow integrity, the architecture is inherently resistant to indirect prompt injection attacks, and it provides a detailed security blueprint for implementing this architecture in mainstream frameworks such as LangChain, CrewAI, and AutoGen.

Key Definitions

The core of this paper revolves around a series of architectural concepts centered on the “Plan-then-Execute” pattern, with the key definitions as follows:

At present, one of the most common design patterns in the LLM agent field is ReAct (Reason-Act). ReAct agents operate in a tight iterative loop: they generate a Thought, perform an Action (usually a tool call), and observe the Observation, then feed that result back into the next loop to generate a new thought.

The main bottlenecks and issues with this pattern include:

This paper aims to address the above issues by proposing a more robust, predictable, and secure agent architecture pattern, namely P-t-E, with particular emphasis on its value in building production-grade, trustworthy LLM agent applications.

Method

The core contribution of this paper is a systematic exposition of the “Plan-then-Execute” (P-t-E) architecture, along with a set of security-centric design principles and implementation guidelines. This is not just an algorithm, but an architectural blueprint for building resilient LLM agents.

Core Ideas and Advantages of the P-t-E Architecture

The P-t-E pattern works by breaking the agent workflow into two core components:

  1. Planner: A powerful LLM responsible for decomposing the user’s high-level goal into a complete, structured list of steps (or a DAG) before the task begins. This plan serves as a formal, machine-readable artifact that guides all subsequent actions.
  2. Executor: A lighter-weight component (which can be a small model or deterministic code) responsible for strictly following the plan, calling tools step by step, and completing subtasks.

The essential innovation of this design lies in fully separating strategic thinking from tactical execution, thereby bringing three architectural advantages:

Security-First Design Principles

The P-t-E pattern itself provides a strong security foundation, but it must be combined with a set of defense-in-depth strategies.

Control-Flow Integrity and Prompt Injection Defense

This is the most central security advantage of P-t-E. By locking in the entire action plan before interacting with external untrusted data (from tool calls), the P-t-E architecture establishes control-flow integrity. Even if a tool’s output contains an indirect prompt injection attack, it cannot alter the pre-approved sequence of actions or spawn new, unplanned actions. It may contaminate the data flow (for example, by including malicious text in an email body), but it cannot hijack the agent’s control flow. This represents a paradigm shift from “behavioral containment” (hoping the LLM itself can resist attacks) to “architectural containment” (relying on hard architectural constraints to ensure safety).

Defense in Depth: Auxiliary Security Controls

To address other risks such as data-flow contamination, the paper emphasizes that the following controls must be combined:

A Safer Variant: Plan-Validate-Execute

For high-risk applications, the paper proposes an enhanced variant of P-t-E. Given that LLMs may produce plans that are “plausible but actually wrong,” this pattern introduces a mandatory human validation step before execution. After the agent generates a plan, a human expert must review and confirm its logic, safety, and correctness before authorizing the Executor to begin work.

Experimental Conclusions

Rather than conducting traditional quantitative experiments, the paper validates the effectiveness and practicality of its design principles by analyzing how a secure P-t-E architecture can be implemented in three mainstream agent frameworks.

LangChain & LangGraph Implementation

CrewAI Implementation

AutoGen Implementation

Summary

The analysis in this article shows that the “plan-execute” architecture is a solid foundation for building safe, predictable, and efficient LLM agents. It ensures the integrity of control flow through architectural design rather than relying on the model’s inherently unreliable behavior, effectively defending against critical threats such as indirect prompt injection.

The final conclusion is that there is no single “silver bullet.” A production-grade, trustworthy LLM agent must adopt a Defense-in-depth strategy, combining the P-t-E architectural pattern with a series of security controls such as least privilege, sandboxed execution, and human verification.