Loki Mode Goes Provider Agnostic: v5.0 and Multi-Provider Support

Loki Mode v5.0 shipped this week with a feature I have been working toward since the project's earliest design decisions: full provider-agnostic orchestration. The system now runs across Claude Code, OpenAI Codex CLI, and Google Gemini CLI with a single configuration change. No code modifications. No architecture adjustments. Just swap the provider and the entire multi-agent pipeline runs on a different foundation model.

This was not accidental. It was the result of a deliberate architectural bet that is now paying off.

Why Provider Agnosticism Matters

When I started building Loki Mode, Claude Code was the only serious CLI-based AI coding tool. But I could see the trajectory. OpenAI was going to ship a CLI tool. Google was going to ship a CLI tool. The major AI labs are converging on the same interface pattern: a terminal-based agent that can read files, write code, and execute commands.

Building an orchestration layer locked to a single provider would have been simpler in the short term but strategically limiting. If your agent system only works with one foundation model, you inherit all of that model's strengths, weaknesses, pricing changes, and availability constraints. You also lose the ability to evaluate alternatives or mix providers for different tasks.

Provider agnosticism gives you options. When Claude excels at a particular type of reasoning task, use Claude. When Codex handles code generation in a specific language better, use Codex. When Gemini's context window gives you an advantage for large codebase analysis, use Gemini. The orchestration layer should not care which model does the work; it should care that the work meets quality standards.

The Architecture

The implementation is straightforward because the architecture was designed for it from the beginning. Each provider is defined as a shell-sourceable configuration file:

# providers/claude.sh
export LOKI_PROVIDER="claude"
export LOKI_CLI="claude"
export LOKI_AUTO_FLAG="--dangerously-skip-permissions"

invoke_provider() {
    local prompt="$1"
    $LOKI_CLI $LOKI_AUTO_FLAG -p "$prompt"
}

# providers/codex.sh
export LOKI_PROVIDER="codex"
export LOKI_CLI="codex"
export LOKI_AUTO_FLAG="--full-auto"

invoke_provider() {
    local prompt="$1"
    $LOKI_CLI $LOKI_AUTO_FLAG "$prompt"
}

# providers/gemini.sh
export LOKI_PROVIDER="gemini"
export LOKI_CLI="gemini"
export LOKI_AUTO_FLAG="--approval-mode=yolo"

invoke_provider() {
    local prompt="$1"
    $LOKI_CLI $LOKI_AUTO_FLAG "$prompt"
}

The loader sources the appropriate provider configuration at startup, and the rest of the system interacts through the invoke_provider function. The orchestration logic, the RARV cycle, the quality gates, the swarm coordination: none of it knows or cares which model is underneath.

# Switch providers at runtime
source providers/loader.sh
load_provider "claude"  # or "codex" or "gemini"

Provider Feature Matrix

Not all providers are equal, and pretending otherwise would be dishonest. Here is the current state:

Claude Code runs at full capability. All 41 agent types, all 8 swarms, full RARV cycle, quality gates, parallel review loops. Claude's tool use capabilities and instruction following make it the most reliable provider for complex orchestration tasks.

Codex CLI runs in degraded mode. Core functionality works: code generation, file modification, test execution. But Codex's approach to autonomy is different from Claude's, and some of the more sophisticated agent interactions need adaptation. I classify this as "functional but not optimal."

Gemini CLI also runs in degraded mode. Similar to Codex, the core pipeline works, but Gemini's strengths lie in different areas, particularly large context analysis and research-oriented tasks. The swarms that benefit from large context windows perform better with Gemini than with other providers.

The honest assessment is that Claude Code remains the primary provider for production use, and the other providers serve as alternatives for specific use cases, cost optimization, or resilience when a provider has an outage.

What I Learned Implementing This

The implementation revealed some interesting differences between providers that are not obvious from their documentation.

Autonomy flags vary significantly. Claude uses --dangerously-skip-permissions, which is explicit and self-documenting. Codex uses --full-auto, which is more concise. Gemini uses --approval-mode=yolo, which is memorable but tells you less about what it actually does. The naming differences reflect different philosophies about how much the tool should warn you before running autonomously.

Output formats are inconsistent. Each provider structures its output differently. Claude provides structured tool use results. Codex returns a different format. Gemini has its own conventions. The orchestration layer needs to normalize these outputs before passing them to the next phase of the pipeline. This normalization layer was the most tedious part of the implementation, but it is essential for provider interchangeability.

Error handling differs. When a task fails, each provider communicates the failure differently. Some provide structured error codes. Others return natural language descriptions of what went wrong. The orchestration layer needs to handle all of these patterns and translate them into a consistent internal representation.

Context management varies. Each provider has different context window sizes, different approaches to file reading, and different strategies for managing conversation history. The orchestration layer cannot assume a fixed context budget; it needs to adapt based on the active provider's capabilities.

The Testing Challenge

Testing a provider-agnostic system requires running the same test suite against multiple providers. This is expensive in both time and API costs. I built a provider test harness that validates core functionality across all three providers:

# tests/test-provider-loader.sh
# 12 tests covering:
# - Provider loading and configuration
# - Environment variable setup
# - invoke_provider function availability
# - Provider switching at runtime
# - Error handling for unknown providers
# - Default provider fallback

The test suite runs in CI against Claude (primary) and uses mocked providers for Codex and Gemini to keep costs manageable. Live provider testing happens manually before releases.

One important lesson: provider-agnostic does not mean provider-unaware. The test suite validates that provider-specific quirks are handled correctly, not that all providers behave identically. Pretending they behave identically would produce false confidence.

Design Philosophy

The v5.0 release reinforces a design philosophy I hold strongly: orchestration layers should be thin, opinionated about process, and agnostic about implementation.

Loki Mode is opinionated about the RARV cycle. Every task goes through Reason, Act, Reflect, Verify. That is not negotiable regardless of which provider is active.

Loki Mode is opinionated about quality gates. Code must pass review, tests must pass, coverage thresholds must be met. The provider does not get to skip these steps.

But Loki Mode is deliberately agnostic about which model generates the code, which model reviews it, and which model writes the tests. These are interchangeable components, and the system is designed to treat them that way.

This separation between process opinions and implementation agnosticism is what makes the architecture flexible without being unprincipled. You get consistency in how work flows through the system and flexibility in what does the work at each step.

What Comes Next

Provider agnosticism opens up possibilities that were not practical with a single-provider design.

Mixed-provider workflows are the next frontier. Imagine using Claude for planning (strong at structured reasoning), Codex for implementation (strong at code generation), and Gemini for review (strong at large-context analysis). Each phase of the RARV cycle could use the provider that is best suited for that type of work.

Cost optimization becomes possible. Different providers have different pricing models. An orchestration layer that can route tasks to the most cost-effective provider for each task type could significantly reduce the cost of running multi-agent systems at scale.

Resilience improves when you are not dependent on a single provider. If Claude Code has an outage, the system can fall back to Codex or Gemini. Availability concerns are real for anyone running agent systems in production.

v5.0 is the foundation for all of these capabilities. The architecture is ready. Now it is about building the routing logic on top.

Loki Mode Goes Provider Agnostic: v5.0 and Multi-Provider Support

Why Provider Agnosticism Matters

The Architecture

Provider Feature Matrix

What I Learned Implementing This

The Testing Challenge

Design Philosophy

What Comes Next

keep reading

gstack vs Loki Mode: 105,000 Stars Does Not Settle the Argument

AI Agents Are Already Out of Control and Nobody is Ready

The State of AI Agents in Early 2026

get this in your inbox