From LLMs to Multi-Agent Systems: How Semiconductor Architecture Is Moving from Compute to Decision

■ Introduction

In recent years, the evolution of LLMs has significantly changed the semiconductor industry.

The question was simple:

How can we compute faster?

The answer was a GPU-centric world.
The goal was to accelerate matrix operations to the extreme and maximize FLOPS.

However, AI is now entering the next phase.

AI is no longer just a single model.
It is beginning to evolve into a system where multiple agents collaborate.

This shift changes the design philosophy of semiconductors themselves.


■ 1. Semiconductors in the LLM Era: Compute-Centric

The computational characteristics of LLMs are clear.

  • Large-scale matrix operations
  • Parallel execution of uniform processing
  • Batch processing
  • Minimal branching

The architecture optimized for these characteristics was the GPU architecture centered around NVIDIA.

The structure was simple:

Input → Model → Output

What mattered most was:

How to maximize FLOPS.


■ 2. The Rise of the Multi-Agent Era

Agentic AI, or multi-agent systems, has a completely different computational structure.

  • Multi-step processing
  • Tool integration
  • Interaction with the environment
  • State retention
  • Communication between agents

The structure changes into:

Perception → Action → Feedback → Loop

This is no longer just inference.

It is a continuously running system.


■ 3. The Essential Change in Workloads

This shift changes the bottleneck.

■ LLM Era

  • Compute-dominant
  • GPU-optimized

■ Multi-Agent Era

  • Latency
  • Memory
  • Control flow
  • Communication

In other words:

The bottleneck moves from “compute” to “system.”


■ 4. Changes in Semiconductor Architecture

This shift greatly changes the structure of semiconductor architecture.

■ ① From GPU-Centric to Heterogeneous Computing

Before:

GPU-centric

Future:

CPU + GPU + Memory + Network

■ ② Re-evaluation of CPUs, Especially Arm

In multi-agent systems, the following become important:

  • Branching
  • State management
  • Asynchronous processing

This is why Arm is being re-evaluated.

Reasons include:

  • Low power consumption
  • Strength in control processing
  • Scalability

■ ③ Shift Toward Memory-Centric Design

In Agentic AI, the following are critical:

  • Context
  • History
  • State

Memory access becomes more dominant than computation.

Therefore, the importance of:

  • HBM
  • Near-Memory Computing

will increase.

■ ④ Network Becomes a Bottleneck

In multi-agent systems:

  • Communication between agents
  • External tool calls

occur frequently.

Network performance will increasingly determine overall system performance.

■ ⑤ Asynchronous and Event-Driven Processing

Before:

  • Synchronous processing
  • Batch processing

Future:

  • Event-driven processing
  • Asynchronous processing

Hardware will also need to be optimized for this model.


■ 5. What Happens to NVIDIA?

The important point is this:

GPUs will not become unnecessary.

NVIDIA will continue to play a central role in:

  • LLM inference
  • Training

However, its role will change.

Before:

The main actor responsible for everything

Future:

One component within a broader system


■ 6. Apple as a Sign of the Future

One company that already embodies this direction is Apple, through its SoC architecture.

It integrates:

  • CPU for control
  • GPU for computation
  • NPU for inference
  • Unified memory

This can be seen as a prototype of AI as a system.


■ 7. The Essential Shift

In one sentence, the shift can be described as follows:

■ Before: The LLM Era

AI = A compute problem

■ After: The Multi-Agent Era

AI = A system problem


■ 8. The Challenge of Agentic AI: It Runs, but It Is Difficult to Control

As we have seen, Agentic AI begins to operate as a system.

However, this creates a fundamental problem.

That problem is:

A system that keeps running is difficult to control.

■ Why Does Control Become Difficult?

Agentic AI has the following characteristics:

  • It has time, because it runs continuously.
  • It has state, such as context and history.
  • It interacts with the outside world through I/O.

As a result, the system enters the following state.

■ ① State Continues to Change

  • Context is continuously updated.
  • Past decisions influence the present.

Small deviations accumulate and change the behavior of the system.

■ ② Branching Becomes Invisible

  • Decisions are buried inside the model.
  • It becomes difficult to explain why a certain action was taken.

Decision-making becomes a black box.

■ ③ There Is No Clear Stopping Point

  • Loops
  • Retries
  • Exploration

These can lead to infinite execution and increased cost.

■ ④ The System Depends on the External Environment

  • APIs
  • Data
  • Context

Even the same process can produce different results.

■ ⑤ The System Cannot Be Reproduced

  • State is implicit.
  • History is incomplete.

Verification and improvement become difficult.

■ In One Sentence

Agentic AI becomes a system by acquiring time and state.

But at the same time, it introduces uncontrollability caused by history-dependence.


■ 9. Approaches to This Challenge

How to handle this uncontrollability is now a major turning point.

Several approaches are possible.

■ ① Optimization Through Reinforcement Learning

  • Learn behavior through trial and error
  • Maximize long-term reward

However:

  • Learning cost is high.
  • Constraints are difficult to make explicit.
  • Safety guarantees are weak.

■ ② Rule-Based Control

  • Make conditional branches explicit
  • Control behavior procedurally

However:

  • It does not scale well.
  • It lacks flexibility.

■ ③ Workflow / Orchestration

  • Control the system as a flow
  • Use DAGs or pipelines

However:

  • It is weak against dynamic situations.
  • Its handling of state is limited.

■ ④ Monitoring / Guardrails

  • Anomaly detection
  • Filtering

However:

  • It is reactive.
  • It is not fundamental control.

■ 10. DTM: Decision Trace Model as an Approach

There is another direction.

That direction is DTM.

■ What DTM Does

DTM is an approach that:

introduces a decision-making structure into a running system.

■ More Specifically

■ Decision Contract

  • Under what conditions
  • What should be selected

This makes branching explicit.

■ Boundary

  • What is acceptable
  • Where the system should stop

This prevents runaway behavior.

■ Human Gate

  • Return uncertain areas to humans

This prevents full automation from becoming uncontrolled automation.

■ Trace

  • When
  • What
  • Why

This makes reproduction, verification, and learning possible.


■ 11. Semiconductor Architecture in the DTM Era

A computational foundation for executing, controlling, and recording decisions

As we have seen, DTM has the following structure:

  • Define decisions through Decision Contracts
  • Apply constraints through Boundaries
  • Involve humans through Human Gates
  • Record history through Trace

This raises an important question:

What kind of semiconductor architecture is needed to support this structure?


■ 11.1 The Limits of Conventional Architecture

A conventional AI semiconductor architecture looks like this:

CPU
GPU
Memory
Storage
Network

This is suitable for:

  • Accelerating inference
  • Processing data

However, from a DTM perspective, it is insufficient.

The reason is clear:

There is no processing unit for “decision-making.”


■ 11.2 Structural Decomposition Through DTM

When DTM is mapped onto hardware, the process can be decomposed as follows:

Signal → Decision → Boundary → Execution → Trace

This can then be mapped directly onto semiconductor architecture.


■ 11.3 A New Structure: Decision-Centric Architecture

[DTM-aware System Architecture]

1. Signal Engine
2. Decision Engine
3. Boundary Engine
4. Execution Engine
5. Trace Engine

Let us look at each component.


■ ① Signal Engine

Role:

  • LLM / ML inference
  • Interpretation of the situation

Implementation:

  • GPU / NPU
  • Existing AI accelerators

This is an extension of the LLM era.


■ ② Decision Engine

Role:

  • Evaluation of Decision Contracts
  • Conditional branching
  • Priority control

Required characteristics:

  • Fast branching
  • Low-latency rule evaluation
  • Ability to reference state

Possible implementations:

  • Enhanced CPUs, especially Arm-based architectures
  • Dedicated DSL execution units
  • Hardware-accelerated rule engines

This becomes a new core area.


■ ③ Boundary Engine

Role:

  • Checking safety constraints
  • Determining whether execution is allowed
  • Stopping or escalating in abnormal situations

Required characteristics:

  • Real-time judgment
  • Fail-safe behavior
  • Priority evaluation

Possible implementations:

  • Hardware-level constraint checking
  • Safety controllers similar to automotive SoCs

This is the layer that stops what must not be done.


■ ④ Execution Engine

Role:

  • Connection with external systems
  • Issuing control signals
  • Executing tasks

Required characteristics:

  • High-speed I/O
  • Asynchronous processing
  • Event-driven execution

Possible implementations:

  • CPU + DPU
  • SmartNIC

This is the layer that moves the system.


■ ⑤ Trace Engine

Role:

  • Recording decisions
  • Saving state
  • Generating data for reproduction and learning

Required characteristics:

  • Low-latency writing
  • Time-series consistency
  • High-frequency logging

Possible implementations:

  • High-speed log buffers using SRAM
  • Streaming writes
  • Ledger-specific storage

This is the newest element introduced by DTM.


■ 11.4 Why This Architecture Is Needed

In conventional architecture:

  • Inference is possible.
  • Execution is possible.

However:

  • It is unclear where a decision was made.
  • Constraints are added afterward.
  • History remains fragmented.

In a DTM-based architecture:

  • Decisions are made explicit.
  • Constraints are applied before execution.
  • History is recorded consistently.

This shifts the system:

from a system that merely runs
to a system that can be controlled.


■ 11.5 A New Semiconductor Concept

This structure suggests a new category beyond conventional CPUs and GPUs.

■ Decision Processing Unit

Its role would be:

  • Executing decisions
  • Applying constraints
  • Recording history

Before:

  • CPU: control
  • GPU: computation

Future:

  • Decision Unit: decision-making

The role of semiconductors expands.


■ 11.6 Overall Architecture

[DTM × Multi-Agent Semiconductor Stack]

Signal: GPU / NPU

Decision: CPU / Decision Unit

Boundary: Safety Controller

Execution: CPU / DPU

Trace: Ledger Memory / Storage


■ 11.7 Conclusion

In the LLM era, semiconductors accelerated computation.

In the Agentic AI era, semiconductors began to move systems.

And in the DTM era, semiconductors will evolve into a foundation that:

executes, controls, and records decisions.

Exit mobile version
タイトルとURLをコピーしました