AI Quality Engineering — Why the Quality Engineering Becomes Important Again in the Age of AI —

When people talk about AI, most discussions tend to focus on topics such as:

  • Model accuracy

  • Data volume

  • Number of parameters

However, once AI systems begin to operate in real society, the important question starts to change.

The question becomes:

Can this AI make stable decisions in the real world?

It is not enough for an AI system to be correct on average. We must ask whether its decisions remain stable when:

  • The environment changes

  • The data distribution shifts slightly

  • Unexpected inputs appear

This question is actually closer to a problem in quality engineering than in traditional AI research.


The Taguchi Method

When I worked at Xerox, I had the opportunity to use methods from quality engineering.

One of the key approaches used there was the Taguchi Method.

Devices such as photocopiers have a very large number of parameters.

For example:

  • Temperature

  • Voltage

  • Drum rotation speed

  • Toner density

  • Paper conditions

  • Humidity

There can easily be dozens, sometimes hundreds, of such parameters.

Testing every possible combination of these parameters is not realistic.

This is where Orthogonal Arrays come in.

By using orthogonal arrays, engineers can determine—with a relatively small number of experiments:

  • Which parameters actually affect quality

  • Which parameters have little or no impact

In other words, the Taguchi Method provides a way to identify which parameters truly matter for system qualityamong a large number of candidates.

This approach to quality engineering significantly improved the stability and performance of photocopiers, and it was one of the factors that allowed Xerox to maintain strong competitiveness in the copier market for many years.


The Concept of Signal and Noise

In the Taguchi Method, an important concept is the Signal-to-Noise ratio (S/N ratio).

Here, the Signal represents the output that the system is supposed to produce.

In the case of a photocopier, the signal is:

the quality of the printed image.

Meanwhile, Noise refers to variations such as:

  • Temperature fluctuations

  • Humidity changes

  • Material variation

  • Environmental disturbances

Quality engineering evaluates how robust the signal is against noise.


Applying the Taguchi Perspective to AI Systems

This structure is surprisingly similar to modern AI systems.

AI systems also contain many parameters.

For example:

  • Model architecture

  • Training data

  • Prompts

  • Preprocessing

  • Inference settings

And AI systems also face many types of noise, such as:

  • Distribution shifts in data

  • Unknown inputs

  • Noisy data

  • Changes in user behavior

So what corresponds to Signal in an AI system?

It can be understood as:

the decisions produced by the AI.

For example:

  • Fraud detection AI → decision about whether a transaction is fraudulent

  • Recommendation AI → recommended items

  • Customer support AI → generated responses

  • Credit evaluation AI → approval or rejection decisions

In other words, in AI systems:

the decision itself is the Signal in the Taguchi sense.


The Problem with Current AI Evaluation

Most current AI evaluation methods focus on metrics such as:

  • Accuracy

  • F1 score

  • AUC

These metrics measure average performance.

However, from the perspective of quality engineering, this is not sufficient for evaluating system quality.

Because even if the average performance is high, problems can still occur when:

  • Decisions break down under certain conditions

  • Specific inputs lead to misclassification

  • Performance drops due to distribution shifts

In real systems, these exceptional situations can lead to serious problems.

For example:

  • Fraud detection systems blocking large numbers of legitimate users

  • Recommendation systems becoming extremely biased

  • Customer support AIs repeatedly generating incorrect answers

In such cases, even if the average accuracy is high, the overall quality of the system is severely damaged.

In other words, the true essence of AI quality should not be:

average accuracy

but rather:

the stability of the system’s decisions.


AI Quality Engineering

This leads to an important concept:

AI Quality Engineering

AI Quality Engineering is an approach that evaluates AI systems not by their average performance, but by the stability of their decision signals.

In this framework, evaluating AI means asking questions such as:

  • Which parameters affect the quality of the decision signal?

  • Under what conditions do decisions break down?

  • Which types of noise destabilize the system’s decisions?

These are exactly the kinds of questions asked in the Taguchi Method.


Example: Recommendation Systems

Consider a recommendation system.

Such systems have many parameters:

  • Model architecture

  • Hyperparameters

  • Training data

  • User features

  • Prompts

  • Ranking algorithms

They also face many sources of noise:

  • Changes in user behavior

  • Newly introduced products

  • Seasonal trends

  • Limited data situations

What matters is not simply high average accuracy.

The important questions are:

  • Does the system remain stable when many new products appear?

  • Does it remain robust when user behavior changes?

  • Does it avoid extreme failures when data is sparse?

What we need to evaluate is not the average performance, but the stability of decisions.


Example: Fraud Detection Systems

Another clear example is fraud detection AI.

In this case, the signal is the decision:

whether a transaction is fraudulent or not.

However, this decision can be affected by noise such as:

  • New fraud patterns

  • Changes in transaction amounts

  • Behavioral changes of users

  • Differences across regions or countries

Even if the model shows high average accuracy, it may still fail when:

  • New fraud strategies appear

  • Certain conditions cause false positives to increase dramatically

The key question becomes:

Under what conditions do the decisions break down?

Again, this is exactly the question posed in the Taguchi Method.


The Future of AI Is Moving Toward Quality Engineering

The first generation of AI was largely about model competition.

The goal was simple:

Who can build the smartest model?

But as AI systems move deeper into real-world applications, the focus is shifting toward something else:

stability.

AI is gradually moving from being purely a research topic to becoming an object of quality engineering.

The Taguchi Method I learned during my time at Xerox may therefore have surprising relevance in the age of AI.

Because just as the signal in a photocopier was the quality of the printed output, in AI systems:

the signal is the decision itself.

And the quality of AI ultimately depends on how stable those decisions remain in the presence of noise.

コメント

タイトルとURLをコピーしました