Overview of Gene Expression Programming (GEP) and examples of algorithms and implementations

Machine Learning Artificial Intelligence Digital Transformation Deep Learning Image Information Processing Machine Learning in General Navigation of this blog
Overview of Gene Expression Programming (GEP)

Gene Expression Programming (GEP) is a type of evolutionary algorithm, which makes it a particularly suitable method for the evolutionary generation of mathematical expressions and programs. This technique is used to evolve the form of a mathematical expression or program to help find the best solution for a particular task or problem. The main features and overview of GEP are described below.

1. individual expression:

In GEP, individuals are composed of multiple sequences called genes. A gene contains a function set (function symbols) and a terminal set (variables, constants, etc.), and a gene takes the form of multiple genes at a single locus.

2. gene computation:

In the evolutionary process of GEP, crossovers (crossovers) and mutations affect genes. In a crossover, parts of the genes of two parental individuals are exchanged to produce a new offspring, and in a mutation, parts within a gene are randomly changed.

3. gene evaluation:

Each individual (gene) is used to evaluate its adaptability to a particular problem or task. The evaluation of a gene depends on the specific problem to be solved by genetic programming, and evolution generally takes place to minimize or maximize the evaluation function.

4. evolutionary process:

In GEP, the evolutionary process is repeated over multiple generations. Selection is made based on fitness, and crossovers and mutations produce new individuals, which gradually increase the number of highly adapted individuals, with the expectation that an optimal solution will eventually be obtained.

5. areas of application:

GEP has been applied to a variety of tasks, including symbolic regression, symbolic regression, functional optimization, and program synthesis. In particular, it has been successfully used in mathematical modeling and optimization problems to find evolutionarily more effective formulas and programs than those written by hand by humans.

GEP is mainly used in the evolution of mathematical formulas and programs, and in the evolutionary process, it evolves in the direction of increasing adaptivity and eventually becomes what is expected to find the right mathematical model or program for the problem. GEP has been applied to problems such as symbolic regression, functional optimization, and program synthesis, and has a variety of applications.

Algorithms for Gene Expression Programming (GEP)

Gene Expression Programming (GEP) algorithm is a type of evolutionary algorithm, in which a formula or program is evolved to find a suitable one for a problem. The following is a basic algorithmic procedure for GEP.

1. initialization:

Generate an initial population of individuals. Each individual is composed of what are called genes, which contain a function set (function symbols) and a terminal set (variables, constants, etc.), and the genes are randomly generated.

2. computation of the degree of adaptation:

The degree of adaptation of each individual is evaluated for the problem. The calculation of the fitness depends on the specific problem that genetic programming wants to solve.

3. evolution:

Evolutionary operations are performed to generate the next generation. The main evolutionary operations include crossover and mutation.
Crossover: a partial exchange of genes between two parental individuals results in the creation of a new offspring.
Mutation: a part of a gene is randomly changed.

4. Selection:

Selection of the individuals that will make up the next generation. Individuals with higher fitness have a higher probability of being passed on to the next generation. 5.

5. check for termination conditions:

Checks to see if termination conditions are met, such as if the level of adaptation has become sufficiently high, or if a certain number of generations have passed.

6. extraction of final results:

Final individuals or genes are obtained and the best solution to the problem is extracted.

Application of Gene Expression Programming (GEP)

Gene Expression Programming (GEP) has been applied to various problems in various fields. The following are examples of its application.

1. mathematical modeling:

GEP has been successfully applied to the evolutionary generation of mathematical functions and models, and is particularly promising for symbolic regression problems. It has been used for function fitting and modeling of nonlinear functions.

2. time series data analysis:

GEP has also been applied to the analysis of time-series data, for example, for modeling time-dependent data such as stock price forecasts and weather forecasts.

3. machine learning feature generation:

GEP is also used as a feature generation method. It is expected that GEP will improve the performance of machine learning models through the successful evolutionary generation of mathematical expressions and programs to generate new features from data.

4. control system design:

GEP is also used in control system design and optimization. Evolutionary generation of mathematical models of control systems will make it possible to find optimal control methods for specific control tasks.

5. life sciences and medicine:

GEP is also used in the analysis of gene expression data and in bioinformatics. In particular, GEP may be applied to investigate the relationship between gene expression patterns and disease.

6. power system optimization:

GEP is also used in the generation of optimal operation plans for power systems and in power forecasting. It can be useful in finding optimal control strategies and investment plans in complex power networks in an evolutionary manner.

These examples show that GEP can be applied to a variety of problems using symbolic representations and is promising as an evolutionary generation method. GEP’s flexibility and powerful expressive power have led to its growing use in a variety of fields.

Examples of Gene Expression Programming (GEP) implementations

Examples of Gene Expression Programming (GEP) implementations are complex and require the implementation of various functions to be applied to real problems. Here is an example implementation of GEP for a simple mathematical problem. This example uses Python.

import random
import numpy as np

# Basic GEP settings
function_set = ['+', '-', '*', '/']
terminal_set = ['x', '1', '2', '3', '4', '5']

population_size = 100
gene_length = 10
generations = 50

# Function of the target (e.g., x^2 + 3x + 2)
def target_function(x):
    return x**2 + 3*x + 2

# evaluation function
def fitness(individual):
    error = 0.0
    for x in range(-10, 11):
        try:
            expr_result = eval_individual(individual, x)
            target_result = target_function(x)
            error += (expr_result - target_result)**2
        except ZeroDivisionError:
            # Gives a large error if a zero-division error occurs
            error += 1e10
    return error

# Evaluation of gene expression (expression)
def eval_individual(individual, x):
    stack = []
    for gene in individual:
        if gene in function_set:
            if len(stack) < 2:
                continue
            operand2 = stack.pop()
            operand1 = stack.pop()
            if gene == '+':
                stack.append(operand1 + operand2)
            elif gene == '-':
                stack.append(operand1 - operand2)
            elif gene == '*':
                stack.append(operand1 * operand2)
            elif gene == '/':
                # Preventing zero percentages
                if operand2 != 0:
                    stack.append(operand1 / operand2)
                else:
                    stack.append(1e10)  # Substitute a suitably large number
        else:
            # Terminals (variables and constants)
            if gene == 'x':
                stack.append(x)
            else:
                stack.append(float(gene))
    return stack[0]

# Generation of initial individuals
def generate_individual():
    return [random.choice(function_set + terminal_set) for _ in range(gene_length)]

# evolution
def evolve(population):
    for generation in range(generations):
        # Evaluation of each individual
        scores = [fitness(ind) for ind in population]
        best_ind = population[np.argmin(scores)]
        best_score = min(scores)

        # Generation of new individuals
        new_population = []
        for _ in range(population_size):
            parent1 = tournament_selection(population, scores)
            parent2 = tournament_selection(population, scores)
            child = crossover(parent1, parent2)
            child = mutate(child)
            new_population.append(child)

        population = new_population

        # Display Results
        print(f"Generation {generation+1}, Best Score: {best_score}, Best Individual: {best_ind}")

# Tournament Selection
def tournament_selection(population, scores, tournament_size=5):
    tournament_indices = random.sample(range(population_size), tournament_size)
    tournament_scores = [scores[i] for i in tournament_indices]
    winner_index = tournament_indices[np.argmin(tournament_scores)]
    return population[winner_index]

# uniform crossing
def crossover(parent1, parent2, crossover_rate=0.9):
    child = []
    for gene1, gene2 in zip(parent1, parent2):
        if random.random() < crossover_rate:
            child.append(gene2)
        else:
            child.append(gene1)
    return child

# mutation
def mutate(individual, mutation_rate=0.1):
    for i in range(len(individual)):
        if random.random() < mutation_rate:
            if individual[i] in function_set:
                individual[i] = random.choice(function_set)
            else:
                individual[i] = random.choice(terminal_set)
    return individual

if __name__ == "__main__":
    # Generation of initial individuals
    population = [generate_individual() for _ in range(population_size)]

    # Performing Evolution
    evolve(population)

In this example, the target function is x^2 + 3x + 2 and a program is implemented to approximate it with GEP.

Gene Expression Programming (GEP) Challenges and Measures to Address Them

Gene Expression Programming (GEP), like other evolutionary algorithms described in “Overview of evolutionary algorithms and examples of algorithms and implementations“, has its challenges. Below we discuss some common GEP challenges and how they are addressed.

1. expression constraints:

Challenges: GEP requires appropriate design for the representation of individuals, including the selection of function sets and terminal sets, and the setting of gene lengths.
Solution: It is important to carefully address constraints on gene expression by selecting appropriate function sets and terminal sets for the problem, as well as appropriate gene lengths and number of individuals.

2. reduced speed of convergence:

Challenge: Convergence speed is slowing down and it takes time to converge to the optimal solution.
Solution: As is the case with genetic algorithms described in “Overview of genetic algorithms, application examples, and implementation examples” in general, adjustments can be made to improve algorithm performance, such as adjusting parameters for appropriate genetic manipulation, improving initialization, and introducing evolutionary strategies.

3. susceptibility to fall back to a locally optimal solution:

Challenge: GEP, like other evolutionary algorithms, may converge to a locally optimal solution.
Solution: To maintain diversity, it is important to set appropriate mutation and crossover rates in genetic manipulation. In addition, methods for generating different initial individuals and averaging over multiple runs can be considered.

4. designing an appropriate evaluation function:

Challenge: It is necessary to design an evaluation function that is appropriate for the specific problem that GEP wants to solve.
Solution: Carefully design the evaluation function according to the characteristics of the problem, so that the calculation of adaptability appropriately reflects the performance of the individual. In addition, the degree of adaptation must be evaluated accurately.

5. data-dependence:

Challenge: If GEP is data-dependent, its ability to adapt to new data will be reduced.
Solution: To increase versatility, consider designing expressions and genetic manipulations that can learn general patterns while minimizing data dependence.

Reference information and reference books

For reference information, please also refer to “Overview and reference books of metaheuristics” and “Overview and implementation of particle swarm optimization (PSO)“.

As a reference book

Hands-On Genetic Algorithms with Python

Genetic Algorithms with Python

Genetic Algorithms and Genetic Programming

Gene Expression Programming

GEP(Gene Expression Programming)Reference

1. Gene Expression Programming: Mathematical Modeling by an Evolutionary Algorithm

  • Author: Cândida Ferreira

  • Publisher: Springer

  • Edition: 1st (2002) / 2nd Revised Edition (2006)

  • Overview: The foundational and most authoritative book on GEP by its inventor. Covers the theoretical underpinnings of GEP, algorithm structure, chromosome representation, and diverse applications in modeling, function discovery, and symbolic regression.

  • Recommended for: Anyone wanting a deep, formal understanding of GEP’s theory and practical implementation.

  • Link: Springer – GEP Book

2. Introduction to Evolutionary Computing

  • Authors: A. E. Eiben, J. E. Smith

  • Publisher: Springer, 2015 (2nd Edition)

  • Overview: Broad textbook on evolutionary computation, including chapters that explain Genetic Algorithms, Genetic Programming, and a section introducing Gene Expression Programming.

  • Recommended for: Those who want to understand GEP in the context of other evolutionary techniques.

3. Handbook of Genetic Programming Applications

  • Editors: Amir H. Gandomi, Amir H. Alavi, Conor Ryan

  • Publisher: Springer, 2015

  • Overview: A practical guide to genetic programming and related techniques, including GEP. Focuses on real-world problem-solving across engineering, economics, and bioinformatics.

  • Recommended for: Practitioners applying GEP to practical, complex problems.

4. Recent Advances in Evolutionary Computation for Combinatorial Optimization Problems

  • Editor: Carlos Cotta, et al.

  • Publisher: Springer

  • Overview: Discusses evolutionary approaches like GEP for solving combinatorial and optimization problems.

  • Recommended for: Researchers focusing on optimization problems beyond standard function approximation.

Online Resources

  • Cândida Ferreira’s Official GEP Website
    http://www.gene-expression-programming.com/
    Includes tutorials, publications, and software tools related to GEP.

  • Google Scholar
    ➔ Search "Gene Expression Programming" AND "applications" to find academic papers applying GEP in different fields like symbolic regression, data mining, bioinformatics, etc.

  • GitHub Repositories
    ➔ Many open-source projects for GEP exist in Python, Java, and C#. Example search: "Gene Expression Programming site:github.com"

コメント

タイトルとURLをコピーしました