Considerations for a program for solving algebraic sentences

Web Technology Digital Transformation Artificial Intelligence Natural Language Processing Semantic Web Deep Learning Online Learning Reinforcement Learning Chatbot and Q&A User Interface Knowledge Information Processing Machine Learning Reasoning Programming  Lisp Prolog Navigation of this blog

Introduction

With chatGPT using GPT model described in “Overview of GPT and examples of algorithms and implementations“, you can have it solve a variety of algebraic problems as shown below. Since chatGPT not only gives simple answers, but also shows solutions by applying various formulas, it gives the illusion of a universal AI.

However, when we actually perform the calculations, we find that the answers given by chatGPT are sometimes incorrect. This is because chatGPT only estimates what character will appear next based on a vast amount of training data, and does not perform essential calculations.

Solving Algebra Problems with cahtGPT (1)

<Problem> Calculate 2023×1225

Calculate 2023 × 1225.
2023 × 1225 = 2,480,375
Therefore, 2023 × 1225 = 2,480,375.

The actual answer is 2,478,175, incorrect.

Solving Algebra Problems with chatGPT (2) Factorization

<Problem> Factorize \(3x^2-17x-6\)

Here is how to factor 3x^2 - 17x - 6

First, to factor this quadratic equation, we attempt to factorize it in the following form

3x^2 - 17x - 6 = (ax + b)(cx + d)

To factor it in this way, we need to find the values of a, b, c, and d. 

To expand this to match the original equation, we have

(3x + e)(x + f) = 3x^2 + 3xf + ex + ef

where the conditions 3xf + ex = -17x and ef = -6 must be satisfied.

Next, we find the pair of factors of ef = -6. The factors of -6 are as follows

-6 = -1 * 6 -6 = 1 * -6 -6 = -2 * 3 -6 = 2 * -3

Next, consider the condition 3xf + ex = -17x.

3xf + ex = x(3f + e)

To compare this to -17x, the coefficient of x must be 1.
Thus, 3f + e = -17.
Find a combination that meets these criteria.

ef = -6 case:

-1 * 6 = -6 3f + e = -17

Solving for this, e = -1 and f = -6.

Thus, factoring 3x^2 - 17x - 6 yields
3x^2 - 17x - 6 = (3x - 1)(x - 6)

It is solved with logic and the correct answer is also obtained by (3x-1)(x-6).

Solving Algebra Problems with chatGPT (3) simultaneous equations in two variables

<問題> Solve the following simultaneous equations

\begin{eqnarray}a+b+c&=&0\\2a+2b+c&=&3\\2a+3b+2c&=&1\end{eqnarray}

To solve these three simultaneous equations, we will use a matrix-based method. The given equations are as follows

a + b + c = 0
2a + 2b + c = 3
2a + 3b + 2c = 1

This can be expressed in matrix form as follows

[ 1 1 1 ] [ a ] [ 0 ] [ 2 2 1 ] [ b ] = [ 3 ] [ 2 3 2 ] [ c ] [ 1 ]

Let this matrix be the coefficient matrix A and define the vector b as
A = [ 1 1 1 ] [ 2 2 1 ] [ 2 3 2 ]

b = [ 0 ] [ 3 ] [ 1 ]

We then define the unknown vector x as

x = [ a ] [ b ] [ c ]

The equation then becomes Ax = b. To solve this system of equations, we need to find x = A^(-1) * b. Note that A^(-1) is the inverse of matrix A.

Here, the inverse matrix is computed.

A^(-1) = (1/(ad - bc)) * adj(A)

where adj(A) is the cosine factor matrix of matrix A. Calculating the cofactor matrix yields
adj(A) = [ 4 -3 1 ] [ -1 2 -1 ] [ -2 3 -1 ]

Then calculate det(A) = ad - bc.

det(A) = (1 * 2 * 2) - (1 * 2 * 2) = 4 - 4 = 0

Since det(A) is 0, there is no inverse of A. What this means is that this system of equations may have no solution or countless solutions.
In fact, solving the equation yields

a + b + c = 0

2a + 2b + c = 3

2a + 3b + 2c = 1

From these equations we obtain a + b = -c, 2a + 2b = 3 - c, and 2a + 3b = 1 - 2c.

From equations 1) and 2),

2a + 2b - (a + b) = 3 - 0 a + b = 3

From equations 1) and 3),

2a + 3b - (a + b) = 1 - 0 a + 2b = 1

Solving these equations, we get,

b = 3 - a a + 2(3 - a) = 1 a + 6 - 2a = 1 6 - a = 1 a = 6 - 1 a = 5

Then, find b.

b = 3 - a b = 3 - 5 b = -2

Finally, find c.
c = -a - b c = -5 - (-2) c = -5 + 2 c = -3

Thus, the solution to the simultaneous equations is a = 5, b = -2, and c = -3.

The chatGPT is solved using the expansion factor matrix; the calculation in line 52 is incorrect, and the final solution is also incorrect due to this effect.

STUDENT solving algebraic sentences

Tasks for solving such problems have actually been studied for quite some time. For example, STUDENT, described in “Practical Common LISP Reading Notes” is an early language understanding program written in 1964 and designed to read and solve problems like those in a middle school algebra textbook. It was designed to read and solve sentences like those in a junior high school algebra textbook.

「If the number of customers Tom gets is twice the square of 20% of the number of advertisements he runs, and the number of advertisements is 45, then what is the number of customers Tom gets?」

The answer will be that the number of customers is 162. Incidentally, if the same input is introduced into chatGPT, the following answer (correct this time) is returned.

Let's break down the problem step by step:

Given information:

The number of advertisements Tom runs is 45.

We need to find:

The number of customers Tom gets.

Let's denote the number of customers as "C" and the number of advertisements as "A."

From the problem statement, we have the following information:

The number of customers Tom gets is twice the square of 20% of the number of advertisements he runs.
Number of advertisements, A = 45.

First, let's calculate 20% of the number of advertisements: 20% of A = 0.20 * 45 = 9

Next, let's find the square of 20% of the number of advertisements: Square of 20% of A = 9^2 = 81

Now, according to the problem statement, the number of customers Tom gets is twice the square of 20% of the number of advertisements he runs: C = 2 * (Square of 20% of A) C = 2 * 81 C = 162

So, the number of customers Tom gets is 162.

The code for this STUDENT is written in LISP and can be obtained on git. A simple example implementation is shown below.

(defun solve-algebraic-word-problem (problem)
  (let* ((variables (find-variables problem))
         (equations (find-equations problem variables))
         (solutions (solve-equations equations variables)))
    solutions))

(defun find-variables (problem)
  (let ((pattern "[a-zA-Z]+"))
    (remove-duplicates (all-matches pattern problem))))

(defun find-equations (problem variables)
  (let ((pattern "[a-zA-Z]+[+\-*/][a-zA-Z]+"))
    (all-matches pattern problem)))

(defun solve-equations (equations variables)
  (let ((solutions '()))
    (dolist (equation equations)
      (let* ((parts (split-string equation "="))
             (left-side (parse-expression (car parts) variables))
             (right-side (parse-expression (cadr parts) variables))
             (eq (list '= left-side right-side))
             (solution (solve eq variables)))
        (setq solutions (cons solution solutions))))
    solutions))

(defun parse-expression (expression variables)
  (let ((var-values (mapcar (lambda (var) (list var (intern var))) variables)))
    (subst var-values expression)))

;; Entering Algebra Sentence Problems
(setq problem "3x + 2y = 10nx - y = 2")

;; Solving Algebra Sentence Problems
(setq solutions (solve-algebraic-word-problem problem))

;; Display Results
(dolist (solution solutions)
  (dolist (variable solution)
    (format t "~a = ~a~%" (car variable) (cadr variable))))

In this LISP program, the solve-algebraic-word-problem function is the entry point for solving algebraic sentence problems, solve-equations function solves the equation and returns the final solution; the parse-expression function parses the equation and replaces variables with their corresponding values.

About a Python Program for Solving Algebra Sentence Problems

A similar approach can be implemented using the modern programming language python. In order to create a program to solve algebraic sentence problems, it is necessary to combine Natural Language Processing (NLP) with an algebraic solution algorithm, which can be implemented using the following approach.

  1. Obtain the algebra sentence problem in text format as input to the program.
  2. Use an NLP model to extract important information from the text. For example, extract variables and conditions from the problem text.
  3. Construct an algebra equation based on the extracted information. For example, from a condition such as “x times 2 is 10,” generate an expression such as x * 2 = 10.
  4. Apply a mathematical algorithm to solve the constructed algebraic expression.
  5. For example, use linear equation solving to obtain the answer.
    Output the obtained answer.

A concrete implementation example is as follows.

import re
from sympy import symbols, Eq, solve

def solve_algebraic_word_problem(problem):
    # Extract variables in a sentence using regular expressions
    variables = re.findall(r'b[a-zA-Z]+b', problem)
    variables = list(set(variables))  # Remove duplicates

    # Create symbols that are unknowns in the equation
    symbols_list = symbols(' '.join(variables))

    # Extract equations
    equations = re.findall(r'[a-zA-Z]+(?:s*+s*[a-zA-Z]+)*s*=s*d+', problem)

    # solve a equation
    solutions = []
    for equation in equations:
        # Split the left and right sides of the equation
        parts = equation.split('=')
        left_side = parts[0].strip()
        right_side = parts[1].strip()

        # Create equations
        eq = Eq(eval(left_side), int(right_side))

        # solve a equation
        solution = solve(eq, symbols_list)
        solutions.append(solution)

    return solutions

# Entering Algebra Sentence Problems
problem = "3x + 2y = 10nx - y = 2"

# Solving Algebra Sentence Problems
solutions = solve_algebraic_word_problem(problem)

# Display Results
for solution in solutions:
    for variable, value in solution.items():
        print(f'{variable} = {value}')

The program uses regular expressions to extract variables in sentences and the Sympy library to solve equations. When the program is run, the variables and their values for a given algebraic text problem are displayed.

What is done in these LISP and Python codes is to “convert English sentences into equations” using string pattern matching, and then solve the generated equations. This approach does not allow for a variety of approaches to problems as in chatGPT, but only accepts input for a fixed set of problems. However, once accepted, the answer is almost always correct.

Thus, both chatGPT and STUDENT have their strengths and weaknesses, and are inadequate for the task of “solving math problems. The solution to these problems is to combine the flexibility of chatGPT with the certainty of STUDENT. We can also say that this issue is a challenge for the creation of General AI.

Approaches to Problem Solving

So, how can we combine the flexibility of chatGPT with the certainty of STUDENT? First, let’s analyze the process flow of each of them as follows.

<STUDENT process flow>

  1. Set the context of the problem: First, provide the context of the problem to STUDENT. For example, understand the assumptions of the problem statement, the type of problem, the role of variables, etc., and use them as data.
  2. Enter the problem statement: Enter an algebraic sentence problem into STUDENT. The problem statement is expressed in natural language and contains the conditions and requirements of the problem.
  3. Read the problem statement: Understand the given algebra text problem and extract mathematical information from the problem statement. Identify key words and phrases in the text and identify mathematical elements such as variables, operators, equals signs, and numbers.
  4. Variable and equation extraction: From the mathematical elements extracted, identify variables and equations relevant to the problem. Variables are unknowns in the problem statement and equations are equations that show mathematical relationships to the variables. Equations are constructed based on the conditions of the problem and the way the problem statement is expressed.
  5. Solving the equations: Algebraic methods are applied to solve the extracted equations. This includes algebraic operations and calculations to normalize the equation and to find the unknowns. Algebraic methods include matrix operations for solving simultaneous equations and transformations of algebraic identities.
  6. Solution Evaluation: After obtaining an analytical solution, the solution is evaluated based on the context of the problem. Check to see if the solution satisfies the conditions and requirements of the given problem. The solution may also be interpreted in natural language to provide an interpretation to aid in explanation and understanding.

<Process flow of ChatGPT>

  1. Entering a problem statement: An algebraic text problem is entered into ChatGPT. The problem statement is expressed in natural language and contains the conditions and requirements of the problem.
  2. Read the problem text: The given text is converted into a feature vector from which an output vector is generated.
  3. Evaluate the solution: check the output results and give feedback if it is inappropriate.

The directions for fusing these are

<Process flow of the fusion module of STUDENT and ChatGPT>

  1. Input Problem Sentence: Input an algebraic sentence problem. The problem statement is expressed in natural language and contains the conditions and requirements of the problem.
  2. Set the context of the problem: Extract the context of the problem from the input sentences (applying deep learning techniques). For example, the assumptions of the problem statement, the type of problem, the role of variables, etc. are extracted and used as data.
  3. Read the problem text: Extract mathematical information from a given algebra text problem text (applying deep learning techniques). Identify key words and phrases in the sentence and identify mathematical elements such as variables, operators, equals signs, numbers, etc.
  4. Variable and equation extraction: From the extracted mathematical elements, identify variables and equations relevant to the problem (applying deep learning techniques). Variables are unknowns in the problem statement, and equations are equations that show mathematical relationships to the variables. Equations are constructed based on the conditions of the problem and the way the problem statement is expressed.
  5. Estimate the method for solving the equation: Estimate the appropriate method from the structured problem (application of deep learning techniques).
  6. Equation solving: Algebraic methods are applied to solve the extracted equations. This includes algebraic operations and computations to normalize the equations and find the unknowns. Algebraic methods include matrix operations for solving simultaneous equations and transformations of algebraic identities.
  7. Solution Evaluation: After obtaining an analytical solution, the solution is evaluated based on the context of the problem. Check to see if the solution satisfies the conditions and requirements of the given problem. It may also interpret the solution in natural language and provide interpretations to aid in explanation and understanding.

As in the ChatGPT flow, everything is not done in a single model, but is divided into functional modules tailored to the flow and machine learning modules such as deep learning (which correspond to 2, 3, 4, and 5 above) are built. The equation solving part (6) and the solution evaluation part (7) will be built based on logic, which is expected to create a solution that combines the flexibility of chatGPT with the certainty of STUDENT.

Recently, there is a topic entitled “WizardMath, a large-scale language model “specialized for mathematics,” developed by Microsoft and other companies in the U.S., enhances the Llama model. This is an open-source pre-trained large-scale language model (LLM) called “Llama-2” that Microsoft has fine-tuned using a new method called “Reinforcement Learning from Evol-Instruct Feedback” (RLEIF) to improve its mathematical reasoning ability. This improvement improved the mathematical reasoning ability of Llama-2. Although it is not clear whether this improvement has improved the above problems, it can be said that improvements have been made on the GPT side as well.

Algorithms used in each module and specific implementations will be described separately.

コメント

タイトルとURLをコピーしました