Approach to automatic generation of prompts for generative AI

Machine Learning Natural Language Processing Artificial Intelligence Digital Transformation Image Processing Reinforcement Learning Probabilistic Generative Modeling Deep Learning Python Navigation of this blog
Generative AI and prompts

Generative AI refers to artificial intelligence technologies that generate new content such as text, images, audio and video, and examples of typical generative AI include

– Text generation: models that understand natural language and generate dialogue and text, such as ChatGPT.
– Image generation: models that generate images based on given prompts, such as DALL-E.
– Speech generation: text-to-speech (TTS) models that synthesise voices and music generation AI.
– Video generation: AI models that generate video scenes based on prompts.

Generative AIs (e.g. image generation AIs and text generation AIs) generate new content based on given instructions (prompts), so the quality and appropriateness of the prompts is key to maximising AI performance.

Examples of prompts can be as simple as ‘A cat sitting on a chair’ for image generation AI, or as detailed as ‘A fluffy white cat with blue eyes sitting on a wooden chair, next to a window with sunlight streaming in’.

In the aforementioned examples, simple prompts allow for a wider range of interpretation of what the AI generates, making the results less predictable, whereas detailed prompts allow for more specific, intentional results.

Designing effective prompts for generative AI requires attention to the following points

  • Clarity and specificity: AI will interpret a given prompt as it is given, so specific and clear instructions are important. For example, if it contains ambiguous words or expressions, the generated content may also be disappointing.

– Good example: ‘A futuristic city with flying cars and tall skyscrapers under a purple sky’.
– Bad example: ‘A city’.

  • Setting context: it is also important to include contextual information in the prompt, as clearly indicating elements such as a particular style, era or sentiment will make it easier for the AI to generate results that are more in line with your intentions accordingly.

Example: ‘A Victorian-era library with antique furniture, lit by candlelight.’
→ The period or atmosphere is clear and the AI will generate results based on these elements.

  • Adding constraints or conditions: if the produce needs to meet certain conditions, add them to the prompt. For example, some clearly describe constraints such as colour, composition, placement, emotional tone, etc.

– Example: ‘Generate an image of a red sports car, parked under a tree with autumn leaves, during sunset’.

  • Splitting up the generation process: setting up prompts that divide a complex generation process into phases can produce highly accurate results. The method would be to generate the basic parts first and then fill in the details with additional prompts.

– Step 1: ‘Generate an image of a medieval castle on a hill’
– Step 2: ‘Add a dragon flying in the sky above the castle’
– Step 3: ‘Make the sky stormy with dark clouds and lightning’

Prompt engineering has been proposed to generate these prompts, as described in ‘Overview of prompt engineering and its use’.

Prompt engineering is a technique for optimising the prompts of generative AI, e.g. iterative prompting, a method of iteratively adjusting and fine-tuning the prompts to get closer to the ideal result, e.g. if the image generated by an initial prompt is different from the intention, the feedback can be used to Prompts that change prompts; and prompts that appropriately combine prompts containing multiple elements, for example, balancing different conditions such as ‘colour’, ‘composition’, ‘style’ and ‘emotion’ to convey intent to the AI.

Challenges with prompts include the following

  • Overly detailed prompts: conversely, overly detailed prompts can overburden the AI and impair its creative freedom.
  • Ambiguous instructions: too vague prompts can lead to a variety of interpretations, making the outcome less predictable.

More sophisticated methods of applying automatic generation to these prompts for generative AI are evolving, which can be used to optimise generative results and make more effective use of generative AI.

On the approach to automatic generation of prompts for generative AI

The approach to automatic generation of prompts for generative AI is that the prompts themselves have a significant impact on the AI’s output, and thus dynamically generating prompts enables optimal results for different contexts and applications. They are described below.

1. automatic template-based generation: template-based prompt generation is an approach where placeholders are set and dynamically replaced depending on the context. Flexible prompts can be generated by modifying parts of the template according to specific parameters.

A specific approach would be to have a fixed template for the prompt and dynamically replace placeholders based on input content and conditions. This approach is useful when a specific domain or pattern is fixed.

function generatePrompt(object, style) {
    return `Generate an image of a ${object} in a ${style} style.`;
}

const object = "mountain";
const style = "watercolor";
const prompt = generatePrompt(object, style);
console.log(prompt);
// output: "Generate an image of a mountain in a watercolor style."

Advantages of this approach include simplicity, ease of management and the ability to generate prompts suitable for specific formats.

2. rule-based prompt generation: the rule-based approach incorporates conditions and rules into the prompt generation logic, allowing for different scenarios and conditions. This allows for more sophisticated automatic generation of prompts.

A specific approach could be to generate different prompts based on parameters and dynamically change them according to the conditions entered (e.g. user preferences, specific situations).

function generatePrompt(object, style, mood) {
    let prompt = `Generate an image of a ${object} in a ${style} style.`;
    if (mood === 'dark') {
        prompt += " Make it look dark and eerie.";
    } else if (mood === 'bright') {
        prompt += " Use bright and vibrant colors.";
    }
    return prompt;
}

const object = "forest";
const style = "oil painting";
const mood = "dark";
const prompt = generatePrompt(object, style, mood);
console.log(prompt);
// output: "Generate an image of a forest in an oil painting style. Make it look dark and eerie."

Advantages of this approach include the ability to generate detailed context-specific prompts and its suitability for generating complex prompts based on conditions.

3. learning-based prompt generation: an approach that uses machine learning to automatically generate new prompts based on the results of learning from a dataset of previous prompts, where the model needs to learn the relationship between previous prompts and the output to them in order to generate prompts suitable for generative AI This requires the model to learn the relationship between previous prompts and the outputs to them.

A specific approach would be to prepare a large amount of prompt and output data, train an AI model on it, and use the learned model to automatically generate the best prompt for the context.

A specific example would be where a text-generating AI automatically learns prompts, deriving the most appropriate prompts based on past interactions and generated text.

Advantages of this approach include the ability to generate optimised prompts based on past data and the flexibility to respond to diverse contexts.

4. continuous prompt generation (multi-stage generation): an approach that not only generates one prompt, but also uses the generated output as input for the next prompt, whereby the AI’s output influences the next prompt, allowing the prompt to evolve continuously.

A specific approach could be to have the AI generate output based on the initial prompt, analyse that output and reflect it in the next prompt, and update the prompt, including feedback on the product.

An example would be where an image generated using the initial prompt ‘Generate an image of a city skyline at sunset’ evolves to ‘Enhance the lighting effects in the skyline’ in the next prompt .

Advantages of this approach include the ability to continuously improve based on what the AI generates and to utilise feedback loops to improve output.

5. prompt generation using natural language processing: this approach utilises natural language processing (NLP) technology to understand the user’s intentions and generate optimal prompts based on them, and can utilise NLP to generate effective prompts for the user’s free input.

A specific approach could be one that analyses the free text input from the user, extracts important elements and generates optimal prompts based on the extracted elements.

A concrete example would be if the user inputs ‘Draw a bright, futuristic cityscape’, which would be analysed and converted into the prompt ‘Generate a futuristic cityscape with bright, vibrant colours’.

Advantages of this approach include the ability to dynamically generate prompts from natural language input and the ability to generate prompts based on an understanding of the user’s intentions.

By combining these approaches or selecting one or more depending on the purpose, more effective automatic prompt generation can be achieved for generative AI, thereby improving the quality of generated results and efficient prompt management.

Automatic generation of prompts using multi-agent systems

Automatic generation of prompts using multi-agent systems is an advanced approach where multiple agents work together to dynamically generate and optimise prompts, and this system can provide more diverse and sophisticated prompts for generative AIs to increase the accuracy and creativity of their products and It is possible to increase the accuracy and creativity of the generative products.

1. what is a multi-agent system (MAS): a multi-agent system is one in which multiple independent agents work together to solve a problem or perform a task, each agent having its own knowledge and role and communicating with other agents to find the best solution In prompt generation for generative AI, each agent has a different perspective and function, and can cooperate to create the best prompt.

2. multi-agent prompt generation flow: in prompt generation using a multi-agent system, each agent is responsible for a specific task and the results are combined to generate the final prompt. The following is an example of this flow.

Step 1: Define agent roles
Each agent is responsible for a specific element. For example, the following roles are assigned
– Context agents: generate prompts for context and scene settings.
– Style agents: determine the style and artistic elements of the production.
– Feedback Agents: evaluate previous productions and suggest improvements.
– Constraint agents: consider constraints such as colour, composition and time.

Step 2: Inter-agent co-ordination
Each agent generates prompts based on its own role and shares information with other agents. For example, if the context agent suggests ‘desert landscape’, the style agent will suggest ‘oil painting style’ based on this.

Step 3: Integration and optimisation of prompts
The suggestions of the individual parts obtained by the collaboration between the agents are integrated to generate the final prompt. At this stage, the constraint agents may reflect specific conditions (e.g. ‘use brighter tones’).

Step 4: Dynamic feedback loop
After generating the prompts, the feedback agent evaluates the results generated by the AI, and the whole agent improves the prompts again based on its evaluation. This allows more refined prompts to be generated in the next cycle.

3. example agents: the following are examples of agents available for generative AI

a. Context agent: this agent suggests the overall theme or situation of the prompt. For example, it determines contextual elements such as scenery, character placement, time and place of the scene, etc.

– Example: ‘Generate is a scene of a robot standing in the wilderness’.

b. Style agents: style agents determine the artistic style and visual atmosphere of the content to be generated. For example, they suggest painting styles, photo filters, texture features, etc.

– Example: ‘oil painting style’, ‘monochrome’, ‘futuristic’.

c. Constraint agents: constraint agents set specific conditions for the product. For example, they are responsible for colour, time of day, character clothing, physical constraints, etc.

– Example: ‘red as the predominant colour’, ‘character size within 2 metres’.

d. Feedback agent: this agent evaluates the generated content and suggests improvements. It learns from previous prompts and results and provides advice to other agents to improve the AI’s generation accuracy.

– Example: ‘The previous generated results showed that the colours were too dark, so they should be brighter this time’.

4. applications of multi-agent systems:

a. Optimising prompts in image generation: several agents cooperate to generate prompts for the image generation AI. The context agent determines the scene, the style agent chooses the art style, the constraint agent adds colour and composition constraints, and so on, to generate images more in line with the intent.

b. Automatic generation of prompts in text generation: when generating stories and dialogue, each agent is responsible for the story plot, characterisation and emotional tone, and works together to generate the final prompts. This generates a richer and more coherent narrative.

c. Improving prompts by utilising feedback loops: the AI evaluates the generated content and sends the evaluation results to the agent as feedback to be reflected in the next prompt, thereby gradually improving the quality of the generated material. For example, if the previous image generation result did not meet expectations, the feedback agent will point out the problem and generate a prompt next time that corrects that element.

5. advantages of multi-agent systems: prompt generation using multi-agent systems has the following advantages

  • Diverse perspectives: more complex and sophisticated prompts can be generated as each agent is responsible for a different perspective or task.
  • Flexibility: the division of roles between agents allows for flexible adaptation to different scenarios and applications.
  • Dynamic feedback: the quality of the prompts can be progressively improved by incorporating feedback on the results generated.
  • Distributed processing: each agent can handle tasks in parallel, making the prompt generation process more efficient.

6. challenges and solutions: although a multi-agent system enables advanced prompt generation, there are some challenges, such as

  • Inter-agent co-ordination: the agents need to work together to generate unified prompts, without each agent acting separately, and communication protocols and co-ordination between agents are important for this.
  • Agent design: the design of agents requires careful planning, as the prompts generated may not be what is expected if the role and knowledge base of each agent is poorly designed.

Automatic generation of prompts using multi-agent systems is a powerful approach to providing diverse, high-quality prompts for generative AI, where each agent plays a specialised role and works together to generate and improve prompts, allowing AI to generate more precise and creative content generation. This approach is particularly effective in complex tasks and creative processes.

Automatic generation of prompts using the knowledge graph

The automatic generation of prompts using knowledge graphs is a method of leveraging the knowledge base to create appropriate prompts for generative AI. By using knowledge graphs, which visualise the relationships between concepts and entities and systematically structure data, the generated prompts can be more contextualised, improving the accuracy and semantic coherence of AI generation.

1. what is a knowledge graph: a knowledge graph is a network structure representing entities (things or concepts) and their relationships, for example, the entity ‘cat’ is associated with the concepts ‘animal’ and ‘pet’ and these relationships are represented as nodes and edges in the graph. Such structures aid systematic understanding and reasoning about information and provide a basis for exploring relationships between data.

2. how knowledge graphs are used to generate prompts:

a. Identifying relevant entities: as a starting point for prompt generation, the main entities and their related entities are extracted from the knowledge graph. For example, if the entity ‘sky’ is selected, related concepts such as ‘cloud’, ‘blue sky’ and ‘bird’ are automatically extracted based on the knowledge graph.

b. Prompt generation reflecting relationships between entities: based on the relationships between entities, highly relevant content is included in the prompt. If the relationship between ‘sky’ and ‘bird’ is recognised by the knowledge graph, the generated prompts can automatically incorporate expressions reflecting the relationship, such as ‘bird flying under a blue sky’.

c. Hierarchical use of information: as entities are structured hierarchically in a knowledge graph, it is possible to generate prompts that take into account higher and lower level concepts. For example, based on the higher-level concept of ‘animal’, it is possible to select sub-concepts such as ‘cat’ and ‘dog’ and incorporate a variety of elements in the generated prompts.

d. Understanding the context of prompts: the knowledge graph facilitates the generation of contextualised prompts. For example, if a user is looking for a generation on ‘history’ or ‘science’, prompts can be dynamically generated around a specific era or concept by utilising nodes in the knowledge graph that are relevant to that domain.

3. example of prompt generation:

a. Image generation prompt: e.g. if the user wants to generate images on the theme of ‘landscape’, extract entities related to ‘landscape’ from the knowledge graph and combine them to generate a prompt.
– Extract from knowledge graph: landscape -> mountain, lake, sky, sun
– Generated prompt: ‘Mountains rising into the blue sky and a tranquil lake at their foot’

b. Text generation prompts: if the user wants text generation based on historical themes, entities related to a specific period or person are extracted from the Knowledge Graph and contextualised prompts are automatically generated.
– Extract from knowledge graph: history -> medieval -> knights, castles, wars
– Generated prompt: ‘Describe a medieval knight preparing for war at the gates of his castle’

4. advantages of generating prompts using the knowledge graph: 

a. Contextual consistency: by using the knowledge graph, the generated prompts will automatically include relevant elements, thus maintaining contextual consistency. Rather than randomly combining different entities, relevant entities are combined to generate natural and persuasive prompts.

b. Dynamic prompt generation: as the knowledge graph can dynamically explore and add information, the scope and variation of prompt generation increases accordingly as new entities and relationships are added. This allows a variety of prompts to be generated automatically according to user needs.

c. Highly accurate prompts based on knowledge base: knowledge graphs provide highly accurate prompts based on knowledge, as they systematically organise large amounts of data. This increases the probability of a product that is in line with the user’s wishes.

5. applications of prompt generation using the knowledge graph: a. Creative generation: using the knowledge graph to support the creative generation process.

a. Creative generation: use the knowledge graph to support the creative generation process. For example, if a user wants to create a new story or artwork based on a specific genre or theme, a knowledge graph-based related entity can be used to generate prompts that enhance creativity.

b. Educational content generation: in education, prompts related to specific knowledge domains can be automatically generated to present learners with appropriate tasks and questions. For example, knowledge graphs related to science or history can be used to generate prompts for each topic and learning materials based on them.

c. Personalised content generation: individual prompts are automatically generated from the knowledge graph based on the user’s interests and concerns to provide personalised content. For example, based on a user’s preferred theme or genre, relevant information can be extracted from the knowledge graph and prompts can be generated to meet specific requests.

6. challenges and solutions:

a. Creating and maintaining a knowledge graph: the creation of a knowledge graph requires sufficient data collection and accurate relationship building between entities. In addition, the knowledge graph needs to be continuously updated to incorporate new knowledge and trends, which can be effectively addressed by using AI and automated data analysis to automatically update the graph.

b. Information bias: If the information contained in the knowledge graph is biased towards certain areas, the prompts generated may also be biased. To avoid this problem, it is important to collect a wide range of data from different sources and ensure that the graphs are balanced.

Automatic generation of prompts using knowledge graphs is a powerful approach to generate highly accurate and contextually relevant prompts based on knowledge, and by utilising relationships and hierarchical structures between entities, it can be applied to a wide range of applications, from creative productions to educational content. Building and managing the knowledge graph is a significant challenge, but when properly managed, it is expected to significantly improve the performance of generative AI.

Automatic generation of prompts using GNNs

Automatic generation of prompts using a Graph Neural Network (GNN) is an approach that exploits the structural information in the knowledge graph to extract relevant context from the graph data to generate prompts GNNs are machine learning models suitable for graph-structured data. and is adept at learning patterns of nodes (entities) and edges (relationships) to understand the relationships between nodes.

In this approach, knowledge graphs and other graph-based datasets are used as input, and GNNs predict and extract the contextual and relevant information needed to generate prompts. The following section describes the flow of automatic prompt generation using GNNs and specific applications.

1. what is a GNN: A GNN is a neural network that learns by propagating information about nodes (data points) and edges (relationships between nodes) to data with a graph structure, where each node updates its features under the influence of its surrounding nodes and edges, and finally the embedded representation of the nodes This will be the one that is obtained.

The features of GNNs enable deep semantic understanding with structured data, such as knowledge graphs, and automatic generation of contextual prompts.

2. prompt generation mechanisms using GNNs:

a. Using knowledge graphs: a knowledge graph contains entities (nodes) and their relationships (edges). For example, if the entity ‘sky’ is associated with related concepts such as ‘weather’ or ‘clouds’, the GNN can be used to learn these relationships.

GNNs understand the relationships between entities by having each node receive features from its neighbours. Based on this information, entities and relationships suitable for prompt generation can be extracted.

b. Node features and prompt generation: the features of a node generated by a GNN represent the semantic information that the node has. For example, contextual prompts are automatically generated based on entities such as ‘blue sky’, ‘cloud’ and ‘sun’ related to ‘sky’.

– Input knowledge graph: sky -> weather -> clouds -> sun
– Output prompt: ‘Describe a landscape with clear skies and drifting clouds’

c. Message passing and contextual understanding: in GNNs, node information is propagated between neighbouring nodes through a process called message passing. This technique allows distant nodes to influence each other indirectly and to better understand the context that should be included in the prompt.

For example, a ‘sky’ node propagates information about ‘weather’ and ‘clouds’, ultimately generating prompts containing contexts such as ‘sunny’ and ‘rainy’.

3. example of prompt generation:

a. GNN-generated contextual image-generated prompts: e.g. nodes related to ‘landscape’ from the knowledge graph are extracted and processed by the GNN and contextualised prompts are automatically generated.

– Input knowledge graph: landscape -> mountains -> lake -> sky -> clouds
– GNN contextual understanding: empty landscape with mountains and lakes and drifting clouds
– Generated prompt: ‘Landscape with quiet lake and mountains, blue sky with clouds’

b. Text-generated prompts: using knowledge graphs related to ‘medieval history’ and ‘science’, and learning the context with GNNs to generate prompts according to the user’s wishes.

– Input knowledge graph: medieval -> knight -> castle -> war
– Contextual understanding with GNN: historical battles related to knights and castles
– Generated prompt: ‘Scene of medieval knights gathering in front of a castle to prepare for battle’

4. advantages of generating prompts using GNNs:

a. Deep contextual understanding: the GNN learns the relationships and context between nodes in the knowledge graph and generates prompts based on this. This enables the creation of consistent prompts based on relevant entities, rather than just a list of keywords.

b. Hierarchical information integration: a GNN can generate richer and more complex prompts because it can learn complex contexts while integrating hierarchical information. This is particularly useful for prompts that require extensive knowledge and relationships.

c. Flexible prompt generation: a GNN can generate flexible prompts based on the structure of the knowledge graph and can dynamically generate prompts for different topics and themes by learning data from different domains and understanding their relationships.

5. applications of prompt generation using GNNs:

a. creative productions: using GNNs to automatically produce appropriate contextual prompts to support creative outputs (e.g. art, stories).

b. science and technology-based knowledge generation: using knowledge graphs related to technology and scientific disciplines, GNNs can be used to learn professional contexts and generate prompts for technical suggestions and solutions, which can be particularly useful in areas where understanding and relevance of complex concepts is required.

c. Supporting education and training: in the generation of educational content, GNNs can be used to dynamically generate prompts tailored to the learner and provide more personalised tasks and questions, for example, questions and tasks can be generated based on specific historical topics.

6. challenges and solutions:

a. Quality of graph data: the performance of GNNs is highly dependent on the quality of the knowledge graph. Therefore, knowledge graphs need to be created and updated properly and this problem can be addressed by collecting high quality data and accurately defining entities and relationships.

b. Scalability: the training cost of GNNs for large knowledge graphs can be high, and to overcome this, efficient GNN algorithms and distributed processing techniques can be utilised to improve computational efficiency.

Automatic prompt generation using GNNs is an innovative approach to generate high-quality prompts based on contextual understanding utilising graph structures, and combining knowledge graphs and GNNs can automatically create contextually consistent prompts and improve the performance of generative AI It is possible to. It is particularly useful in cases requiring complex knowledge structures.

reference book

The following reference books on the application of automatic prompt generation and knowledge graphs using GNNs (Graph Neural Networks) are available.

1. foundations of GNNs and graph theory:

Graph Neural Networks: Foundations, Frontiers, and Applications” by Lingfei Wu, Peng Cui, Jian Pei, and Liang Zhao

– “Deep Learning on Graphs” by Yao Ma and Jiliang Tang

2. books related to knowledge graphs:

– “Knowledge Graphs: Fundamentals, Techniques, and Applications” by Dieter Fensel, Katja Höffner, and Elena Simperl

– “Graph@powered Machine Learning: Learn how to perform machine learning on graph data and leverage its predictive power” by Alessandro Negro**

3. relevant books on natural language processing and prompt generation:

– “Natural Language Processing with PyTorch: Build Intelligent Language Applications Using Deep Learning” by Delip Rao and Brian McMahan

– “Deep Learning for Natural Language Processing: Creating Neural Networks with Python” by Palash Goyal, Sudip Pujari, and Arpan Chakraborty

4. reinforcement learning and generative models:
– “Reinforcement Learning: An Introduction” by Richard S. Sutton and Andrew G. Barto

コメント

タイトルとURLをコピーしました