Codeless generation module using text-generation-webui and AUTOMATIC1111

Machine Learning Natural Language Processing Artificial Intelligence Digital Transformation Image Processing Reinforcement Learning Probabilistic Generative Modeling Deep Learning Python Navigation of this blog

Generative machine learning tools text-generation-webui and AUTOMATIC1111

There are open source tools such as text-generation-webui and AUTOMATIC1111 that allow codeless use of generative modules such as ChatGPT described in “Overview of ChatGPT and LangChain and their use” and Stable Diffusion described in “Stable Diffusion and LoRA Applications“. In this article, we describe how to use these text generation/image generation modules.

First, let’s look at text-generation-webui.

text-generation-webui

The “Text generation web UI” is a tool that makes it easy to use language models such as GPT and LLaMA with a web app-like UI. By using this tool, you can easily download new models and switch between multiple models.

The following is from the official website

The goal is to become the AUTOMATIC1111/stable-diffusion-webui of text generation. We are working on it.”

<Features>

Three interface modes: default, notebook, chat
Multiple model backends: transformer described in “Overview of Transformer Models, Algorithms, and Examples of Implementations“, llama.cpp, AutoGPTQ, GPTQ-for-LaMa, ExLlama, RWKV, FlexGen
Drop-down menus for quick switching between different models
LoRA: Load/Unload LoRA on the fly, load multiple LoRAs simultaneously, train new LoRAs
Alpaca, Vicuna, Open Assistant, Dolly, Koala, ChatGLM, MOSS, RWKV-Raven, Galactica, StableLM, WizardLM, Baize, Ziya, Chinese-Vicuna, MPT, INCITE, Exact instruction templates for chat modes including Wizard Mega, KoAlpaca, Vigogne, Bactrian, h2o, OpenBuddy
Multimodal pipeline including LLaVA and MiniGPT-4
8-bit and 4-bit inference with bitsandbytes
CPU mode for transformer models
DeepSpeed ZeRO-3 inference
Extensions
Custom chat characters
Very efficient text streaming
Markdown output with LaTeX rendering.
Excellent HTML output for GPT-4chan
API with endpoints for websocket streaming (see sample)

Below we describe how to get them up and running.

<Starting up on a Mac>

See “How to install text-generation-webui (a GUI for local LLMs) on Mac” to get up and running on a Mac.

First, set up the homebrew environment, see “Getting started with Clojure (1) Setting up the environment (spacemacs and leiningen)” for details. 3.10 python is required. pyenv is used to maintain puthon’s version. pyenv is After installing with homebrew (brew install pyenv), make sure you see the version through the path (.zshrc, .bash_profile, etc.) (pyenv –version), pyenv install 3.10.xx to Install and change with pyenv global 3.10.xx to finish setting up the environment.

Then copy from git and go to the top of the folder

> git clone https://github.com/oobabooga/text-generation-webui.git
> cd text-generation-webui
> python3 --version
#It shows python 3.10.xx.
#Virtual Environment Settings
> python3 -m venv venv
> source venv/bin/activate

Next, run the following command to install the necessary packages.

pip install -r requirements.txt

In addition, the released Pytorch package does not work with text-generation-webui on the Mac, so install the Knightly development build.

pip install -U --pre torch torchvision -f 
https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
download.pytorch.org

Finally, download the language model.

python download-model.py facebook/opt-1.3b

Enter the following code to run it.

python server.py

If it works well, you can find it on your browser at http://127.0.0.1:7860

After that, select the model you have just downloaded in the Model tab.

For example, put “Who is JFK?” after “Question” in the “Text Generation” tab, and press the “Generate” button.

After that, you can experiment with various models and output forms, combine learning methods, etc.

<Starting up in Window: Installation using One-click installers>

Windows	Linux	macOS	WSL
oobabooga-windows.zip	oobabooga-linux.zip	oobabooga-macos.zip	oobabooga-wsl.zip

Download and unzip the above zip and double click on “start”. They will install the web UI and all dependencies in the same folder. After that, you can use it as in the case of mac by running “start_windows.bat” in the downloaded oobabooga-windows folder.

LLM

There are various types of LLMs, which are summarized in A Survey of Large Language Models.

Typical examples include the following

GPT: GPT (Generative Pre-trained Transformer) will be a language model based on the Transformer architecture. GPT is pre-trained on large data sets and has the ability to predict the next word or sentence. GPT-4 is the most famous version and has demonstrated high performance on many tasks. See detail in “Overview of GPT and examples of algorithms and implementations“
DialoGPT: DialoGPT is an interactive language generation model and will be developed by OpenAI. DialoGPT is based on the GPT (Generative Pre-trained Transformer) architecture. DialoGPT is trained on a large dataset, understands the dialogue context, and can generate responses; the model considers previous utterances and context, and generates a sequence of tokens to generate responses. It is also possible to output multiple candidate responses to a single dialogue turn.
BERT: BERT (Bidirectional Encoder Representations from Transformers) is a language model that uses bidirectional transformer encoders. BERT is widely used, especially for natural language processing tasks. For more information on BERT, see “BERT Overview, Algorithms, and Example Implementations“.
XLNet: XLNet will be based on the Transformer architecture, a model that can be trained using both bidirectional and forward information. This allows the language model to handle context more flexibly and make more accurate predictions.
T5: T5 (Text-to-Text Transfer Transformer) is a model that can be applied to a variety of natural language processing tasks, such as machine translation, summarization, question answering, and document classification. T5 can be applied to a variety of natural language processing tasks such as machine translation, summarization, question answering, and document classification.
PaLM: PaLM is a model released by Google in 2022, which stands for “Pathways Language Model. is 540 billion. (OpenAI’s GPT-3, which was released before PaLM, has 170 billion parameters.)
LLaMA: LLaMA (Large Language Model Meta AI) is a large-scale language model released by Meta in February 2023. Because LLaMA achieves high accuracy while keeping the number of parameters low, researchers around the world can explore the possibilities of various large-scale language models based on LLaMA.
OpenFlamingo : The model “Flamingo” developed by DeepMind was open-sourced by the German non-profit organization LAION.
Vicuna 13B : Open source chatbot based on LLaMA and trained on ChatGPT and user conversations, with performance close (90%) to ChatGPT despite $300 training cost.
Alpaca 7B : Fine tuning using the results of instruction-following (generate your own training data) based on LLaMA
NEMO LLM : A large-scale language model developed by NVIDIA that, like GPT-4, supports document generation, image generation, translation, coding, etc.
Claude : Founded by engineers who were involved in the development of GPT-2/3 at OpenAI, the model developed there

AUTOMATIC1111

The AUTOMATIC1111 version of Stable Diffusion Web UI will be a fork of the most feature-rich “Stable Diffusion” that has been developed as open source. In addition to easy operation through the Web UI, it includes almost all features, such as additional training models, use of additional training such as LoRA, face restoration using GFPGAN, and high quality image enhancement.

The installation procedure is described below.

<Starting up on a Mac>

First, launch the Homebrew environment as well as Text generation web UI. Next, install the necessary tools and AUTOMATIC1111 from git using the following commands.

> brew install cmake protobuf rust python@3.10 git wget
> git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui

This will create a stable-diffusion-webui folder in your home folder. Next, download the training model (e.g. stable-diffusion-v-1-4-original) and move it to the stable-diffusion-webui/models/Stable-diffusion folder.

Now go to the stable-diffusion-webui folder and run the script “webui.sh”, which will open the browser’s http://127.0.0.1:7860/.

All that remains is to enter the prompt, decide on the options and select “Genegate”, then wait a few moments and the image will be generated and output.

<Starting up Windows>

Windows can be started up in much the same way as the procedure for macs.

Reference Information and Reference Books

For details on automatic generation by machine learning, see “Automatic Generation by Machine Learning.

Reference book is “Natural Language Processing with Transformers, Revised Edition“

“Transformers for Machine Learning: A Deep Dive“

“Transformers for Natural Language Processing“

“Vision Transformer入門 Computer Vision Library“