Clojureでのビタビアルゴリズムと確率的生成モデルでのHMM

Clojureを用いたビタビアルゴリズムと確率的生成モデルによる隠れマルコフモデルの実装

Clojure

2023.10.17 2022.10.06

機械学習技術確率的生成モデルサポートベクトルマシンスパースモデリング人工知能技術デジタルトランスフォーメーション技術一般的な機械学習とデータ分析 Clojure 異常検知・変化検知技術関係データ学習時系列データ解析本ブログのナビ

Clojureを用いたビタビアルゴリズムと確率的生成モデルによる隠れマルコフモデルの実装

隠れマルコフモデルとは、確率モデルの一つであり、観測されない(隠れた)状態を持つマルコフ過程となる。状態が直接観測可能なマルコフ過程と異なり、観測されたデータの情報を使って、その裏側にある「隠れた」状態を推測するものとなる。下図はwikiページから参照した隠れマルコフモデルの例となる。図中上部の丸(1〜5)は状態遷移、下に並ぶシンボルが観測される情報となる。実際に観測された情報が波線下部のようなシンボルの場合に、それらを出力する状態系列の候補は「532532」「432532」「312532」の3つとなる。このような最尤観測系列の問題は後述するビタビアルゴリズムで効率的に解くことができる。

具体的な応用例としては、音声認識や自然言語あるいは、画像やIOT等のセンサー情報の処理などのあらかじめメタルール(状態の遷移)が決まっているものに対して、隠れ状態の遷移をあらかじめ設定して、入力の情報系列に対して隠れ状態を推定するものや、さらにそれらを”BERTの概要とアルゴリズム及び実装例について“で述べているBERT等と組み合わせたもの、あるいはノンパラメトリックベイズ等の確率的生成モデルと組み合わせて状態遷移を推論する手法等さまざまなものがある。

ここではまずClojureを用いたビタビアルゴリズムについて述べる。利用するモジュールは”clj-viterbi“となる。以下に実装について述べる。Clojure環境の立ち上げについては”Clojureを始めよう“を参照のこと。

まずは”lein new viterbi-test(任意のプロジェクト名)”で新規プロジェクトを作成する。作成されたプロジェクトフォルダー中のproject.cljファイルに以下のように[com.lemonodor.viterbi “0.1.0”]をライブラリー追加を行う。

(defproject viterbi-test "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :license {:name "EPL-2.0 OR GPL-2.0-or-later WITH Classpath-exception-2.0"
            :url "https://www.eclipse.org/legal/epl-2.0/"}
  :dependencies [[org.clojure/clojure "1.11.1"]
                 [com.lemonodor.viterbi "0.1.0"]]
  :repl-options {:init-ns viterbi-test.core})

次にcore.cljファイルでHMMの定義と評価を行う。今回定義するのは以下のようなHMMとなる。

観測される状態が”Dizzy(めまい)”、”Cold(寒気)”、”Normal(通常)”で隠れ状態が”Healthy(健康)”、”Fever(熱)”となっている。これを用いたHMMのコードは以下のようになる。

(ns viterbi-test.core
  (:require [com.lemonodor.viterbi :as viterbi]))

(def hmm
  (viterbi/make-hmm
   :states [:healthy :fever]
   :start-p {:healthy 0.6, :fever 0.4}
   :trans-p (let [t {:healthy {:healthy 0.7, :fever 0.3},
                     :fever {:healthy 0.4, :fever 0.6}}]
              #((t %1) %2))
   :emit-p (let [t {:healthy {:normal 0.5, :cold 0.4, :dizzy 0.1},
                    :fever {:normal 0.1, :cold 0.3, :dizzy 0.6}}]
             #((t %1) %2))))

:state変数で隠れ状態を設定、:start-p変数でそれぞれの初期確率(上図ではstartからの確率として表示)を設定、:trans-p変数で隠れ状態の状態遷移確率を設定し、:emit-p変数で隠れ状態から観測状態への遷移確率を定義する。

これらを評価すると以下のようになる。

(viterbi/viterbi hmm [:normal :cold :dizzy])
;; -> [-1.8204482088348124 [:healthy :healthy :fever]]

出力されたデータは最尤推定された状態遷移とその確率(log10-probability)となる。

次に、以前述べたanglicanを使ったHMMについて述べる、環境設定等は”Clojureを用いた確率的プログラミング“を参照のこと。前述と同様にprojectファィルを作り、project.cljファイルにライブラリを追加する。

(defproject hmm-test "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :license {:name "EPL-2.0 OR GPL-2.0-or-later WITH Classpath-exception-2.0"
            :url "https://www.eclipse.org/legal/epl-2.0/"}
  :dependencies [[org.clojure/clojure "1.11.1"]
                 [anglican "1.1.0"]
                 [org.clojure/data.csv "1.0.1"]
                 [net.mikera/core.matrix "0.62.0"]
                 ;[plotly-clj "0.1.1"]
                 ;[org.nfrac/cljbox2d "0.5.0"]
                 ;[quil "3.1.0"]
                 ]
  :main ^:skip-aot gr-anglican.core
  :target-path "target/%s"
  :plugins [[org.clojars.benfb/lein-gorilla "0.6.0"]]
  :profiles {:uberjar {:aot :all
                       :jvm-opts ["-Dclojure.compiler.direct-linking=true"]}})

次にHMMのモデルを記述する。

(ns hmm-test
  (:require [gorilla-plot.core :as plot]
            [clojure.core.matrix :as m]
            [anglican.stat :as s]
            :reload)
  (:use clojure.repl
        [anglican 
          core runtime emit 
          [inference :only [collect-by]]]))

(defn index->ind 
  "converts a collection of indices to a matrix of indicator vectors"
  [values]
  (let [max-v (reduce max values)
        zero-vec (into [] (repeat (inc max-v) 0))]
    (m/matrix (map #(assoc zero-vec % 1) values))))

(defn square 
  [x]
  (m/mul x x))
(defquery hmm
  "takes observation sequence and
  known initial state, transition,
  and observation distributions and
  returns posterior samples of the 
  joint latent state sequence"
  [observations init-dist trans-dists obs-dists]
    (reduce 
      (fn [states obs]
        (let [state (sample (get trans-dists
                                 (peek states)))]
          (observe (get obs-dists state) obs)
          (conj states state)))
      [(sample init-dist)]
      observations))

パラメータを記述する。今回は16観察データで、3状態のパラメータとする。

(def data 
  "observation sequence (16 points)"
  [0.9 0.8 0.7 0.0 
   -0.025 -5.0 -2.0 -0.1 
   0.0 0.13 0.45 6 
   0.2 0.3 -1 -1])

(def init-dist
  "distribution on 0-th state in sequence 
  (before first observation)"
  (discrete [1.0 1.0 1.0]))

(def trans-dists
  "transition distribution for each state"
  {0 (discrete [0.1 0.5 0.4])
   1 (discrete [0.2 0.2 0.6])
   2 (discrete [0.15 0.15 0.7])})

(def obs-dists
  "observation distribution for each state"
  {0 (normal -1 1)
   1 (normal 1 1)
   2 (normal 0 1)})

Sequential Monte Carlo methodを用いた推論を実行する

(def number-of-particles 1000)
(def number-of-samples 10000)

(def samples
  (->> (doquery :smc hmm
                [data 
                 init-dist
                 trans-dists
                 obs-dists]
                :number-of-particles number-of-particles)
       (take number-of-samples)
       doall
       time))

Forward-backward方で計算した分布と比較する。

(def posterior 
  "ground truth postior on states, 
  calculated using forward-backwardi'm "
  [[ 0.3775 0.3092 0.3133]
   [ 0.0416 0.4045 0.5539]
   [ 0.0541 0.2552 0.6907]
   [ 0.0455 0.2301 0.7244]
   [ 0.1062 0.1217 0.7721]
   [ 0.0714 0.1732 0.7554]
   [ 0.9300 0.0001 0.0699]
   [ 0.4577 0.0452 0.4971]
   [ 0.0926 0.2169 0.6905]
   [ 0.1014 0.1359 0.7626]
   [ 0.0985 0.1575 0.744 ]
   [ 0.1781 0.2198 0.6022]
   [ 0.0000 0.9848 0.0152]
   [ 0.1130 0.1674 0.7195]
   [ 0.0557 0.1848 0.7595]
   [ 0.2017 0.0472 0.7511]
   [ 0.2545 0.0611 0.6844]])

(def num-sample-range (mapv (partial * number-of-samples)
                            [1e-2 2e-2 5e-2 1e-1 2e-1 5e-1 1]))  
(def L2-errors
   (map (fn [n]
          (->> ;; use first n samples
               (take n samples)
               ;; calculate L2 error of empirical posterior relative to true posterior
               (collect-by (comp index->ind :result))
               s/empirical-mean
               (l2-norm posterior)))
        num-sample-range))

(plot/list-plot (map vector 
                     (map #(/ (log %) (log 10)) num-sample-range)
                     (map #(/ (log %) (log 10)) L2-errors))
                :joined true
                :color "#05A"
                :x-title "log number of samples"
                :y-title "log L2 error")

結果は以下のようになる。

以前”隠れマルコフモデルと状態空間モデルの違いと状態空間モデルのパラメータ推定“で述べたように隠れマルコフモデルの状態を連続的にすると状態空間モデルとなる。カルマンフィルタへの確率的アプローチに関しては別途述べる。