強化学習 | ページ 2 | Deus Ex Machina

ACKTRの概要とアルゴリズム及び実装例について

ACKTRの概要 ACKTR（Actor-Critic using Kronecker-factored Trust Region）は、強化学習のアルゴリズムの一つであり、"トラストリージョン法について"で述べてい...

2024.09.06

pythonアルゴリズム:Algorithms強化学習機械学習:Machine Learning深層学習:Deep Learning

最適制御に基づく逆強化学習（Optimal Control-based Inverse Reinforcement Learning）の概要最適制御に基づく逆強化学習（Optimal Control-based ...

2024.08.30

pythonアルゴリズム:Algorithms強化学習機械学習:Machine Learning深層学習:Deep Learning

最大エントロピー逆強化学習（Maximum Entropy Inverse Reinforcement Learning, MaxEnt IRL）の概要最大エントロピー逆強化学習（Maximum Entropy ...

2024.08.23

pythonアルゴリズム:Algorithms強化学習機械学習:Machine Learning深層学習:Deep Learning

逆強化学習の概要について逆強化学習（Inverse Reinforcement Learning, IRL）は、強化学習の一種で、エキスパートの行動データからエキスパートの意思決定の背後にある報酬関数を学習するタ...

2024.08.16

pythonアルゴリズム:Algorithmsバンディッド問題強化学習機械学習:Machine Learning深層学習:Deep Learning

TD3 (Twin Delayed Deep Deterministic Policy Gradient)の概要 TD3（Twin Delayed Deep Deterministic Policy Gradien...

2024.08.09

pythonアルゴリズム:Algorithms強化学習機械学習:Machine Learning深層学習:Deep Learning

Double Q-Learningの概要 Double Q-Learning（ダブルQ-ラーニング）は、"Q-学習の概要とアルゴリズム及び実装例について"で述べているQ-Learningの一種であり、強化学習のアル...

2024.08.02

pythonアルゴリズム:Algorithms強化学習機械学習:Machine Learning深層学習:Deep Learning

Trust Region Policy Optimization (TRPO)の概要 Trust Region Policy Optimization（TRPO）は、強化学習のアルゴリズムで、"ポリシー勾配法の概要...

2024.07.26

pythonアルゴリズム:Algorithms強化学習機械学習:Machine Learning深層学習:Deep Learning

ドリフト検出ベースの逆強化学習（Drift-based Inverse Reinforcement Learning）の概要ドリフト検出ベースの逆強化学習（Drift-based Inverse Reinforc...

2024.07.19

pythonアルゴリズム:Algorithms強化学習機械学習:Machine Learning深層学習:Deep Learning

特徴量逆強化学習（Feature-based Inverse Reinforcement Learning）の概要特徴量逆強化学習（Feature-based Inverse Reinforcement Lear...

2024.07.12

pythonアルゴリズム:Algorithms強化学習機械学習:Machine Learning深層学習:Deep Learning

Artificial General Intelligence（人工一般知能）本ブログのメインテーマの一つであるAGIとは、Artificial General Intelligence（人工一般知能）の略称で、人間...

2024.07.06

アルゴリズム:Algorithms人工知能:Artificial Intelligence強化学習