sarsa

英

美

网络撒尔沙; 沙士; 洋菝葜

医学

双语例句

SARSA ( λ) Algorithm of Reinforcement Learning Basd on States Clustering
一种基于状态聚类的SARSA（λ）强化学习算法
The learning of this method is divided into two processes, state space learning using K-means clustering algorithm for adaptive discretization of continuous states and policy learning using Sarsa algorithm for finding optimal policy.
该方法的学习过程分为两部分：对连续状态空间进行自适应离散化的状态空间学习，使用K-均值聚类算法；寻找最优策略的策略学习，使用替代合适迹Sarsa学习算法。
Reinforcement Learning and two classes of learning algorithms is introduced. A class of the state discretization based on RBF function for the Reinforcement Learning is proposed and preliminary empirical results are presented to compare the performance of the new method.
介绍了激励学习和两类学习算法：Q学习和SARSA学习，提出一类基于RBF函数的特征状态离散化方法，并对该方法进行了初步的实验比较。
Sarsa Reinforcement Learning Algorithm Based on Neural Networks
基于神经网络的Sarsa强化学习算法
Based on eligibility trace theory, a delayed fast reinforcement learning algorithm DFSARSA(λ) is proposed in this paper.
在对资格迹理论研究的基础上，提出了一种延迟快速强化学习算法DFSARSA（λ）（延迟快速SARSA（λ）算法）。
Based on the factored representation of a state, a new SARSA ( λ) algorithm is proposed.
基于状态的因素化表达，提出了一个新的SARSA（λ）激励学习算法。
Also the policy learned by Actor-Critic is better than that learned by Sarsa(λ), a value-based reinforcement method on the condition that the players have 360 view and the problem itself is not so large.
对于小的问题，球员在360度视角下，通过Actor-Critic强化学习方法得到的策略比基于值函数强化学习方法Sarsa（λ）得到的策略要好。
The conventional reinforcement method such as Q-learning, TD learning or Sarsa learning has a common characteristic of estimating the value function only and action selection is determined by the value function estimation completely.
强化学习中常用算法如Q-学习、TD学习、Sarsa学习的一个共同特点是仅对值函数进行估计，动作选择策略则由值函数的估计完全确定。

sarsa

双语例句

考试分类

行业分类

柯林斯词典词频分级词汇

其他分类