目录 Preface Chapter 1: What is Reinforcement Learning? Learning - supervised, unsupervised, and reinforcement RL formalisms and relations Reward The agent The environment Actions Observations Markov decision processes Markov process Markov reward process Markov decision process Summary Chapter 2: OpenAI Gym The anatomy of the agent Hardware and software requirements OpenAI Gym API Action space Observation space The environment Creation of the environment The CartPole session The random CartPole agent The extra Gym functionality - wrappers and monitors Wrappers Monitor Summary Chapter 3: Deep Learning with PyTorch Tensors Creation of tensors Scalar tensors Tensor operations GPU tensors Gradients Tensors and gradients NN building blocks Custom layers Final glue - loss functions and optimizers Loss functions Optimizers Monitoring with TensorBoard TensorBoard 101 Plotting stuff Example -GAN on Atari images Summary Chapter 4: The Cross-Entropy Method Taxonomy of RL methods Practical cross-entropy Cross-entropy on CartPole
以下为对购买帮助不大的评价