四足机器人科研项目工作总结 Month 7 Week 2

本周总结

主要完成了如下工作

配置虚拟环境 rlgpu 并安装 Isaac Gym 和 IsaacGymEnvs
阅读 MICRO_Quadruped_ARCHIVE 源代码并写注释
学习了信赖域方法

2023.07.10

Unitree sdk

宇树科技开发知识库 (yuque.com)

宇树四足机器人开发入门

Zotero + WebDAV + Synchrony NAS

By default, Zotero will sync your local data with the Zotero servers whenever changes are made.

条目数据同步方式为云数据库的数据与本地数据存储文件夹 storage (用于存放附件) 同步

Data syncing syncs library items, but doesn’t sync attached files (PDFs, audio and video files, images, etc.). To sync these files, you can set up file syncing to accompany data syncing, using either Zotero Storage or WebDAV.

条目数据依靠 Zotero 同步，附件同步依靠 Zotero 或 WebDAV

Zotero 相关机制详解 - 知乎

Zotero 文献管理、科研笔记不完全教程 | loturest

解决方法：

Preference $\rightarrow$ Sync $\rightarrow$ File Sync
先选择 Zotero
然后再选择 WebDAV
此时发现 NAS 上的云的文件与 zotero 中的文件对齐了

NAS

2023.07.11

Breakout / PPO / 调试超参数

learning_rate 的选择

1 2	lr_set = {1e-2, 2e-2, 5e-2, 1e-3, 2e-3, 5e-3, 1e-4, 2e-4, 5e-4, 1e-5, 2e-5, 5e-5, // 1e-6, 2e-6, 5e-6, 1e-7, 2e-7}

设定 learning rate schedule $*$

from typing import Callable
from stable_baselines3 import PPO


def linear_schedule(initial_value: float) -> Callable[[float], float]:
    """
    Linear learning rate schedule.

    :param initial_value: Initial learning rate.
    :return: schedule that computes
      current learning rate depending on remaining progress
    """
    def func(progress_remaining: float) -> float:
        """
        Progress will decrease from 1 (beginning) to 0.

        :param progress_remaining:
        :return: current learning rate
        """
        return progress_remaining * initial_value

    return func

# Initial learning rate of 0.001
model = PPO("MlpPolicy", "CartPole-v1", learning_rate=linear_schedule(0.001), verbose=1)
model.learn(total_timesteps=20_000)
# By default, `reset_num_timesteps` is True, in which case the learning rate schedule resets.
# progress_remaining = 1.0 - (num_timesteps / total_timesteps)
model.learn(total_timesteps=10_000, reset_num_timesteps=True)

batch_size 的作用

1	batch_size=128

batch_size 过高会降低模型的泛化能力？

重读 Stable-baselines3 官方文档

参考资料

paper: Stable-Baselines3: Reliable Reinforcement Learning Implementation

Reinforcement Learning Tips and Tricks Web$$ Slides$$

Custom Task Definition

observation space
action space
reward function
termination conditions

Choosing the observation space

enough information to solve the task
do not break Markov assumption
normalization

Algorithm Selection

define action space: discrete / continuous / …
prefer sampling efficiency or speed?
whether you can parallel the training: look at vectorized Environments to learn more about training with multiple workers

Creating a custom environment

normalize the observation space and action space (when continuous)
If there is some time delay between action and observation (e.g. due to Wi-Fi communication), you should give a history of observations as input.

Vectorized Environments $*$

搞不明白 DummyVecEnv 和 SubprocVecEnv 的区别。在官方给出的表格中，前者不能 Multi-processing，而后者可以，但是实际训练时前者训练速度更快？！

深度学习 PyTorch，TensorFlow 中GPU利用率较低，CPU利用率很低，且模型训练速度很慢的问题总结与分析_深度学习 GPU 利用率很低

2023.07.12

NVIDIA / Isaac Omniverse / Isaac Sim

Isaac Gym Preview 4 & IsaacGymEnvs

Paper: Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning

Forum: Latest Robotics - Isaac/Isaac Gym topics - NVIDIA Developer Forums

Projects:

GitHub: GitHub | NVIDIA-Omniverse/IsaacGymEnvs: Isaac Gym Reinforcement Learning Environments

镜像: GitCode | mirrors / NVIDIA-Omniverse / IsaacGymEnvs · GitCode

安装与环境配置资料查阅

Blog:

Documentation:

在 isaacgym/docs 目录下打开 index.html 文件

重要事实

Isaac Gym 是 Nvidia 为强化学习开发的物理模拟环境。基于 OpenAI Gym 库，物理计算在 GPU 上进行，结果可以作为 Pytorch GPU 张量接收，从而实现快速模拟和学习。物理模拟是使用 PhysX 进行的，它还支持使用 FleX 的软体模拟。
可以从 Nvidia的开发人员页面免费下载 Isaac Gym 主软件包。文档以 HTML 格式保存在软件包的“docs”目录中（请注意，网站上没有）
推荐环境：
- Ubuntu 18.04、20.04
- Python 3.6~3.8
- Nvidia Driver>=470
文件 rlgpu_conda_envs.yml 中写出了 isaac gym 的依赖库
- python=3.7
- pytorch=1.8.1
- torchvision=0.9.1
- cudatoolkit=11.1
- pyyaml>=5.3.1
- scipy>=1.5.0
- tensorboard>=2.2.1
IsaacGymEnvs 是一个 Python 软件包，用于在 Isaac Gym 中测试强化学习环境。通过参考实现的任务，可以使用 rl-games 中实现的强化学习算法轻松构建强化学习环境。即使对于那些计划编写自己的强化学习算法的人，也建议尝试使用此软件包与 Isaac Gym 一起学习。它最初包含在 Isaac Gym 中，在 Preview3 中分离出来，现在在 GitHub 上公开可用

安装及报错解决

在进行第 2，3 步之前先看第 5 步！

环境
- Ubuntu 22.04 LTS (可能会有问题？未来如果出现问题，溯源时首先需要检查这里是否有问题)
- Python 3.7
- Nvidia Driver 525

创建名为 rlgpu 的虚拟环境并激活环境

1 2	(base) ... ~$ conda create -n rlgpu python=3.7 (base) ...~$ source activate rlgpu

安装依赖库

Isaac Gym 主要就是依赖 pytorch, torchvision, cudatoolkit, tensorboard，而 IsaacGymEnvs 的运行依赖 Isaac Gym

1 2	(rlgpu) ... ~$ conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia (rlgpu) ... ~$ pip install tensorboard

下载 Isaac Gym - Preview Release

首次登录需要注册并成为会员 (免费)
安装 Isaac Gym

解压 IsaacGym_Preview_4_Package.tar.gz 文件放在主目录下

进入 IsaacGym_Preview_4_Package\isaacgym\docs，双击 index.html 打开官方文档，按照步骤安装
- 方法 1（失败）：直接建立新的虚拟环境
  1
  2
  3
  4
  5
  (base) ... ~$ cd isaacgym/python/
  
  (base) ... ~$ sh ../create_conda_env_rlgpu.sh
  # 或者
  (base) ... ~$ bash ../create_conda_env_rlgpu.sh
  结果在创建新环境时一直卡在 Solving Environment
- 方法 2（成功）：如步骤 2，3 创建环境之后
  1
  (base) ... ~$ source activate rlgpu
当上述两个方法二选一之后，可以通过如下方式安装 Isaac Gym：首先返回 python 子目录下，激活 rlgpu 环境，接着执行如下命令
1
(rlgpu) ... ~/isaacgym/python$ pip install -e .

安装 IsaacGymEnvs

直接通过 git clone 的方式安装

1	git clone https://github.com/NVIDIA-Omniverse/IsaacGymEnvs.git$ cd IsaacGymEnvs

失败，换用镜像安装

1	git clone https://gitcode.net/mirrors/NVIDIA-Omniverse/IsaacGymEnvs.git

类似地

1	(rlgpu) ... ~/IsaacGymEnvs$ pip install -e .

测试

通过 PyCharm 打开 IsaacGymEnvs 项目，右下角选择添加 Python 本地解释器，选择 Conda，选择 rlgpu

打开 isaacgymenvs 目录下的 train.py，点击运行，发现报错 ImportError: libpython3.7m.so.1.0: cannot open shared ogject file:...

报错处理

首先找出系统中的 libpython3.7m.so.1.0 的位置

1	(base) ... ~$ find / -name "libpython*=3.7m.so.1.0"

输入可能如下所示

...
/home/cc/anaconda3/envs/rlgpu/lib/libpython3.so
/home/cc/anaconda3/envs/rlgpu/lib/libpython3.7m.so
/home/cc/anaconda3/envs/rlgpu/lib/libpython3.7m.so.1.0
...

找到任意一个 libpython3.7m.so.1.0，复制到 /usr/lib/x86_64-linux-gnu 目录下

1	(base) ... ~$ sudo cp /home/cc/anaconda3/envs/rlgpu/lib/libpython3.7m.so.1.0 /usr/lib/x86_64-linux-gnu

再次运行 train.py，结果成功

训练 Anymal

在 PyCharm 中打开 Terminal 终端，找到 train.py 对应的目录，选择 task 为 Anymal
1
(rlgpu) ... ~/IsaacGymEnvs/isaacgymenvs$ python train.py task=Anymal

IsaacGymEnvs 文件结构

IsaacGymEnvs

> .gitlab
> assets
> docs
> isaacgymenvs
> isaacgymenvs.egg-info		# 在 pip install -e . 之后产生的
- .gitattributes
- .gitignore
- .pre-commit-config.yaml
- LICENSE.txt
- README.md
- setup.py

docs

> images
- dextreme.md
- domain_randomization.md
- factory.md
- framework.md
- pbt.md
- release_notes.md
- reproducibilty.md
- rl_examples.md

isaacgymenvs

> cfg			# configuration
> learning
> pbt
> runs			# training data / checkpoints
> tasks			# different tasks
> utils
- __init__.py
- train.py		# main

isaacgymenvs/Readme.md

Running the ANYmal

1	(rlgpu) ... ~/IsaacGymEnvs$ python isaacgymenvs/train.py task=Anymal

结果将会保存在 runs 文件夹下

TensorBoard

训练过程的 Callback 数据保存在 runs/EXPERIMENT_NAME/summaries 文件夹，其中 EXPERIMENT_NAME 一般和 Task 的名字有关，也可以通过参数 experiment 进行设定

在命令行中输入

1	(rlgpu) ... ~/IsaacGymEnvs$ tensorboard --logdir ./isaacgymenvs/runs/EXPERIMENT_NAME/summaries

点击 http://localhost:6006/ 查看结果

Loading trained models / Checkpoints $*$

Checkpoints 将会保存在 runs/EXPERIMENT_NAME/nn 文件夹下

如果想要继续训练

1	python train.py task=Ant checkpoint=runs/Ant/nn/Ant.pth

如果想要进行测试

1	python train.py task=Ant checkpoint=runs/Ant/nn/Ant.pth test=True num_envs=64

Configuration and command line arguments

使用 Hydra 进行配置文件的管理，在 train.py 文件中有 import hydra

有如下参数 task, train (选择 training config), num_envs, seed, sim_device (用于物理模拟的 device), rl_device (用于强化学习的 device), graphics_devices_id, pipeline (使用 GPU 或 CPU，详见文档), test (不进行训练), checkpoint (指定 load checkpoint 的路径), headless, experiment (设定实验名称), max_iterations

默认配置文件信息可以在 isaacgymenvs/config/config.yaml 找到

task 的配置文件可以在 isaacgymenvs/config/task/<TASK>.yaml 中找到并且用于 train 的配置文件位于 isaacgymenvs/config/train/<TASK>PPO.yaml.

Usage — OmegaConf 2.4.0.dev0 documentation

Tasks

Tasks 的源代码在 isaacgymenvs/tasks 中，它们的基类在 isaacgymenvs/tasks/base/vec_task.py 中

NVIDIA Omniverse Documentation

Omniverse Developer Guide — Omniverse Kit documentation

User Guide — Omniverse Launcher documentation

Isaac Sim - Robotics Simulation and Synthetic Data Generation | NVIDIA Developer

Isaac Gym Part 3E: Academic Labs - Eth Zurich - YouTube

Isaac Sim 探索|（一）安装 Omniverse 及 Isaac Sim - 知乎

Isaac Omniverse 安装组件时缓慢 - 知乎

暂时未遇到这个问题

Isaac仿真平台搭建以及ROS试用教程 - 稚晖 - 知乎

What Is Isaac Sim — Omniverse Robotics documentation

NVIDIA Omniverse™ Isaac Sim is a robotics simulation toolkit for the NVIDIA Omniverse™ platform. Isaac Sim has essential features for building virtual robotic worlds and experiments. It provides researchers and practitioners with the tools and workflows they need to create robust, physically accurate simulations and synthetic datasets. Isaac Sim supports navigation and manipulation applications through ROS/ROS2. It simulates sensor data from sensors such as RGB-D, Lidar, and IMU for various computer vision techniques such as domain randomization, ground-truth labeling, segmentation, and bounding boxes.

Isaac Sim: Extensions API — Isaac Sim 2022.2.1-beta.29 documentation

TAO Toolkit | NVIDIA Developer

Eliminate your need for mountains of data and an army of data scientists as you create AI/machine learning models, and speed up the development process with transfer learning - a powerful technique that instantly transfers learned features from an existing neural network model to a new customized one

The NVIDIA TAO Toolkit, built on TensorFlow and PyTorch, uses the power of transfer learning while simultaneously simplifying the model training process and optimizing the model for inference throughput on the target platform. The result is an ultra-streamlined workflow. Take your own models or pre-trained models, adapt them to your own real or synthetic data, then optimize for inference throughput. All without needing AI expertise or large training datasets.

强化学习环境 ISAAC GYM 初步入门 - 哔哩哔哩

Isaac Gym - Preview Release | NVIDIA Developer

目前可供选择的是 Preview 4

Isaac Gym 安装与简单使用 - 知乎

在终端打开 .py 文件时，首先到定位到文件所在文件夹位置

2023.07.13

IsaacGymEnvs

isaacgymenvs/tasks/base/vec_task.py

所有的自定义 Task 类都应该基于 VecTask 类，而 VecTask 继承了抽象类 Env

初始化Env 时：

实例化参数：config (主要是 config["env"])，sim_device (包括 device_type，device_id)，rl_device，graphics_device_id，headless
配置文件 config (dictionary) 中必须提供 numEnvs，numObservations，numActions
配置文件 config (dictionary) 中的可选参数：numAgents，numStates，controlFrequencyInv，clipObservations，clipActions，enableCameraSensors

Env 内含：

@abc.abstractmethod (子类继承该抽象基类时必须拥有以下方法)：allocate_buffer()，step()，reset()，resets_idx()
@property (将方法转化为属性)：observation_space，action_space，num_envs，num_acts，num_obs
其他：set_train_info()，set_env_state()，set_env_state()

初始化 VecTask 时：

配置文件参数 config 中必须提供 "physics_engine"

VecTasks 内含：

set_viewer()，allocate_buffers()，create_sim()，get_state()，reset()，render()，…
@abc.abstractmethod：pre_physics_step，post_physics_step

重点讲讲 step() 函数

在 Stable-Baseline3 中，自定义环境需要继承基类 gym.Env，同时要求重写 (@abc.abstractmethod) 方法 step，reset 和 render。其中 step 函数输入基于 RL 算法的 Policy 所提供的动作，并在仿真环境中进行动力学解算，最后计算 observation，rewards，terminated 和 info 并返回

但是在 IsaacGymEnvs 中，step 上述过程被进一步细化为

pre_physics_step：用于 apply actions
self.gym.simulation 中进行动力学解算
post_physics_step：用于计算 observation，rewards，resets，…

Python 新知

列表生成式，生成器 (generator)，迭代器 (iterator)

# 列表生成式
>>> a = [x for x in range(5)]
>>> a
[0, 1, 2, 3, 4]

# 生成器
>>> g = (x * x for x in range(1,5,2))
>>> next(g)
0
>>> next(g)
9
>>> next(g)
25
>>> next(g)
Traceback
StopIteration

匿名函数 lambda

>>> list(map(lambda x: x * x, [1, 2, 3, 4, 5, 6, 7, 8, 9]))
[1, 4, 9, 16, 25, 36, 49, 64, 81]

>>> evens = list(filter(lambda x: x % 2 == 0, list(range(1,11))))

__init__.py

- demo.py
> package
  - __init__.py		# 调用 package 会自动调用该函数
  - module.py

abc, ABC, abstractmethod

import abc 
from abc import ABC

class Animal(ABC)
	def __init__(self):
    	pass
    
    @abc.abstractmethod
    def move(self):
        pass
    
class Human(Animal):
    def __init__(self):
        super().__init__()
    
    def move(self):
        print("Human can move.")

注解类型 typing

List[int], Dict[str, int], Tuple(int, str, int), Tuple(int, …)
Any, Union[int, str], Optional[int] = Union[int, None]
NewType

@property

将方法转化为属性
属性的方法名不要和实例变量重名

@property
def score(self):
    return self._score

@score.setter
def score(self, value):
    if not isinstance(value, int):
        raise ValueError('score must be an integer!')
    if value < 0 or value > 100:
        raise ValueError('score must between 0 ~ 100!')
    self._score = value

What is __init__.py for - Stack Overflow

__init__.py文件到底是什么 - 知乎

面向对象2：之抽象基类：import abc, metaclass=abc.ABCMeta | Recently_祝祝的博客

from abc import ABC,abstractmethod 是什么意思 | 音程的博客

要定义一些抽象方法，然后子类继承的时候必须要重写这些方法。出于这个目标，我们就要用到abc这个包。@abstractmethod表示这个方法是一个抽象方法，子类必须重写

注解类型 typing: from typing import List | zhegecsdn的博客

2023.07.14

gym / envs / mujoco / halfcheetah v1, v3, v4

gym/core.py 定义了抽象基类 Env。主要的 API 方法有 step()，reset()，render()，close()，seed()，其中 abc.abstractmethod 有step()，reset() 和 render()

gmy/envs/mujoco_env.py 中定义继承基类 gym.Env 的类 BaseMujocoEnv 以及定义了继承 BaseMujocoEnv 的 MujocoEnv

half_cheetah_v4.py 中定义了继承 MujocoEnv 的 HalfCheetahEnv

python 新知

用户定义的泛型类型 $*$

1	class Env(Generic[ObsType, ActType]):

MICRO_Quadruped_ARCHIVE 阅读并添加注释

reinforcement_learning.py

2023.07.15

MICRO_Quadruped_ARCHIVE 阅读并添加注释

utils.py

Gait Analysis

Here are some video clips.

Analytic

Real

Analysis of 3 Dogs’ Gaits – Walk, Trot, Transverse Gallop - YouTube

Model

Locomotion Skills for Simulated Quadrupeds - YouTube

2023.07.16

Trust Region Method

Videos

CS885 Lecture 15a: Trust Region Policy Optimization (Presenter: Shivam Kalra) - YouTube
Trust Regions - YouTube (很简明，值得一看)

Trust Region Policy Optimization

TRPO 置信域策略优化 Trust Region Policy Optimization | bilibili

本周总结

2023.07.10

Unitree sdk

Zotero + WebDAV + Synchrony NAS

NAS

2023.07.11

Breakout / PPO / 调试超参数

重读 Stable-baselines3 官方文档

参考资料

Reinforcement Learning Tips and Tricks Web$*$ Slides$*$

Vectorized Environments $*$

2023.07.12

NVIDIA / Isaac Omniverse / Isaac Sim

Isaac Gym Preview 4 & IsaacGymEnvs

安装与环境配置资料查阅

安装及报错解决

IsaacGymEnvs 文件结构

IsaacGymEnvs

docs

isaacgymenvs

isaacgymenvs/Readme.md

Running the ANYmal

TensorBoard

Loading trained models / Checkpoints $*$

Configuration and command line arguments

Tasks

2023.07.13

IsaacGymEnvs

isaacgymenvs/tasks/base/vec_task.py

Python 新知

2023.07.14

gym / envs / mujoco / halfcheetah v1, v3, v4

python 新知

用户定义的泛型类型 $*$

MICRO_Quadruped_ARCHIVE 阅读并添加注释

2023.07.15

MICRO_Quadruped_ARCHIVE 阅读并添加注释

Gait Analysis

Analytic

Real

Model

2023.07.16

Trust Region Method

Trust Region Policy Optimization

Reinforcement Learning Tips and Tricks Web$$ Slides$$