文章摘要
RT-2 (robotic-transformer2): Vision-Language-Action Models Transfer Web Knowledge to Robotic Control 论文点此, 源代码点此 “express the actions as text tokens and incorporate them directly into the training set of the model in the same way as natural lan…