摘要

The success of robotic, such as UGV systems, largely benefits from the fundamental capability of autonomously finding collision-free path(s) to commit mobile tasks in routinely rough and complicated environments. Optimization of navigation under such circumstance has long been an open problem: 1) to meet the critical requirements of this task typically including the shortest distance and smoothness and 2) more challengingly, to enable a general solution to track the optimal path in real-time outdoor applications. Aiming at the problem, this study develops a two-tier approach to navigation optimization in terms of path planning and tracking. First, a "rope'' model has been designed to mimic the deformation of a path in axial direction under external force and the fixedness of the radial plane to contain a UGV in a collision-free space. Second, a deterministic policy gradient (DPG) algorithm has been trained efficiently on abstracted structures of an arbitrarily derived "rope'' to model the controller for tracking the optimal path. The learned policy can be generalized to a variety of scenarios. Experiments have been performed over complicated environments of different types. The results indicate that: 1) the rope model helps in minimizing distance and enhancing smoothness of the path, while guarantees the clearance; 2) the DPG can be modeled quickly (in a couple of minutes on an office desktop) and the model can apply to environments of increasing complexity under the circumstance of external disturbances without the need for tuning parameters; and 3) the DPG-based controller can autonomously adjust the UGV to follow the correct path free of risks by itself.