EvolvingGrasp

Abstract

Dexterous robotic hands often struggle to generalize effectively in complex environments due to the limitations of models trained on low-diversity data. However, the real world presents an inherently unbounded range of scenarios, making it impractical to account for every possible variation. A natural solution is to enable robots learning from experience in complex environments—an approach akin to evolution, where systems improve through continuous feedback, learning from both failures and successes, and iterating toward optimal performance. Motivated by this, we propose EvolvingGrasp, an evolutionary grasp generation method that continuously enhances grasping performance through efficient preference alignment. Specifically, we introduce Handpose-wise Preference Optimization (HPO), which allows the model to continuously align with preferences from both positive and negative feedback while progressively refining its grasping strategies. To further enhance efficiency and reliability during online adjustments, we incorporate a Physics-aware Consistency Model within HPO, which accelerates inference, reduces the number of timesteps needed for preference fine-tuning, and ensures physical plausibility throughout the process. Extensive experiments across four benchmark datasets demonstrate state-of-the-art performance of our method in grasp success rate and sampling efficiency. Our results validate that EvolvingGrasp enables evolutionary grasp generation, ensuring robust, physically feasible, and preference-aligned grasping in both simulation and real scenarios.

Visualization and Performance

Directional Weight Score

The left part illustrates EvolvingGrasp, an approach akin to evolution, where it enables the model to learn from experience and iteratively refine its grasping strategy. The right part demonstrates its efficiency and effectiveness.

Method

Directional Weight Score

Overview of EvolvingGrasp. The evolutionary process begins with the human preference guidance, where Handpose-wise Preference Optimization (HPO, highlighted in the green rectangle) is employed to facilitate preference alignment. These grasp poses are generated by the Physics-Aware Consistency Model (shown in blue rectangles), including Sampling and Distillation mechanism, to ensure the sampling efficiency and the physical plausibility. In this way, EvolvingGrasp, an efficient evolutionary grasp generation framework is proposed to enable the grasp model iteratively converge toward preferred distributions.

Method Performance

Directional Weight Score

Evolution of robotic grasp preferences during efficient feedback-driven finetuning across 10 epochs. Top row illustrates the adjustment from hand occlusion to clear nozzle visibility. Middle row demonstrates the transition from lens obstruction to an unobstructed camera view. The bottom row shows the evolution from a top-down grasping approach to a bottom-up one, while simultaneously mitigating physical impacts.

Comparison performance

Directional Weight Score

Grasping performance in terms of Suc.6, Suc.1, and Pen. comparison across different methods and datasets. “Step” refers to inference timestep. Bold values highlight the best results and underlined values indicate the runner-up.

Mean grasping performance

Directional Weight Score

Mean grasping performance in terms of success rate and penetration of randomly selected 6 objects with the finetuning epoch increasing during inference optimization.

Other Experiments

Directional Weight Score

Pseudo Code of EvolvingGrasp

Directional Weight Score

BibTeX

@article{zhu2025evolvinggrasp,
        title={EvolvingGrasp: Evolutionary Grasp Generation via Efficient Preference Alignment},
        author={Zhu, Yufei and Zhong, Yiming and Yang, Zemin and Cong, Peishan and Yu, Jingyi and Zhu, Xinge and Ma, Yuexin},
        journal={arXiv preprint arXiv:2503.14329},
        year={2025}
      }

EvolvingGrasp: Evolutionary Grasp Generation via Efficient Preference Alignment

Real World Experiments

Abstract

Visualization and Performance

Method

Method Performance

Comparison performance

Mean grasping performance

Other Experiments

Pseudo Code of EvolvingGrasp

BibTeX