Towards Generalizable Robotic Data Flywheel: High-Dimensional Factorization and Composition

ByteDance Seed
*Equal contribution, Corresponding authors

Abstract

The lack of sufficiently diverse data, coupled with limited data efficiency, remains a major bottleneck for generalist robotic models, yet systematic strategies for collecting and curating such data are not fully explored. One major challenge in achieving generalization lies in the inherently high-dimensional nature of robotic data. Task diversity arises from implicit factors that are sparsely distributed across multiple dimensions and are difficult to define explicitly. To address this challenge, we introduce F-ACIL, a heuristic Factor-Aware Compositional Iterative Learning framework for data factorization and compositional generalization.

teaser
An overview of F-ACIL.

Data Distribution

We show three representative data distributions. The 3D surfaces (blue-purple) represent different training data distributions, while the contour maps(warm amber) show the shared evaluation distribution. We compare three data distribution: (a) a narrow Gaussian-like distribution with limited coverage, (b) a quasi-uniform distribution with full coverage but low efficiency, and (c) F-ACIL with multiple gaussian modes, which achieves efficient, broad coverage via factor-wise composition.

figure1
Illustration of data distribution with specific properties.

Experiment

We show how F-ACIL helps to achieve compositional generalization across high-dimensional spaces on two representative manipulation skill families: Pick-and-Place and Open-and-Close. We consider three groups of strategies to evaluate the improvement in data efficiency and the performance differences:

  1. F-ACIL-Factors-Ratio: We constructs demonstrations by progressively increasing the coverage of task-relevant factors while controlling the ratio among factor spaces.
  2. F-ACIL-Factors-Mixture: Increases the overall number of demonstrations from a quasi-uniform distribution without explicitly considering the factor structure.
  3. Gaussian: Samples demonstrations purely at random from a gaussian distribution without exploiting factor-aware data composition.

1. Compositional Generalization Results

strategy

For Pick-and-Place trials in object factor space, F-ACIL-Factors-Ratio requires 2~3x less data to achieve 80–90% success rate comparing to F-ACIL-Factors-Mixture. In the object–action setting of Open-and-Close, F-ACIL-Factors-Ratio requires 4x less data to achieve 80-85% success rate comparing to F-ACIL-Factors-Mixture. In the most complex object–action–environment space, both skills trained with F-ACIL-Factors-Ratio can achieve approximately 85–95% success rate with 3~4x less data than F-ACIL-Factors-Mixture. Both models trained with structured factor-wise strategies outperform the Gaussian baseline with 5~10x less data in all cases.

2. Scaling Laws with Increasing Dimension.

a. Success rate

Scaling simple tasks requires extensive data to reach baseline performance, which is far more difficult for complex skills such as Open-and-Close.

success_rate

Changes in success rate in dimensions.

b. Power Law

The slope of these power laws can vary dramatically depending on the dimensionality of the distribution spaces. Blindly scaling up dataset volume without accounting for dimensionality often results the curse of dimensionality.

scaling_law

Though the model performance improves proportional to the dataset size according to the scaling law, the scaling exponent can vary dramatically depending on the dimensionality of the data manifold or task space. Blindly scaling up dataset volume without accounting for dimensionality often results in substantially diminished returns — a manifestation of the curse of dimensionality in high-dimensional regimes.

Rollout Exhibition (Fully Autonomous 1x Speed)

We conduct below entensive experiments for Pick-and-Place and Open-and-Close, where the latter one should be noticed: The texture of hinged object is defined by the texture of its manipulated component.

F-ACIL-Object

Texture
Transparent
Specular
Absorptive
Geometry
Cylindrical
Rod-like
Irregular
Texture
Specular
Diffuse
Transparent
Size
Large
Medium
Small

F-ACIL-Action

where the rows {Left, Middle, Right} discretize positions along the horizontal direction of the tabletop (x-axis), and the columns {Top, Bottom} discretize positions along the vertical direction of the tabletop (y-axis)

Top
Bottom
Left
Middle
Right
Top
Bottom
Left
Middle
Right

F-ACIL-Environment

Warm Light
Cool Light
Toward Right
Toward Left
Warm Light
Cool Light
Toward Right
Toward Left

Citation

@article{
xiao2026generalizableroboticdataflywheel,
title={Towards Generalizable Robotic Data Flywheel: High-Dimensional Factorization and Composition},
author={Yuyang Xiao and Yifei Zhou and Haoran Wang and Wenxuan Ou and Yuxiao Liu},
year={2026},
eprint={2603.25583},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2603.25583},
}