Towards Generalizable Robotic Data Flywheel: High-Dimensional Factorization and Composition

Yuyang Xiao ^*,†

Yifei Zhou ^*

Haoran Wang

Wenxuan Ou

Yuxiao Liu ^†

ByteDance Seed

^*Equal contribution, ^†Corresponding authors

arXiv

Abstract

The lack of sufficiently diverse data, coupled with limited data efficiency, remains a major bottleneck for generalist robotic models, yet systematic strategies for collecting and curating such data are not fully explored. One major challenge in achieving generalization lies in the inherently high-dimensional nature of robotic data. Task diversity arises from implicit factors that are sparsely distributed across multiple dimensions and are difficult to define explicitly. To address this challenge, we introduce F-ACIL, a heuristic Factor-Aware Compositional Iterative Learning framework for data factorization and compositional generalization.

Data Distribution

We show three representative data distributions. The 3D surfaces (blue-purple) represent different training data distributions, while the contour maps(warm amber) show the shared evaluation distribution. We compare three data distribution: (a) a narrow Gaussian-like distribution with limited coverage, (b) a quasi-uniform distribution with full coverage but low efficiency, and (c) F-ACIL with multiple gaussian modes, which achieves efficient, broad coverage via factor-wise composition.

Experiment

We show how F-ACIL helps to achieve compositional generalization across high-dimensional spaces on two representative manipulation skill families: Pick-and-Place and Open-and-Close. We consider three groups of strategies to evaluate the improvement in data efficiency and the performance differences:

F-ACIL-Factors-Ratio: We constructs demonstrations by progressively increasing the coverage of task-relevant factors while controlling the ratio among factor spaces.
F-ACIL-Factors-Mixture: Increases the overall number of demonstrations from a quasi-uniform distribution without explicitly considering the factor structure.
Gaussian: Samples demonstrations purely at random from a gaussian distribution without exploiting factor-aware data composition.

1. Compositional Generalization Results

For Pick-and-Place trials in object factor space, F-ACIL-Factors-Ratio requires 2~3x less data to achieve 80–90% success rate comparing to F-ACIL-Factors-Mixture. In the object–action setting of Open-and-Close, F-ACIL-Factors-Ratio requires 4x less data to achieve 80-85% success rate comparing to F-ACIL-Factors-Mixture. In the most complex object–action–environment space, both skills trained with F-ACIL-Factors-Ratio can achieve approximately 85–95% success rate with 3~4x less data than F-ACIL-Factors-Mixture. Both models trained with structured factor-wise strategies outperform the Gaussian baseline with 5~10x less data in all cases.

2. Scaling Laws with Increasing Dimension.

a. Success rate

Scaling simple tasks requires extensive data to reach baseline performance, which is far more difficult for complex skills such as Open-and-Close.

b. Power Law

The slope of these power laws can vary dramatically depending on the dimensionality of the distribution spaces. Blindly scaling up dataset volume without accounting for dimensionality often results the curse of dimensionality.

Though the model performance improves proportional to the dataset size according to the scaling law, the scaling exponent can vary dramatically depending on the dimensionality of the data manifold or task space. Blindly scaling up dataset volume without accounting for dimensionality often results in substantially diminished returns — a manifestation of the curse of dimensionality in high-dimensional regimes.

Rollout Exhibition (Fully Autonomous 1x Speed)

We conduct below entensive experiments for Pick-and-Place and Open-and-Close, where the latter one should be noticed: The texture of hinged object is defined by the texture of its manipulated component.

F-ACIL-Object

Texture	Transparent	Specular	Absorptive
Geometry	Cylindrical	Rod-like	Irregular

Texture	Specular	Diffuse	Transparent
Size	Large	Medium	Small

F-ACIL-Action

where the rows {Left, Middle, Right} discretize positions along the horizontal direction of the tabletop (x-axis), and the columns {Top, Bottom} discretize positions along the vertical direction of the tabletop (y-axis)

Top
Bottom	Left	Middle	Right

Top
Bottom	Left	Middle	Right

F-ACIL-Environment

Warm Light
Cool Light	Toward Right	Toward Left

Warm Light
Cool Light	Toward Right	Toward Left

Citation

@article{
  xiao2026generalizableroboticdataflywheel,
  title={Towards Generalizable Robotic Data Flywheel: High-Dimensional Factorization and Composition},
  author={Yuyang Xiao and Yifei Zhou and Haoran Wang and Wenxuan Ou and Yuxiao Liu},
  year={2026},
  eprint={2603.25583},
  archivePrefix={arXiv},
  primaryClass={cs.RO},
  url={https://arxiv.org/abs/2603.25583},
}