A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens | alphaXiv
【DL輪読会】A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tok more
【DL輪読会】A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tok more
【DL輪読会】Flow-OPD: On-Policy Distillation for Flow Matching Models by @DeepLearning2023
【DL輪読会】Causal-JEPA: Learning World Models through Object-Level Latent Interventions b more
【DL輪読会】MolmoAct2: Action Reasoning Models for Real-World Deployment by @DeepLearning2 more
【DL輪読会】Geometry-aware 4D Video Generation for Robot Manipulation by @DeepLearning2023
【DL輪読会】WEBROUTER: QUERY-SPECIFIC ROUTER VIA VARIATIONAL INFORMATION BOTTLENECK FOR CO more
【DL輪読会】WorldMark: A Unified Benchmark Suite for Interactive Video World Models by @De more
【DL輪読会】LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from P more