Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

TabTune: A Unified Library for Inference and Fine-Tuning Tabular Foundation Models

Step-Audio-EditX Technical Report































TabTune: A Unified Library for Inference and Fine-Tuning Tabular Foundation Models

Step-Audio-EditX Technical Report






























LEGO-Eval: Towards Fine-Grained Evaluation on Synthesizing 3D Embodied Environments with Tool Augmentation
UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions
Diffusion Language Models are Super Data Learners
UNO-Bench: A Unified Benchmark for Exploring the Compositional Law Between Uni-modal and Omni-modal in Omni Models
Dynamic Population Distribution Aware Human Trajectory Generation with Diffusion Model
Text to Robotic Assembly of Multi Component Objects using 3D Generative AI and Vision Language Models
Kosmos: An AI Scientist for Autonomous Discovery
Shorter but not Worse: Frugal Reasoning via Easy Samples as Length Regularizers in Math RLVR
Brain-IT: Image Reconstruction from fMRI via Brain-Interaction Transformer
When Modalities Conflict: How Unimodal Reasoning Uncertainty Governs Preference Dynamics in MLLMs
Don't Blind Your VLA: Aligning Visual Representations for OOD Generalization
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation
The AI Productivity Index (APEX)
Chain-of-Frames: Advancing Video Understanding in Multimodal LLMs via Frame-Aware Reasoning
Towards Robust Mathematical Reasoning
Towards a future space-based, highly scalable AI infrastructure system design
PHUMA: Physically-Grounded Humanoid Locomotion Dataset
UniREditBench: A Unified Reasoning-based Image Editing Benchmark
Generalizing Test-time Compute-optimal Scaling as an Optimizable Graph
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback
The Underappreciated Power of Vision Models for Graph Structural Understanding
Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation
NOBLE - Neural Operator with Biologically-informed Latent Embeddings to Capture Experimental Variability in Biological Neuron Models
Glia: A Human-Inspired AI for Automated Systems Design and Optimization
Context Engineering 2.0: The Context of Context Engineering
Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning
Continuous Autoregressive Language Models
Οππ»: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
LEGO-Eval: Towards Fine-Grained Evaluation on Synthesizing 3D Embodied Environments with Tool Augmentation
UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions
Diffusion Language Models are Super Data Learners
UNO-Bench: A Unified Benchmark for Exploring the Compositional Law Between Uni-modal and Omni-modal in Omni Models
Dynamic Population Distribution Aware Human Trajectory Generation with Diffusion Model
Text to Robotic Assembly of Multi Component Objects using 3D Generative AI and Vision Language Models
Kosmos: An AI Scientist for Autonomous Discovery
Shorter but not Worse: Frugal Reasoning via Easy Samples as Length Regularizers in Math RLVR
Brain-IT: Image Reconstruction from fMRI via Brain-Interaction Transformer
When Modalities Conflict: How Unimodal Reasoning Uncertainty Governs Preference Dynamics in MLLMs
Don't Blind Your VLA: Aligning Visual Representations for OOD Generalization
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation
The AI Productivity Index (APEX)
Chain-of-Frames: Advancing Video Understanding in Multimodal LLMs via Frame-Aware Reasoning
Towards Robust Mathematical Reasoning
Towards a future space-based, highly scalable AI infrastructure system design
PHUMA: Physically-Grounded Humanoid Locomotion Dataset
UniREditBench: A Unified Reasoning-based Image Editing Benchmark
Generalizing Test-time Compute-optimal Scaling as an Optimizable Graph
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback
The Underappreciated Power of Vision Models for Graph Structural Understanding
Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation
NOBLE - Neural Operator with Biologically-informed Latent Embeddings to Capture Experimental Variability in Biological Neuron Models
Glia: A Human-Inspired AI for Automated Systems Design and Optimization
Context Engineering 2.0: The Context of Context Engineering
Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning
Continuous Autoregressive Language Models
Οππ»: Online RL Fine-tuning for Flow-based Vision-Language-Action Models
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats