Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios

FLOWER: Democratizing Generalist Robot Policies with Efficient Vision-Language-Action Flow Policies































LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios

FLOWER: Democratizing Generalist Robot Policies with Efficient Vision-Language-Action Flow Policies






























Inpainting-Guided Policy Optimization for Diffusion Large Language Models
MCP-AgentBench: Evaluating Real-World Language Agent Performance with MCP-Mediated Tools
A Survey on Cache Methods in Diffusion Models: Toward Efficient Multi-Modal Generation
Rethinking Driving World Model as Synthetic Data Generator for Perception Tasks
Spatially-Varying Autofocus
When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation
FineVision: Open Data Is All You Need
Glyph: Scaling Context Windows via Visual-Text Compression
PICABench: How Far Are We from Physically Realistic Image Editing?
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
Self-Attention to Operator Learning-based 3D-IC Thermal Simulation
Earth AI: Unlocking Geospatial Insights with Foundation Models and Cross-Modal Reasoning
Rethinking Cross-lingual Gaps from a Statistical Viewpoint
Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery
Emergent Misalignment via In-Context Learning: Narrow in-context examples can produce broadly misaligned LLMs
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning
DeepSeek-OCR: Contexts Optical Compression
Direct Preference Optimization with Unobserved Preference Heterogeneity: The Necessity of Ternary Preferences
Elucidated Rolling Diffusion Models for Probabilistic Weather Forecasting
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale
AI for Service: Proactive Assistance with AI Glasses
WithAnyone: Towards Controllable and ID Consistent Image Generation
Agentic Entropy-Balanced Policy Optimization
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA
Inpainting-Guided Policy Optimization for Diffusion Large Language Models
MCP-AgentBench: Evaluating Real-World Language Agent Performance with MCP-Mediated Tools
A Survey on Cache Methods in Diffusion Models: Toward Efficient Multi-Modal Generation
Rethinking Driving World Model as Synthetic Data Generator for Perception Tasks
Spatially-Varying Autofocus
When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation
FineVision: Open Data Is All You Need
Glyph: Scaling Context Windows via Visual-Text Compression
PICABench: How Far Are We from Physically Realistic Image Editing?
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
Self-Attention to Operator Learning-based 3D-IC Thermal Simulation
Earth AI: Unlocking Geospatial Insights with Foundation Models and Cross-Modal Reasoning
Rethinking Cross-lingual Gaps from a Statistical Viewpoint
Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery
Emergent Misalignment via In-Context Learning: Narrow in-context examples can produce broadly misaligned LLMs
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning
DeepSeek-OCR: Contexts Optical Compression
Direct Preference Optimization with Unobserved Preference Heterogeneity: The Necessity of Ternary Preferences
Elucidated Rolling Diffusion Models for Probabilistic Weather Forecasting
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale
AI for Service: Proactive Assistance with AI Glasses
WithAnyone: Towards Controllable and ID Consistent Image Generation
Agentic Entropy-Balanced Policy Optimization
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA