Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

Aria: An Open Multimodal Native Mixture-of-Experts Model































SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

Aria: An Open Multimodal Native Mixture-of-Experts Model






























Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
VGGT: Visual Geometry Grounded Transformer
Multi-Turn Code Generation Through Single-Step Rewards
Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability
Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence
Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation
BUT System for the MLC-SLM Challenge
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models
ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs
Sekai: A Video Dataset towards World Exploration
Data-driven material screening of secondary and natural cementitious precursors
QFFT, Question-Free Fine-Tuning for Adaptive Reasoning
Can LLMs Generate High-Quality Test Cases for Algorithm Problems? TestCase-Eval: A Systematic Evaluation of Fault Coverage and Exposure
AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy
Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model
Efficient Medical VIE via Reinforcement Learning
Scaling Test-time Compute for LLM Agents
Iterative transcription factor screening enables rapid generation of microglia-like cells from human iPSC
TaskCraft: Automated Generation of Agentic Tasks
Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency
Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Polystyrene nanoplastics disrupt the intestinal microenvironment by altering bacteria-host interactions through extracellular vesicle-delivered microRNAs
Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache
A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?
The Diffusion Duality
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
VGGT: Visual Geometry Grounded Transformer
Multi-Turn Code Generation Through Single-Step Rewards
Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability
Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence
Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation
BUT System for the MLC-SLM Challenge
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models
ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs
Sekai: A Video Dataset towards World Exploration
Data-driven material screening of secondary and natural cementitious precursors
QFFT, Question-Free Fine-Tuning for Adaptive Reasoning
Can LLMs Generate High-Quality Test Cases for Algorithm Problems? TestCase-Eval: A Systematic Evaluation of Fault Coverage and Exposure
AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy
Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model
Efficient Medical VIE via Reinforcement Learning
Scaling Test-time Compute for LLM Agents
Iterative transcription factor screening enables rapid generation of microglia-like cells from human iPSC
TaskCraft: Automated Generation of Agentic Tasks
Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency
Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Polystyrene nanoplastics disrupt the intestinal microenvironment by altering bacteria-host interactions through extracellular vesicle-delivered microRNAs
Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache
A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?
The Diffusion Duality