Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs

MV-RAG: Retrieval Augmented Multiview Diffusion































MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs

MV-RAG: Retrieval Augmented Multiview Diffusion






























Connecting metal-organic framework synthesis to applications using multimodal machine learning
Model Context Protocols in Adaptive Transport Systems: A Survey
Algorithmic Collective Action with Multiple Collectives
OpenCUA: Open Foundations for Computer-Use Agents
Spatial Policy: Guiding Visuomotor Robotic Manipulation with Spatial-Aware Modeling and Reasoning
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
CRISP: Persistent Concept Unlearning via Sparse Autoencoders
Selective Contrastive Learning for Weakly Supervised Affordance Grounding
EgoTwin: Dreaming Body and View in First Person
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Constraints-Guided Diffusion Reasoner for Neuro-Symbolic Learning
LLM-Based Agents for Competitive Landscape Mapping in Drug Asset Due Diligence
SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass
A Survey on Large Language Model Benchmarks
Waver: Wave Your Way to Lifelike Video Generation
LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries
Deep Think with Confidence
Mobile-Agent-v3: Foundamental Agents for GUI Automation
Intern-S1: A Scientific Multimodal Foundation Model
Language-Guided Tuning: Enhancing Numeric Optimization with Textual Feedback
NiceWebRL: a Python library for human subject experiments with reinforcement learning environments
From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Tinker: Diffusion's Gift to 3D--Multi-View Consistent Editing From Sparse Inputs without Per-Scene Optimization
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization
From Scores to Skills: A Cognitive Diagnosis Framework for Evaluating Financial Large Language Models
Granary: Speech Recognition and Translation Dataset in 25 European Languages
Connecting metal-organic framework synthesis to applications using multimodal machine learning
Model Context Protocols in Adaptive Transport Systems: A Survey
Algorithmic Collective Action with Multiple Collectives
OpenCUA: Open Foundations for Computer-Use Agents
Spatial Policy: Guiding Visuomotor Robotic Manipulation with Spatial-Aware Modeling and Reasoning
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
CRISP: Persistent Concept Unlearning via Sparse Autoencoders
Selective Contrastive Learning for Weakly Supervised Affordance Grounding
EgoTwin: Dreaming Body and View in First Person
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Constraints-Guided Diffusion Reasoner for Neuro-Symbolic Learning
LLM-Based Agents for Competitive Landscape Mapping in Drug Asset Due Diligence
SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass
A Survey on Large Language Model Benchmarks
Waver: Wave Your Way to Lifelike Video Generation
LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries
Deep Think with Confidence
Mobile-Agent-v3: Foundamental Agents for GUI Automation
Intern-S1: A Scientific Multimodal Foundation Model
Language-Guided Tuning: Enhancing Numeric Optimization with Textual Feedback
NiceWebRL: a Python library for human subject experiments with reinforcement learning environments
From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Tinker: Diffusion's Gift to 3D--Multi-View Consistent Editing From Sparse Inputs without Per-Scene Optimization
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization
From Scores to Skills: A Cognitive Diagnosis Framework for Evaluating Financial Large Language Models
Granary: Speech Recognition and Translation Dataset in 25 European Languages