HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
Papers

Papers

Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

Diffusion Model

Yangguang Li, Zi-Xin Zou, Zexiang Liu, et al.

Qwen2.5-Omni Technical Report

Qwen2.5-Omni Technical Report

Dual-Scale Single Image Dehazing Via Neural Augmentation

Z. G. Li, C. B. Zheng, H. Y. Shu, et al.

SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks

Kai-Wei Chang, Haibin Wu, Yu-Kai Wang, et al.

Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm

Ke Chen, Cheng-i Wang, Taylor Berg-Kirkpatrick, et al.

Adaptive Data Flywheel: Applying MAPE Control Loops to AI Agent Improvement

Aaditya Shukla, Sidney Knowles, Meenakshi Madugula, et al.

Expand VSR Benchmark for VLLM to Expertize in Spatial Rules

Peijin Xie, Lin Sun, Bingquan Liu, et al.

DensityTool: A post-processing tool for space and spin-resolved density of states from VASP

Lucas Lodeiro, Tomáš Rauch

A One-Dimensional Energy Balance Model Parameterization for the Formation of CO2 Ice on the Surfaces of Eccentric Extrasolar Planets

Vidya Venkatesan, Aomawa Shields, Russell Deitrick, et al.

Towards The Ultimate Brain: Exploring Scientific Discovery with ChatGPT AI

Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1

Birger Moëll, Fredrik Sand Aronsson, Sanian Akbar

Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization

Tzu-Quan Lin, Wei-Ping Huang, Hao Tang, et al.

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Xiao Bi, Deli Chen, Guanting Chen, et al.

MatterGen: a generative model for inorganic materials design

Claudio Zeni, Robert Pinsler, Daniel Zügner, et al.

MultiActor-Audiobook: Zero-Shot Audiobook Generation with Faces and Voices of Multiple Speakers

Kyeongman Park, Seongho Joo, Kyomin Jung

Phi-4 Technical Report

Marah Abdin, Ronen Eldan, Mojan Javaheripi, et al.

A Set of Tutorials for the LAMMPS Simulation Package

Simon Gravelle, Cecilia M. S. Alvaras, Jacob R. Gissinger, et al.

GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot

Aohan Zeng, Zhengxiao Du, Mingdao Liu, et al.

Length Aware Speech Translation for Video Dubbing

Harveen Singh Chadha, Aswin Shanmugam Subramanian, Vikas Joshi, et al.

DrawingSpinUp: 3D Animation from Single Character Drawings

Jie Zhou, Chufeng Xiao, Miu-Ling Lam, et al.

Phonetically-oriented word error alignment for speech recognition error analysis in speech translation

Nicholas Ruiz, Marcello Federico

ReaderLM-v2: Small Language Model for HTML to Markdown and JSON

Feng Wang, Zesheng Shi, Bo Wang, et al.

Deployment Calculation and Analysis for a Fail-Operational Automotive Platform

Klaus Becker, Bernhard Schätz, Christian Buckl, et al.

MegActor: Harness the Power of Raw Video for Vivid Portrait Animation

Shurong Yang, Huadong Li, Juhao Wu, et al.

Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams

Video Understanding

Visual Question Answering

Haoji Zhang, Yiqin Wang, Yansong Tang, et al.

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

Diffusion Model

Zhen Li, Mingdeng Cao, Xintao Wang, et al.

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

Yupeng Zhou, Daquan Zhou, Ming-Ming Cheng, et al.

Cherokee-English Machine Translation Demo with Quality Estimation and Corrective Feedback

Shiyue Zhang, Benjamin Frey, Mohit Bansal

Online Algorithm for Demand Response with Inelastic Demands and Apparent Power Constraint

Areg Karapetyan, Majid Khonji, Chi-Kin Chau, et al.

ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems

Siddhant Arora, Yifan Peng, Jiatong Shi, et al.

Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

Neha Sengupta, Sunil Kumar Sahu, Bokang Jia, et al.

Quick Back-Translation for Unsupervised Machine Translation

Benjamin Brimacombe, Jiawei Zhou

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

Diffusion Model

Yangguang Li, Zi-Xin Zou, Zexiang Liu, et al.

Qwen2.5-Omni Technical Report

Qwen2.5-Omni Technical Report

Dual-Scale Single Image Dehazing Via Neural Augmentation

Z. G. Li, C. B. Zheng, H. Y. Shu, et al.

SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks

Kai-Wei Chang, Haibin Wu, Yu-Kai Wang, et al.

Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm

Ke Chen, Cheng-i Wang, Taylor Berg-Kirkpatrick, et al.

Adaptive Data Flywheel: Applying MAPE Control Loops to AI Agent Improvement

Aaditya Shukla, Sidney Knowles, Meenakshi Madugula, et al.

Expand VSR Benchmark for VLLM to Expertize in Spatial Rules

Peijin Xie, Lin Sun, Bingquan Liu, et al.

DensityTool: A post-processing tool for space and spin-resolved density of states from VASP

Lucas Lodeiro, Tomáš Rauch

A One-Dimensional Energy Balance Model Parameterization for the Formation of CO2 Ice on the Surfaces of Eccentric Extrasolar Planets

Vidya Venkatesan, Aomawa Shields, Russell Deitrick, et al.

Towards The Ultimate Brain: Exploring Scientific Discovery with ChatGPT AI

Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1

Birger Moëll, Fredrik Sand Aronsson, Sanian Akbar

Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization

Tzu-Quan Lin, Wei-Ping Huang, Hao Tang, et al.

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Xiao Bi, Deli Chen, Guanting Chen, et al.

MatterGen: a generative model for inorganic materials design

Claudio Zeni, Robert Pinsler, Daniel Zügner, et al.

MultiActor-Audiobook: Zero-Shot Audiobook Generation with Faces and Voices of Multiple Speakers

Kyeongman Park, Seongho Joo, Kyomin Jung

Phi-4 Technical Report

Marah Abdin, Ronen Eldan, Mojan Javaheripi, et al.

A Set of Tutorials for the LAMMPS Simulation Package

Simon Gravelle, Cecilia M. S. Alvaras, Jacob R. Gissinger, et al.

GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot

Aohan Zeng, Zhengxiao Du, Mingdao Liu, et al.

Length Aware Speech Translation for Video Dubbing

Harveen Singh Chadha, Aswin Shanmugam Subramanian, Vikas Joshi, et al.

DrawingSpinUp: 3D Animation from Single Character Drawings

Jie Zhou, Chufeng Xiao, Miu-Ling Lam, et al.

Phonetically-oriented word error alignment for speech recognition error analysis in speech translation

Nicholas Ruiz, Marcello Federico

ReaderLM-v2: Small Language Model for HTML to Markdown and JSON

Feng Wang, Zesheng Shi, Bo Wang, et al.

Deployment Calculation and Analysis for a Fail-Operational Automotive Platform

Klaus Becker, Bernhard Schätz, Christian Buckl, et al.

MegActor: Harness the Power of Raw Video for Vivid Portrait Animation

Shurong Yang, Huadong Li, Juhao Wu, et al.

Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams

Video Understanding

Visual Question Answering

Haoji Zhang, Yiqin Wang, Yansong Tang, et al.

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

Diffusion Model

Zhen Li, Mingdeng Cao, Xintao Wang, et al.

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

Yupeng Zhou, Daquan Zhou, Ming-Ming Cheng, et al.

Cherokee-English Machine Translation Demo with Quality Estimation and Corrective Feedback

Shiyue Zhang, Benjamin Frey, Mohit Bansal

Online Algorithm for Demand Response with Inelastic Demands and Apparent Power Constraint

Areg Karapetyan, Majid Khonji, Chi-Kin Chau, et al.

ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems

Siddhant Arora, Yifan Peng, Jiatong Shi, et al.

Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

Neha Sengupta, Sunil Kumar Sahu, Bokang Jia, et al.

Quick Back-Translation for Unsupervised Machine Translation

Benjamin Brimacombe, Jiawei Zhou

Dual-Scale Single Image Dehazing Via Neural Augmentation

SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks

Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm

Adaptive Data Flywheel: Applying MAPE Control Loops to AI Agent Improvement

Expand VSR Benchmark for VLLM to Expertize in Spatial Rules

DensityTool: A post-processing tool for space and spin-resolved density of states from VASP

A One-Dimensional Energy Balance Model Parameterization for the Formation of CO2 Ice on the Surfaces of Eccentric Extrasolar Planets

Towards The Ultimate Brain: Exploring Scientific Discovery with ChatGPT AI

Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1

Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

MatterGen: a generative model for inorganic materials design

MultiActor-Audiobook: Zero-Shot Audiobook Generation with Faces and Voices of Multiple Speakers

Phi-4 Technical Report

A Set of Tutorials for the LAMMPS Simulation Package

GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot

Length Aware Speech Translation for Video Dubbing

DrawingSpinUp: 3D Animation from Single Character Drawings

Phonetically-oriented word error alignment for speech recognition error analysis in speech translation

ReaderLM-v2: Small Language Model for HTML to Markdown and JSON

Deployment Calculation and Analysis for a Fail-Operational Automotive Platform

MegActor: Harness the Power of Raw Video for Vivid Portrait Animation

Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

Cherokee-English Machine Translation Demo with Quality Estimation and Corrective Feedback

Online Algorithm for Demand Response with Inelastic Demands and Apparent Power Constraint

ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems

Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

Quick Back-Translation for Unsupervised Machine Translation

Dual-Scale Single Image Dehazing Via Neural Augmentation

SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks

Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm

Adaptive Data Flywheel: Applying MAPE Control Loops to AI Agent Improvement

Expand VSR Benchmark for VLLM to Expertize in Spatial Rules

DensityTool: A post-processing tool for space and spin-resolved density of states from VASP

A One-Dimensional Energy Balance Model Parameterization for the Formation of CO2 Ice on the Surfaces of Eccentric Extrasolar Planets

Towards The Ultimate Brain: Exploring Scientific Discovery with ChatGPT AI

Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1

Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

MatterGen: a generative model for inorganic materials design

MultiActor-Audiobook: Zero-Shot Audiobook Generation with Faces and Voices of Multiple Speakers

Phi-4 Technical Report

A Set of Tutorials for the LAMMPS Simulation Package

GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot

Length Aware Speech Translation for Video Dubbing

DrawingSpinUp: 3D Animation from Single Character Drawings

Phonetically-oriented word error alignment for speech recognition error analysis in speech translation

ReaderLM-v2: Small Language Model for HTML to Markdown and JSON

Deployment Calculation and Analysis for a Fail-Operational Automotive Platform

MegActor: Harness the Power of Raw Video for Vivid Portrait Animation

Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

Cherokee-English Machine Translation Demo with Quality Estimation and Corrective Feedback

Online Algorithm for Demand Response with Inelastic Demands and Apparent Power Constraint

ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems

Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

Quick Back-Translation for Unsupervised Machine Translation