Model Deployment Introduction

After completing model training, you can deploy the model to a server or store it on a device to provide real-time model inference services. "Model Deployment (Serving)" is the server-side model inference functionality provided by HyperAI.

Deployment Modes

HyperAI model deployment supports two deployment modes:

Custom Deployment (Recommended): Fully customize the service startup method by writing a start.sh startup script
Traditional predictor.py approach: Use the predefined framework provided by HyperAI

API Key Authentication

HyperAI model deployment supports secure authentication using API Keys. Compared to JWT Token authentication, API Keys offer:

More fine-grained access control
Support for independent key management and tracking
Alignment with industry standard practices (such as OpenAI, HuggingFace, etc.)

You can create and manage API Keys in the model deployment settings page. For detailed information, please refer to API Key Management.

Custom Deployment Method (Recommended)

This is the simplest and most flexible deployment method. You only need to:

Prepare model files
Write a start.sh script to start your service

Custom Deployment Requirements

Required files:
- start.sh - Startup script, must ensure:
  - Listen on port 80
  - Handle HTTP requests
- Model files and other dependency files
Optional files:
- requirements.txt - For installing Python dependencies
- conda-packages.txt - For installing Conda dependencies
- dependencies.sh - For installing system dependencies
- .env - For setting environment variables

Example

You can use any framework (such as FastAPI, Flask, Gradio, etc.) to provide services. Here is a simple example using FastAPI:

# app.py
from fastapi import FastAPI
import uvicorn

app = FastAPI()

@app.get("/")
def predict():
    return {"message": "Hello World"}

# start.sh
#!/bin/bash
pip install fastapi uvicorn
uvicorn app:app --host 0.0.0.0 --port 80

Data Binding

When creating a model deployment, you can bind one or more data directories. The data binding method is basically the same as data binding for model training, and you can choose from the following sources:

Public datasets or models
Personal private datasets or models
Working directory of computing containers
Data repositories uploaded via file upload

Data Binding Characteristics

Model deployment data binding differs from model training in the following ways:

Read-only Binding: All data bindings are in read-only mode, with no write or modification operations allowed
Multiple Directory Binding: Multiple data directories can be bound simultaneously to different mount points:
- /openbayes/input/input0
- /openbayes/input/input1
- /openbayes/input/input2
- /openbayes/input/input3
- /openbayes/input/input4
Working Directory Characteristics:
- Contents in the working directory (/openbayes/home) are copied from the binding source when the deployment starts
- Important note: Since the contents of the working directory will be lost after restart, it is recommended to place all necessary model files and dependencies in the bound data directories

Selecting Binding Directories

The selection method for binding is similar to compute containers:

Version Management

Model deployment supports version management:

Versions are independent of each other and can support different runtime environments, resource types, and deployment contents
When a new version is deployed, the old version will be automatically taken offline
Version numbers increment as numeric sequences

Detailed operations are introduced in Managing Model Deployments.

Traditional Deployment Method (predictor.py)

If you wish to use the predefined framework provided by HyperAI, you can choose this method.

Required files:
- predictor.py - Model deployment script containing the Predictor class
- Model files
Optional files:
- requirements.txt, conda-packages.txt - For installing dependencies
- dependencies.sh - For installing system dependencies
- .env - For setting environment variables

Detailed writing rules are introduced in Writing Serving Services, and writing examples are available for reference in the openbayes-serving-examples model repository.

Model Deployment Introduction

On this page