HyperAIHyperAI

HyperAI Configuration File

Providing Default Parameters for Execution

The HyperAI configuration file (openbayes.yaml) can be used in combination with the command-line tool to greatly simplify repetitive command input.

When using the command-line tool, you can pass required information such as "environment", "resources", and "datasets" through command parameters, for example:

bayes gear run task \
    --env=pytorch-2.0 \ # Specify runtime environment
    --resource=t4 \ # Specify resources to use; use command bayes gear resource to see usage in USAGE field
    --data openbayes/mnist/1:/input0 \ # Data to bind
    -- python main.py # Entry command

With openbayes.yaml, you can provide default commands for running tasks in the current directory. For example, if we define an openbayes.yaml with the following content:

data_bindings:
  - data: openbayes/mnist/1
    path: /input0
    type: ro
resource: t4
env: pytorch-2.0
command: "python main.py"

You only need to enter the following command in the current directory to achieve the same task execution effect:

bayes gear run task

openbayes.yaml Field Description and Specifications

openbayes.yaml currently consists of two parts:

  1. Basic type parameters, including five fields: data_bindings, resource, env, command, and parameters
  2. Hyperparameter tuning parameters, contained in hyper_tuning. Detailed content is introduced in Hyperparameter Tuning

data_bindings

Refers to bound data, supporting "container output" and "datasets", with a maximum of three bindings simultaneously. It consists of two parts: data and path

data

data refers to the bound data source. If the bound data source is a "dataset version", its format is:

<userid>/<dataset-name>/<dataset-version>

For example, to bind the first version of the MNIST dataset under HyperAI, the data field would be:

openbayes/mnist/1

If the bound data source is a "container output", its format is:

<userid>/jobs/<job-id>/output

For example, to bind the output of jfaqJeLMcPM under the test-project container under HyperAI, its format would be:

openbayes/jobs/jfaqJeLMcPM/output

path

The other part is path, which specifies which directory in the container to bind the data source to. Currently supported directories are:

  • /input0
  • /input1
  • /input2
  • /input3
  • /input4
  • /output

Two Configuration Methods for Data Binding

In openbayes.yaml, you can use one of the following two methods to configure data bindings:

  1. Using data_bindings (detailed configuration method):
data_bindings:
  - data: openbayes/mnist/1     # Full path of the dataset
    path: /input0               # Path to mount in the container
    type: ro                    # Optional: ro (read-only) or rw (read-write)
  - data: openbayes/jobs/jfaqJeLMcPM/output
    path: /output
    type: rw
  1. Using bindings (shorthand method):
# Without specifying binding permissions, defaults to read-only binding
bindings:
  - openbayes/eBIQp4yPMtU/1:/input0
  - openbayes/jobs/jfaqJeLMcPM/output:/output

# You can also specify read-write binding
bindings:
  - username/data-cli-2/5:/input0:rw
  - username/data-1/1:/output:rw

:::info Please note that in openbayes/eBIQp4yPMtU/1:/input0, openbayes is a dedicated name for public datasets. If you want to use your own dataset, you need to replace openbayes with your username. eBIQp4yPMtU is the dataset ID, 1 is the dataset version number; :/input0 binds the dataset to input0. :::

These two methods are equivalent, and you can choose either one based on your needs.

Data Binding Permission Control Instructions

  • ro: Read-only permission, suitable for reference datasets
  • rw: Read-write permission, suitable for output directories where results need to be saved

:::caution Note For read-only paths, the bound data source cannot be modified, but the binding speed is very fast. /output is the working directory. Although it has read-write permissions, binding data to the /output directory will incur additional data copy time, and will also generate additional storage usage when saving the container. Therefore, users need to decide on the binding directory based on their own scenarios. :::

:::tip

  • Use the bayes gear bindings command to view the list of bindable data :::

:::caution

  • Ensure the specified dataset version exists
  • Ensure you have permission to access the specified dataset
  • Do not use duplicate values for mount paths :::

resource

Specifies which computing resource container to use. Use the command bayes gear resource to see supported resource types.

env

Specifies which runtime environment to use. Use the command bayes gear env to view supported runtime environments.

command

Only required when creating "script execution", specifies the entry command when the task is executed.

parameters

When creating a task or jupyter execution, you can pass a set of key / value parameters through parameters. This parameter mainly serves two purposes:

  1. Conveniently record important parameters for this execution. parameters will be displayed on the execution interface

  2. Pass custom parameters to the execution, supporting two forms:

    • The content of parameters will generate a file openbayes_params.json during execution initialization, making it convenient for programs to read internal parameters through this file:

  • The content of parameters will be appended to the entry command as command-line arguments. For example, if openbayes.yaml contains the following:

    ...
    command: python main.py
    parameters:
      input: /input0
      epochs: 5
    ...

    Then the actual execution entry command will be

    python main.py --input=/input0 --epochs=5

Overriding Parameters in Configuration File with Command-Line Arguments

At the same time, you can also override the corresponding parameters through the command line. For example, for the openbayes.yaml mentioned above, you can override the entry execution command with the following command:

bayes gear run task -- sleep 360

Then the submitted command content will no longer be python main.py but sleep 360.