HyperAIHyperAI

Managing Containers with bayes

The bayes command line has the concept of a "working directory", which corresponds to the "output" directory in the HyperAI container. When creating a container through the command line tool, you first need to designate a local directory as the "working directory" and establish a mapping relationship with the HyperAI "container". The specific operations are as follows:

  1. Switch to the directory where the code to be executed is located: cd ~/openbayes-mnist-example
  2. Initialize a new container: bayes gear init mnist-example. At this point, the current directory has created a mapping relationship with the mnist-example container, and all created "executions" will appear under this container.

:::note Use the bayes gear ls command to view all your containers :::

The bayes gear init command can use an existing container name or container ID to initialize the current directory. If you initialize with a non-existent container name, a new container will be created.

After the preparation work is complete, we will introduce the usage of several access methods.

Creating "Python Script Execution" via Command Line Parameters

Through the command bayes gear run task -h, you can see numerous example prompts on how to create a "Python Script Execution".

Here we'll first create a relatively simple version:

$ bayes gear run task --env=pytorch-2.0 -- python main.py

Currently operating on organization org1...
task_command information: python main.py
Uploading source code...
Preparing to upload source code...
Obtaining upload authorization...
Starting file scan, please wait...
Found 9 files in total, 13.3 kB in total, starting upload...
Upload progress: 100% (9/9): 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13.3k/13.3k [00:00<00:00, 44.0kB/s]

✅ Source code uploaded successfully! 9 files uploaded
Requesting server to create container...
Container created successfully
Open webpage https://openbayes.com/console/org1/jobs/fpyx2l77wtvh to view detailed container information

The content after -- is the specific command to be executed. If there are symbols like &&, they need to be protected with quotes: bayes gear run task -- 'echo 123 && python main.py'.

You can see that bayes uploaded the files in the current directory and created a "Python script" task.

Next, let's try to create a slightly more complex version using command line parameters:

Here is the translation:

  bayes gear run task \
      --resource cpu \
      --env pytorch-2.0 \
      --data openbayes/eBIQp4yPMtU/1:/input0 \
      --data openbayes/sTggKplxyT6/1:/input1 \
      --data openbayes/bbNaMvDNqO9/1:/input2 \
      --data username/jobs/3s55ypc33ptl/output:/output \
      --message "task message" \
      --open \
      --follow \
      -- sleep 60

Introduction to several available parameters:

  • -e or --env selects the image. Available images can be queried through the command bayes gear env
  • -r or --resource selects computing resources. Available resources can be queried through the command bayes gear resource
  • -d or --data binds data. Available datasets for binding can be queried through the command bayes gear bindings
  • -m or --message execution description, can be left empty
  • -o or --open will open the corresponding web interface in the browser after the container starts running
  • -f or --follow tracks the status of the running container

:::info Note that in --data openbayes/eBIQp4yPMtU/1:/input0, openbayes is the dedicated name for public datasets. If you want to use your own dataset, you need to replace openbayes with your username. eBIQp4yPMtU is the dataset ID, 1 is the dataset version number; :/input0 binds the dataset to input0. :::

Creating "Python Script Execution" through openbayes.yaml

Additionally, after binding the current directory to the container through bayes gear init, a file openbayes.yaml will appear in the directory with the following initial content:

openbayes.yaml
## For the latest instructions on "Configuration File", please refer to https://openbayes.com/docs/cli/config-file/

## data_bindings
#  Refers to bound data, supports "container output" and "datasets", with a maximum of three bindings simultaneously
#
#  A complete data_bindings example is as follows:
#
#    data_bindings:
#      - data: openbayes/mnist/1
#        path: /input0
#        type: ro
#      - data: openbayes/jobs/jfaqJeLMcPM/output
#        path: /output
#        type: rw
#
#  data_bindings can also be replaced with bindings, abbreviated as the following example:
#
#    bindings:
#      - openbayes/mnist/1:/input0
#      - openbayes/mnist/1:/input1:rw
#      - openbayes/jobs/jfaqJeLMcPM/output:/output
#
data_bindings: []

## resource
# Specifies which compute container to use. Use the command `bayes gear resource` to see supported compute types
#
resource: "rtx-4090"

## env
# Specifies which runtime environment to use. Use the command `bayes gear env` to view supported runtime environments
#
env: "pytorch-2.6-2204"

## command
# Only required when creating a "script execution". Specifies the entry command when the task executes
#
command: ""

## node
# Specifies the number of running nodes
#
node: 1

## parameters
# Supports key/value format parameters. These parameters will generate an openbayes_params.json file during container execution and be appended to the command parameter
# Example as follows:
#
#    parameters:
#      input: /input0
#      epochs: 5
#
#    During execution, an openbayes_params.json file with content {"input": "/input0", "epochs": 5} will be generated,
#    and `--input=/input0 --epochs=5` will be appended to the execution command
#
parameters: {}


## For the latest documentation on "HyperAI Hyperparameter Tuning", please refer to https://openbayes.com/docs/hypertuning/
#
# A complete hyper_tuning example is as follows:
#    hyper_tuning:
#      max_job_count: 3
#      hyperparameter_metric: precision
#      goal: MINIMIZE
#      algorithm: Bayesian
#      parameter_specs:
#      - name: regularization
#        type: DOUBLE
#        min_value: 0.001
#        max_value: 10.0
#        scale_type: UNIT_LOG_SCALE
#      - name: latent_factors
#        type: INTEGER
#        min_value: 5
#        max_value: 50
#        scale_type: UNIT_LINEAR_SCALE
#      - name: unobs_weight
#        type: DOUBLE
#        min_value: 0.001
#        max_value: 5.0
#        scale_type: UNIT_LOG_SCALE
#      - name: feature_wt_factor
#        type: DOUBLE
#        min_value: 1
#        max_value: 200
#        scale_type: UNIT_LOG_SCALE
#      - name: level
#        type: DISCRETE
#        discrete_values: [1, 2, 3, 4]
#      - name: category
#        type: CATEGORICAL
#        categorical_values: ["A", "B", "C"]
#
hyper_tuning:

## max_job_count
  #  The number of attempts for one automatic hyperparameter tuning session, with a maximum support of 100 attempts
  #
  max_job_count: 0

  ## parallel_count
  #  The number of parallel attempts is limited by the user's maximum parallel count for a single resource type, typically 1 or 2
  #
  parallel_count: "1"

  ## hyperparameter_metric
  #  Target variable
  #  For reporting target variables, please refer to https://openbayes.com/docs/hypertuning/#2-上报目标变量
  hyperparameter_metric: ""

  ## goal
  #  Direction of the optimal solution (MAXIMIZE or MINIMIZE)
  #
  goal: ""

  ## algorithm
  #  Algorithm to be used, supported algorithms are as follows:
  #  Grid      For scenarios with only DISCRETE and CATEGORICAL type parameters, GridSearch can be used to traverse all parameter combinations
  #  Random    For INTEGER and DOUBLE types, based on their supported distribution types, randomly select values between min_value and max_value; for DISCRETE and CATEGORICAL types, behavior is similar to Grid method
  #  Bayesian  When generating parameters each time, consider previous "parameter"-"target variable" results, and provide parameters through an updated distribution function to expect better results. The algorithm can be referenced in this article
  #
  algorithm: ""

  ## parameter_specs
  #  Input parameter specifications
  #  For parameter specification definitions, please refer to: https://openbayes.com/docs/hypertuning/#参数规约的定义
  #
  parameter_specs: []

  ## side_metrics
  #  Other reference metrics
  #
  side_metrics: []

The hyper_tuning section is not introduced for now, but you can see that other parameters are consistent with those used in bayes gear run task. By configuring parameters in openbayes.yaml, you can avoid repeatedly entering parameters when using bayes gear run task. For example, providing the following parameters:

data_bindings:
  - data: openbayes/mnist/1     # Full path of the dataset
    path: /input0               # Path to mount in the container
    type: ro                    # Optional: ro (read-only) or rw (read-write)
resource: rtx-4090
env: pytorch-2.0
command: "python train.py -i /input0 -o ./model -e 2 -m model.h5 -l ./tf_dir"

:::info It should be noted that in openbayes/mnist/1, openbayes is a dedicated name for public datasets. If you want to use your own dataset, you need to replace openbayes with your username, mnist is the name of the dataset, and 1 is the version number of the dataset. :::

You can directly execute a task by entering the bayes gear run task command. This task will run in the pytorch-2.0 environment, use rtx-4090 compute resources, bind the dataset openbayes/mnist/1 to /input0, and execute the entry command python train.py -i /input0 -o ./model -e 2 -m model.h5 -l ./tf_dir.

:::note For more information on how to write configuration files, see HyperAI Configuration File :::

Creating a "Jupyter Workspace"

Similar to creating a "Python Script", creating a Jupyter workspace via command line will by default upload files from the current directory to the container's "output".

  1. git clone https://github.com/practicalAI/practicalAI Download the practicalAI project
  2. cd practicalAI && bayes gear init practicalAI Initialize the container
  3. bayes gear run workspace Create Jupyter
$ bayes gear run workspace -o -f

Currently operating on organization org1...
Requesting server to create container...
Container created successfully
Open the webpage https://openbayes.com/console/org1/jobs/52yaekv8nf91 to view detailed container information
Browser opened successfully.

Container is running

:::note Creating a "Jupyter Workspace" is similar to creating a "Python Script" - it can be created through command line arguments or through the openbayes.yaml file. :::

Continuing Container Execution

  • Use the bayes gear status command to view all executions under the current container
  • Use the bayes gear restart command, passing in the ID of a completed execution, to run that execution again with the same parameters.
$ bayes gear restart 52yaekv8nf91 -o -f

Currently operating on organization org1...
Container continuing execution...
Open the webpage https://openbayes.com/console/org1/jobs/52yaekv8nf91 to view detailed information of container practicalAI
Browser opened successfully.
⠸ CREATED

You can also override parameters to modify certain settings and run the execution again.

:::note The options for the restart command are the same as those for the run command :::

$ bayes gear restart 52yaekv8nf91 \
      --resource cpu \
      --env pytorch-2.0 \
      --data openbayes/eBIQp4yPMtU/1:/input0 \
      --data openbayes/sTggKplxyT6/1:/input1 \
      --data openbayes/bbNaMvDNqO9/1:/input2 \
      --data username/jobs/3s55ypc33ptl/output:/output \
      --message "task message" \
      --open \
      --follow

Currently operating on organization org1...
Container continuing execution...
Open webpage https://openbayes.com/console/org1/jobs/52yaekv8nf91 to view detailed information of container practicalAI
Browser opened successfully.

Container running

:::info Note that in --data openbayes/eBIQp4yPMtU/1:/input0, openbayes is the dedicated name for public datasets. If you want to use your own dataset, you need to replace openbayes with your username, eBIQp4yPMtU is the dataset ID, 1 is the dataset version number; :/input0 binds the dataset to input0. :::

Stop Container Execution

Use the bayes gear stop command with the running container execution ID to stop that execution of the container.

$ bayes gear stop 52yaekv8nf91 -o -f

Currently operating on organization org1...
Synchronizing data and closing container
Open webpage https://openbayes.com/console/username/jobs/52yaekv8nf91 to view detailed information of container practicalAI
Browser opened successfully.

Container closed

Let me introduce several available parameters:

  • -o or --open will open the corresponding web interface in the browser after the container starts closing
  • -f or --follow will keep tracking the container's status until the container is completely closed

Download Container Output with Command Line Tool

1. Download Container Output Directly via Execution ID

Use the bayes gear download command with the container execution ID to download the current output content of that container.

$ bayes gear download 5mx0ki1s5ej8 --target ~/Downloads/data-download-location -u

Currently operating on organization org1...
Downloading, please wait
Download complete, file saved to ~/Downloads/data-download-location/cli-29.output.zip

Extracting, please wait
Extraction successful: Files extracted to ~/Downloads/data-download-location
Source file deleted: ~/Downloads/data-download-location/cli-29.output.zip

Let me introduce several available parameters:

  • -f or --from specifies the subpath to download; if not filled, downloads the entire output
  • -t or --target local storage location; if not filled, uses the current path
  • -u or --unarchive whether to automatically extract the archive and delete the source file; if not filled, keeps the archive by default without automatic extraction

:::note Using the -u or --unarchive parameter requires that the folder selected by -t or --target be an empty folder :::

2. Create "Python Script Execution" and Download Output After Container Execution Completes

Using bayes gear run task and bayes gear download commands in combination will wait for the "Python Script Execution" to complete before downloading the output content.

$ bayes gear run task -f && bayes gear download -t /Users/username/test-data-download -u

Currently operating on organization org1...
command information: sleep 1
Uploading source code...
Preparing to upload source code...
Obtaining upload authorization...
Starting to scan files, please wait...
Found 10 files in total, 4.4 MB in total, starting upload...
Upload progress: 100% (10/10): 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.43M/4.43M [00:05<00:00, 809kB/s]

✅ Source code uploaded successfully! 10 files uploaded
Requesting server to create container...
Container created successfully
Open webpage https://openbayes.com/console/org1/jobs/onl6jcbkgahd to view detailed container information

Container is running
Currently operating on organization org1...
Downloading, please wait
Download complete, file saved at /Users/username/test-data-download/test-cli.output.zip

Extracting, please wait
Extraction successful: Files extracted to /Users/username/test-data-download
Source file deleted: /Users/username/test-data-download/test-cli.output.zip

Open container web interface with command line tool

We can directly open the web interface from the command line with the following command:

$ bayes gear open 6q848lathbdp

Currently operating on organization org1...
Opening task https://beta.openbayes.com/console/org1/jobs/6q848lathbdp
Jumping to browser...
Browser opened successfully.

You can also open it by container name:

$ bayes gear open practicalAI

Currently operating on organization org1...
Opening container https://openbayes.com/console/org1/containers/6q848lathbdp
Jumping to browser...
Browser opened successfully.

Alternatively, add the -o parameter at the end of the container execution command, and the command line tool will immediately open the corresponding web interface after upload or merge is completed:

$ bayes gear run workspace -o -f

Currently operating on organization org1...
Requesting server to create container...
Container created successfully
Open webpage https://openbayes.com/console/org1/jobs/52yaekv8nf91 to view detailed container information
Browser opened successfully.

Container is running

:::note The run, restart, and stop commands of bayes gear can all add the -o option at the end of the command. The command line will open the corresponding web interface in the browser after the container reaches the target state. :::

Track container logs and container status with command line tool

1. Log tracking

You can view running container logs through the command bayes gear logs. Adding the -f or --follow parameter will continuously track the container's log output:

$ bayes gear logs 1ekrvwi6uyac -f

[I 14:41:01.149 LabApp] Writing notebook server cookie secret to /root/.local/share/jupyter/runtime/notebook_cookie_secret
[W 14:41:01.433 LabApp] All authentication is disabled.  Anyone who can connect to this server will be able to run code.
[I 14:41:01.749 LabApp] JupyterLab extension loaded from /usr/local/lib/python3.6/site-packages/jupyterlab
[I 14:41:01.750 LabApp] JupyterLab application directory is /usr/local/share/jupyter/lab
[I 14:41:01.758 LabApp] Serving notebooks from local directory: /openbayes
[I 14:41:01.758 LabApp] Jupyter Notebook 6.1.4 is running at:
[I 14:41:01.758 LabApp] http://username-1ekrvwi6uyac-main:8888/jobs/username/jobs/1ekrvwi6uyac/
[I 14:41:01.758 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
...
...
...

2. Status Tracking

The subcommands run, restart, and stop of bayes gear all support adding the -f or --follow parameter to track container status.

:::caution Status tracking only works for "Python Script" and "Jupyter Workspace" tasks, and is ineffective for "Auto-tuning" tasks. :::

For the run and restart commands:

  • For "Python Script" tasks, it will track until the entire task startup is complete
  • For "Jupyter Workspace" tasks, it will track until the Jupyter workspace startup is complete
  • For "Auto-tuning" tasks, the --follow parameter has no effect

For the stop command:

  • For "Python Script" tasks, it will track the task until the container is completely shut down
  • For "Jupyter Workspace" tasks, it will track the task until the container is completely shut down
  • For "Auto-tuning" tasks, the --follow parameter has no effect