HyperAIHyperAI

Runtime Environment (Image) Overview

Common Dependencies in Runtime Environments

HyperAI includes a large number of basic dependencies in the runtime environment by default to reduce the consumption of runtime resources for downloading and installing dependencies each time a container starts.

Pre-installed dependencies can be categorized into the following areas by purpose:

1. General Machine Learning Libraries

  • scikit-learn General machine learning library containing numerous machine learning models, data analysis, data mining algorithms, and visualization tools
  • XGBoost A high-performance GBDT model implementation; many Kaggle winning algorithms are built on top of this algorithm
  • ONNX Deep learning model conversion library
  • spaCy Industrial-grade natural language processing library
  • LightGBM Boosting framework launched by Microsoft

2. Image Processing Tools

Commonly used graphics processing libraries

  • OpenCV Powerful image processing tool
  • Pillow Commonly used image processing tool in Python

3. Data Analysis Libraries

  • pandas
  • SciPy
  • Matplotlib
  • NumPy
  • h5py

How to Add Dependencies Not in the List

The default HyperAI runtime environment has already installed a large number of dependencies for machine learning scenarios. If you still need additional dependencies, you can install them through the following methods.

:::note The runtime environment is managed with Conda, which supports Conda installing additional dependencies. :::

:::danger Each runtime environment has a different CUDA version installed. When installing additional dependencies, be sure to ensure they match the CUDA environment in the runtime environment. :::

Installing Python Libraries

If you need some additional dependencies when uploading code and running it in "Python script execution" mode, you can define a file named openbayes_requirements.txt or requirements.txt in the root directory of the uploaded code, add the required dependencies, and upload it along with other files. Before the code runs, the system will first install these dependencies before executing the "Python script".

The format of this file's content is consistent with Python's requirements.txt format. A typical openbayes_requirements.txt file content is as follows:

requirements.txt
jieba
tqdm==4.11.2

Where jieba and tqdm are two libraries that can be installed via pip. Using the above format will install these libraries first before executing the "Python script". The == after tqdm==4.11.2 specifies the specific version to be installed.

:::danger Some dependencies in the system such as tensorflow and pytorch should not be modified arbitrarily, because different versions of tensorflow or pytorch have different underlying dependencies, which may damage the current environment. :::

Dependency Management via Conda

For the "Workspace" section, see Managing Dependencies with Conda.

For "Python Script Execution", you can provide a file named conda-packages.txt in the root directory of the uploaded code. The file format follows:

[channel::]package[=version[=buildid]]

Here is an example:

conda-packages.txt
conda-forge::rdkit
conda-forge::pygpu

If requirements.txt, openbayes_requirements.txt, and conda-packages.txt exist simultaneously, dependencies in conda-packages.txt will be installed first, followed by dependencies in openbayes_requirements.txt and requirements.txt.

Installing Other Dependencies

If you are in a "Jupyter Workspace", please refer to the next section. For "Python Script Execution" scenarios, you can install additional non-Python dependencies using the following methods:

  1. Include dependency installation commands in the "Execution Command"

    For example, if you want to download necessary git repositories before running the program, you can use the following "Execution Command":

    $ git clone https://github.com/tensorflow/models.git && cd models && python ...
  2. Prepare a dependencies.sh script

    For dependencies that are not from Conda or PyPI, you can provide a file named dependencies.sh in the root directory. It will be executed by bash when "Python Script Execution" starts, and its execution will occur before the installation of dependencies from openbayes_requirements.txt, requirements.txt, and conda-packages.txt.

    For example, here is a dependencies.sh script with the following content:

    dependencies.sh
    git clone https://github.com/tensorflow/models.git
    cd models
    pip install -r requirements.txt

:::info Your runtime environment is Linux Ubuntu. If you want to install additional package dependencies, you can use the apt-get or apt command. Usually, you need to execute apt-get update or apt update before installation. :::

Installing Dependencies in Jupyter Workspace

In "Jupyter Workspace", you can install any dependencies you need, whether they are Python dependencies or other dependencies installed via apt.

For example, here is how to install an additional Python dependency in the editor:

After entering !, follow it with the pip installation command, such as !pip install jieba.

And you can install apt package dependencies using !apt install xxx:

Persisting pip Dependencies

Currently, additional pip dependencies installed under HyperAI are saved by default to the system disk, and need to be reinstalled when the container restarts. However, if you add the extra parameter --user when installing dependencies with pip, the installed dependencies will be saved in the container's workspace (i.e., /openbayes/home), specifically in the /openbayes/home/.pylibs directory. When the "Workspace" is closed and started again, the .pylibs directory will also be copied to the specified directory in the container, and you can still see the list of installed dependencies through pip list.

1. Removing Unnecessary Dependencies

If you don't want to keep these dependencies, you can add the rm -rf /openbayes/home/.pylibs command in "Execute Command".

2. Requiring Different Python Versions

If you need a completely new Python version (currently the Python version under HyperAI is 3.6 or 3.8), you can create a completely new environment in the /openbayes/home directory according to Create a new environment under /openbayes/home and install complete dependencies in it.

:::info After creating a completely new environment through conda, do not use the pip --user parameter to install dependencies; when adding --user, dependencies will be placed in /openbayes/home/.pylibs instead of the newly created conda environment, which can easily cause dependency conflicts. :::

Installing Jupyter Workspace Extensions

"Jupyter Workspace" has many extensions, and we can add the extensions we need in "Terminal". Here we demonstrate how to install an extension that automatically adds an editor directory jupyterlab-toc.

Open a "Terminal" and enter the following command:

jupyter labextension install @jupyterlab/toc

Open a .ipynb file again, and you can see the "Table of Contents" tab appears on the left side. Click it to see the directory of the current file.