HyperAIHyperAI

System Metrics and Custom Metrics

Displaying Key Metrics

HyperAI provides display of key metrics during execution by default. Currently supported metrics include CPU, memory, gpu-0-memory, gpu-0-util, and storage resources. More metrics will be supported in the future.

Custom Metrics

HyperAI provides a Python library called openbayestool for recording custom key metrics in Python programs and displaying them on the user's container execution page:

The usage of openbayestool is as follows:

from openbayestool import log_param, log_metric, clear_metric

# Log parameter `learning_rate=0.01`
log_param('learning_rate', 0.01)

# For the same parameter, the last request will be recorded `foo=3`
log_param('foo', 1)
log_param('foo', 2)
log_param('foo', 3)

# Log model execution result `precision=0.77`
log_metric('precision', 0.77)

# Multiple recordings of the same result precision will append results, i.e., result will be [0.79, 0.82, 0.86]
log_metric('precision', 0.79)
log_metric('precision', 0.82)
log_metric('precision', 0.86)

# Clear a custom metric, note this can only be done in a running container
clear_metric('precision')

Integration with Keras

The Keras framework provides a Callback API that can create update callbacks at batch or epoch level. Using this approach, we can add custom metrics from openbayestool to the Keras training process.

class HyperAIMetricsCallback(tf.keras.callbacks.Callback):
    def on_batch_end(self, batch, logs=None):
        """Print Training Metrics"""
        if batch % 300 == 0:
          openbayestool.log_metric('acc', float(logs.get('acc')))
          openbayestool.log_metric('loss', float(logs.get('loss')))

model.fit(x_train, y_train,
          epochs=epochs,
          verbose=1,
          callbacks=[HyperAIMetricsCallback()])

A complete code example can be found here.