System Metrics and Custom Metrics
Displaying Key Metrics
HyperAI provides display of key metrics during execution by default. Currently supported metrics include CPU, memory, gpu-0-memory, gpu-0-util, and storage resources. More metrics will be supported in the future.
Custom Metrics
HyperAI provides a Python library called openbayestool for recording custom key metrics in Python programs and displaying them on the user's container execution page:
The usage of openbayestool is as follows:
from openbayestool import log_param, log_metric, clear_metric
# Log parameter `learning_rate=0.01`
log_param('learning_rate', 0.01)
# For the same parameter, the last request will be recorded `foo=3`
log_param('foo', 1)
log_param('foo', 2)
log_param('foo', 3)
# Log model execution result `precision=0.77`
log_metric('precision', 0.77)
# Multiple recordings of the same result precision will append results, i.e., result will be [0.79, 0.82, 0.86]
log_metric('precision', 0.79)
log_metric('precision', 0.82)
log_metric('precision', 0.86)
# Clear a custom metric, note this can only be done in a running container
clear_metric('precision')Integration with Keras
The Keras framework provides a Callback API that can create update callbacks at batch or epoch level. Using this approach, we can add custom metrics from openbayestool to the Keras training process.
class HyperAIMetricsCallback(tf.keras.callbacks.Callback):
def on_batch_end(self, batch, logs=None):
"""Print Training Metrics"""
if batch % 300 == 0:
openbayestool.log_metric('acc', float(logs.get('acc')))
openbayestool.log_metric('loss', float(logs.get('loss')))
model.fit(x_train, y_train,
epochs=epochs,
verbose=1,
callbacks=[HyperAIMetricsCallback()])A complete code example can be found here.