Skip to content

Overview

In training, the training recipe is just as important as the model architecture. Even if you have a good model architecture, the performance on the same data and model combination can vary greatly depending on your training recipe. NetsPresso Trainer not only introduces models optimized for edge devices, but also provides the ability to change training configurations to train these models with various data. The optimal training recipe will vary depending on the data you want to train. Use the options provided by NetsPresso Trainer to find the optimal training recipe for your data.

Users can adjust epochs, the desired optimizer and scheduler as a following example.

training:
  epochs: 300
  ema:
    name: constant_decay
    decay: 0.9999
  optimizer:
    name: adamw
    lr: 6e-5
    betas: [0.9, 0.999]
    weight_decay: 0.0005
  scheduler:
    name: cosine_no_sgdr
    warmup_epochs: 5
    warmup_bias_lr: 1e-5
    min_lr: 0.

Field list

Field Description
training.epochs (int) The total number of epoch for training the model
training.ema (dict, optional) The configuration of EMA. Please refer to the EMA page for more details. If None, EMA is not applied.
training.optimizer (dict) The configuration of optimizer. Please refer to the list of supporting optimizer for more details.
training.scheduler (dict) The configuration of learning rate scheduler. Please refer to the list of supporting scheduler for more details.