Skip to content

Missing argument error in trainer.py when setting lr_scheduler #3

@bc-bytes

Description

@bc-bytes

In trainer.py (line 101) there is a missing argument. Here's how I define optimiser and lr_scheduler in train.py:

optimizer = torch.optim.Adam(params, lr=0.001, weight_decay=0.0001)
scheduler = lr_scheduler.ReduceLROnPlateau(optimizer, factor=0.1, patience=6, verbose=1, min_lr=0.000001)
trainer = Trainer(
    model,
    criterion=BCEDiceLoss(),
    optimizer=optimizer,
    lr_scheduler=scheduler,   
    epochs=500
)

That throws the following error:

File "/home/seg/backbones_unet/utils/trainer.py", line 127, in _train_one_epoch
    self.lr_scheduler.step() # this was originally: self.lr_scheduler.step()
TypeError: step() missing 1 required positional argument: 'metrics'

If I then set line 101 to self.lr_scheduler.step(loss) that seems to fix the error. However, when I start training I get this:

Training Model on 500 epochs:   0%|                            | 0/500 [00:00<?, ?it/s
Epoch 00032: reducing learning rate of group 0 to 1.0000e-04.  | 31/800 [00:03<00:54, 14.03 training-batch/s, loss=1.1]
Epoch 00054: reducing learning rate of group 0 to 1.0000e-05.  | 53/800 [00:04<00:51, 14.38 training-batch/s, loss=1.09]
Epoch 00062: reducing learning rate of group 0 to 1.0000e-06.  | 61/800 [00:05<00:54, 13.45 training-batch/s, loss=1.06]
^Epoch 1:  13%|██████████████████                              | 104/800 [00:08<00:58, 11.84 training-batch/s, loss=1.05]

I haven't seen that before when training models with code from other repos. If that is normal, then all is OK, I just wanted to report the missing argument error in trainer.py.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions