Stefano Bonicatti
02/06/2021, 7:59 PMThe normal level allows for 10 restarts if the limits are violated. The restrictive allows for only 4,
But looking at the code what is used as the restart count is the RESPAWN_LIMIT
limit which is defined as:
// Number of seconds the worker should run, else consider the exit fatal.
{WatchdogLimitType::RESPAWN_LIMIT, {4, 4, 1000}},
and it's used both as seconds in some places and number of restarts in an another.
It seems that the restart count limit has been added later and the normal setting (the first value) has been reduced from 10 to 4.
Shouldn't we separate these limits? One for restarts count, one for seconds?
And if yes, would it make sense then to put back 10 restarts for normal, 4 for restricted, and 4, 4 for the number of seconds?
Finally the issue highlights the fact that the disabled
level is not really disabled, it simply has a high value which though could still be realistically hit.
Shouldn't we actually ignore that limit if the level is set to disabled
?