# opennmt.utils.optim module¶

Optimization related functions.

opennmt.utils.optim.learning_rate_decay_fn(decay_type, decay_rate, decay_steps, decay_step_duration=1, staircase=True, start_decay_steps=0, minimum_learning_rate=0)[source]

Returns the learning rate decay functions.

Parameters: decay_type – The type of decay. A function from tf.train or opennmt.utils.decay as a string. decay_rate – The decay rate to apply. decay_steps – The decay steps as described in the decay type function. decay_step_duration – The number of training steps that make 1 decay step. staircase – If True, learning rate is decayed in a staircase fashion. start_decay_steps – Start decay after this many steps. minimum_learning_rate – Do not decay past this learning rate value. A function with signature (learning_rate, global_step) -> decayed_learning_rate. ValueError – if decay_type can not be resolved.
opennmt.utils.optim.learning_rate_decay_fn_v2(decay_type, decay_params=None, decay_step_duration=1, start_decay_step=0, minimum_learning_rate=0.0)[source]

Returns the learning rate decay function.

Parameters: decay_type – The type of decay. A function from tf.train or opennmt.utils.decay as a string. decay_params – Additional parameters for the decay function. decay_step_duration – The number of training steps that make 1 decay step. start_decay_step – Start decay after this many steps. minimum_learning_rate – Do not decay past this learning rate value. A function with signature (learning_rate, global_step) -> decayed_learning_rate. ValueError – if decay_type can not be resolved.
opennmt.utils.optim.get_optimizer_class(classname)[source]

Returns the optimizer class.

Parameters: classname – The name of the optimizer class in tf.train, tf.contrib.opt, or opennmt.optimizers as a string. A class inheriting from tf.train.Optimizer. ValueError – if classname can not be resolved.
opennmt.utils.optim.optimize(*args, **kwargs)[source]

Wrapper around optimize_loss for backward compatibility.

opennmt.utils.optim.optimize_loss(loss, params, mixed_precision=False, var_list=None, hvd=None)[source]

Minimizes the loss.

Parameters: loss – The loss to minimize. params – A dictionary of hyperparameters. mixed_precision – If True, wraps the optimizer to maintain a float32 copy of the weights. var_list – The variables to update. hvd – Optional Horovod object. The loss minimization op and a list of internal variables to initialize.
opennmt.utils.optim.delayed_update(optimizer, grads_and_vars, global_step, accum_count=1)[source]

Possibly delays the parameters update by first accumulating gradients.

Parameters: optimizer – The optimizer. grads_and_vars – List of (gradient, variable) pairs. global_step – The training step that will be increased when the parameters are updated. accum_count – The number of times to accumulate gradients, as a constant or a tf.Tensor. An operation that conditionally applies the gradients and a list of internal variables to initialize.
opennmt.utils.optim.regularization_penalty(regularization_type, scale, weights_list=None)[source]

Computes the weights regularization penalty.

Parameters: regularization_type – The regularization type: l1, l2, or l1_l2. scale – The regularization multiplier. If regularization_type is l1_l2, this should be a list or tuple containing the L1 regularization scale and the L2 regularization scale. weights_list – The list of weights. Defaults to non bias variables. The regularization penalty. ValueError – if regularization_type is invalid or is l1_l2 but scale is not a sequence.