opennmt.utils.optim module

Optimization related functions.

opennmt.utils.optim.learning_rate_decay_fn(decay_type, decay_rate, decay_steps, decay_step_duration=1, staircase=True, start_decay_steps=0, minimum_learning_rate=0)[source]

Returns the learning rate decay functions.

Parameters:
  • decay_type – The type of decay. A function from tf.train or opennmt.utils.decay as a string.
  • decay_rate – The decay rate to apply.
  • decay_steps – The decay steps as described in the decay type function.
  • decay_step_duration – The number of training steps that make 1 decay step.
  • staircase – If True, learning rate is decayed in a staircase fashion.
  • start_decay_steps – Start decay after this many steps.
  • minimum_learning_rate – Do not decay past this learning rate value.
Returns:

A function with signature (learning_rate, global_step) -> decayed_learning_rate.

Raises:

ValueError – if decay_type can not be resolved.

opennmt.utils.optim.learning_rate_decay_fn_v2(decay_type, decay_params=None, decay_step_duration=1, start_decay_step=0, minimum_learning_rate=0.0)[source]

Returns the learning rate decay function.

Parameters:
  • decay_type – The type of decay. A function from tf.train or opennmt.utils.decay as a string.
  • decay_params – Additional parameters for the decay function.
  • decay_step_duration – The number of training steps that make 1 decay step.
  • start_decay_step – Start decay after this many steps.
  • minimum_learning_rate – Do not decay past this learning rate value.
Returns:

A function with signature (learning_rate, global_step) -> decayed_learning_rate.

Raises:

ValueError – if decay_type can not be resolved.

opennmt.utils.optim.get_optimizer_class(classname)[source]

Returns the optimizer class.

Parameters:classname – The name of the optimizer class in tf.train, tf.contrib.opt, or opennmt.optimizers as a string.
Returns:A class inheriting from tf.train.Optimizer.
Raises:ValueError – if classname can not be resolved.
opennmt.utils.optim.optimize(*args, **kwargs)[source]

Wrapper around optimize_loss for backward compatibility.

opennmt.utils.optim.optimize_loss(loss, params, mixed_precision=False, var_list=None, hvd=None)[source]

Minimizes the loss.

Parameters:
  • loss – The loss to minimize.
  • params – A dictionary of hyperparameters.
  • mixed_precision – If True, wraps the optimizer to maintain a float32 copy of the weights.
  • var_list – The variables to update.
  • hvd – Optional Horovod object.
Returns:

The loss minimization op and a list of internal variables to initialize.

opennmt.utils.optim.delayed_update(optimizer, grads_and_vars, global_step, accum_count=1)[source]

Possibly delays the parameters update by first accumulating gradients.

Parameters:
  • optimizer – The optimizer.
  • grads_and_vars – List of (gradient, variable) pairs.
  • global_step – The training step that will be increased when the parameters are updated.
  • accum_count – The number of times to accumulate gradients, as a constant or a tf.Tensor.
Returns:

An operation that conditionally applies the gradients and a list of internal variables to initialize.

opennmt.utils.optim.regularization_penalty(regularization_type, scale, weights_list=None)[source]

Computes the weights regularization penalty.

Parameters:
  • regularization_type – The regularization type: l1, l2, or l1_l2.
  • scale – The regularization multiplier. If regularization_type is l1_l2, this should be a list or tuple containing the L1 regularization scale and the L2 regularization scale.
  • weights_list – The list of weights. Defaults to non bias variables.
Returns:

The regularization penalty.

Raises:

ValueError – if regularization_type is invalid or is l1_l2 but scale is not a sequence.