opennmt.encoders.self_attention_encoder module

Define the self-attention encoder.

class opennmt.encoders.self_attention_encoder.SelfAttentionEncoder(num_layers, num_units=512, num_heads=8, ffn_inner_dim=2048, dropout=0.1, attention_dropout=0.1, relu_dropout=0.1, position_encoder=)[source]

Bases: opennmt.encoders.encoder.Encoder

Encoder using self-attention as described in https://arxiv.org/abs/1706.03762.

__init__(num_layers, num_units=512, num_heads=8, ffn_inner_dim=2048, dropout=0.1, attention_dropout=0.1, relu_dropout=0.1, position_encoder=)[source]

Initializes the parameters of the encoder.

Parameters:
  • num_layers – The number of layers.
  • num_units – The number of hidden units.
  • num_heads – The number of heads in the multi-head attention.
  • ffn_inner_dim – The number of units of the inner linear transformation in the feed forward layer.
  • dropout – The probability to drop units from the outputs.
  • attention_dropout – The probability to drop units from the attention.
  • relu_dropout – The probability to drop units from the ReLU activation in the feed forward layer.
  • position_encoder – The opennmt.layers.position.PositionEncoder to apply on inputs or None.
encode(inputs, sequence_length=None, mode='train')[source]

Encodes an input sequence.

Parameters:
  • inputs – The inputs to encode of shape \([B, T, ...]\).
  • sequence_length – The length of each input with shape \([B]\).
  • mode – A tf.estimator.ModeKeys mode.
Returns:

A tuple (outputs, state, sequence_length).

class opennmt.encoders.self_attention_encoder.SelfAttentionEncoderV2(num_layers, num_units=512, num_heads=8, ffn_inner_dim=2048, dropout=0.1, attention_dropout=0.1, ffn_dropout=0.1, ffn_activation=, position_encoder=, **kwargs)[source]

Bases: opennmt.encoders.encoder.Encoder

Encoder using self-attention as described in https://arxiv.org/abs/1706.03762.

Note

TensorFlow 2.0 version.

__init__(num_layers, num_units=512, num_heads=8, ffn_inner_dim=2048, dropout=0.1, attention_dropout=0.1, ffn_dropout=0.1, ffn_activation=, position_encoder=, **kwargs)[source]

Initializes the parameters of the encoder.

Parameters:
  • num_layers – The number of layers.
  • num_units – The number of hidden units.
  • num_heads – The number of heads in the multi-head attention.
  • ffn_inner_dim – The number of units of the inner linear transformation in the feed forward layer.
  • dropout – The probability to drop units from the outputs.
  • attention_dropout – The probability to drop units from the attention.
  • ffn_dropout – The probability to drop units from the activation output in the feed forward layer.
  • ffn_activation – The activation function to apply between the two linear transformations of the feed forward layer.
  • position_encoder – The opennmt.layers.position.PositionEncoder to apply on inputs.
call(inputs, sequence_length=None, training=None)[source]

Encodes inputs.