# opennmt.encoders.self_attention_encoder module¶

Define the self-attention encoder.

class opennmt.encoders.self_attention_encoder.SelfAttentionEncoder(num_layers, num_units=512, num_heads=8, ffn_inner_dim=2048, dropout=0.1, attention_dropout=0.1, relu_dropout=0.1, position_encoder=)[source]

Encoder using self-attention as described in https://arxiv.org/abs/1706.03762.

__init__(num_layers, num_units=512, num_heads=8, ffn_inner_dim=2048, dropout=0.1, attention_dropout=0.1, relu_dropout=0.1, position_encoder=)[source]

Initializes the parameters of the encoder.

Parameters: num_layers – The number of layers. num_units – The number of hidden units. num_heads – The number of heads in the multi-head attention. ffn_inner_dim – The number of units of the inner linear transformation in the feed forward layer. dropout – The probability to drop units from the outputs. attention_dropout – The probability to drop units from the attention. relu_dropout – The probability to drop units from the ReLU activation in the feed forward layer. position_encoder – The opennmt.layers.position.PositionEncoder to apply on inputs or None.
encode(inputs, sequence_length=None, mode='train')[source]

Encodes an input sequence.

Parameters: inputs – The inputs to encode of shape $$[B, T, ...]$$. sequence_length – The length of each input with shape $$[B]$$. mode – A tf.estimator.ModeKeys mode. A tuple (outputs, state, sequence_length).
class opennmt.encoders.self_attention_encoder.SelfAttentionEncoderV2(num_layers, num_units=512, num_heads=8, ffn_inner_dim=2048, dropout=0.1, attention_dropout=0.1, ffn_dropout=0.1, ffn_activation=, position_encoder=, **kwargs)[source]

Encoder using self-attention as described in https://arxiv.org/abs/1706.03762.

Note

TensorFlow 2.0 version.

__init__(num_layers, num_units=512, num_heads=8, ffn_inner_dim=2048, dropout=0.1, attention_dropout=0.1, ffn_dropout=0.1, ffn_activation=, position_encoder=, **kwargs)[source]

Initializes the parameters of the encoder.

Parameters: num_layers – The number of layers. num_units – The number of hidden units. num_heads – The number of heads in the multi-head attention. ffn_inner_dim – The number of units of the inner linear transformation in the feed forward layer. dropout – The probability to drop units from the outputs. attention_dropout – The probability to drop units from the attention. ffn_dropout – The probability to drop units from the activation output in the feed forward layer. ffn_activation – The activation function to apply between the two linear transformations of the feed forward layer. position_encoder – The opennmt.layers.position.PositionEncoder to apply on inputs.
call(inputs, sequence_length=None, training=None)[source]

Encodes inputs.