opennmt.models.language_model module

Language model.

class opennmt.models.language_model.LanguageModel(decoder, embedding_size=None, reuse_embedding=True, name='lm')[source]

Bases: opennmt.models.model.Model

An experimental language model.

__init__(decoder, embedding_size=None, reuse_embedding=True, name='lm')[source]

Initializes the language model.

Parameters:
  • decoder – A opennmt.decoders.decoder.DecoderV2 instance.
  • embedding_size – The size of the word embedding. If not set, pretrained embeddings should be defined in the configuration.
  • reuse_embedding – If True, reuse the embedding weights in the output layer.
  • name – The name of this model.
Raises:

ValueError – if the decoder type is invalid.

auto_config(num_devices=1)[source]

Returns automatic configuration values specific to this model.

Parameters:num_devices – The number of devices used for the training.
Returns:A partial training configuration.
compute_loss(outputs, labels, training=True, params=None)[source]

Computes the loss.

Parameters:
  • outputs – The model outputs (usually unscaled probabilities).
  • labels – The dict of labels tf.Tensor.
  • training – Compute training loss.
  • params – A dictionary of hyperparameters.
Returns:

The loss or a tuple containing the computed loss and the loss to display.

print_prediction(prediction, params=None, stream=None)[source]

Prints the model prediction.

Parameters:
  • prediction – The evaluated prediction.
  • params – (optional) Dictionary of formatting parameters.
  • stream – (optional) The stream to print to.
class opennmt.models.language_model.LanguageModelInputter(vocabulary_file_key, embedding_size=None, embedding_file_key=None, embedding_file_with_header=True, case_insensitive_embeddings=True, trainable=True, dropout=0.0, tokenizer=None, dtype=tf.float32)[source]

Bases: opennmt.inputters.text_inputter.WordEmbedder

A special inputter for language modeling.

This is a single word embedder that simply produces labels by shifting the input sequence.

make_evaluation_dataset(features_file, labels_file, batch_size, num_threads=1, prefetch_buffer_size=None)[source]

See opennmt.inputters.inputter.ExampleInputter.make_evaluation_dataset().

make_training_dataset(features_file, labels_file, batch_size, batch_type='examples', batch_multiplier=1, batch_size_multiple=1, shuffle_buffer_size=None, bucket_width=None, maximum_features_length=None, maximum_labels_length=None, single_pass=False, num_shards=1, shard_index=0, num_threads=4, prefetch_buffer_size=None)[source]

See opennmt.inputters.inputter.ExampleInputter.make_training_dataset().