matchzoo.auto.preparer package

Submodules

matchzoo.auto.preparer.prepare module

matchzoo.auto.preparer.prepare.prepare(task, model_class, data_pack, preprocessor=None, embedding=None, config=None)

A simple shorthand for using matchzoo.Preparer.

config is used to control specific behaviors. The default config will be updated accordingly if a config dictionary is passed. e.g. to override the default bin_size, pass config={‘bin_size’: 15}.

Parameters:
  • task (BaseTask) – Task.
  • model_class (Type[BaseModel]) – Model class.
  • data_pack (DataPack) – DataPack used to fit the preprocessor.
  • preprocessor (Optional[BasePreprocessor]) – Preprocessor used to fit the data_pack. (default: the default preprocessor of model_class)
  • embedding (Optional[Embedding]) – Embedding to build a embedding matrix. If not set, then a correctly shaped randomized matrix will be built.
  • config (Optional[dict]) – Configuration of specific behaviors. (default: return value of mz.Preparer.get_default_config())
Returns:

A tuple of (model, preprocessor, data_generator_builder, embedding_matrix).

matchzoo.auto.preparer.preparer module

class matchzoo.auto.preparer.preparer.Preparer(task, config=None)

Bases: object

Unified setup processes of all MatchZoo models.

config is used to control specific behaviors. The default config will be updated accordingly if a config dictionary is passed. e.g. to override the default bin_size, pass config={‘bin_size’: 15}.

See tutorials/automation.ipynb for a detailed walkthrough on usage.

Default config:

{

# pair generator builder kwargs ‘num_dup’: 1,

# histogram unit of DRMM ‘bin_size’: 30, ‘hist_mode’: ‘LCH’,

# dynamic Pooling of MatchPyramid ‘compress_ratio_left’: 1.0, ‘compress_ratio_right’: 1.0,

# if no matchzoo.Embedding is passed to tune ‘embedding_output_dim’: 50

}

Parameters:
  • task (BaseTask) – Task.
  • config (Optional[dict]) – Configuration of specific behaviors.

Example

>>> import matchzoo as mz
>>> task = mz.tasks.Ranking(loss=mz.losses.RankCrossEntropyLoss())
>>> preparer = mz.auto.Preparer(task)
>>> model_class = mz.models.DenseBaseline
>>> train_raw = mz.datasets.toy.load_data('train', 'ranking')
>>> model, prpr, gen_builder, matrix = preparer.prepare(model_class,
...                                                     train_raw)
>>> model.params.completed()
True
classmethod get_default_config()

Default config getter.

Return type:dict
prepare(model_class, data_pack, preprocessor=None, embedding=None)

Prepare.

Parameters:
  • model_class (Type[BaseModel]) – Model class.
  • data_pack (DataPack) – DataPack used to fit the preprocessor.
  • preprocessor (Optional[BasePreprocessor]) – Preprocessor used to fit the data_pack. (default: the default preprocessor of model_class)
  • embedding (Optional[Embedding]) – Embedding to build a embedding matrix. If not set, then a correctly shaped randomized matrix will be built.
Return type:

Tuple[BaseModel, BasePreprocessor, DataGeneratorBuilder, ndarray]

Returns:

A tuple of (model, preprocessor, data_generator_builder, embedding_matrix).

Module contents