matchzoo.auto.preparer package¶
Submodules¶
matchzoo.auto.preparer.prepare module¶
-
matchzoo.auto.preparer.prepare.prepare(task, model_class, data_pack, preprocessor=None, embedding=None, config=None)¶ A simple shorthand for using
matchzoo.Preparer.config is used to control specific behaviors. The default config will be updated accordingly if a config dictionary is passed. e.g. to override the default bin_size, pass config={‘bin_size’: 15}.
Parameters: - task (
BaseTask) – Task. - model_class (
Type[BaseModel]) – Model class. - data_pack (
DataPack) – DataPack used to fit the preprocessor. - preprocessor (
Optional[BasePreprocessor]) – Preprocessor used to fit the data_pack. (default: the default preprocessor of model_class) - embedding (
Optional[Embedding]) – Embedding to build a embedding matrix. If not set, then a correctly shaped randomized matrix will be built. - config (
Optional[dict]) – Configuration of specific behaviors. (default: return value of mz.Preparer.get_default_config())
Returns: A tuple of (model, preprocessor, data_generator_builder, embedding_matrix).
- task (
matchzoo.auto.preparer.preparer module¶
-
class
matchzoo.auto.preparer.preparer.Preparer(task, config=None)¶ Bases:
objectUnified setup processes of all MatchZoo models.
config is used to control specific behaviors. The default config will be updated accordingly if a config dictionary is passed. e.g. to override the default bin_size, pass config={‘bin_size’: 15}.
See tutorials/automation.ipynb for a detailed walkthrough on usage.
Default config:
- {
# pair generator builder kwargs ‘num_dup’: 1,
# histogram unit of DRMM ‘bin_size’: 30, ‘hist_mode’: ‘LCH’,
# dynamic Pooling of MatchPyramid ‘compress_ratio_left’: 1.0, ‘compress_ratio_right’: 1.0,
# if no matchzoo.Embedding is passed to tune ‘embedding_output_dim’: 50
}
Parameters: - task (
BaseTask) – Task. - config (
Optional[dict]) – Configuration of specific behaviors.
Example
>>> import matchzoo as mz >>> task = mz.tasks.Ranking(loss=mz.losses.RankCrossEntropyLoss()) >>> preparer = mz.auto.Preparer(task) >>> model_class = mz.models.DenseBaseline >>> train_raw = mz.datasets.toy.load_data('train', 'ranking') >>> model, prpr, gen_builder, matrix = preparer.prepare(model_class, ... train_raw) >>> model.params.completed() True
-
classmethod
get_default_config()¶ Default config getter.
Return type: dict
-
prepare(model_class, data_pack, preprocessor=None, embedding=None)¶ Prepare.
Parameters: - model_class (
Type[BaseModel]) – Model class. - data_pack (
DataPack) – DataPack used to fit the preprocessor. - preprocessor (
Optional[BasePreprocessor]) – Preprocessor used to fit the data_pack. (default: the default preprocessor of model_class) - embedding (
Optional[Embedding]) – Embedding to build a embedding matrix. If not set, then a correctly shaped randomized matrix will be built.
Return type: Tuple[BaseModel,BasePreprocessor,DataGeneratorBuilder,ndarray]Returns: A tuple of (model, preprocessor, data_generator_builder, embedding_matrix).
- model_class (