matchzoo package¶

Subpackages¶

Submodules¶

matchzoo.embedding module¶

Matchzoo toolkit for token embedding.

class matchzoo.embedding.Embedding(data)¶

基类：object

Embedding class.

Examples::

>>> import matchzoo as mz
>>> data_pack = mz.datasets.toy.load_data()
>>> pp = mz.preprocessors.NaivePreprocessor()
>>> vocab_unit = mz.build_vocab_unit(pp.fit_transform(data_pack),
...                                  verbose=0)
>>> term_index = vocab_unit.state['term_index']
>>> embed_path = mz.datasets.embeddings.EMBED_RANK

To load from a file:

>>> embedding = mz.embedding.load_from_file(embed_path)
>>> matrix = embedding.build_matrix(term_index)
>>> matrix.shape[0] == len(term_index) + 1
True

To build your own:

>>> data = pd.DataFrame(data=[[0, 1], [2, 3]], index=['A', 'B'])
>>> embedding = mz.embedding.Embedding(data)
>>> matrix = embedding.build_matrix({'A': 2, 'B': 1})
>>> matrix.shape == (3, 2)
True

build_matrix(term_index, initializer=<function Embedding.<lambda>>)¶

Build a matrix using term_index.

参数:	term_index (`dict`) -- A dict or TermIndex to build with. initializer -- A callable that returns a default value for missing terms in data. (default: a random uniform distribution in range) (-0.2, 0.2)).
返回类型:	`ndarray`
返回:	A matrix.

input_dim¶

return Embedding input dimension.

返回类型:	`int`

output_dim¶

return Embedding output dimension.

返回类型:	`int`

matchzoo.embedding.load_from_file(file_path, mode='word2vec')¶

Load embedding from file_path.

参数:	file_path (`str`) -- Path to file. mode (`str`) -- Embedding file format mode, one of 'word2vec' or 'glove'. (default: 'word2vec')
返回类型:	`Embedding`
返回:	An `matchzoo.embedding.Embedding` instance.

matchzoo package¶

Subpackages¶

Submodules¶

matchzoo.embedding module¶

matchzoo.logger module¶

matchzoo.version module¶

Module contents¶