vec2word

Map embedding vector to word

collapse all in page

Syntax

words = vec2word(emb,M)

[words,dist] = vec2word(emb,M)

___ = vec2word(emb,M,k)

___ = vec2word(___,'Distance',distance)

Description

words = vec2word(emb,M) returns the closest words to the embedding vectors in the rows of M.

example

[words,dist] = vec2word(emb,M) returns the closest words to the embedding vectors in M, and returns the distances dist of each to their source vectors.

example

___ = vec2word(emb,M,k) returns the top k closest words.

example

___ = vec2word(___,'Distance',distance) specifies the distance metric.

example

Examples

collapse all

Map Words to Vectors and Back

Open Live Script

Load a pretrained word embedding using fastTextWordEmbedding. This function requires Text Analytics Toolbox™ Model for fastText English 16 Billion Token Word Embedding support package. If this support package is not installed, then the function provides a download link.

emb = fastTextWordEmbedding

emb = 
  wordEmbedding with properties:

     Dimension: 300
    Vocabulary: [1×1000000 string]

Map the words "Italy", "Rome", and "Paris" to vectors using word2vec.

italy = word2vec(emb,"Italy");
rome = word2vec(emb,"Rome");
paris = word2vec(emb,"Paris");

Map the vector italy - rome + paris to a word using vec2word.

word = vec2word(emb,italy - rome + paris)

word = 
"France"

Find Closest Words to Vector

Open Live Script

Find the top five closest words to a word embedding vector and their distances.

emb = fastTextWordEmbedding;

Map the words "Italy", "Rome", and "Paris" to vectors using word2vec.

italy = word2vec(emb,"Italy");
rome = word2vec(emb,"Rome");
paris = word2vec(emb,"Paris");

Map the vector italy - rome + paris to a word using vec2word. Find the top five closest words using the Euclidean distance metric.

k = 5;
M = italy - rome + paris;
[words,dist] = vec2word(emb,M,k,'Distance','euclidean');

Plot the words and distances in a bar chart.

figure;
bar(dist)
xticklabels(words)
xlabel("Word")
ylabel("Distance")
title("Distances to Vector")

Input Arguments

collapse all

`emb` — Input word embedding
`wordEmbedding` object

Input word embedding, specified as a wordEmbedding object.

`M` — Word embedding vectors
matrix

Word embedding vectors, specified as a matrix. Each row of M is a word embedding vector. M must have emb.Dimension columns.

`k` — Number of closest words
positive integer

Number of closest words to return, specified as a positive integer.

`distance` — Distance metric
`'cosine'` (default) | `'euclidean'`

Distance metric, specified as 'cosine' or 'euclidean'.

Output Arguments

collapse all

`words` — Output words
string vector

Output words, returned as a string vector.

`dist` — Distance of words to source vectors
vector

Distance of words to their source vectors, returned as a vector.

Version History

Introduced in R2017b

vec2word

Syntax

Description

Examples

Map Words to Vectors and Back

Find Closest Words to Vector

Input Arguments

`emb` — Input word embedding
`wordEmbedding` object

`M` — Word embedding vectors
matrix

`k` — Number of closest words
positive integer

`distance` — Distance metric
`'cosine'` (default) | `'euclidean'`

Output Arguments

`words` — Output words
string vector

`dist` — Distance of words to source vectors
vector

Version History

See Also

Topics

vec2word

Syntax

Description

Examples

Map Words to Vectors and Back

Find Closest Words to Vector

Input Arguments

emb — Input word embedding wordEmbedding object

M — Word embedding vectors matrix

k — Number of closest words positive integer

distance — Distance metric 'cosine' (default) | 'euclidean'

Output Arguments

words — Output words string vector

dist — Distance of words to source vectors vector

Version History

See Also

Topics

`emb` — Input word embedding
`wordEmbedding` object

`M` — Word embedding vectors
matrix

`k` — Number of closest words
positive integer

`distance` — Distance metric
`'cosine'` (default) | `'euclidean'`

`words` — Output words
string vector

`dist` — Distance of words to source vectors
vector