Cosine Similarity using BERT

Question

Nicholas Ang 2021년 6월 30일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/868608-cosine-similarity-using-bert

댓글: Nicholas Ang 2021년 6월 30일

채택된 답변: Divyam Gupta

I am using BERT to calculate similarities in Question Answering. I have encoded my Question data using

data.Tokens = encode(mdl.Tokenizer,data.Questions) which returns me a cell array.

Next, I proceeded to encode new text to test the similiarity with the already encoded Questions in the database: testTokens = encode(mdl.Tokenizer,text)

However, I am imable to use the cosineSimilarity(data.Tokens,testTokens) and I receive an error that says:

Input must be a matrix, a tokenizedDocument array, a bagOfWords model, a bagOfNgrams model, a string array of words, or a cell array of character vectors.

Do I need padding here or reshape of my cell vectors?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Divyam Gupta 2021년 6월 30일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/868608-cosine-similarity-using-bert#answer_736543

Hi Nicholas, I notice that you're facing an issue while computing the cosine similarity using a text encoder. As per the documentation mentioned at https://www.mathworks.com/help/textanalytics/ref/cosinesimilarity.html#d123e8335 the cosineSimilarity function takes a matrix to compute the similarity between two documents.

Since the encoded vector sizes for each of the questions is different, constructing a matrix might be difficult. You can do a pairwise comparision between the data.Tokens and the testTokens to compute the similarities. This can be achieved by running a nested loop while simultaneously storing the similarity scores.

Hope this helps.

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Nicholas Ang 2021년 6월 30일

Thank you! This worked!

댓글을 달려면 로그인하십시오.

Cosine Similarity using BERT

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Cosine Similarity using BERT

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기