Find array elements inside another array while incrementing counter

조회 수: 1 (최근 30일)
John
John 2017년 11월 7일
편집: John 2017년 11월 7일
for kk = 1:n
str = ['<p id=',num2str(kk),'>'];
idx_s = find(strcmp(C,str));
if ~isempty(idx_s)
idx_e = idx2(find(idx2>idx_s,1));
Doc=C(idx_s:idx_e); %May need to remove tags later
Doc = regexp(Doc,'[a-z0-9\-]+','match');
Doc = [Doc{:}];
Unique_Doc_count = arrayfun(@(x) nnz(strcmp(x,Doc)), Unique);
Unique_Doc_freq=[Unique;Unique_Doc_count];
end
end
I want to search if the elements in string array 'Unique' exist in 'Doc'. I got results in 'Unique_Doc_count' as the number of their occurrences but I need just 1 or 0 values (exist) or (not exist). The aim is to loop 'kk' over multiple documents and find the number of documents that contain each word in 'Unique'. Not even number of times the word occurs, but number of documents it appears in.
  댓글 수: 2
per isakson
per isakson 2017년 11월 7일
A tiny example would make it easier to help, i.e values of
n, C, Unique, idx2, ...
btw: the name Unique is dubious; it's too close to the function name unique
John
John 2017년 11월 7일
편집: John 2017년 11월 7일
I have a large document I am iterating through, but it is divided into sub-documents by delimiters. So, n is the number of sub-documents to go through. C is the large document with sub-documents n, Doc is the current sub-document being examined, idx_s and idx_e are the starting point and ending point of current sub-document within the larger document, idx_2 is the location of the bottom delimiter, Unique is an array of unique words in document C. So, I want to check the number of documents within 'n' that each element in 'Unique' exists in, by element-wise checking each element in Unique against the current document as I loop through all documents. I want to add a column next containing the number of documents each unique word exists in! Check the image:

댓글을 달려면 로그인하십시오.

답변 (0개)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by