compare groups of items regarding overlaps

조회 수: 7 (최근 30일)
Ulrike Lohner
Ulrike Lohner 2021년 6월 23일
댓글: Ulrike Lohner 2021년 6월 24일
Short background: I have a number of texts that are being grouped regarding their value (about 5 differing values for each variable) for number of variables; meaning that each texts appears in one value group of each variable. (group A might be text1, text7, text23, text38; etc.)
Goal: I want to compare each of these primary groups regarding any overlap of their contained items using one group as a basis; i.e. I take group A and check which texts of this group also appear in any group of another variable (of course, I am not comparing groups that belong to the same variable, since there would oviously be no overlap). In the end, I'd like to say that e.g. Text 1, 7, 23 and 38 all appear in groups A, F, J, K and so forth.
That means I do not want to compare the means or any values of the data groups, but want to know which groups share which items.
Since I am not yet that experienced yet, I can't seem to find the right code to start with; any ideas about how to tackle this task?
  댓글 수: 3
Image Analyst
Image Analyst 2021년 6월 23일
What do you mean by overlapping texts? What kind of data do you have? String arrays? Character arrays? Images? Tables? Cell arrays? Structure arrays? Can you attach your data (group(s)) in a .mat file with the paper clip icon.
save('answers.mat', 'group1', 'group2', 'group3');
Use your actual variable names of course.
In the meantime, see functions like setdiff(), intersect(), contains(), ismember(), strcmpi(), etc.
Ulrike Lohner
Ulrike Lohner 2021년 6월 24일
Unfortunately, I am not allowed to post any original data due to data security issues (and the code I have so far is importing the data, so that wouldn't be any help). I can try to be more specific regarding my data, though:
Basically I have a large number of groups of strings that are organized in a table (each column one group, each string in a cell); there are about 150 different strings in total and each string will appear in a number of groups; however, no group is composed of the same combination of strings, and additionally, the groups do not have the same sizes.
I will probably need a loop that takes each column (i.e. each group) as a starting point once, checking which strings of this group is also contained in the other groups; giving me as output a new set of string clusters that only contain those strings included in the first group.
Anyway: thank you for the suggestions so far; I will dig deeper into the functions you mentioned already and will check if one of them serves my purpose.

댓글을 달려면 로그인하십시오.

채택된 답변

SALAH ALRABEEI
SALAH ALRABEEI 2021년 6월 23일
Use
[val,ndxA,ndxB] = intersect(A,B)
It will give you the overlapping val and its index in both groups A and B
  댓글 수: 1
Ulrike Lohner
Ulrike Lohner 2021년 6월 24일
Thank you for this suggestion! I will have a closer look at that function and check whether is serves the right prupose.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Startup and Shutdown에 대해 자세히 알아보기

제품


릴리스

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by