이 질문을 팔로우합니다.
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다.
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다.
Creating a new matrix based on matching two columns
조회 수: 2 (최근 30일)
이전 댓글 표시
User
2023년 10월 17일
I have generated two matrices from two different csv files. Matrix 1 contains two columns of information (matrix contain middle names and first name). Matrix 2 only contains one Column of information (middle name). I want to create a loop where I take the middle name from matrix 2 and match it to the middle names in matrix 1. Once the names have been matched I want to read out each matches corresponding first name in a new matrix
댓글 수: 3
the cyclist
2023년 10월 17일
Can you upload the data? You can use the paper clip icon in the INSERT section of the toolbar.
One thing that is confusing is that you refer to the data as "matrices", but in MATLAB matrices are numeric. So, it's not clear if you have cell arrays, or string arrays, or tables. Uploading the data is the easiest way for us to make sure a solution works for you.
Walter Roberson
2023년 10월 26일
답변 (1개)
Voss
2023년 10월 26일
T1 = readtable('example1.xlsx');
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T2 = readtable('example2.xlsx');
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
disp(T1);
firstName lastName
___________ ____________________
{'ben' } {'smith' }
{'sarah' } {'doe' }
{'rebecca'} {'martinez, garcia'}
{'lucy' } {'griffin' }
{'grace' } {'lockwood' }
{'sam' } {'sanchez' }
{'brian' } {'pina' }
{'stef' } {'ngyugen' }
{'maria' } {'lee' }
{'emily' } {'grady' }
{'giselle'} {'curry' }
{'andrea' } {'green, martinez' }
disp(T2);
lastName
____________
{'smith' }
{'green' }
{'miller' }
{'wilson' }
{'davis' }
{'allen' }
{'moore' }
{'cooper' }
{'adams' }
{'thompson'}
{'lopez' }
{'hill' }
{'adams' }
{'bailey' }
{'thatcher'}
{'raven' }
{'elsher' }
{'levine' }
{'brown' }
{'williams'}
{'anderson'}
{'wilson' }
{'gonzales'}
{'garcia' }
"I want to [...] take the [last] name from [array] 2 and match it to the [last] names in [array] 1. Once the names have been matched I want to read out each [match's] corresponding first name in a new [array]"
If you want exact matches:
result = T1{ismember(T1{:,2},T2{:,1}),1}
result = 1×1 cell array
{'ben'}
Or, if you want to split the cells where there are more than one last name separated by commas into multiple separate entries, then something like this:
% make a new table with only one first and last name per row:
C = regexp(T1{:,2},',\s*','split');
idx = arrayfun(@(x,c)x(ones(1,numel(c{1}))),1:numel(C),C.','un',0);
T1_new = table(T1{[idx{:}],1},[C{:}].')
T1_new = 14×2 table
Var1 Var2
___________ ____________
{'ben' } {'smith' }
{'sarah' } {'doe' }
{'rebecca'} {'martinez'}
{'rebecca'} {'garcia' }
{'lucy' } {'griffin' }
{'grace' } {'lockwood'}
{'sam' } {'sanchez' }
{'brian' } {'pina' }
{'stef' } {'ngyugen' }
{'maria' } {'lee' }
{'emily' } {'grady' }
{'giselle'} {'curry' }
{'andrea' } {'green' }
{'andrea' } {'martinez'}
% this part is the same as before with T1 but now using T1_new:
result = T1_new{ismember(T1_new{:,2},T2{:,1}),1}
result = 3×1 cell array
{'ben' }
{'rebecca'}
{'andrea' }
댓글 수: 5
User
2023년 11월 1일
How do I get the result to display the original last name as well? for instance I want to create a new array that has the matched first name ben with the respective last name smith?
Voss
2023년 11월 1일
편집: Voss
2023년 11월 1일
T1 = readtable('example1.xlsx');
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T2 = readtable('example2.xlsx');
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
% exact matches only:
T_out = T1(ismember(T1{:,2},T2{:,1}),:)
T_out = 1×2 table
firstName lastName
_________ _________
{'ben'} {'smith'}
result = join(T_out{:,:},' ')
result = 1×1 cell array
{'ben smith'}
% matches any last name in given row of T1:
C = regexp(T1{:,2},',\s*','split');
idx = arrayfun(@(x,c)x(ones(1,numel(c{1}))),1:numel(C),C.','un',0);
T1_new = table(T1{[idx{:}],1},[C{:}].');
T_out = T1_new(ismember(T1_new{:,2},T2{:,1}),:)
T_out = 3×2 table
Var1 Var2
___________ __________
{'ben' } {'smith' }
{'rebecca'} {'garcia'}
{'andrea' } {'green' }
result = join(T_out{:,:},' ')
result = 3×1 cell array
{'ben smith' }
{'rebecca garcia'}
{'andrea green' }
User
2023년 11월 1일
I am trying to repeat this process on this other sample data set but am running into issues. How do I match the Drug in the first file (drugID_names_matched)to the DrugIds in the second file(allpharm1) and if there is a match how do I output the respective DrugIDs name from the second file with its original Drug name from the first file?
Walter Roberson
2023년 11월 1일
With the original last name is something I posted code for several days ago at https://www.mathworks.com/matlabcentral/answers/2034499-loading-in-a-table-that-has-multiple-values-in-a-single-cell-seperated-by-a-comma#comment_2938846
Voss
2023년 11월 1일
@User: Does this produce the expected result? The only change is changing the comma to a semicolon in the regexp() call, since the IDs are separated by semicolons in the real data file (it was commas in the example files).
T1 = readtable('allPharm1.csv')
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T1 = 5258×2 table
Name DrugIDs
_______________________________________________________________ ___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
{'Peptidoglycan synthase FtsI' } {'DB00303' }
{'Histidine decarboxylase' } {'DB00114; DB00117' }
{'Glutaminase liver isoform, mitochondrial' } {'DB00142' }
{'Coagulation factor XIII A chain' } {'DB02340; DB11300; DB11311; DB11571; DB13151' }
{'Nitric oxide synthase, inducible' } {'DB00125; DB00155; DB01017; DB01110; DB01234; DB01686; DB01835; DB01997; DB02044; DB02207; DB02234; DB02462; DB02644; DB03100; DB03144; DB03366; DB03449; DB03953; DB04400; DB04534; DB05214; DB05252; DB05383; DB06879; DB06916; DB07002; DB07003; DB07007; DB07008; DB07011; DB07029; DB07306; DB07318; DB07388; DB07389; DB07405; DB08214; DB08750; DB08814; DB09237; DB11327; DB14649' }
{'Estradiol 17-beta-dehydrogenase 2' } {'DB00157; DB13952; DB13953; DB13954; DB13955; DB13956' }
{'NAD(P) transhydrogenase, mitochondrial' } {'DB00157; DB01763; DB03461; DB09092' }
{'Alcohol dehydrogenase class-3' } {'DB00157; DB03017; DB03704; DB04153' }
{'Aminomethyltransferase, mitochondrial' } {'DB00116; DB00157; DB04789' }
{'Isocitrate dehydrogenase [NAD] subunit alpha, mitochondrial'} {'DB00157; DB06757; DB09092; DB09130' }
{'Voltage-dependent T-type calcium channel subunit alpha-1I' } {'DB00381; DB00568; DB00617; DB00909; DB01118; DB01388; DB04841; DB06152; DB09061; DB09235; DB14009; DB14011' }
{'Adenosine receptor A1' } {'DB00193; DB00201; DB00277; DB00555; DB00640; DB00651; DB00806; DB00824; DB00996; DB01223; DB01303; DB01412; DB04932; DB04954; DB06471; DB09061; DB11757; DB12569; DB12670' }
{'Tyrosine-protein kinase ABL1' } {'DB00171; DB00619; DB01254; DB03878; DB04868; DB05184; DB06616; DB07831; DB08043; DB08231; DB08339; DB08350; DB08583; DB08896; DB08901; DB12010; DB12267; DB12323' }
{'High affinity immunoglobulin epsilon receptor subunit alpha'} {'DB00043; DB00895; DB05797' }
{'Coagulation factor VIII' } {'DB00055; DB00100; DB06050; DB11300; DB11312; DB11571; DB11572; DB12872; DB13133; DB13151; DB13152; DB13933; DB14700' }
{'Prostaglandin G/H synthase 1' } {'DB00154; DB00159; DB00244; DB00316; DB00328; DB00350; DB00461; DB00465; DB00469; DB00500; DB00554; DB00573; DB00586; DB00605; DB00711; DB00712; DB00749; DB00784; DB00788; DB00795; DB00812; DB00814; DB00821; DB00861; DB00870; DB00936; DB00939; DB00945; DB00963; DB00991; DB01009; DB01014; DB01050; DB01283; DB01397; DB01399; DB01401; DB01419; DB01435; DB01600; DB01837; DB01892; DB02047; DB02110; DB02198; DB02266; DB02379; DB02709; DB02773; DB03667; DB03752; DB03753; DB03783; DB04552; DB04557; DB04817; DB06725; DB06736; DB06802; DB07981; DB07983; DB07984; DB08814; DB09061; DB09212; DB09213; DB09214; DB09215; DB09216; DB09288; DB09295; DB11071; DB11079; DB11201; DB11323; DB12445; DB13346; DB13501; DB13783; DB14009; DB14011'}
T2 = readtable('drugID_names_matched_CCM.csv')
T2 = 188×2 table
Drug Target
___________ ________________________________________
{'DB00112'} {'Vascular endothelial growth factor A'}
{'DB01017'} {'Vascular endothelial growth factor A'}
{'DB01120'} {'Vascular endothelial growth factor A'}
{'DB01136'} {'Vascular endothelial growth factor A'}
{'DB01270'} {'Vascular endothelial growth factor A'}
{'DB03088'} {'Vascular endothelial growth factor A'}
{'DB05294'} {'Vascular endothelial growth factor A'}
{'DB05434'} {'Vascular endothelial growth factor A'}
{'DB05890'} {'Vascular endothelial growth factor A'}
{'DB05932'} {'Vascular endothelial growth factor A'}
{'DB05969'} {'Vascular endothelial growth factor A'}
{'DB06642'} {'Vascular endothelial growth factor A'}
{'DB06779'} {'Vascular endothelial growth factor A'}
{'DB08885'} {'Vascular endothelial growth factor A'}
{'DB09301'} {'Vascular endothelial growth factor A'}
{'DB10772'} {'Vascular endothelial growth factor A'}
C = regexp(T1{:,2},';\s*','split');
idx = arrayfun(@(x,c)x(ones(1,numel(c{1}))),1:numel(C),C.','un',0);
T1_new = table(T1{[idx{:}],1},[C{:}].');
T_out = T1_new(ismember(T1_new{:,2},T2{:,1}),:)
T_out = 2099×2 table
Var1 Var2
_______________________________________________________________ ___________
{'Nitric oxide synthase, inducible' } {'DB01017'}
{'Nitric oxide synthase, inducible' } {'DB03144'}
{'Alcohol dehydrogenase class-3' } {'DB03017'}
{'Isocitrate dehydrogenase [NAD] subunit alpha, mitochondrial'} {'DB06757'}
{'Isocitrate dehydrogenase [NAD] subunit alpha, mitochondrial'} {'DB09130'}
{'Tyrosine-protein kinase ABL1' } {'DB01254'}
{'Tyrosine-protein kinase ABL1' } {'DB05184'}
{'Tyrosine-protein kinase ABL1' } {'DB06616'}
{'Tyrosine-protein kinase ABL1' } {'DB08231'}
{'Tyrosine-protein kinase ABL1' } {'DB08896'}
{'Tyrosine-protein kinase ABL1' } {'DB08901'}
{'Tyrosine-protein kinase ABL1' } {'DB12010'}
{'30S ribosomal protein S4' } {'DB01017'}
{'Vascular endothelial growth factor receptor 3' } {'DB00398'}
{'Vascular endothelial growth factor receptor 3' } {'DB05932'}
{'Vascular endothelial growth factor receptor 3' } {'DB08896'}
result = join(T_out{:,:},' ')
result = 2099×1 cell array
{'Nitric oxide synthase, inducible DB01017' }
{'Nitric oxide synthase, inducible DB03144' }
{'Alcohol dehydrogenase class-3 DB03017' }
{'Isocitrate dehydrogenase [NAD] subunit alpha, mitochondrial DB06757'}
{'Isocitrate dehydrogenase [NAD] subunit alpha, mitochondrial DB09130'}
{'Tyrosine-protein kinase ABL1 DB01254' }
{'Tyrosine-protein kinase ABL1 DB05184' }
{'Tyrosine-protein kinase ABL1 DB06616' }
{'Tyrosine-protein kinase ABL1 DB08231' }
{'Tyrosine-protein kinase ABL1 DB08896' }
{'Tyrosine-protein kinase ABL1 DB08901' }
{'Tyrosine-protein kinase ABL1 DB12010' }
{'30S ribosomal protein S4 DB01017' }
{'Vascular endothelial growth factor receptor 3 DB00398' }
{'Vascular endothelial growth factor receptor 3 DB05932' }
{'Vascular endothelial growth factor receptor 3 DB08896' }
{'Vascular endothelial growth factor receptor 3 DB09079' }
{'Vascular endothelial growth factor receptor 3 DB12010' }
{'Vascular endothelial growth factor receptor 1 DB00398' }
{'Vascular endothelial growth factor receptor 1 DB05932' }
{'Vascular endothelial growth factor receptor 1 DB08896' }
{'Vascular endothelial growth factor receptor 1 DB09079' }
{'Vascular endothelial growth factor receptor 1 DB09221' }
{'Vascular endothelial growth factor receptor 1 DB12010' }
{'Insulin receptor DB12010' }
{'RAF proto-oncogene serine/threonine-protein kinase DB00398' }
{'RAF proto-oncogene serine/threonine-protein kinase DB04973' }
{'RAF proto-oncogene serine/threonine-protein kinase DB05190' }
{'RAF proto-oncogene serine/threonine-protein kinase DB05268' }
{'RAF proto-oncogene serine/threonine-protein kinase DB08862' }
참고 항목
카테고리
Help Center 및 File Exchange에서 Tables에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
아시아 태평양
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)