Merging/Superimposing two tables without using loops

조회 수: 14 (최근 30일)
Nikhil Tawakley
Nikhil Tawakley 2017년 3월 16일
댓글: Nikhil Tawakley 2017년 4월 10일
I have two very large (200000x200) tables, Table_1 & Table_2. They have columns/variables that store different data types- strings, doubles, integers etc.
I want to create a third table, Table_3, that superimposes Table 2 over Table 1 if Table 1 has a missing value. i.e.
if Table_1(row,column)<>{''},
Table_3(row,column)=Table_1(row,column)
else
Table_3(row,column)=Table_2(row,column)
For example,
Table 1 Table 2 Table 3
Record Name Age Record Name Age Record Name Age
1 John 10 1 John 1 John 10
2 Daniela 2 Daniela 5 2 Daniela 5
3 Austin 30 3 Austin 30 3 Austin 30
4 45 4 Mark 45 4 Mark 45
I want to do this without using for loops because the time to traverse them is very very slow.
To consider:
  • Is there a way to do this using the fillmissing() function? This would be the most natural approach however, Table_3=fillmissing(Table_1,'Constant',Table_2) doesn't work. (The 2nd argument (Table_2) cannot be multidimensional.)
  • If this were a simple matrix, Table_3(Table_1=='')=Table_2 would've worked; but I can't seem to do this with tables.
Thanks, NT
  댓글 수: 1
Nikhil Tawakley
Nikhil Tawakley 2017년 3월 17일
Thanks Peter- it's "like" a join but not quite- one typically does a join if there are extra variables to capture; these tables are identical in structure/size. Data in Table_1 is just more reliable than the other (but is comparatively sparsely populated).
I wan't to "superimpose" the two tables such that data from Table 1 is given priority over Table 2.
There could be situation where Mark's age in Table_1 is 85- in that case I would like the output to read:
Table 1 Table 2 Table 3
Record Name Age Record Name Age Record Name Age
4 85 4 Mark 45 4 Mark 85
There's gotta be a way to do this! Thanks again for looking into this.

댓글을 달려면 로그인하십시오.

채택된 답변

Peter Perkins
Peter Perkins 2017년 3월 16일
This isn't exactly a join, but it's close. It's not entirely clear to me what you want to do, for example, Mark is missing from reconrd 4 in table 1, but Age == 45 is not. However, in tables 2 and 3, where Mark is present, it's the same Age: 45. Is that always true? If so, I would think you could delete records from table 1 that had missing values, then outerjoin with table 2, then repeat for table 3.
Otherwise, maybe you're looking for some kind of setdiff operation.
  댓글 수: 4
Peter Perkins
Peter Perkins 2017년 3월 20일
편집: Peter Perkins 2017년 3월 20일
OK, I've been misinterpreting your example, and I didn't realize that "table 3" was you expected output. So after re-reading, I see what you want.
You say, "no loops", but I think you mean "no loops over rows." Here's a version that loops over variables, making use of ismissing. With only 200 vars, I don't think that will be an issue. I'm gonna assume that the two tables contain the same record numbers, if that's not true you would have to do some kind of join before this:
>> t1 = table( ...
[1;2;3;4],{'John';'Daniela';'Austin';''},[10;NaN;30;45], ...
'VariableNames',{'Record' 'Name' 'Age'})
t1 =
4×3 table
Record Name Age
______ _________ ___
1 'John' 10
2 'Daniela' NaN
3 'Austin' 30
4 '' 45
>> t2 = table( ...
[1;2;3;4],{'John';'Daniela';'Austin';'Mark'},[NaN;5;30;45], ...
'VariableNames',{'Record' 'Name' 'Age'})
t2 =
4×3 table
Record Name Age
______ _________ ___
1 'John' NaN
2 'Daniela' 5
3 'Austin' 30
4 'Mark' 45
>> isempty(setxor(t1.Record,t2.Record)) % check the assumption
ans =
logical
1
>> t1 = sortrows(t1,'Record'); % just in case
>> t2 = sortrows(t2,'Record');
>> t3 = t1;
>> missingValues = ismissing(t1);
>> for varIndex = 2:width(t1) % index by var number, not name
i = missingValues(:,varIndex);
t3.(varIndex)(i) = t2.(varIndex)(i);
end
>> t3
t3 =
4×3 table
Record Name Age
______ _________ ___
1 'John' 10
2 'Daniela' 5
3 'Austin' 30
4 'Mark' 45
Nikhil Tawakley
Nikhil Tawakley 2017년 4월 10일
Thanks for the detailed response, Peter; this was very helpful. I ended up "solving" the issue by looping over variables. It's still slow but fortunately it's acceptable :)

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Matrix Indexing에 대해 자세히 알아보기

태그

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by