Outerjoin: resulting table has repeated rows and has erased some data

조회 수: 28 (최근 30일)
Ariana
Ariana 2023년 11월 2일
답변: Steven Lord 2023년 11월 27일
Hi everyone!
I have two tables: table 1, and table 2 (sorry it's showing up as table 11 in the pictures). They each have one column in common - it's called "a". They are of different lengths: table 1 is a 96514x2 double, and table 2 is a 95646x2 double.
I want to combine these two tables. Any times they have in common, I'd like them to share the data and put it into one row, and any times they don't have in common, I'd like the table that doesn't have that time to just be NA.
I have used the function outerjoin. however, when I use this, I get the resulting table T, which is a 459,134x3 double. As you can see from the attached file, many of these are repeats, AND it's smaller than the parent two tables, so I lose a lot of data.
I don't know why this is happening, or what I should do instead.
and then finally, table T
here is my code:
%this is me making my tables (which are originally an array, haha, sorry) into actual tables
table1=array2table(table1)
table1.Properties.VariableNames(1:2)={'a', 'b'}
table2=array2table(table2)
table2.Properties.VariableNames(1:2)={'a','c'}
%now me using the outerjoin function
T=outerjoin(table1, table2, 'MergeKeys', true)
thank you for your help!!
  댓글 수: 2
Dyuman Joshi
Dyuman Joshi 2023년 11월 2일
"I don't know why this is happening"
Because the values are not the same, though they might appear to be so.
Can you share the data you are working with? If they are too big to attach, a sample of it would be good enough.
Use the paperclip button to attach.
Ariana
Ariana 2023년 11월 27일
Hi Dyuman,
you were right! I did have many many duplicate rows. I was able to identify them and remove them! Thank you so much for your help

댓글을 달려면 로그인하십시오.

답변 (2개)

Vatsal
Vatsal 2023년 11월 17일
Hi,
The outerjoin function in MATLAB performs an outer join operation on two tables based on one or more key variables. It includes all rows from both tables and fills the missing values with NaN (Not-a-Number) for the non-matching rows. However, it seems that the resulting table T is not as expected.
One possible reason for this behaviour is that the key variable "a" might contain duplicate values in both tables. When performing an outer join, the duplicate values can cause the resulting table to have repeated rows.
To address this issue, you can try the following steps:
  • Remove any duplicate rows from both tables using the unique function. This ensures that each key value appears only once in each table.
  • Perform the outer join on the cleaned tables using the outerjoin‘ function.
To learn more about the usage and syntax ofouterjoin, you may refer to the MathWorks documentation links below:
I hope this helps!
  댓글 수: 1
Ariana
Ariana 2023년 11월 27일
Hi Vatsal,
Thank you for this! It turned out there were many duplicate rows, I just didn't know it. Removing them helped a lot!

댓글을 달려면 로그인하십시오.


Steven Lord
Steven Lord 2023년 11월 27일
Rather than storing your data as tables with one variable containing serial date numbers, consider converting those serial date numbers to datetime and converting your tables to timetables using table2timetable. If you do that you could synchronize the timetables (which would allow you to change the time basis, synchronize them to times that appear in either the intersection or union of the times, etc.)

카테고리

Help CenterFile Exchange에서 Tables에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by