MATLAB Answers


Brain freeze re: grouping and recombination

Asked by dpb
on 17 Apr 2019
Latest activity Commented on by dpb
on 18 Apr 2019
Have two files of correlated data via a categorical variable -- in the first, the name field has been normalized such that can find the grouping of the multiple (1 to 3 each, variable) accounts associated with each name. I have the cell array of those accounts by unique group in this file.
The same accounts exist (amongst others not of interest here) in a second file in which the name field has NOT been and for arcane reasons CANNOT be normalized (well, not so arcane, just that it is dynamic and generated external to process over which I have any control from a database prior to the normalization having occurred that can't be changed for the immediate future).
I need to find and associate the accounts by the account numbers according to the first list in the second file -- where I ran into brain cramp was a way without just iterating through the list and doing individual lookup but to generate the addressing vector by group to associate the multiple accounts.
Hopefully there's enough of a description to see the issue...I'll try to desensitize the data sufficiently to post a small sample set but there's personal, private info in the real set so didn't want to just post a subset as is...


I think doing inner table join will solve the problem for you.
[C,idx1,idx2] = innerjoin(Tab1,Tab2, "LeftKeys",{'name'},"RightKeys",{'name'});
on 17 Apr 2019
I'm not terribly adept with join operations but I was thinking perhaps...but the 'name' field is the one that isn't normalized in the second table so there are inconsistent name codings in the second otherwise I could just regroup that file on name as well as could in the first which I was able to standardize/normalize the naming amongst the groupings/subaccounts.
What I've got is a cell array that contains the 1-N (N==3, max) accounts that belong together regardless what the name is...that make more sense, maybe?
>> whos a
Name Size Bytes Class Attributes
a 153x1 19548 cell
>> a(1:10)
ans =
10×1 cell array
{2×6 char}
{2×6 char}
{2×6 char}
{2×6 char}
>> a{4}
ans =
2×6 char array
for a subset...I've not yet cleaned up the Names to something anonymous but each of those accounts is associated with a given name--that ideally is the same but in the TABLE2 has variations like different abbreviations or somesuch so that matching them directly doesn't work.
Also, even though there's a fixed difference between the two accounts above, that pattern is also accidental, not consistent so can't compute the others from knowing one (altho it's not searching/finding the matching by account that's the problem it's finding those accounts overall efficiently that I'm looking for.
on 18 Apr 2019
Well, in the end I concluded that it's simpler (and fast enough since the database isn't terribly big) to just loop thru the cell array and use lookup with ismember to operate on the desired rows/accounts. I realized halfway through I needed to do some other operations on each fund besides besides the global sum, etc.,that was initial thinking could do with grouping variable derived from the lookup but that isn't possible for the other operations.

Sign in to comment.

0 Answers

Translated by