Brain freeze re: grouping and recombination

조회 수: 1 (최근 30일)
dpb
dpb 2019년 4월 17일
댓글: dpb 2019년 4월 18일
Have two files of correlated data via a categorical variable -- in the first, the name field has been normalized such that can find the grouping of the multiple (1 to 3 each, variable) accounts associated with each name. I have the cell array of those accounts by unique group in this file.
The same accounts exist (amongst others not of interest here) in a second file in which the name field has NOT been and for arcane reasons CANNOT be normalized (well, not so arcane, just that it is dynamic and generated external to process over which I have any control from a database prior to the normalization having occurred that can't be changed for the immediate future).
I need to find and associate the accounts by the account numbers according to the first list in the second file -- where I ran into brain cramp was a way without just iterating through the list and doing individual lookup but to generate the addressing vector by group to associate the multiple accounts.
Hopefully there's enough of a description to see the issue...I'll try to desensitize the data sufficiently to post a small sample set but there's personal, private info in the real set so didn't want to just post a subset as is...
  댓글 수: 3
dpb
dpb 2019년 4월 17일
편집: dpb 2019년 4월 17일
I'm not terribly adept with join operations but I was thinking perhaps...but the 'name' field is the one that isn't normalized in the second table so there are inconsistent name codings in the second otherwise I could just regroup that file on name as well as could in the first which I was able to standardize/normalize the naming amongst the groupings/subaccounts.
What I've got is a cell array that contains the 1-N (N==3, max) accounts that belong together regardless what the name is...that make more sense, maybe?
>> whos a
Name Size Bytes Class Attributes
a 153x1 19548 cell
>> a(1:10)
ans =
10×1 cell array
{'A63009'}
{'A63032'}
{'A63006'}
{2×6 char}
{2×6 char}
{2×6 char}
{'B63022'}
{2×6 char}
{'B63042'}
{'B63052'}
>> a{4}
ans =
2×6 char array
'A63022'
'A64022'
>>
for a subset...I've not yet cleaned up the Names to something anonymous but each of those accounts is associated with a given name--that ideally is the same but in the TABLE2 has variations like different abbreviations or somesuch so that matching them directly doesn't work.
Also, even though there's a fixed difference between the two accounts above, that pattern is also accidental, not consistent so can't compute the others from knowing one (altho it's not searching/finding the matching by account that's the problem it's finding those accounts overall efficiently that I'm looking for.
dpb
dpb 2019년 4월 18일
Well, in the end I concluded that it's simpler (and fast enough since the database isn't terribly big) to just loop thru the cell array and use lookup with ismember to operate on the desired rows/accounts. I realized halfway through I needed to do some other operations on each fund besides besides the global sum, etc.,that was initial thinking could do with grouping variable derived from the lookup but that isn't possible for the other operations.

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Matrix Indexing에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by