2 data sets - Need to make the same size!

조회 수: 1 (최근 30일)
Mate 2u
Mate 2u 2012년 6월 22일
EDITED-HAD AN ERROR CHANGE THE ARRAY SIZE OF B NOW
Hi Everybody..
I have 2 big data sets.
Data set 1 is contained in two arrays, the components are A which is a cell array of size 65,000,000 x 1, and B which is a normal array of size 65,000,000 x 1.
Data set 2 is contained in two arrays also, the components are C which is a cell array of size 61,500,000 x 1, and B which is a normal array of size 61,500,000 x 1.
Formats
A and C:
A is the date and time stamp for the corresponding prices in B
C is the date and time stamp for the corresponding prices in D.
The date and time are incremental for both and are irregular (therefore price changes at different times for both). The format is as:
'20090501 00:00:00.365'
'20090501 00:00:00.371'
'20090501 00:00:00.605'
'20090501 00:00:00.863'
--------
B and D are the prices which correspond to the date and times of A and C respectively. The prices are in a format as:
98.9020000000000
98.8990000000000
98.8850000000000
98.8890000000000
What I want
I want to ammend this so that I can get A, B, C and D all the same size. I do not want to lose any information.
So lets look at our first dataset. We have A and B (time and price). We now look at C and we add into A all the entries of C which are not already in A. THEN....for these new date and times in A we need to add the corresponding price in B, which will just be the price of the nearest price above it.
For example, If we add
'20090501 00:00:00.645' into A then the corresponding B entry would be the price of the time before this time. So if we already had '20090501 00:00:00.605' in A and 98.9020000000000 in B then the new B entry for the new time added would remain 98.9020000000000 .
I look forward to some great and interesting answers.

답변 (1개)

Walter Roberson
Walter Roberson 2012년 6월 24일
Convert to serial date numbers. Then use the two-output form of histc(); the second output will be the bin number of the highest vector entry that does not exceed the probe times.
  댓글 수: 1
Walter Roberson
Walter Roberson 2012년 6월 24일
A_datenum = datenum(A, 'yyyymmdd HH:MM:SS.FFF')
C_datenum = datenum(C, 'yyyymmdd HH:MM:SS.FFF');
[count, C_bin] = histc(C_datenum, [-inf A_datenum(:); inf] );
Now, the entry for C{K} matches before anything in A if C_bin(K) is 1, and otherwise is at least as late as the A entry A{C_bin(K) - 1} . The -1 is because of the bin that got added to catch times before anything in A.
You did not say anything about how prices should be handled.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Trading Technologies에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by