Data manipulation of a financial time series

조회 수: 2 (최근 30일)
Mate 2u
Mate 2u 2012년 10월 5일
Hi all I have a 1x1 Struct called "ans". Within this I have a textdata matrix of size 5 million x 4.
Now in this textdata I have 4 columns, 1st column is dates, 2nd column is time, 3rd column is prices and 4th column is symbol.
OBJECTIVE
I need to rearrange the 2nd column times so it gives me the prices in 30 second intervals. Currently it is not regular.
Example of current format of column 2 is:'18:42:20.000' '18:42:22.000' '18:42:30.000' '18:42:40.000' '18:42:40.000' '18:42:40.000' '18:42:40.000' '18:42:40.000' '18:42:43.000' '18:42:49.000' '18:43:00.000'
I need to do it so that it gives me the times only every 30 seconds and the corresponding price (column 3) at that specific time. The goal is to have regular 30 second intervals alongside the price at that current time. So we only need to work with columns 2 and 3.
Please message me if I never explained properly, and thanks so much for the help in advance.
  댓글 수: 4
José-Luis
José-Luis 2012년 10월 5일
Well, you would still need to distinguish between days then. Say the price in Monday at 18:40:02 is 412 bananas and on Tuesday at 18:40:02 it is 24 bananas. That would have an impact in your calculations. Similary, you have several data with the same time stamp:
  • 18:42:00 12 bananas
  • 18:42:00 13 bananas
  • 18:42:00 0 bananas
Which is the valid one? Are the prices guaranteed to be the same? Or how do you calculate it? The average? The minimum? The maximum? The last one? Or maybe I am missing something?
Mate 2u
Mate 2u 2012년 10월 5일
Hi, you are right 1st column is dates in the format of 01/03/2007...
The prices are not guaranteed to be the same.
Let me explain properly: I need the prices in 30 second even intervals. Currently it is uneven. If there are multiple prices for a certain time that corresponds to many trades occuring in a second and the last price is what the asset would remain until the next trade? Does this help?

댓글을 달려면 로그인하십시오.

답변 (1개)

José-Luis
José-Luis 2012년 10월 5일
편집: José-Luis 2012년 10월 5일
Here is what you can do, assuming your_data is a cell matrix containing strings.
your_data{1} = '01/08/2007';
your_data{1,2} = '18:04:01';
your_data{2,1} = '04/08/2007';
your_data{2,2} = '18:05:01';
your_data{1,3} = '12';
your_data{2,3} = '13';
%Getting Matlab's date number
your_dates = cellfun(@(a,b) datenum([a ' ' b],'dd/mm/yy HH:MM:SS'),...
your_data(:,1),your_data(:,2));
%Creating time stamp with 1/2 hour interval
interpDates = (floor(your_dates(1)):1/2880:ceil(your_dates(end)));
interpDates(interpDates < your_dates(1)) = [];
interpDates(interpDates > your_dates(end)) = [];
%Keeping only the last time stamp, from the data
[dummy idx dummy] = unique(your_dates,'last');
your_dates = your_dates(idx);
%Getting data values
your_bananas = cellfun(@(a) str2double(a), your_data(:,3));
your_bananas = your_bananas(idx);
%Function to find the nearest value
find_banana = @(x) find(your_dates>=x,1,'first'); %There is probably a more efficient way
%Interpolating
your_interp_bananas = arrayfun(@(x) your_bananas(find_banana(x)),interpDates);
  댓글 수: 4
José-Luis
José-Luis 2012년 10월 5일
편집: José-Luis 2012년 10월 5일
Modified original code. I thought it was 30 min interval. To do a 30 second interval, use:
interpDates = (floor(your_dates(1)):1/2880:ceil(your_dates(end)));
Also I realized an error in the find_bananas function, it should have been written the other way around:
find_banana = @(x) find(your_dates>=x,1,'first');
Mate 2u
Mate 2u 2012년 10월 5일
OK, I am running it on my 5,000,000 x 4 cell array. It is Busy, I will let you know when it has run and if it works.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Money.Net에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by