Rank stock returns by month

조회 수: 3 (최근 30일)
Syabil
Syabil 2023년 5월 12일
답변: Eric Sofen 2023년 6월 5일
I have a table of stock returns for each company in the US, in each month from 2000-2019. Each month has roughly 3000 companies (and forecasted returns). I want to sort these forecasted returns into ten deciles, with decile 1 having the highest 10% of forecasted returns in each month (90th percentile), and so on so decile 10 has the lowest 10% forecasted returns of the month (10th percentile). I tried discretize but this function ranks the whole table together, not by month. Finally, I want to be able to combine all decile 1 ACTUAL returns into one table, decile 2 returns into another table and so on, and then find the average returns for each month, in each decile. I know I can do this using groupsummary, but I need to get the deciles assigned first. Any tips? Thanks.
I attached a sample data that I believe represents the structure of my table. I just took random parts of my table and put it together. The original table is 700k+ rows long. I need to rank by "TPER" (forecasted returns), and in the end I will want to find average "RET" (actual return) for each month in each decile. Thanks!
  댓글 수: 2
William Rose
William Rose 2023년 5월 12일
@Syabil, To help others help you, please provide the table, or a subset of the table, or simulated data with the same table structure, that illustrates how the data is arranged. Someone could waste time devising a solution that doesn;t work for your table. For example, your table could have 3000 rows and 241 columns (col.1=company name, columns 2-241=monthly returns, for the months from Jan 2000 to Dec 2019). Or the table could have 240 x 2 columns, because the company names could change from month to month, as some companies fail and others start. Or the columns could be companies and the rows could be months. Or something else.
Syabil
Syabil 2023년 5월 13일
Hi, sorry for that. I added some sample data in my question now. Thanks!

댓글을 달려면 로그인하십시오.

답변 (2개)

Santosh Fatale
Santosh Fatale 2023년 5월 18일
Hi Syabil,
I understand that you want to process the data available in tabular format in MATLAB.
Kindly refer to the following code snippet, which could help you to understand about extracting rows of interests from the table variable.
load("sampledata.mat");
originalTable = sampledata;
% Sort table as per Year+Month and assign it to new table variable.
[sortedValues, sortingIndices] = sort(originalTable.dates,'ascend');
updatedTable = originalTable(sortingIndices,:);
% Find unique dates from the updated table.
uniqueDates = unique(updatedTable.dates);
% Loop over each unique date and process the data.
for idx = 1 : length(uniqueDates)
% Select the rows as per
tempTable = updatedTable(updatedTable.dates == uniqueDates(idx),:);
% Write your logic to process data from tempTable
end
In “for” loop, you could add logic for processing data extracted from the original table.
Refer to the documentation page of table, unique, sort to learn more about functions used in the code snippet and how to process tabular data in MATLAB.
  댓글 수: 1
Syabil
Syabil 2023년 5월 20일
편집: Syabil 2023년 5월 20일
Hi Santosh,
Thank you for this! Just one final question, I did the processing, but after the loop ends, I'm only left with values of d for the last loop (ie. Dec 2018 since its the last month in my table). I do know that each loop would overwrite previous interations. So how do I extract the values for the previous months (preferably into the original table) before it is overwritten? Most answers I find on this website are when each row has unique i values, such as i=1:10 with also 10 rows, but my data will have thousands of rows with the same idx value. How do I get around that? Thanks.
for idx = 1 : length(uniqueDates)
% Select the rows as per
%fulltable and yyyymm are simply my actual table and variable names
tempTable = fulltable(fulltable.yyyymm == uniqueDates(idx),:);
% Write your logic to process data from tempTable
d= discretize(tempTable{:,"TPER"},quantile(tempTable{:,"TPER"},[0:10]/10));
end

댓글을 달려면 로그인하십시오.


Eric Sofen
Eric Sofen 2023년 6월 5일
First off, get your dates into datetime rather than YYYYMM doubles!
Then, you can use groupsummary to group the data by month and do the discretize you have above. The tricky part is that each month appears to have a different number of companies, so the size of the output varies from month to month. To deal with this, we wrap the output in a cell. Then, for any given month, if you want the company numbers corresponding to the quantiles, you'll have to index back into that list...
load sampledata.mat
sampledata.dates = datetime(string(sampledata.dates),InputFormat="yyyymm");
head(sampledata)
dates companynumber TPER RET ____________________ _____________ _______ _______ 01-Jan-2000 00:01:00 79678 0.10131 0.5681 01-Jan-2000 00:01:00 84828 0.10181 1.4585 01-Jan-2000 00:01:00 81127 0.10192 1.129 01-Jan-2000 00:01:00 80320 0.10388 1.1066 01-Jan-2000 00:01:00 77668 0.10641 0.74181 01-Jan-2000 00:01:00 87092 0.11021 0.1373 01-Jan-2000 00:01:00 80049 0.11454 0.92289 01-Jan-2000 00:01:00 83531 0.1147 0.51898
pctlFun = @(x) {discretize(x,quantile(x,[0:10]/10))};
coNumList = @(y) {horzcat(y)};
monthlyPercentiles = groupsummary(sampledata,"dates","month",pctlFun,"TPER")
monthlyPercentiles = 4×3 table
month_dates GroupCount fun1_TPER ___________ __________ _____________ Jan-2000 30 {30×1 double} Jan-2004 15 {15×1 double} Jan-2011 15 {15×1 double} Jan-2015 30 {30×1 double}
Depending on the format you ultimately want, you could also use rowfun to find which companies are in each quantile, then put each quantile's companies into a cell that goes into the table. Similar to this example.

카테고리

Help CenterFile Exchange에서 Dates and Time에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by