Creating a variable problem

조회 수: 1 (최근 30일)
Alejandro
Alejandro 2024년 1월 14일
댓글: Shivam 2024년 1월 17일
I am trying to create a (instrumental) variable for my linear regression.
The variable is intended to be: Number of drug generic products not offered by firm i. This is, I want to count all the generic products that all firms sell, except the "own" one.
The key variables are:
firm: A list of 44 firms numbered from 1 to 44
indicator: Being a dummy variable that takes value 0 if the drug is generic, 1 if it is branded.
productid: The unique identifier of each product in my dataset.
The thing is that my dataset is panel data, and I want to count only the unique first instance of generic for each firm and productid. Ideally, what I would like to do is to iterate ovear each productid for each firm then take the first instance of the generic for each firm/productid combination, and then sum that count. Once I have that count, I just have to take all the generics of my dataset (82) and then subtract the sum I just did for each firm. This is what I tried so far:
% Iterate over each firm
uniqueFirms = unique(m.firm);
for i = 1:length(uniqueFirms)
firm = uniqueFirms(i);
% Get unique product IDs for the current firm
firmProductIDs = unique(m.productid(m.firm == firm));
% Iterate over each productid for the firm
for j = 1:length(firmProductIDs)
pid = firmProductIDs(j);
% Find the first generic product for the current productid within the firm
firstGenericIndex = find(m.firm == firm & m.productid == pid & m.indicator == 0, 1, 'first');
if ~isempty(firstGenericIndex)
m.first_generic_by_firm(firstGenericIndex) = 1;
end
end
end
% Total number of generics in the dataset
totalGenerics = 82;
% Initialize a column to store the count of generics not offered by each firm
m.generics_not_offered_by_firm = zeros(height(m), 1);
% Iterate over each firm to perform the subtraction
for i = 1:length(uniqueFirms)
firm = uniqueFirms(i);
% Count the first instances of generics for the firm
countGenericsByFirm = sum(m.first_generic_by_firm(m.firm == firm));
% Subtract from total and assign to the relevant rows
m.generics_not_offered_by_firm(m.firm == firm) = totalGenerics - countGenericsByFirm;
end
The final result is just a vector of zeros in the variable
m.generics_not_offered_by_firm
Also the variable
firstGenericIndex
only stores a vector of zeros.
Could anyone help me with that? Maybe you can propose another approach. If you need further information just let me know
Thanks,
Alejandro.

채택된 답변

Shivam
Shivam 2024년 1월 14일
Hi,
Based on the information provided, I understand that you want to calculate the "Number of generic drug products unavailable from firm i," which involves pinpointing the initial introduction of a generic product by each distinct firm-productid combination within the data. Eventually, you want to get the overall generic drug count.
You can follow the below workaround to achieve the goal:
% Sort the table by firm, productid, and then by indicator to ensure generics come first
m = sortrows(m, {'firm', 'productid', 'indicator'});
% Find the unique combinations of firm and productid for generics (indicator == 0)
[uniqueComb, ia, ~] = unique(m(m.indicator == 0, {'firm', 'productid'}), 'rows', 'stable');
% Create a logical index for the first instance of each unique combination
firstGenericIndex = false(height(m), 1);
firstGenericIndex(ia) = true;
% Use accumarray to count the number of first generics for each firm
countGenericsByFirm = accumarray(m.firm(firstGenericIndex), 1, [], @sum);
% Total number of generics in the dataset
totalGenerics = 82;
% Initialize a column to store the count of generics not offered by each firm
m.generics_not_offered_by_firm = zeros(height(m), 1);
% Use the countGenericsByFirm to fill in the generics_not_offered_by_firm
for i = 1:length(unique(m.firm))
firm = unique(m.firm(i));
m.generics_not_offered_by_firm(m.firm == firm) = totalGenerics - countGenericsByFirm(firm);
end
I hope it helps.
Thanks
  댓글 수: 2
Alejandro
Alejandro 2024년 1월 17일
Hi! Thanks for your answer and your time. :)
I tried using the code you provided. It seems something is not working because the variable m.generics_not_offered_by_firm results in a whole vector of zeros.
Also the countGenericsByFirm should be a vector of variables right? it displays a 1x1 vector being 9. Maybe the problem is here.
Shivam
Shivam 2024년 1월 17일
Hey,
Can you attach your files for me to debug the issue? Since, I tried by creating a dummy data and it worked.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

제품


릴리스

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by