a philosophical question regarding function (how many tasks should be wrapped inside a function?)

조회 수: 1 (최근 30일)
My data have two types of tables. They share common columns, and the second type has extra columns. I couldn't decide whether if I should write two separate functions to process the tables? Or should I have the extra columns processed outside one common function? I personally prefer the one-function approach. Below are toy examples to show what my question is about. What would be the best practice? What are the criteria to choose it? Thanks for any thoughts you like to share.
Two types of tables
T = cell2table({'matlab 101'; 'C++ 202'}, "VariableNames", "u");
T2 = cell2table({'algebra 101', 'jack 001'; 'calculus 202', 'jill 002'}, "VariableNames", ["u", "v"]);
Coding Phylosophy I. One function processes common tasks while dealing with extra columns outside the function. What I don't like about this approach is that it doesn't look as clean.
sp = func(T.u);
desired_output1 = cell2table(sp, "VariableNames", ["course", "level"])
desired_output1 = 2×2 table
course level __________ _______ {'matlab'} {'101'} {'C++' } {'202'}
spu = func(T2.u);
spv = func(T2.v);
desired_output2 = cell2table([spu, spv], "VariableNames", ["course", "level", "teacher", "id"])
desired_output2 = 2×4 table
course level teacher id ____________ _______ ________ _______ {'algebra' } {'101'} {'jack'} {'001'} {'calculus'} {'202'} {'jill'} {'002'}
Coding Phylosophy II. Two separate functions handle two slightly diffrent tables separately. What I don't like about this approach is that the functions must know the tables' variable names. If the variable names are to be changed, the functions must be rewritten.
desired_output1 = gunc(T)
desired_output1 = 2×2 table
course level __________ _______ {'matlab'} {'101'} {'C++' } {'202'}
desired_output2 = hunc(T2)
desired_output2 = 2×4 table
course level teacher id ____________ _______ ________ _______ {'algebra' } {'101'} {'jack'} {'001'} {'calculus'} {'202'} {'jill'} {'002'}
functions:
function sp = func(C)
sp = split(C);
% There are other processings here. For the sake of demo simplicity, they are
% ommited.
end
function out = gunc(T)
sp = split(T.u);
out = cell2table(sp, "VariableNames", ["course", "level"]);
end
function out = hunc(T)
spu = split(T.u);
spv = split(T.v);
out = cell2table([spu, spv], "VariableNames", ["course", "level", "teacher", "id"]);
end

채택된 답변

Ive J
Ive J 2023년 6월 21일
Assuming you're gonna stick with those variable names, you can do somethig like this:
T1 = array2table(["matlab 101"; "C++ 202"], "VariableNames", "u");
T2 = array2table(["algebra 101", "jack 001"; "calculus 202", "jill 002"], "VariableNames", ["u", "v"]);
function out = tableSplitter(tab)
cols = string(tab.Properties.VariableNames);
if width(tab) > 2 || ~any(ismember(cols, ["u", "v"]))
error("input table must have max two columns with var names of u/[v]!")
end
out = splitvars(varfun(@split, tab));
if width(tab) == 1
out.Properties.VariableNames = ["course", "level"];
else
out.Properties.VariableNames = ["course", "level", "teacher", "id"];
end
end
  댓글 수: 4
Simon
Simon 2023년 6월 23일
I didn't exactly know what I mean with that statement. I guess I mean in the if-else condition, the 'if' part and the 'else' part have different input formats and different output formats. So it seems to put two functions together with if-else statement. I don't completely avoid using conditional statements; just that I think it would be better for the condition to handle homogeneous data, such as if (x is positive), if ( year is before 2020), if (diagnosis is negative), ... etc. I do appreciate your codes and thoughts!
Simon
Simon 2023년 7월 4일
@Ive J I see now some benefitis your idea has. Using conditional statement to wrap related processes in a function does have its advantage. Your answer is accepted.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Tables에 대해 자세히 알아보기

제품


릴리스

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by