Recommendations on how to approach XLOOKUP/VLOOKUP/SUMIF operations on a table
조회 수: 15 (최근 30일)
이전 댓글 표시
I have a very large table which has defined column names, and several of the initial columns with only strings in them. These are like descriptive variables. The remaining columns (30 in total) store numerical values.
I'm trying to come up with a systematic approach on how to isolate the rows that contain strings from multiple colums, and then do the arithmetic operations on the numbers located in the same rows of the remaining columns.
Most of the answers I found in the community discuss predefined locations in the table, as in "I'd like to sum up the values in rows x to y in column z". I'm trying to come up with a more general and flexible approach.
Allow me to try and illustrate my question on a severely simplified example. Assume I have a table:
| Material | Sigma | L | LR |
|---|---|---|---|
| Aluminum | Plus | .12 | 111 |
| Aluminum | Minus | .34 | 222 |
| Steel | Plus | .56 | 333 |
| PET | Minus | .78 | 444 |
| Steel | Minus | .90 | 555 |
| PET | Plus | .21 | 666 |
I am able to load in the table via:
my_data = readtable('raw_data.csv','PreserveVariableNames',true);
For example, I am able to average all the values in column LR by isolating it first into an array and then using the mean function on it:
sum_LR = mean(my_data.LR)
But what I don't know is how to do this only for the rows where Material is Steel, or where Sigma is Minus, or both at the same time. Say, what is the sum of LR for Steel - how to get 888? Or what is the average value of L where the Material is x and at the same time Sigma is y.
I tried to modify this solution but I'm unable to come up with anything that will work. So I'd like to aks you only for advice on how should I approach this?
Thank you.
댓글 수: 0
답변 (1개)
Cris LaPierre
2021년 8월 17일
We can provide more specific advice if you can be more specific. Can you attach your file and some examples of calculations you need to compute? Use the paperclip icon to attach files to your post.
댓글 수: 4
Eric Sofen
2021년 8월 17일
Chris's suggestion of groupsummary or grpstat (or varfun or findgroups/splitapply) is the right approach if you want to calculate the statistics for each of the groups - that is, mean for steel, mean for Aluminum, mean for PET. You can specify multiple grouping variables in these functions to do the various combinations: means separately for PET/Plus, PET/Minus, Steel/Plus...
If you only want one of these, you can do some logical indexing on the table based on the first few variables to select the subtable or array of interest, then calculate your statistics on that.
For example, if you just want the mean of LR for Steel/Minus, use {} to extract the LR data as an array just where the material and sigma are correct. Then pass that to mean.
mean(t{t.Material =="Steel" & t.Sigma == "Minus","LR"})
참고 항목
카테고리
Help Center 및 File Exchange에서 Data Preprocessing에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!