Convert table with different types

here is a part of my script, when i read the table, the column 2, 3 and 4 is not recognized as string. According to Matlab these columns are cells. How can I change this? The rest of the columns, except for 1, are strings for which my script works. The solutions I have found in the forums, I could not transfer to my problem.
table=readtable('C:\Users\Hüs\Documents\MATLAB\Datalogger\TFAbisJun21.xlsx');
x=table.Timestamp; %Datum und Uhrzeit auslesen
x=x(end-468:end);
time=datetime(datestr((x)));
%Raum 210
temp210=table.T0; %Temperatur Raum210 auslesen
temp210=temp210(end-468:end);
temp210=str2double(temp210);
hum210=table.RH0; %Luftfeuchtigkeit Raum210 auslesen
hum210=hum210(end-468:end);
hum210=str2double(hum210);
...

댓글 수: 6

I get no such behavior --
>> opt=detectImportOptions('data.xlsx')
opt =
SpreadsheetImportOptions with properties:
Sheet Properties:
Sheet: ''
Replacement Properties:
MissingRule: 'fill'
ImportErrorRule: 'fill'
Variable Import Properties: Set types by name using setvartype
VariableNames: {'Timestamp', 'T0', 'RH0' ... and 25 more}
VariableTypes: {'datetime', 'double', 'double' ... and 25 more}
SelectedVariableNames: {'Timestamp', 'T0', 'RH0' ... and 25 more}
VariableOptions: Show all 28 VariableOptions
Access VariableOptions sub-properties using setvaropts/getvaropts
VariableNamingRule: 'modify'
Range Properties:
DataRange: 'A2' (Start Cell)
VariableNamesRange: 'A1'
RowNamesRange: ''
VariableUnitsRange: ''
VariableDescriptionsRange: ''
To display a preview of the table, use preview
>> preview('data.xlsx',opt)
ans =
8×28 table
Timestamp T0 RH0 DEWI T1 RH1 DEW1 T2 RH2 DEW2 T3 RH3 DEW3 T4 RH4 DEW4 T5 RH5 DEW5 T6 RH6 DEW6 T7 RH7 DEW7 T8 RH8 DEW8
____________________ ____ ___ ____ __ ___ ____ ____ ___ ____ ____ ___ ____ ____ ___ ____ ____ ___ ____ ____ ___ ____ ____ ___ ____ _______ _______ _______
12-Jun-2021 21:20:00 23.4 45 10.8 21 41 7.3 19 44 6.5 20.2 51 9.8 19.7 58 11.2 22.2 51 11.6 26.2 42 12.3 20.3 55 11 {'---'} {'---'} {'---'}
12-Jun-2021 21:30:00 23.4 45 10.8 21 40 6.9 19.2 43 6.3 20.2 51 9.8 19.7 57 11 22.2 50 11.3 26.2 42 12.3 20.4 58 11.9 {'---'} {'---'} {'---'}
12-Jun-2021 21:40:00 23.4 45 10.8 21 40 6.9 19.1 43 6.3 20.1 51 9.7 19.6 56 10.6 22.1 50 11.2 26.1 42 12.2 20.5 55 11.2 {'---'} {'---'} {'---'}
12-Jun-2021 21:50:00 23.3 45 10.7 21 40 6.9 19 43 6.2 20 50 9.3 19.4 56 10.4 22.1 50 11.2 26.1 41 11.9 20.2 58 11.7 {'---'} {'---'} {'---'}
12-Jun-2021 22:00:00 23.3 44 10.4 21 40 6.9 19 42 5.8 20 50 9.3 19.7 58 11.2 22.1 50 11.2 26.1 41 11.9 20.7 56 11.6 {'---'} {'---'} {'---'}
12-Jun-2021 22:10:00 23.2 44 10.3 21 40 6.9 19 42 5.8 19.9 50 9.2 19.7 57 11 22.1 49 10.9 26.1 41 11.9 20.2 56 11.2 {'---'} {'---'} {'---'}
12-Jun-2021 22:20:00 23.2 45 10.6 21 39 6.5 18.9 42 5.7 19.8 50 9.1 19.8 55 10.5 22 49 10.8 26.1 41 11.9 20.6 56 11.5 {'---'} {'---'} {'---'}
12-Jun-2021 22:30:00 23.2 44 10.3 21 39 6.5 19 41 5.5 19.8 50 9.1 19.6 53 9.8 22.1 49 10.9 26.1 40 11.5 20.2 54 10.6 {'---'} {'---'} {'---'}
>> tData=readtable('data.xlsx');
>> head(tData)
ans =
8×28 table
Timestamp T0 RH0 DEWI T1 RH1 DEW1 T2 RH2 DEW2 T3 RH3 DEW3 T4 RH4 DEW4 T5 RH5 DEW5 T6 RH6 DEW6 T7 RH7 DEW7 T8 RH8 DEW8
____________________ ____ ___ ____ __ ___ ____ ____ ___ ____ ____ ___ ____ ____ ___ ____ ____ ___ ____ ____ ___ ____ ____ ___ ____ _______ _______ _______
12-Jun-2021 21:20:00 23.4 45 10.8 21 41 7.3 19 44 6.5 20.2 51 9.8 19.7 58 11.2 22.2 51 11.6 26.2 42 12.3 20.3 55 11 {'---'} {'---'} {'---'}
12-Jun-2021 21:30:00 23.4 45 10.8 21 40 6.9 19.2 43 6.3 20.2 51 9.8 19.7 57 11 22.2 50 11.3 26.2 42 12.3 20.4 58 11.9 {'---'} {'---'} {'---'}
12-Jun-2021 21:40:00 23.4 45 10.8 21 40 6.9 19.1 43 6.3 20.1 51 9.7 19.6 56 10.6 22.1 50 11.2 26.1 42 12.2 20.5 55 11.2 {'---'} {'---'} {'---'}
12-Jun-2021 21:50:00 23.3 45 10.7 21 40 6.9 19 43 6.2 20 50 9.3 19.4 56 10.4 22.1 50 11.2 26.1 41 11.9 20.2 58 11.7 {'---'} {'---'} {'---'}
12-Jun-2021 22:00:00 23.3 44 10.4 21 40 6.9 19 42 5.8 20 50 9.3 19.7 58 11.2 22.1 50 11.2 26.1 41 11.9 20.7 56 11.6 {'---'} {'---'} {'---'}
12-Jun-2021 22:10:00 23.2 44 10.3 21 40 6.9 19 42 5.8 19.9 50 9.2 19.7 57 11 22.1 49 10.9 26.1 41 11.9 20.2 56 11.2 {'---'} {'---'} {'---'}
12-Jun-2021 22:20:00 23.2 45 10.6 21 39 6.5 18.9 42 5.7 19.8 50 9.1 19.8 55 10.5 22 49 10.8 26.1 41 11.9 20.6 56 11.5 {'---'} {'---'} {'---'}
12-Jun-2021 22:30:00 23.2 44 10.3 21 39 6.5 19 41 5.5 19.8 50 9.1 19.6 53 9.8 22.1 49 10.9 26.1 40 11.5 20.2 54 10.6 {'---'} {'---'} {'---'}
>>
whether use detectImportOptions or not; it reads everything except the last three columns as numeric; they are missing values with '---' stored so they are rightfully imported as cellstr(). You could leave those blank in the workbook and they would get imported automagically as NaN for missing value but be of class double.
Or you could fix up the import rule for missing in the import options object.
Hüseyin Uzun
Hüseyin Uzun 2021년 6월 21일
you are right, but when I run my script to read columns 2 and 3, matlab does not recognize them as a number (NaN) - see picture Workspace. Respectively when my script wants to plot columns 2 and 3 against column 1, the Figure is empty. The last three columns are empty, which is how it should be. I have also attached a picture of my table. Maybe something goes wrong with my table when importing it, can that be? As you can see, the values in columns 2, 3 and 4 look different from 5, 6 and so on.
dpb
dpb 2021년 6월 21일
Those aren't the same data as in the file you uploaded; something's different in that file.
Would have to see the same file to be able to see what was/is different, but setting the variable type in the import options object should fix it -- or, you can use str2double on the data after loading.
But, in general it is the first record in the file for each column that determines the data type -- that would imply the first record in that file isn't numeric.
If there's a missing value at that point and used the "---" indicator there as are the last three columns in the file you uploaded, that would do it. If you can't fix the Excel file to use a numeric missing value or empty, then fix up the import options object to handle the "---" missing value indicator.
See the doc for detectImportOptions for all the skinny of how to do that...
Hüseyin Uzun
Hüseyin Uzun 2021년 6월 22일
Thank you for your answer.
I looked at it and that made me understand it a little better, but I couldn't solve the problem. I also tried to solve the problem with the Excel file, but it did not work.
The original file is very large, so I could not upload it. I have shortened it now and attached the data, which causes the error and also my script. When I run opts = detectImportOptions(...) and look at the variabletypes, it looks fine. But when I run my script, I can't read the data from column 2 and 3, it says NaN even though it's a double. Can you track it with the attachements I uploaded now?
Again, it works as expected here without anything special being done...
>> tTF=readtable('TFAbisJun21a.xlsx');
>> head(tTF)
ans =
8×25 table
Timestamp T0 RH0 DEW0 T1 RH1 DEW1 T2 RH2 DEW2 T3 RH3 DEW3 T4 RH4 DEW4 T5 RH5 DEW5 T6 RH6 DEW6 T7 RH7 DEW7
____________________ ____ ___ ____ ____ ___ ____ ____ ___ ____ ____ ___ ____ ____ ___ ____ ___ ___ ____ ___ ___ ____ ___ ___ ____
14-Aug-2020 12:00:00 29.9 49 18.1 21.5 46 9.4 17.1 57 8.5 21.7 70 16 19.3 72 14.1 NaN NaN NaN NaN NaN NaN NaN NaN NaN
14-Aug-2020 12:10:00 30 49 18.2 21.4 46 9.3 16.9 59 8.9 21.9 70 16.2 19.2 72 14.1 NaN NaN NaN NaN NaN NaN NaN NaN NaN
14-Aug-2020 12:20:00 30 49 18.2 21.5 47 9.7 17.6 58 9.3 21.9 70 16.2 19.3 72 14.1 NaN NaN NaN NaN NaN NaN NaN NaN NaN
14-Aug-2020 12:30:00 30.1 49 18.2 21.5 47 9.7 16.7 60 8.9 21.9 70 16.2 19.3 72 14.1 NaN NaN NaN NaN NaN NaN NaN NaN NaN
14-Aug-2020 12:40:00 30.1 49 18.2 21.5 47 9.7 17.5 58 9.2 22 69 16.1 19.3 72 14.1 NaN NaN NaN NaN NaN NaN NaN NaN NaN
14-Aug-2020 12:50:00 30.2 49 18.3 21.5 47 9.7 16.7 60 8.9 22.1 70 16.4 19.3 73 14.4 NaN NaN NaN NaN NaN NaN NaN NaN NaN
14-Aug-2020 13:00:00 30.2 49 18.3 21.5 47 9.7 17.4 59 9.3 22.1 70 16.4 19.3 73 14.4 NaN NaN NaN NaN NaN NaN NaN NaN NaN
14-Aug-2020 13:10:00 30.3 50 18.8 21.5 47 9.7 17.2 59 9.1 22.2 70 16.5 19.4 73 14.5 NaN NaN NaN NaN NaN NaN NaN NaN NaN
>> winopen('TFAbisJun21a.xlsx');
>> tTF(14:25,14:end)
ans =
12×12 table
T4 RH4 DEW4 T5 RH5 DEW5 T6 RH6 DEW6 T7 RH7 DEW7
____ ___ ____ ____ ___ ____ ____ ___ ____ ____ ___ ____
19.4 73 14.5 NaN NaN NaN NaN NaN NaN NaN NaN NaN
19.4 73 14.5 NaN NaN NaN NaN NaN NaN NaN NaN NaN
19.4 73 14.5 31.4 49 19.4 30.7 50 19.1 30.4 51 19.2
19.4 73 14.5 25.6 50 14.4 29.5 54 19.2 31.8 43 17.7
19.4 73 14.5 23.2 54 13.4 28.5 56 18.9 22.8 61 14.9
19.5 74 14.8 23 54 13.2 28.3 57 19 21.6 59 13.3
19.5 73 14.6 22.5 53 12.5 28.3 54 18.1 21.7 60 13.6
19.5 72 14.3 22.3 55 12.9 28.1 53 17.7 21.8 63 14.5
19.4 72 14.2 22.4 53 12.4 28.1 53 17.7 21.6 63 14.3
19.5 72 14.3 22.5 54 12.8 28 54 17.9 21.5 58 12.9
19.5 71 14.1 22.5 53 12.5 28 54 17.9 21.7 61 13.9
19.4 71 14 22.4 54 12.7 28 55 18.2 21.8 64 14.7
>>
All variables are read as numeric; the last columns are the "---" missing values indicator for the first 15 records or so so they are interpreted as NaN.
As for the rest, there are some missing values scattered around here and there...
>> all(isfinite(tTF{:,2:end}))
ans =
1×24 logical array
1 1 1 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>
>> sum(~isfinite(tTF{:,2:end}))
ans =
0 0 0 38 38 38 0 0 0 33 33 33 23 23 23 291 291 291 35 35 35 22 22 22
>>
shows you that only the first and third groups of three columns are all finite; all the rest have at least some missing values that are imported as NaN.
>> arrayfun(@(i)find(~isfinite(tTF{:,i}),1),2:24,'UniformOutput',false)
ans =
1×23 cell array
Columns 1 through 12
{0×1 double} {0×1 double} {0×1 double} {[15152]} {[15152]} {[15152]} {0×1 double} {0×1 double} {0×1 double} {[38164]} {[38164]} {[38164]}
Columns 13 through 23
{[10075]} {[10075]} {[10075]} {[1]} {[1]} {[1]} {[1]} {[1]} {[1]} {[1]} {[1]}
>>
shows you the first location in each column that is missing; looking at that location for the first set of three plus the header lines to see which variables they are shows up the following--
>> tTF([1:5 15152+[0:10]],[1 4:7])
ans =
16×5 table
Timestamp DEW0 T1 RH1 DEW1
____________________ ____ ____ ___ ____
14-Aug-2020 12:00:00 18.1 21.5 46 9.4
14-Aug-2020 12:10:00 18.2 21.4 46 9.3
14-Aug-2020 12:20:00 18.2 21.5 47 9.7
14-Aug-2020 12:30:00 18.2 21.5 47 9.7
14-Aug-2020 12:40:00 18.2 21.5 47 9.7
27-Nov-2020 20:10:00 2.7 NaN NaN NaN
27-Nov-2020 20:20:00 2.7 20.6 33 3.8
27-Nov-2020 20:30:00 2.7 20.5 34 4.1
27-Nov-2020 20:40:00 3.3 20.5 34 4.1
27-Nov-2020 20:50:00 2.7 20.5 34 4.1
27-Nov-2020 21:00:00 2.6 20.5 33 3.7
27-Nov-2020 21:10:00 2.6 20.5 32 3.3
27-Nov-2020 21:20:00 2.6 20.6 35 4.6
27-Nov-2020 21:30:00 2.6 20.6 32 3.4
27-Nov-2020 21:40:00 2.6 20.6 34 4.2
27-Nov-2020 21:50:00 2.6 20.6 31 2.9
>>
For the first case, there's just one set of missing values in the series; but the above sums show there are 37 more locations in the three series overall; although I didn't look to see where they are, specifically.
The data are being read just fine; it's that there are apparently unknown to you and/or unexpectedly, missing values.
You can either remove those, or possibly interpolate to fill in (although the above distinct step change in variable DEW0 makes the latter seem a dubious proposition as it looks like there can be serious discontinuities in the observations).
Just as a comment on style and use of MATLAB, your code makes many copies of the same data in named local variables instead of using the data in the table directly with the variable names that could reduce the amount of code and duplication and make the code general for a particular purpose and just pass the desired variable names to functions.
It is possible to refer to variables in tables by either column position or by variable name as char() variables; look at the section in the documentation on "referencing and accessing data in tables" to see the full illustration of using tables effectively.
Also, it would be better to not bury "magic" numbers like the 468 that is used in subscripting expressions; instead find the location of interest by logic addressing operations so that it can be generalized or, at least, use variables to hold the numeric value if that is not feasible so the value can be changed readily in one location. The difficulty in using constants is that if you remove the missing values, then those locations can change and if they're hard coded as numbers, then that number must change by some unknown number of positions to compensate.
It doesn't appear the problem is in readtable but that you're just not accounting for the nature of the data.
Hüseyin Uzun
Hüseyin Uzun 2021년 6월 22일
thank you for your detailed answer, again.
This really helped me to understand matlab in generell better. I checked the documentation "referencing and accessing data in tables" and this is really helpful and also helps me working with the table.

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

도움말 센터File Exchange에서 Tables에 대해 자세히 알아보기

제품

릴리스

R2019b

질문:

2021년 6월 21일

댓글:

2021년 6월 22일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by