deleting a part of a column - date to date??
이 질문을 팔로우합니다.
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다.
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다.
오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
이전 댓글 표시
0 개 추천
daten1=floor(gas_calcorr(1,1));
% daten1=datenum(2018,08,20);
% daten2=floor(gas_calcorr(end,1));
daten2=datenum(2018,08,31);
RemoveData=(gas_calcorr(daten1:daten2,7));
댓글 수: 3
Please be more specific about your data, problem and question. I think, that you also do not want to use daten1 and daten2 as indices for gas_calcorr since you use datenum to create these dates and thatfor index 1 would correspond to the first of january in the year 0.
Yes,you are right. wrong way.. To be more specific: I just want to delete very noisy data over a period when an instruemnt was malfunctioning and unfortunately it is along vector; Goes from 01/02/2018 untill 31/08/2018. My matrix is the following
% Columns:
% 1: time
% 2: pressure drop in inlet (provides information on possible jams in inlet)
% 3: O3
% 4: SO2
% 5: NO
% 6: NOx
% 7: CO
All others are fine except CO an that must go out (become nan) within this date. My data is in 1-minute resolution and there are too much to manilpulate it in variable editor. Could you write me an example script? Thx. MJ
Okay, I think i got the problem and prepare a example script. Is the column 2 a criterion for the malfunction so that if column2 is true than CO should be NaN?
채택된 답변
Benjamin Großmann
2020년 3월 9일
clearvars
close all
clc
% lets create the date column (I only use 1 hour with increment of 1 minute), but this should works for any length
dt_datetime = [datetime(2018,02,01,14,00,00):minutes(1):datetime(2018, 02, 01, 14, 59, 59)];
dt = datenum(dt_datetime); % this should look like your first column transposed
% Generate the rest of your data as random values and attach it to the time
% vector
data_orig = [dt' rand(size(dt,2), 6)];
% now, the variable "data_orig" should have the dimensions of your gas_calcorr variable
% We now can try to manipulate the data
%% Example 1) Give specific start and end date and set the CO values (seventh column) within these dates to NaN
data1 = data_orig; % do not override the original data since we need it for another example
start_date = datenum(2018, 02, 01, 14, 20, 00);
end_date = datenum(2018, 02, 01, 14, 25, 00);
% generate a mask where the date fullfills the criterion
mask1 = (data1(:, 1) >= start_date) & (data1(:, 1) <= end_date); % creates a logical vector with 1s and 0s
% use logical indexing as row index to apply the mask:
% Set the values in the 7th column and each row where the mask is 1 to NaN
data1(mask1, 7) = NaN;
%% Example 2) Search for a criterion in column 2 and apply the mask
% do not override the original data since we may need it for another example
data2 = data_orig;
mask2 = data2(:,2) >= 0.5;
% use logical indexing as row index to apply the mask for the corresponding mask
data2(mask2, 7) = NaN;
댓글 수: 9
Thanks Benni,
Sorry for my deleyed reply - I was in the field/=o connection till now. Just tried your sollution and it did not work (for me). If you are stil willing to assist me could I share my matrix with you?
It is a bit big so I could upload it in my Google drive and send you a shareable link if you don't mind. And also to answer to you on the prior question = column 2 is not any criterion for column 7.
Thanks in anticipation,
Micky
Yes, your data would be helpful. Furthermore, can you please explain how you get the dates for which the CO value has to be NaN. Do you have a list or do you obtain those values by looking at a plot or something like that?
Micky Josipovic
2020년 3월 10일
편집: Image Analyst
2020년 3월 10일
Thanks Benni,
We ran a field station with a lot of ambient measurements. With any and all interactions we diarise any interference and changes. The CO analyser got faulty (Feb 2018) and we could only replace it at the end of Aug. 2018. Thus we decided that (very) noisy data (Feb - Aug) should be flagged and excluded from our "cleaned" matrix (that we use further in our work). Thus column 7 (CO conc. ) in that period needs to be NaNs. Other columns must remain as they are (also checked and flagged when unreliable on the basis of daily data screening).
I am going to upload my gas_callcor matrix in a folder and send you a link right now. Please try another code.
Thanks a lot!
M
Micky Josipovic
2020년 3월 10일
편집: Image Analyst
2020년 3월 10일
Here is the link:
Hi Bennie - The Matlab answers does not allow me to attach the file no send you a link to Google for it (conider it spam). Any other option to send it to you? Dropbox link?
Hi, I checked your data and there are already a lot of NaNs inside it (in all columns except for column 1). Anyway, the following code reads your mat-file and sets column 7 to NaN for the rows bewteen startDate and endDate. Please let me know if this works for you. In your data there are 180637 NaNs in column 7 and after running the script there are 427606 NaNs in column 7.
clearvars
close all
clc
%% ----- Settings -----
% the file name (and path) of the mat-File to load data
input.matFileName = 'gas_calcorr_O3_SO2_NO_NOx.mat';
% the file name (and path) of the mat-File to save the data
output.matFileName = 'gas_calcorr_O3_SO2_NO_NOx_out.mat';
% The name of the variable inside the input mat-File where the data is stored
input.variableName = 'gas_calcorr';
% The name of the variable inside the output mat-File where the filtered data is stored
output.variableName = 'gas_calcorr';
% The first date and time to set the seventh column NaN (Year, Month, day, hour, minute, second)
startDate = datenum(2018, 02, 01, 00, 00, 00);
% The last date and time to set the seventh column NaN (Year, Month, day, hour, minute, second)
endDate = datenum(2018, 08, 31, 23, 59, 59);
%% ----- Program -----
% load the mat-File
data_struct = load(input.matFileName);
% get the contents of the struct
d = data_struct.(input.variableName);
% generate a mask where the date fullfills the criterion
mask = (d(:, 1) >= startDate) & (d(:, 1) <= endDate); % creates a logical vector with 1s and 0s
% use logical indexing as row index to apply the mask:
% Set the values in the 7th column and each row where the mask is 1 to NaN
d(mask, 7) = NaN;
%% ----- Output -----
out.(output.variableName) = d;
save(output.matFileName, '-struct', 'out')
Thanks Benni,
It worked! Thanks a lot. However, whene I try to load and plot the column 7 (CO gas) my script (below) plots all (from desired date) and not CO column at all. Please could you look at whete I went wrong?
% Program to visually remove bad data from gas_calcorr measurements
% VV 15.4.2010 :
% - all paths to files are collected in paths.m. Modify it!
% - Using Trailer_ini.xls to read calibrations and filter out maintenance periods
% Columns:
% 1: time
% 2: pressure drop in inlet (provides information on possible jams in inlet)
% 3: O3
% 4: SO2
% 5: NO
% 6: NOx
% 7: CO
load('gas_calcorr_O3_SO2_NO_NOx_out.mat');
% load('gas_calcorrected.mat'); % gas_calcorr
% gas_calcorr=gases_uncor;
% load O3;
load('../raw_data/soil_raw.mat'); % soil data for power indicator
col=7 % check and remove from this column
collabel='CO';
%% remember to save and load corrected matrix "gas_calcorr" before continuing with
%% next next run
% daten1=floor(gas_calcorr(1,1));
daten1=datenum(2018,9,1);
daten2=floor(gas_calcorr(end,1));
% daten2=datenum(2017,8,22);
% for daten=daten1:daten2
for daten= daten1:daten2
grr1=find(gas_calcorr(:,1)>=daten,1,'first');
grr2=find(gas_calcorr(:,1)<=daten+1,1,'last');
figure(1)
subplot(2,1,1)
plot(gas_calcorr(grr1:grr2,1),gas_calcorr(grr1:grr2,3),'.b')
ylabel('O3')
xlim([daten daten+1])
datetick('x','HH:MM','keeplimits')
grid on;
subplot(2,1,2)
plot(gas_calcorr(grr1:grr2,1),gas_calcorr(grr1:grr2,4),'.b')
ylabel('SO2')
xlim([daten daten+1])
datetick('x','HH:MM','keeplimits')
grid on;
figure(2)
subplot(2,1,1)
plot(gas_calcorr(grr1:grr2,1),gas_calcorr(grr1:grr2,5),'.b',gas_calcorr(grr1:grr2,1),gas_calcorr(grr1:grr2,6),'.g')
ylabel('NO, NOx')
xlim([daten daten+1])
datetick('x','HH:MM','keeplimits')
grid on;
subplot(2,1,2)
plot(gas_calcorr(grr1:grr2,1),gas_calcorr(grr1:grr2,7),'.b')
ylabel('CO')
xlim([daten daten+1])
datetick('x','HH:MM','keeplimits')
grid on;
addpath ../raw_data/; % use one function from this folder
plot_power_rh(soil(soil(:,1)>=daten & soil(:,1)<=daten+1 ,[1 17]),[],4);% plot power indicator, not RH, in fig 4
rmpath ../raw_data/;
figure(3) % this is the figure where you remove bad points
plot(gas_calcorr(grr1:grr2,1),gas_calcorr(grr1:grr2,col),'.r')
ylabel(collabel)
xlim([daten daten+1])
datetick('x','HH:MM','keeplimits')
grid on;
title(['Remove bad ' datestr(daten,'dd.mm.yyyy')])
rem=input('Remove? ','s');
while ~isempty(rem)
disp('Removing all data between two x-values.')
[x y]=ginput(2);
if x(1)<daten
x(1)=daten;
end
if x(2)>daten+1
x(2)=daten+1;
end
disp([datestr(x(1),'yyyymmdd HH:MM') ' ' datestr(x(2),'yyyymmdd HH:MM')])
if x(2)>x(1)
hold on
plot(gas_calcorr(gas_calcorr(:,1)>=x(1) & gas_calcorr(:,1)<=x(2),1 ),gas_calcorr(gas_calcorr(:,1)>=x(1) & gas_calcorr(:,1)<=x(2),col ),'ok');
hold off
else
disp('x(2) > x(1)! Period rejected, no data will be removed.');
end
go_on=input('Ok? ','s');
while ~isempty(go_on)
figure(3)
plot(gas_calcorr(grr1:grr2,1),gas_calcorr(grr1:grr2,col),'.r')
ylabel(collabel)
xlim([daten daten+1])
datetick('x','HH:MM','keeplimits')
grid on;
title(['Remove bad ' datestr(daten,'dd.mm.yyyy')])
[x y]=ginput(2);
disp([datestr(x(1),'yyyymmdd HH:MM') ' ' datestr(x(2),'yyyymmdd HH:MM')])
if x(2)>x(1)
hold on
plot(gas_calcorr(gas_calcorr(:,1)>=x(1) & gas_calcorr(:,1)<=x(2),1 ),gas_calcorr(gas_calcorr(:,1)>=x(1) & gas_calcorr(:,1)<=x(2),col ),'ok');
hold off
else
disp('x(2) > x(1)! Period rejected, no data will be removed.');
end
go_on=input('Ok? ','s');
end
if x(2)>x(1)
gas_calcorr(gas_calcorr(:,1)>=x(1) & gas_calcorr(:,1)<=x(2),col )=nan;
else
disp('x(2) > x(1)! Period rejected, no data will be removed.');
end
figure(3)
plot(gas_calcorr(grr1:grr2,1),gas_calcorr(grr1:grr2,col),'.r')
ylabel(collabel)
xlim([daten daten+1])
datetick('x','HH:MM','keeplimits')
grid on;
title(['Remove bad ' datestr(daten,'dd.mm.yyyy')])
rem=input('Remove? ','s');
end
end
Hey Micky,
your code seems fine. It could be improved at one point or another, but it gets the job done. I think, that you only looked at data points where the CO value is NaN. Remember, as I said in the earlier comment, the data that you uploaded to google drive already contains close to 200.000 NaNs. If you dont see any CO data in the plot, then the whole day contains NaNs.
Please set your daten1 variable to something like
daten1=datenum(2018, 11, 15);
to get some data for the CO plot.
If you set it to
daten1=datenum(2018, 12, 04);
you can see a gap in the data in all subplots.
Please let me know if you need further help. Do you know where these NaNs in your original data come from? We can also try to investigate the NaNs in your original data, maybe graphically.
Hi Benni,
Yes indeed - there mut have been more days with NaNs after 01/09/18 (I checked a few and not all). Thanks for your assistance, highly appreciated.
The script works now as you indicated! And to answer your question about the NaNs in the raw (and semi-cleaned data matrix):
Those are generally power cut periods, interferences with our measurments due to maintenance, checks, calibrations, etc. Indeed there are many but the main culprit is our electricity grid. Giving the entire South Africa many hours of cuts and dips... So in order for us to clean the data , all those interfereances must be flaged and cut out at the beginning of further work, one where we look at other u nrealistic and unprobable outliers and cut them out at our discretion. Desite this we have high retention of data and the case with our CO-analyser was an odd one, malfunctioned February till August (we could not get another one to replace it)...
Thank you for offering your further assistance. I am fine for now but will count on "Matlab Answers", community in the future of course.
Kind regards,
MJ
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Time Series Objects에 대해 자세히 알아보기
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
