MATLAB Answers

TextScan - flexible formatSpec string how to?

조회 수: 30(최근 30일)
BSantos
BSantos 20 May 2015
편집: BSantos 20 May 2015
Hey fellow "Matlabers"!
I have been struggling with this problem for quite a while and so far could not figure out a solution. Maybe someone here knows or have a suggestion.
The problem:
I am reading several .txt files, which contain all kind of data (strings, date format, numbers) and I only need to extract information of a few columns. The problem is that I need to ignore a certain amount of characters (marked as string) until I reach the first column that has data I need. For each file, the amount of characters can vary and therefore, I don't know how to specify on the formatSpec string that will be used in my textscan function. The number 59 is the value that varies; each file has a different number of characters to discard.
Example:
formatSpec = '%*59*s%10{dd/MM/yyyy}D%6{HH:mm}D%*10*s%*14s%10s%*8*s%*14s%10s%[^\n\r]';
textscan(fileID, '%[^\n\r]', startRow-1, 'ReturnOnError', false);
dataArray = textscan(fileID, formatSpec, 'Delimiter', '', 'WhiteSpace', '', 'ReturnOnError', false);
Error message:
Error using textscan
Unable to read the DATATIME data with the format 'dd/MM/yyyy'. If the data is not a time, use %q to get
string data.
Any idea how can I automate this process?
Thanks in advance!
EDIT: I have added two .txt files as an example of what kind of data I am dealing with.

  댓글 수: 2

Stephen Cobeldick
Stephen Cobeldick 20 May 2015
Your description is great. All that is missing are a few sample text files, so that we can test out code on and see if it works. You can upload a few test files using the paperclip button, and not that you will need to push both the Choose file and Attach file buttons too.
It is much easier for us and also for you if we have real data to work with!
BSantos
BSantos 20 May 2015
Stephen,
I thought about adding my files, but I'm afraid I can't due to company restrictions. I will try to "edit" my txt files and leave out just some information so I don't get in troubles.
Thanks!

댓글을 달려면 로그인하십시오.

답변(1개)

Walter Roberson
Walter Roberson 20 May 2015
ToSkip = 59;
formatSpec = ['%*', sprintf('%d', ToSkip), '*s%10{dd/MM/yyyy}D%6{HH:mm}D%*10*s%*14s%10s%*8*s%*14s%10s%[^\n\r]';

  댓글 수: 5

표시 이전 댓글 수: 2
BSantos
BSantos 20 May 2015
My data should start with "datatime" kind of date, meaning it's a number in a format dd/MM/yyyy. But before the Date/Time column there are other columns with numbers as well.
Any idea on how to figure out the amount of characters to discard before scanning start?
thanks!
Walter Roberson
Walter Roberson 20 May 2015
The two sample files you provided can be handled by using
repmat('%*s',1,5)
as the data to skip.
By the way, are the columns possibly tab separated?
BSantos
BSantos 20 May 2015
Walter,
Thanks, I will try this out on my script. Unfortunately no; the software generating this .txt files is a bit "user unfriendly"... The csv files are even a lot worse to handle than the text files; so I choose saving my results in this kind of text files.
If this works, I will pots over here.
EDIT:
Well I get the same error as posted on my question. Any other suggestions?
Thanks!!

댓글을 달려면 로그인하십시오.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by