필터 지우기
필터 지우기

String Match for Plotting in Excel

조회 수: 1 (최근 30일)
Amanda
Amanda 2013년 4월 30일
I have a simple CSV Data file. See below. I need to extract data for each individual state with its temperature. For example, I need a script that can find the string New York and get the associated temperature data for New York, and put it in an array for plotting. So the output would be:
New_York = [83,55], Indiana = [70,60]
State Temperature (headerline)
New York 83
Indiana 70
Texas 72
California 80
Indiana 60
Texas 61
California 92
New York 55
Thanks, Amanda

채택된 답변

Cedric
Cedric 2013년 4월 30일
편집: Cedric 2013년 4월 30일
If states had no space in their names, or if you had commas as delimiters in the CSV file, you could go for a variant of
[state, temp] = textread('myFile.csv', '%s %d', 'delimiter', ',', ...
'headerlines', 1) ;
Now as it seems that there are spaces in names and no comma as delimiter, you can read line by line and extract states and temperatures with more specific functions (TEXTSCAN, FSCANF, FGETL+SSCANF, etc) and based on position if needed (e.g. start reading temperatures from char. 15 on). Then you can use STRCMPI to find indices of relevant states, and get corresponding temperatures based on these indices. But you could also go for a solution based on regular expressions (less common approach for this kind of structured data), that I illustrate below:
>> buffer = fileread('myFile.csv') ;
>> state = 'New York' ;
>> temp = str2double(regexpi(buffer, sprintf('(?<=%s\\s*)\\d*', state), ...
'match'))
temp =
83 55
>> state = 'California' ;
>> temp = str2double(regexpi(buffer, sprintf('(?<=%s\\s*)\\d*', state), ...
'match'))
temp =
80 92
Note that I wrote this in a concise manner, but we do the following in fact:
>> pattern = sprintf('(?<=%s\\s*)\\d*', state) ;
>> match = regexpi(buffer, pattern, 'match') ;
>> temp = str2double(match) ;
If you look at the pattern for New Work:
>> pattern
pattern =
(?<=New York\s*)\d*
It tells regexp to match
  • As many numeric characters as possible: \d*
  • Preceded by (positive look behind (?<=)) the literal New York followed by as many white spaces as possible: New York\s*
  댓글 수: 3
Cedric
Cedric 2013년 4월 30일
You're welcome! I updated my answer so you have a bit more explanations about REGEXP.
Amanda
Amanda 2013년 4월 30일
Thanks for the more detail explanation. Some reason, data extraction from a file has always been a challenge.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Spreadsheets에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by