Handling header name variation

조회 수: 4 (최근 30일)
Stephen Devlin
Stephen Devlin 2018년 6월 7일
편집: Stephen23 2018년 6월 7일
Hi,
When I import a textfile using readable I have various column headers, one called "myheader1", but it could also be "my header_1",how can you access that column once it is in the workspace no matter whether there is an underscore etc in that particular variable name?
Best regards Steve
  댓글 수: 2
Geoff Hayes
Geoff Hayes 2018년 6월 7일
Stephen - is this column always in the same position? Are you saying that a single file may have columns names as "myheader1" or "my header_1"? Please attach a small example...
Stephen23
Stephen23 2018년 6월 7일
Stephen Devlin's "Answer" moved here:
Hi Geoff, Yes it is always in the same position, the first cell of an excel spreadsheet (A1), it would be great though if the code was able to handle other headers that may have a similar issue but would also be in the same position.

댓글을 달려면 로그인하십시오.

채택된 답변

Stephen23
Stephen23 2018년 6월 7일
편집: Stephen23 2018년 6월 7일
One: Position: Can you rely on the column position as being fixed? If the position is always the same then you could simply replace the variable names by their position. Easy.
Two: Remove Chars: depending on how much variance you accept in the names, or what patterns the differences might fit: e.g. if you know that only spaces and underlines are involved, then you could simply use strrep or regexprep to get rid of the superfluous characters, and then update the variable name:
Three: Match Similar Names: If the possible names do not fit a particular pattern (i.e. not just extra space/underscore, but there could be missing/different characters), then one straightforward solution would be to compare the variable names against your preferred names (e.g. calculate the Levenshtein edit distance using my implementation of the Wagner Fischer algorithm (which is the most efficient algorithm), and then replace the "closest" matches with your preferred variable names. Thus in your code you would always use the "correct" variable name. All you would have to do is decide the limit that you will tolerate for differences, e.g. for your example the difference is:
>> wfEdits('myheader1','my header_1')
ans = 2
  댓글 수: 1
Stephen Devlin
Stephen Devlin 2018년 6월 7일
Brilliant, thanks Stephen

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Spreadsheets에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by