How to output html text using fscanf or textscan
    조회 수: 3 (최근 30일)
  
       이전 댓글 표시
    
Greetings,
I'm trying to take a specified data from a html text using MATLAB.
Here is a sample html text(lines 12-18):
<FONT SIZE=+1 COLOR="#800000">TONIGHT</FONT>
EAST WINDS 10 TO 15 KNOTS. SEAS 3 TO 4 FEET. ISOLATED
SHOWERS. 
<FONT SIZE=+1 COLOR="#800000">SATURDAY</FONT>
EAST WINDS 12 TO 16 KNOTS. SEAS 3 TO 4 FEET. ISOLATED
SHOWERS.
I want to take the headings 'Tonight' and 'Saturday' (Line 12 and 18) thus the data 'EAST WINDS 10 TO 15 KNOTS. SEAS 3 TO 4 FEET. ISOLATED SHOWERS.' and 'EAST WINDS 12 TO 16 KNOTS. SEAS 3 TO 4 FEET. ISOLATED SHOWERS. ' leaving me with an output of:
TONIGHT
EAST WINDS 10 TO 15 KNOTS. SEAS 3 TO 4 FEET. ISOLATED
SHOWERS.
SATURDAY
EAST WINDS 12 TO 16 KNOTS. SEAS 3 TO 4 FEET. ISOLATED
SHOWERS.
I want to use textscan or fscanf to make MATLAB scan the text file and leave me an output of just the plain text without de html tags.
Thank you for your time
댓글 수: 0
채택된 답변
  Walter Roberson
      
      
 2012년 6월 30일
        Well if it is important to use textscan() or fscanf(), then:
DataCell = textscan( fid, '%s', 'Delimiter', '');  %read the entire file as strings, one per line.
Output = regexprep( DataCell, '<[^>]+>', '' );  %remove the HTML
This will be a cell array of strings.
댓글 수: 3
  per isakson
      
      
 2012년 6월 30일
				Change
regexprep( DataCell ...)
to
regexprep( DataCell{1}  ... )
  Walter Roberson
      
      
 2012년 7월 1일
				You should try hard to avoid eval! http://matlab.wikia.com/wiki/FAQ#How_can_I_create_variables_A1.2C_A2.2C....2CA10_in_a_loop.3F
추가 답변 (0개)
참고 항목
카테고리
				Help Center 및 File Exchange에서 Oceanography and Hydrology에 대해 자세히 알아보기
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


