Trying to extract different numbers out of structured text

조회 수: 1 (최근 30일)
Or Shem Tov
Or Shem Tov 2020년 3월 14일
댓글: Or Shem Tov 2020년 3월 15일
Hi guys,
I am trying to extract numbers out of a bunch of strings, where every number is different in each line but they have the same structure like this:
phrases = ["Analyst Actions: Stifel Nicolaus Cuts Apache Price Target to $37 From $40, Maintains Buy Rating"; % to $NEWPT From $OLDPT
"Analyst Actions: Citigroup Initiates Coverage on Expedia With Buy Rating, $130 Price Target"; % $NEWPT Price Target
"Johnson & Johnson's PT cut by Credit Suisse Group AG to $159.00. outperform rating. (NYSE:JNJ)"; % to $NEWPT
"Kroger's equal weight rating reiterated at Stephens. $35.00 PT. (NYSE:KR)"; % $NEWPT PT
"Analyst Actions: Citigroup Initiates Coverage on Booking Holdings With Buy Rating" % this row has no value and should be "None" "None"
% more similiar lines in the same structure
]
% Extract NewPT from each line
% Extract OldPT from each line
% Write "None" where NewPT or OldPT values are null
I am trying to create two columns - NewPT and OldPT and extract the values as commented above and assign "None" whenever values don't exist
I'll be thankful to anybody who can help me with this.
Thank you!
  댓글 수: 2
Rik
Rik 2020년 3월 15일
There are only 3 cases, so it shouldn't be too difficult to write a parser. Did you try locating the values by searching for the dollar signs?
Or Shem Tov
Or Shem Tov 2020년 3월 15일
Not sure how to do that, notice that there is one case "From $OldPT" where the value next to the $ sign has a different meaning, not sure how to tackle that one

댓글을 달려면 로그인하십시오.

채택된 답변

Akira Agata
Akira Agata 2020년 3월 15일
I believe 'Regular Expression' will extract the target part of string. The following is an example.
% Extract target part of string
newpt = regexp(phrases,'((?<=to \$)\d+\.?\d*|\d+\.?\d*(?= (Price Target|PT)))','match','once');
oldpt = regexp(phrases,'(?<=From \$)\d+\.?\d*','match','once');
% Convert to numerical array
newpt = str2double(newpt);
oldpt = str2double(oldpt);

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Characters and Strings에 대해 자세히 알아보기

제품


릴리스

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by