이 페이지의 최신 내용은 아직 번역되지 않았습니다. 최신 내용은 영문으로 볼 수 있습니다.

심층 학습을 사용하여 텍스트 데이터 분류하기

이 예제에서는 심층 학습 장단기 기억(LSTM) 네트워크를 사용하여 일기 예보의 텍스트 설명을 분류하는 방법을 보여줍니다.

텍스트 데이터는 본질적으로 순차적입니다. 텍스트 조각은 단어로 이루어진 시퀀스로서, 각 단어 사이에는 종속성이 있을 수 있습니다. 장기적인 종속성을 학습하여 이를 시퀀스 데이터를 분류하는 데 사용하려면 LSTM 신경망을 사용하십시오. LSTM 신경망은 시퀀스 데이터의 시간 스텝 간의 장기적인 종속성을 학습할 수 있는 순환 신경망(RNN)의 일종입니다.

LSTM 네트워크에 텍스트를 입력하려면 먼저 텍스트 데이터를 숫자형 시퀀스로 변환하십시오. 이렇게 하려면 문서를 숫자형 인덱스 시퀀스로 매핑하는 단어 인코딩을 사용하면 됩니다. 더 나은 결과를 위해 네트워크에 단어 임베딩 계층을 포함시킵니다. 단어 임베딩은 어휘에 있는 단어를 스칼라형 인덱스가 아닌 숫자형 벡터로 매핑합니다. 이러한 임베딩은 비슷한 의미를 갖는 단어들이 비슷한 벡터를 갖도록 단어의 의미 체계 정보를 캡처합니다. 벡터 연산을 통해 단어 사이의 관계도 모델링합니다. 예를 들어, "왕 대 여왕은 남자 대 여자와 같다"라는 관계는 남자 + 여자 = 여왕이라는 식으로 설명됩니다.

이 예제에서는 다음과 같은 네 단계를 사용하여 LSTM 네트워크를 훈련시키고 사용합니다.

  • 데이터를 가져오고 전처리합니다.

  • 단어 인코딩을 사용하여 단어를 숫자 시퀀스로 변환합니다.

  • 단어 임베딩 계층을 사용하여 LSTM 네트워크를 만들고 훈련시킵니다.

  • 훈련된 LSTM 네트워크를 사용하여 새로운 텍스트 데이터를 분류합니다.

데이터 가져오기

일기 예보 데이터를 가져옵니다. 이 데이터는 날씨 이벤트에 대한 텍스트로 된 설명을 포함합니다. 텍스트 데이터를 문자열로 가져오도록 텍스트 유형을 'string'으로 지정하십시오.

filename = "weatherReports.csv";
data = readtable(filename,'TextType','string');
head(data)
ans=8×16 table
            Time             event_id          state              event_type         damage_property    damage_crops    begin_lat    begin_lon    end_lat    end_lon                                                                                             event_narrative                                                                                             storm_duration    begin_day    end_day    year       end_timestamp    
    ____________________    __________    ________________    ___________________    _______________    ____________    _________    _________    _______    _______    _________________________________________________________________________________________________________________________________________________________________________________________________    ______________    _________    _______    ____    ____________________

    22-Jul-2016 16:10:00    6.4433e+05    "MISSISSIPPI"       "Thunderstorm Wind"       ""                "0.00K"         34.14        -88.63     34.122     -88.626    "Large tree down between Plantersville and Nettleton."                                                                                                                                                  00:05:00          22          22       2016    22-Jul-0016 16:15:00
    15-Jul-2016 17:15:00    6.5182e+05    "SOUTH CAROLINA"    "Heavy Rain"              "2.00K"           "0.00K"         34.94        -81.03      34.94      -81.03    "One to two feet of deep standing water developed on a street on the Winthrop University campus after more than an inch of rain fell in less than an hour. One vehicle was stalled in the water."       00:00:00          15          15       2016    15-Jul-0016 17:15:00
    15-Jul-2016 17:25:00    6.5183e+05    "SOUTH CAROLINA"    "Thunderstorm Wind"       "0.00K"           "0.00K"         35.01        -80.93      35.01      -80.93    "NWS Columbia relayed a report of trees blown down along Tom Hall St."                                                                                                                                  00:00:00          15          15       2016    15-Jul-0016 17:25:00
    16-Jul-2016 12:46:00    6.5183e+05    "NORTH CAROLINA"    "Thunderstorm Wind"       "0.00K"           "0.00K"         35.64        -82.14      35.64      -82.14    "Media reported two trees blown down along I-40 in the Old Fort area."                                                                                                                                  00:00:00          16          16       2016    16-Jul-0016 12:46:00
    15-Jul-2016 14:28:00    6.4332e+05    "MISSOURI"          "Hail"                    ""                ""              36.45        -89.97      36.45      -89.97    ""                                                                                                                                                                                                      00:07:00          15          15       2016    15-Jul-0016 14:35:00
    15-Jul-2016 16:31:00    6.4332e+05    "ARKANSAS"          "Thunderstorm Wind"       ""                "0.00K"         35.85         -90.1     35.838     -90.087    "A few tree limbs greater than 6 inches down on HWY 18 in Roseland."                                                                                                                                    00:09:00          15          15       2016    15-Jul-0016 16:40:00
    15-Jul-2016 16:03:00    6.4343e+05    "TENNESSEE"         "Thunderstorm Wind"       "20.00K"          "0.00K"        35.056       -89.937      35.05     -89.904    "Awning blown off a building on Lamar Avenue. Multiple trees down near the intersection of Winchester and Perkins."                                                                                     00:07:00          15          15       2016    15-Jul-0016 16:10:00
    15-Jul-2016 17:27:00    6.4344e+05    "TENNESSEE"         "Hail"                    ""                ""             35.385        -89.78     35.385      -89.78    "Quarter size hail near Rosemark."                                                                                                                                                                      00:05:00          15          15       2016    15-Jul-0016 17:32:00

일기 예보가 비어 있는 행은 테이블에서 제거합니다.

idxEmpty = strlength(data.event_narrative) == 0;
data(idxEmpty,:) = [];

이 예제의 목표는 event_type 열의 레이블을 기준으로 이벤트를 분류하는 것입니다. 데이터를 클래스별로 나누기 위해 레이블을 categorical형으로 변환합니다.

data.event_type = categorical(data.event_type);

히스토그램을 사용하여 데이터의 클래스 분포를 표시합니다. 레이블을 읽기 쉽도록 하려면 Figure의 너비를 늘리십시오.

f = figure;
f.Position(3) = 1.5*f.Position(3);

h = histogram(data.event_type);
xlabel("Class")
ylabel("Frequency")
title("Class Distribution")

많은 클래스가 적은 수의 관측값을 포함하고 있어서 데이터의 클래스 간에 균형이 맞지 않습니다. 이런 식으로 클래스 간에 균형이 맞지 않으면 네트워크가 덜 정확한 모델로 수렴할 수 있습니다. 이 문제를 방지하려면 10회보다 적게 나타나는 클래스를 모두 제거하십시오.

히스토그램에서 클래스의 빈도 수와 클래스 이름을 가져옵니다.

classCounts = h.BinCounts;
classNames = h.Categories;

관측값이 10개보다 작은 클래스를 찾습니다.

idxLowCounts = classCounts < 10;
infrequentClasses = classNames(idxLowCounts)
infrequentClasses = 1×8 cell array
    {'Freezing Fog'}    {'Hurricane'}    {'Lakeshore Flood'}    {'Marine Dense Fog'}    {'Marine Strong Wind'}    {'Marine Tropical Depression'}    {'Seiche'}    {'Sneakerwave'}

빈도가 적은 이러한 클래스를 데이터에서 제거합니다. removecats를 사용하여 categorical형 데이터에서 사용되지 않는 범주를 제거합니다.

idxInfrequent = ismember(data.event_type,infrequentClasses);
data(idxInfrequent,:) = [];
data.event_type = removecats(data.event_type);

데이터는 이제 적당한 크기의 클래스로 정렬되어 있습니다. 다음 단계는 데이터를 훈련 세트, 검증 세트, 테스트 세트로 분할하는 것입니다. 데이터를 훈련 파티션, 그리고 검증과 테스트를 위한 홀드아웃 파티션으로 분할합니다. 홀드아웃 백분율을 30%로 지정합니다.

cvp = cvpartition(data.event_type,'Holdout',0.3);
dataTrain = data(training(cvp),:);
dataHeldOut = data(test(cvp),:);

홀드아웃 세트를 다시 분할하여 검증 세트를 얻습니다. 홀드아웃 백분율을 50%로 지정합니다. 이렇게 하면 훈련 관측값 70%, 검증 관측값 15%, 테스트 관측값 15%로 데이터가 분할됩니다.

cvp = cvpartition(dataHeldOut.event_type,'HoldOut',0.5);
dataValidation = dataHeldOut(training(cvp),:);
dataTest = dataHeldOut(test(cvp),:);

분할된 테이블에서 텍스트 데이터와 레이블을 추출합니다.

textDataTrain = dataTrain.event_narrative;
textDataValidation = dataValidation.event_narrative;
textDataTest = dataTest.event_narrative;
YTrain = dataTrain.event_type;
YValidation = dataValidation.event_type;
YTest = dataTest.event_type;

데이터를 올바르게 가져왔는지 확인하려면 단어 구름을 사용하여 훈련 텍스트 데이터를 시각화하십시오.

figure
wordcloud(textDataTrain);
title("Training Data")

텍스트 데이터 전처리하기

텍스트 데이터를 토큰화하고 전처리하는 함수를 만듭니다. 이 예제의 마지막에 나오는 함수 preprocessText는 다음 단계를 수행합니다.

  1. tokenizedDocument를 사용하여 텍스트를 토큰화합니다.

  2. lower를 사용하여 텍스트를 소문자로 변환합니다.

  3. erasePunctuation을 사용하여 문장 부호를 지웁니다.

preprocessText 함수를 사용하여 훈련 데이터와 검증 데이터를 전처리합니다.

documentsTrain = preprocessText(textDataTrain);
documentsValidation = preprocessText(textDataValidation)
documentsValidation = 
  4218×1 tokenizedDocument:

      5 tokens: quarter size hail near rosemark
      7 tokens: large tree down on powerlines in caruthersville
      6 tokens: three trees down on hwy 224
      7 tokens: heat indices of 110 degrees or higher
      9 tokens: numerous trees were reported down in the greenback area
      9 tokens: several large tree branches were blown down in osage
     12 tokens: a tree fell onto a car four miles west southwest of knoxville
      7 tokens: two trees were reported in tellico plains
     21 tokens: wind gusts of 40 to 45 mph were common across buffalo county during the morning and early afternoon of april 2nd
     77 tokens: strong southerly gradient winds affected the nashville metro during the afternoon hours on april 6 a peak wind gust of 49 mph 43 knots was measured at the nashville international airport asos at 145 pm cdt a few trees were blown down across davidson county including a tree blown down in front of the blair school of music on the vanderbilt university campus and a large tree blown down in the yard of a home in hermitage
     36 tokens: wind gusts of 40 to 50 mph were common across crawford county during the morning and early afternoon of april 2nd the highest recorded wind gust was 49 mph by a mesonet station near mt sterling
     31 tokens: snowfall amounts between 19 and 32 inches were reported across prince william county snowfall totaled up to 300 inches near bull run and 185 inches of snow was reported near dumfries
     16 tokens: wind driven hail resulted in numerous holes in siding on the south side of a house
     15 tokens: snowfall amounts were estimated to be between 24 and 36 inches based on observations nearby
     33 tokens: snowfall amounts were reported to be between 18 and 30 inches across southern fauquier county a snowfall report of 300 inches was received in opal and 180 inches of snow fell near bealeton
     10 tokens: three to eight inches of snow fell across suffolk county
      9 tokens: approximately nine inches of snow fell in bristol county
     14 tokens: between 2 and 5 inches of snow were reported over the 12 hour period
     13 tokens: trained spotters reported between 02 and 04 inches of ice around the county
     23 tokens: the white river at newport remained above flood stage from december and fell below flood stage during the evening hours on the 9th
    118 tokens: strong high pressure developed across south central arizona including the greater phoenix area during the day on july 22nd leading to excessive heat over the lower deserts the official high temperature at phoenix reached to 112 degrees the heat proved to be deadly according to a report from local broadcast media a 12 year old boy was rushed to the hospital after losing consciousness from heat stroke or heat exhaustion during an afternoon hike the boy had been hiking at the apache wash trailhead located about halfway between deer valley airport and cave creek the boy later died an excessive heat warning had been in effect for the area since noon continuing on and into the next day
     13 tokens: the blackhall mountain snotel site elevation 9820 ft estimated six inches of snow
     13 tokens: the battle mountain snotel site elevation 7440 ft estimated 17 inches of snow
      7 tokens: trees were blown down on persimmon road
     16 tokens: two to five inches of rain fell across central and eastern portions of scotts bluff county
     13 tokens: one to two feet of water covered highway 2692 from mitchell to scottsbluff
     13 tokens: the intersection of highway 97 and highway vv was closed due to flooding
      7 tokens: quarter size hail was reported at federal
    127 tokens: the national weather service baltimore washington weather forecast office has confirmed a waterspout and tornado struck the potomac river moving into st mary s county just south of beauvue on tuesday february 24 2016 a national weather service ground survey along with radar analysis concluded the tappahannock virginia tornado which created a 30 mile path of damage across the middle peninsula and northern neck of virginia crossed the potomac and traveled a mile into st mary s county maryland before dissipating most of the 65 mile path in maryland was over the potomac river the national weather service classified the storm once onshore as an ef0 peak winds were estimated at 65 mph the path width was approximately 75 yards no damage was reported over the water
     14 tokens: the east santa barbara channel buoy reported a thunderstorm wind gust of 34 knots
     19 tokens: there was a report via social media of 73 inches of lake enhanced snow in 24 hours at ironwood
     27 tokens: lake enhanced snow totals over an 18hour period included six inches near harvey and five inches just south of marquette in the higher terrain of sands township
     19 tokens: twoday storm total lake effect snow accumulation included 14 inches at lac la belle and 11 inches at phoenix
     10 tokens: trees were blown down around highway 37 near fort gaines
     13 tokens: the webber springs snotel site elevation 9250 ft estimated 15 inches of snow
     10 tokens: just over two inches of rain fell in 24 hours
      8 tokens: quarter size hail was reported at fort laramie
     12 tokens: between 2 and 4 inches of snow was reported around the county
     13 tokens: estimated wind gusts of 60 mph were reported 17 miles westnorthwest of hemingford
      9 tokens: nickel to quarter size hail was reported at potter
      9 tokens: just over two inches of rain fell with thunderstorms
     73 tokens: the poteau river near poteau rose above its flood stage of 24 feet at 100 pm cst on december 27th the river crested at 3144 feet at 615 am cst on the 28th resulting in major flooding extensive flooding of cropland occurred many county roads were inundated by flood water the river remained in flood through the end of december 2015 finally falling below flood stage at 1245 pm cst on january 3rd
     10 tokens: fdk reported reduced visibilities of one quarter mile or less
     10 tokens: hwy reported reduced visibilities of one quarter mile or less
     11 tokens: heavy rain fell from thunderstorms on the morning hours of 731
     20 tokens: total rainfall from thunderstorms was six and three quarter inches with over four inches falling in a couple of hours
     14 tokens: between 1 and 3 inches of snow were reported over the 12 hour period
      9 tokens: between 1 and 4 inches of snow was reported
     10 tokens: just under three inches of heavy rain fell from thunderstorms
     21 tokens: a large tree was taken down by thunderstorm wind gusts and knocked off several wires as well onto old topton rd
     28 tokens: heavy rainfall over the solimar burn scar resulted in a significant mud and debris flow on highway 101 multiple lanes were closed due to mud and debris flows
     40 tokens: northerly winds gusting to near 50 mph combined with existing snow cover to cause areas of blowing snow with visibilities lowering to below a half mile at times new snow of less than a half inch fell during this time
     24 tokens: a light glaze of freezing rain caused hazardous travel conditions over portions of fulton county several roads were closed due to the icy conditions
      9 tokens: a tree was blown down just south of bonifay
     49 tokens: temperatures reaching daytime highs of 95 to 100 after a period of cool weather were accompanied by humid conditions with the heat index rising to a little above 100 degrees an unknown number of people suffered from heat stress heat exhaustion or dehydration as reported by hospital emergency rooms
      7 tokens: trees were blown down on beechwood drive
     10 tokens: nyg reported reduced visibilities of one quarter mile or less
     10 tokens: mrb reported reduced visibilities of one quarter mile or less
      9 tokens: between 1 and 4 inches of snow was reported
     12 tokens: ping pong ball size hail was reported two miles south of cheyenne
      5 tokens: heavy rain fell from thunderstorms
     13 tokens: trained spotters reported between 02 and 04 inches of ice around the county
     36 tokens: a tree was blown down at avery and spring street in st augustine the time of damage was based on radar the cost of damage was estimated for the event to be included in storm data
     41 tokens: snow amounts reported by spotters were 10 inches at victor and 9 inches in driggs snotel amounts were the following 25 inches at black bear 13 inches at island park 16 inches at phillips bench and 19 inches at white elephant
     11 tokens: harell road was closed at forbes street due to high water
      9 tokens: golf ball size hail was reported southwest of wheatland
      9 tokens: between 5 and 8 inches of snow was reported
      9 tokens: almost three inches of heavy rain fell with thunderstorms
     31 tokens: a lightning strike hit the pulaski county 911 center several of the computer systems equipment and radios inside the building were damaged or destroyed by the lightning strike time was estimated
     14 tokens: the east santa barbara channel buoy reported a thunderstorm wind gust of 34 knots
     13 tokens: a 24 hour storm total rainfall of 550 inches was reported near northview
     15 tokens: a tree was blown down onto a residence in the cottondale area damage was estimated
      8 tokens: trees were blown down in the youngstown area
     14 tokens: the roof was partially blown off of lighthouse church near sylvester damage was estimated
      9 tokens: a tree was blown down on spring creek road
     53 tokens: temperatures reaching daytime highs of 95 to 100 after a period of cool weather were accompanied by humid conditions with the heat index rising to a little above 100 degrees an unknown number of people suffered from heat stress heat exhaustion or dehydration as reported by hospital emergency rooms including one in mitchell
      5 tokens: heavy rain fell from thunderstorms
      5 tokens: heavy rain fell from thunderstorms
     13 tokens: the blackhall mountain snotel site elevation 9820 ft estimated 15 inches of snow
      9 tokens: there was a water rescue reported on cherokee avenue
     13 tokens: trained spotters reported between 02 and 04 inches of ice around the county
     14 tokens: the wind sensor at the torrington airport measured a peak gust of 63 mph
      9 tokens: between 5 and 8 inches of snow was reported
     58 tokens: snow began during the evening hours on the 22nd then continued heavy at times through the 23rd before ending early on the 24th snowfall totals included 277 inches in metuchen 240 inches in east brunswick 230 inches in perth amboy 190 inches in woodbridge 180 inches in milltown 170 inches in highland park and 160 inches in cheesequake
     10 tokens: quarter size hail was reported seven miles west of carpenter
     39 tokens: snowfall of 1 to 4 inches combined with southeast winds gusting around 30 mph at times to produce areas of blowing snow with visibilities lowering locally to below one mile an accumulation of 30 inches was reported at everly
     17 tokens: quarter size hail was reported 4 miles southeast of kentwood the report was relayed by broadcast media
      8 tokens: golfball size hail was reported in downtown franklinton
     17 tokens: a porch roof was blown off a home at louisiana highway 447 and courtney road in walker
     12 tokens: just under four inches of rain fell in 24 hours from thunderstorms
     13 tokens: the webber springs snotel site elevation 9250 ft estimated 17 inches of snow
      7 tokens: trees were blown down on thames street
      9 tokens: between 1 and 6 inches of snow was reported
     11 tokens: four and a half inches of heavy rain fell from thunderstorms
     17 tokens: one quarter to one half of an inch of freezing rain accrual was reported across the county
     20 tokens: a tree was blown down on highway 189 about 5 miles outside of elba power lines were also blown down
      8 tokens: a tree was blown down on highway 162
      8 tokens: trees were blown down on harvey mill road
     18 tokens: a tree was blown down onto a house near the 3200 block of crawfordville highway damage was estimated
     11 tokens: highway j was flooded and there was a high water rescue
      9 tokens: between one and two inches of snow was reported
      9 tokens: between one and two inches of snow was reported
     23 tokens: meteorologist from the 26th operational weather squadron at barksdale air force base reported halfdollar size hail on highway 171 northwest of grand cane
     40 tokens: westerly winds behind a cold front reach sustained speed of 40 to 45 mph for a few hours a gust to 56 mph was measured at storm lake airport the high winds caused spotty power line and traffic light damage
     12 tokens: a wind gust of 60 mph was measured at wunderground site kflpanam37
     12 tokens: snowfall amounts of up to 2 inches were observed across the county
     24 tokens: heavy rain and snowmelt combined to cause minor flooding on the kennebec river at skowhegan flood stage 35000 cfs which crested at 35168 cfs
     27 tokens: a tenth of an inch of freezing rain was reported in dillon a large tree limb was down on hwy 301 south near the church of god
     13 tokens: the wydot sensor at bordeaux measured a peak wind gust of 61 mph
     17 tokens: a spotter reported visibility of 300 yards at el toro rd and aliso creek in aliso viejo
     44 tokens: lake effect snow showers accumulated to between 2 and 6 inches during the evening hours of january 3rd through midmorning january 4th heaviest across northern and eastern portions of the county reduced visibilities and slick roadways led to a few accidents and school delays
     42 tokens: lake effect snow showers accumulated to between 2 and 5 inches during the evening hours of january 3rd through midmorning january 4th heaviest across western portions of the county reduced visibilities and slick roadways led to a few accidents and school delays
     28 tokens: a couple tenths of an inch of freezing rain accrual was reported across the county in addition snowfall sleet amounts of around one half of an inch fell
     11 tokens: a 59 mph wind gust was measured at the cleveland awos
      8 tokens: a tree was blown down on morris road
     11 tokens: a tree fell on a home around 2273 highway 15 south
     46 tokens: lake effect snow showers accumulated to between 2 and 4 inches during the late evening hours of january 3rd through midmorning january 4th heaviest across northwest portions of the county reduced visibilities and slick roadways led to a few accidents and school delays across the region
     14 tokens: a 24 hour storm total rainfall of 381 inches was reported near ash grove
     19 tokens: quarter size hail fell at the intersection of north street and highway 224 on the north side of nacogdoches
     14 tokens: flash flooding washed out portions of county road 19 north of carter canyon road
     16 tokens: a tree was blown down onto county road 5 near the intersection with county road 245
     13 tokens: several trees uprooted along highway 231 between the cities of cleveland and oneonta
     13 tokens: several trees uprooted and power pole downed causing structural damage to a building
     40 tokens: northerly winds gusting to near 50 mph combined with existing snow cover to cause areas of blowing snow with visibilities lowering to below a half mile at times new snow of less than a half inch fell during this time
     29 tokens: westerly winds behind a cold front reach sustained speed of 40 to 45 mph for a few hours the high winds caused spotty power line and traffic light damage
     15 tokens: law enforcement reported a funnel cloud near the intersection of us 98 and conners highway
     18 tokens: flooding reported in plaza del caribe flood waters reached the doors of the vehicles in the parking lot
     49 tokens: temperatures reaching daytime highs of 95 to 100 after a period of cool weather were accompanied by humid conditions with the heat index rising to a little above 100 degrees an unknown number of people suffered from heat stress heat exhaustion or dehydration as reported by hospital emergency rooms
     26 tokens: woodland high school lunchroom roof lifted off along with numerous trees uprooted and power lines downed in the town of woodland trees uprooted across randolph county
     49 tokens: temperatures reaching daytime highs of 95 to 100 after a period of cool weather were accompanied by humid conditions with the heat index rising to a little above 100 degrees an unknown number of people suffered from heat stress heat exhaustion or dehydration as reported by hospital emergency rooms
     49 tokens: temperatures reaching daytime highs of 95 to 100 after a period of cool weather were accompanied by humid conditions with the heat index rising to a little above 100 degrees an unknown number of people suffered from heat stress heat exhaustion or dehydration as reported by hospital emergency rooms
     20 tokens: thunderstorm winds caused tree damage including several branches blown down tree debris damaged power lines and caused a power outage
     49 tokens: reports of slideoffs and accidents along with school delays were common on january 12th due to snow and blowing snow snow accumulations across the country generally ranged between 2 and 3 inches the accumulating snow combined with temperatures falling into the teens and reduced visibilities created difficult driving conditions
     49 tokens: reports of slideoffs and accidents along with school delays were common on january 12th due to snow and blowing snow snow accumulations across the country generally ranged between 2 and 3 inches the accumulating snow combined with temperatures falling into the teens and reduced visibilities created difficult driving conditions
     23 tokens: six utility poles bent on 4000n between 6000e and 8000e and a half dozen trees downed radar indications are this was a microburst
     13 tokens: a tree was blown down along shell point road and spring creek highway
      5 tokens: heavy rain fell from thunderstorms
     19 tokens: trees were taken down at the intersection of 202 and route 10 in morris plains due to thunderstorm winds
     10 tokens: there were 2 reports of trees down in quitman county
      7 tokens: trees were blown down on highway 216
      8 tokens: trees were blown down along val del road
     15 tokens: a public report indicated pea to penny sized hail near majors field in greenville tx
     49 tokens: reports of slideoffs and accidents along with school delays were common on january 12th due to snow and blowing snow snow accumulations across the country generally ranged between 1 and 3 inches the accumulating snow combined with temperatures falling into the teens and reduced visibilities created difficult driving conditions
     11 tokens: golf ball size hail was reported six miles west of hemingford
     25 tokens: heavy rain and snow melt combined to cause minor flooding on the presumpscot river at westbrook flood stage 150 ft which crested at 1717 ft
     13 tokens: trees and wires down on western ave in morristown due to thunderstorm winds
      5 tokens: heavy rain fell from thunderstorms
     10 tokens: trees and power lines were blown down on highway 52
     23 tokens: downed tree blocking the road along the 500 block of ridgewood road radar estimated winds in excess of 60 mph in the vicinity
     17 tokens: a tree was blown down on county road 58 near the border of franklin and gulf counties
     12 tokens: a tree was blown down at courtney grade road near puckett road
     10 tokens: a tree was taken down due to thunderstorm wind gusts
     36 tokens: near lake darby 34 inches of snow was measured a social media post from north of grove city showed that 3 inches of snow fell there the port columbus international airport recorded 23 inches of snow
      5 tokens: reported by ew8188 in frederick
     13 tokens: the squaw peak raws recorded a gust to 77 mph at 130239 pst
      9 tokens: hail from a thunderstorm was estimated at 75 inches
     21 tokens: liberty county dispatch reported power lines down at the intersection of highway 84 and leroy coffer highway due to gusty winds
     24 tokens: liberty county dispatch reported a tree and a power line down at the intersection of highway 17 and phillips road due to gusty winds
     16 tokens: the juniper creek raws recorded wind chill temperatures ranging from 16f to 24f during this interval
     22 tokens: the flynn prairie raws recorded several gusts exceeding 57 mph during this interval the peak gust was 67 mph at 170713 pst
     13 tokens: dutch harbor asos experienced a peak gust to 82 knots during this time
     10 tokens: several large trees were taken down due to thunderstorm winds
      6 tokens: thunderstorm winds took down numerous trees
     19 tokens: three inchesof snow was measured 5 miles south of heath a spotter measured 26 inches of snow in johnstown
     17 tokens: northeast of troy a spotter measured 28 inches of snow southwest of town odot measured 2 inches
      5 tokens: reported by dw3148 falling waters
     10 tokens: w99 reported reduced visibilities of one quarter mile or less
     17 tokens: the umpqua offshore buoy indicated heavy swell that likely generated high surf along the southern oregon coast
     17 tokens: the port orford buoy indicated heavy swell that likely generated high surf along the southern oregon coast
     17 tokens: the port orford buoy indicated heavy swell that likely generated high surf along the southern oregon coast
     15 tokens: offshore buoys indicated heavy swell that likely generated high surf along the southern oregon coast
     16 tokens: a power line was blown down at highway 71 and industrial road monetary damage was estimated
      8 tokens: weatherflow measured thunderstorm wind gust of 64 mph
     14 tokens: an nws employee measured hail to the size of 125 inches during a thunderstorm
     16 tokens: in plymouth warren avenue was closed at the entrance to plymouth beach due to coastal flooding
     30 tokens: a coop observer reported 4 inches of snow new snow in waterville wa the new snow fell between noon on the 19th and 8am on the 20th of january 2015
     12 tokens: several pictures of golf ball size hail was posted to social media
     15 tokens: a measured wind gust of 56 knots occurred with a thunderstorm at a weatherflow site
     25 tokens: em reported 56 inches of snow in marysville early thursday morning the southwest half of marshall county received 57 inches of snow from this storm
    100 tokens: isolated thunderstorms developed and moved north across the greater phoenix metropolitan area during the late afternoon and early evening hours on july 18th one of the stronger storms moved across phoenix sky harbor airport and generated gusty and damaging winds according to a county official with the city of phoenix at 1825mst damaging thunderstorm outflow winds pulled the roof off of an apartment complex located at the intersection of 26th street and van buren street the apartment complex was about 1 mile northwest of the airport peak wind gusts were estimated to be about 70 knots no injuries were reported
      8 tokens: near circleville an inch of snow was measured
      7 tokens: trees were blown down east of somerset
      8 tokens: three inches of snow was measured near wapakoneta
     63 tokens: a coop observer reported 143 inches of new snow from this passing storm system other new snow amounts associated with this storm system include 82 inches at mazama and 5 inches 12 miles northwest of entiat wa the snow started near 10 pm wednesday reached warning criteria amounts near 10am on thursday the 21st before diminishing substantially early friday morning of january 222016
     11 tokens: two and a half inches of snow was measured near greenville
     14 tokens: a report from east of pickerington showed that 2 inches of snow had fallen
     23 tokens: heavy snotel amounts were 9 inches at dollarhide 11 inches at galena summit 9 inches at hyndman and 10 inches at swede peak
     15 tokens: snowfall amounts were estimated to be between 20 and 30 inches based on reports nearby
      9 tokens: just over two inches of rain fell from thunderstorms
     11 tokens: there were numerous reports of downed trees in the haysi vicinity
     16 tokens: broadcast media reported a tree blown down on a home 2 miles north of mead wa
     35 tokens: two weather stations along the south washington coast reported a few hours of sustained winds between 49 and 51 mph in the early morning a peak gust of 61 mph was measured at cape disappointment
     14 tokens: a member of the public reported 6 inches of snow in east wenatchee wa
     14 tokens: snowfall totaled up to 243 inches near keyser and 235 inches near short gap
     18 tokens: thunderstorm winds caused tree damage including several branches blown down the winds were accompanied by penny size hail
     10 tokens: northwest of chillicothe a half inch of snow was measured
     13 tokens: spotters in trenton and southeast of oxford both measured 2 inches of snow
     47 tokens: numerous trees were downed from a storm that would go on to produce a tornado at nugget lake see separate entry for the tornado local law enforcement officials also indicated a trailer house was blown over at highway 10 and 490th avenue in the town of salem
     11 tokens: snow accumulated 2 to 5 inches including 40 inches near pipestone
     21 tokens: snow amounts up to 15 inches were measured across warren county cocorahs station mcminnville 85 ese measured 14 inches of snow
      9 tokens: trees and wires taken down due to thunderstorm winds
     32 tokens: areas of both low visibilities from fog and icy surfaces from freezing drizzle combined to make travel hazardous from the night of january 6th to the early daylight hours of january 7th
     12 tokens: a coop observer reported 71 inches of snow in holden village wa
     41 tokens: measurements and estimates of 4 to 8 inches of snow were received across pottawatomie county the heaviest snow occurred across the northwest half of the county 6 inches was measured by coop in blaine in the northern parts of the county
     32 tokens: areas of both low visibilities from fog and icy surfaces from freezing drizzle combined to make travel hazardous from the night of january 6th to the early daylight hours of january 7th
     32 tokens: areas of both low visibilities from fog and icy surfaces from freezing drizzle combined to make travel hazardous from the night of january 6th to the early daylight hours of january 7th
     12 tokens: several trees and power wires taken down due to thunderstorm wind gusts
     18 tokens: snow accumulated 2 to 5 inches over the southeastern part of yankton county including 32 inches at yankton
      6 tokens: law enforcement reported several trees down
     12 tokens: power lines were blown down on the 4100 block of jordan ave
     12 tokens: local media reported trees and power lines were down in oakland township
     10 tokens: five people died in las vegas of heat related causes
     10 tokens: a woman died of heat related causes in death valley
      7 tokens: eight homes were flooded in dolan springs
     18 tokens: over 9 inches of snow was reported in alamo the heavy wet snow resulted in scattered power outages
      5 tokens: emergency management reported trees down
      8 tokens: a trained spotter reported trees and wires down
      5 tokens: the public reported trees down
     23 tokens: a tree was down on south shore road near old forge in the town of webb blocking the roadway due to thunderstorm winds
      8 tokens: the sacramento wash flooded the oatman topock highway
     11 tokens: a wind gust to 65 mph was reported at joplin 3sw
     37 tokens: frequent wind gusts of 40 to 50 mph resulted in numerous trees down across vance county including on homes cars and power lines numerous customers lost power in vance county as a result of the strong winds
     28 tokens: the stuart airport awos ksua recorded peak wind gusts of 35 knots as a strong thunderstorm crossed the coast and continued across the intracoastal and nearshore atlantic waters
     11 tokens: one to three inches of snowfall with some light ice accretion
      6 tokens: emergency management reported numerous trees down
     16 tokens: highway 95 was impassable from vidal junction to mile marker 24 due to flooding and debris
     35 tokens: usaf wind tower 0300 recorded a peak gust of 41 knots from the southwest as a strong thunderstorm exited merritt island and continued across the banana river barrier island and into the nearshore atlantic waters
     32 tokens: the awos at the new smryna beach airport kevb reported winds up to 38 knots from the westsouthwest as a strong thunderstorm exited the coast and continued over the nearshore atlantic waters
     37 tokens: the vero beach airport asos kvrb measured a gust to 34 knots from the southsouthwest as a line of strong thunderstorms exited the mainland and continued rapidly east across the intracoastal waterways barrier islands and nearshore atlantic
     14 tokens: eight inches of snow was reported in barryton 73 inches was reported in sylvester
     12 tokens: fourteen inches of snow fell in riverdale ten inches fell in alma
     14 tokens: the interstate 80 at grassey sensor reported a peak wind gust of 58 mph
     11 tokens: quarter size hail was reported near p highway near rocky point
     34 tokens: an estimated 35 inches of rain fell causing water to flow over roadways at highway 132 and hayward stabe road and rupe imo and skeleton wood and imo and wheat capital and highway 132
     14 tokens: thunderstorm winds snapped a one to two foot diameter tree at chaparral high school
     10 tokens: mud and debris were on interstate 15 at exit 64
     12 tokens: a wind gust to 74 mph was reported at gallatin gateway 16se
     58 tokens: torrential rainfall of 12 to 15 inches caused widespread flash flooding across the county the heavy rains caused at least 8 dams to breach in cumberland county numerous roads were closed due to flooding including portions of interstate 95 numerous homes and businesses were flooded as well with numerous water rescues from people trapped in homes and vehicles
     19 tokens: measured wind gusts of 40 to 45 mph knocked down isolated tree limbs that resulted in isolated power outages
     15 tokens: a tree was reported down on black hollow road in arlington due to thunderstorm winds
     16 tokens: a trained weather spotter observed pennysized hail falling near state roads 50 and 429 in ocoee
     24 tokens: a foot of snow was reported in comstock park there were numerous reports of ten to eleven inches of snow across southern kent county
     22 tokens: local emergency management relayed a report of a tree down southwest of somerset shingles were also blown off of a roof nearby
      8 tokens: carpet barn road was closed due to flooding
     10 tokens: highway e near barker creek was closed due to flooding
      6 tokens: flash flooding covered kelso cima road
     24 tokens: approximately 10 vehicles were stuck in flood waters at david drive and river drive sections of needles highway near capri road also washed away
     20 tokens: a light pole was blown down on the neil street onramp to westbound i74 in champaign at around 1400 cst
      6 tokens: lightning set fire to a house
     13 tokens: street flooding was reported at orange street and market street near hogans creek
     10 tokens: a trained spotter measured a wind gust of 70 mph
      9 tokens: several large tree limbs were blown down in gainesville
     15 tokens: an nws employee reported heavy freezing rain causing very icy conditions along the glenn highway
     39 tokens: frequent wind gusts of 30 to 40 mph resulted in multiple reports of trees down across person county including on homes cars and power lines some customers lost power in person county as a result of the strong winds
     15 tokens: a wind gust to 67 mph was reported at bynum 13w the dellwo mcscn site
     57 tokens: a short tornado track was determined along cr53 just north of its intersection with cr26 this was a concentrated area of damage with large trees uprooted and snapped near a residence one of the trees had a small amount of debarking with large limbs removed this tornado was rated ef0 with max winds estimated at 85 mph
     19 tokens: numerous trees were blown down in the area along with numerous power outages reported by the wiregrass electric coop
     14 tokens: a severe thunderstorm producing winds estimated near 60 mph knocked down trees near karthus
     13 tokens: wires were reported down on route 41 in sheffield due to thunderstorm winds
     10 tokens: golf ball size hail fell 1 mile south of alanreed
     14 tokens: trees were blown down and roofs and siding were damaged in the laughlin area
     18 tokens: winds caused isolated damage removing the roof from a trailer home a wall came down with the roof
     86 tokens: this was the second tornado to develop in northwest houston county spawned by the same parent thunderstorm after initially developing in houston county the tornado crossed into extreme southeast dale county before moving back into houston county in the murphy mill road area there was a small area of ef1 damage along murphy mill road where many large diameter pine trees were snapped and uprooted this tornado likely lifted before reaching us highway 231 this tornado was rated ef1 with max winds estimated near 100 mph
     14 tokens: bar pilot dispatcher reported a brief waterspout in the columbia river no damage reported
     12 tokens: public reported thor road flooding near track road just south of pelion
      7 tokens: penny size hail was reported via mping
     33 tokens: usaf wind tower 1007 recorded a peak wind gust over 35 knots near playalinda beach as a strong squall line exited the peninsula and raced eastward across the intracoastal and nearshore coastal waters
     23 tokens: the asos at vero beach airport kvrb recorded peak winds of 38 knots as a strong squall line passed by and continued offshore
     11 tokens: estimated wind gust of 60 mph reported north of hazel green
     10 tokens: the wind gust was measured by a davis weather system
     17 tokens: a few dime to quarter sized hailstones fell along with brief heavy rain and very strong winds
     10 tokens: a large tree was downed in dartmouth blocking reed road
     16 tokens: a tree was snapped off at its base and the fordville scale house was blown down
     17 tokens: the grand canyon airport asos measured a peak wind gust of 59 mph at 207 pm mst
     21 tokens: trees were toppled and power lines brought down by wind gusts estimated at up to 60 mph the time is estimated
     93 tokens: weather observers across cumberland county reported snowfall amounts of 3 to 5 inches winds gusting to between 45 and 55 mph created whiteout conditions from 1000 to 1400 cst snowcovered roads and poor visibility due to falling and blowing snow contributed to numerous traffic accidents across the county especially on i57 a fatal traffic accident occurred on il130 south of greenup when a semi truck collided with another vehicle a 57 yearold male in the vehicle was killed in addition many trees and power lines were blown down resulting in scattered power outages
     11 tokens: local media relayed a report of roof damage to a home
     19 tokens: heavy rainfall over southern sections of alexandria produced flooded roadways some roadways had 2 feet of water over them
     39 tokens: strong north winds behind a cold front pushed the tide levels to or below 1 mllw for 2 tide cycles at sabine pass the tide fell to a lowest level of 19 mllw during the morning of the 24th
     21 tokens: two to three inches of snow and gusty southeast winds up to 25 mph created snow covered roads and hazardous travel
     68 tokens: weather observers across edgar county reported snowfall amounts of 4 to 6 inches winds gusting to between 40 and 50 mph created whiteout conditions from 1000 to 1300 cst snowcovered roads and poor visibility due to falling and blowing snow contributed to numerous traffic accidents across the county especially on us150 and us36 in addition many trees and power lines were blown down resulting in scattered power outages
     11 tokens: a picutre of quarter size hail was received through social media
     13 tokens: multiple power poles were knocked down along patton road relayed via social media
      8 tokens: power lines were knocked down on huntsville road
     20 tokens: a large tree was knocked down and blocking the road on mt olive drive at the intersection of section road
     14 tokens: a tree was knocked down along al 277 in stevenson time estimated by radar
      7 tokens: funnel cloud reported did not touch down
      9 tokens: strong winds hit the grand forks air force base
      8 tokens: a tree was knocked down onto a home
     21 tokens: a 30 by 40 foot section of metal roofing was blown onto the intersection of miller and gray roads in gurley
     21 tokens: two to three inches of snow and gusty southeast winds up to 25 mph created snow covered roads and hazardous travel
     21 tokens: two to three inches of snow and gusty southeast winds up to 25 mph created snow covered roads and hazardous travel
      8 tokens: trees were knocked down on paint hollow road
      7 tokens: trees were knocked down on bellview road
      7 tokens: trees were knocked down on blanche road
     12 tokens: large trees were downed by severe storm winds in the spring area
     22 tokens: there was street flooding in the town of coldspring there was also water inundating highway 59 south of the town of goodrich
      8 tokens: trees were blown down across county road 65
     22 tokens: a social media post from haydenville showed that 7 inches of snow fell there the cooperative observer in laurelville measured 4 inches
     10 tokens: winds damaged a lightweight tin roof fences and utility poles
     42 tokens: polk county fire rescue reported that multiple 911 calls were received of a tornado briefly touching down in the lake wales area a few trees were found knocked over and two power poles were partial damaged but no structural damage was reported
      8 tokens: several trees uprooted in the town of vincent
     43 tokens: a spotter west of hebron reported 4 inches of snow in that area another near union had 32 inches while a third spotter and broadcast media reported 3 inches fell near burlington and francisville respectively the cvg airport recorded 27 inches of snow
     10 tokens: several trees uprooted in and near the cedar bluff community
     14 tokens: six large trees were knocked down along cr 23 between red bay and vina
     41 tokens: the cocorahs observer southwest of bethel measured 5 inches of snow a spotter north of williamsburg measured 45 inches of snow a nws employee in goshen had 4 inches of accumulation while the odot county garage near amelia had 35 inches
     25 tokens: the airport at kcvg measured a 47 mph gust as did a cwop station in burlington numerous trees were blown down causing significant power outages
     10 tokens: a peak wind of 52 kt 60 mph was reported
     25 tokens: the wind sensor at the rawlins airport measured sustained winds of 40 mph or higher with a peak gust of 60 mph at 151253 mst
     13 tokens: several trees uprooted along highway 43 between tierece road and old fayette road
     18 tokens: the observer near new carlisle measured 2 inches of snow another observer north of springfield measured an inch
     12 tokens: several trees uprooted and power lines downed in the coates bend community
     40 tokens: a public report southeast of washington court house showed that 6 inches of snow fell there a social media post from new martinsburg had 5 inches of snow while the cooperative observer south of washington court house measured 4 inches
     10 tokens: the cooperative observer near alpine measured 38 inches of snow
     37 tokens: a nws employee near ogden measured 3 inches of snow another employee north of wilmington and the nws office south of town both measured 23 inches while the odot county garage measured 13 inches west of burtonville
     12 tokens: the odot county garage west of springfield measured an inch of snow
     12 tokens: numerous trees uprooted and power lines downed in the city of wetumpka
     16 tokens: the nedor sensor at dalton on highway 385 measured sustained winds of 40 mph or higher
     17 tokens: the nedor sensor at interstate 80 mile post 50 measured sustained winds of 40 mph or higher
      9 tokens: trace amounts of ice were reported around the county
     11 tokens: this wind gust was measured at a lavaca bay mesonet site
      3 tokens: no damage reported
     16 tokens: the public estimated 075 inch hail in wind point and relayed their report via social media
     17 tokens: a home weather station near new port richey measured a wind gust to 48 knots 55 mph
     20 tokens: rainfall totals generally ranged from 5 to 9 inches across the county franklin airport fkn reported 878 inches of rain
     32 tokens: snow melt and around an inch of rainfall produced an ice jam on the kennebec river at augusta flood stage 120 ft resulting in minor flooding and a crest of 1435 ft
     21 tokens: blizzard conditions were estimated based on observations nearby snowfall reports between 19 and 39 inches were received across southeastern montgomery county
     23 tokens: the patrick air force base awos kcof recorded a peak gust of 34 knots from the northwest as a strong thunderstorm moved offshore
     16 tokens: flash flooding was reported at stevens and hazelwood in borger barricades were setup in those locations
     18 tokens: a home weather station located on indian shores beach measured a wind gust to 39 knots 45 mph
     15 tokens: a home weather station near belleair measured a wind gust of 38 knots 44 mph
     90 tokens: torrential rainfall of 8 to 12 inches caused widespread flash flooding across the county additional heavy rainfall upstream caused moderate flooding along the cape river basin flooding damaged approximately 744 structures throughout the county resulting in 91 million in property damage numerous streets and roads were reported flooded including interstate 95 near dunn with several washouts reported on secondary roads the flooding resulted in 1 direct fatality a 74 year old man died when he drove past a barricade near carolina drive and was swept away into a flooded creek
     12 tokens: visibility was estimated to be around onequarter mile based on observations nearby
     17 tokens: a usgs rain gauge near lakewood ranch measured 752 inches of rain in a 6hr time period
     34 tokens: rainfall totals generally ranged from 3 to 6 inches across the county stampers reported 527 inches of rain healys 1 sse reported 426 inches of rain remlik 1 n reported 371 inches of rain
     58 tokens: heavy rainfall of 7 to 10 inches caused widespread flash flooding across the county roads all throughout the county were closed due to flooding numerous homes and businesses were flooded as well with numerous water rescues from people trapped in homes and vehicles flooding damaged approximately 2503 structures throughout the county resulting in 655 million in property damage
    149 tokens: torrential rainfall of 10 to 14 inches caused widespread flash flooding across the county additional rainfall upstream caused alltime record major flooding along the black river near tomahawk flooding damaged approximately 657 structures throughout the county resulting in 41 million in property damage and and at least 25 million in crop damage numerous roads were flooded all througout the county us 701 was flooded going into both newton grove and garland and nc 24 was flooded between turkey and clinton nc 24 was closed at the county line in autryville with water flowing over the bridge bonnetsville road between salemburg and the avenue was washed out washedout areas were also on edmond matthis road bass lake road mount moriah church road five bridge road fleet cooper road and numerous others numerous homes and businesses were flooded as well with numerous water rescues from people trapped in homes and vehicles
     44 tokens: heavy rainfall of 9 to 12 inches caused widespread flash flooding across the county numerous roads were closed due to flooding numerous homes and businesses were flooded as well flooding damaged approximately 433 structures throughout the county resulting in 31 million in property damage
    197 tokens: torrential rainfall of 9 to 12 inches caused widespread flash flooding across the county additional 5 to 6 inches of rainfall upstream caused alltime record major flooding along the neuse river basin flooding damaged approximately 1160 structures throughout the county resulting in 247 million in property damage and 20 million in crop damage numerous streets and roads were reported flooded causing sinkholes to form including a large sinkhole at mile marker 334 on interstate 40 the flooding resulted in 4 direct fatalities a 19 year old female died when her car was swept away by flood waters into hannah creek on interstate 95 at mile marker 83 near four oaks a 30 year old male died when his vehicle was swept off the road when attempting to drive through flood waters on cornwallis road near nc highway 42 a 67 year old male died when his vehicle was swept away when attempting to go across a floodcovered bridge on highway 210 near galilee road a 51 year old male died when he was …
     21 tokens: the tidal gauge at annapolis indicated moderate flooding water levels through the storm drains approached businesses on dock street in annapolis
     18 tokens: heavy rainfall of 5 to 6 inches caused widespread flash flooding across the county with numerous road closures
    130 tokens: torrential rainfall of 9 to 12 inches caused widespread flash flooding across the county additional heavy rainfall upstream caused major flooding along the tar river basin and along contentnea creek flooding damaged approximately 1174 structures throughout the county resulting in 323 million in property damage and 20 million in crop damage numerous streets and roads were reported flooded with several washouts reported on secondary roads the flooding resulted in 2 direct fatalities a 51 year old female died when the car she was driving was swept off the road in rushing floodwaters along nc highway 581 between renfro road and rock ridge a 65year old male died when his car was swept away by swift water in a creek near the 6400 block of good news church road near saratoga
     41 tokens: rainfall totals generally ranged from 5 to 11 inches across the county benns church 1 wsw reported 1038 inches of rain smithfield reported 883 inches of rain comet reported 870 inches of rain carrollton 2 ese reported 668 inches of rain
     27 tokens: rainfall totals generally ranged from 7 to 10 inches across the county norfolk international airport orf reported 924 inches of rain norview reported 910 inches of rain
     22 tokens: heavy rainfall caused street flooding in rhinelander mainly west of the wisconsin river in the area of davenport street and maple street
     28 tokens: rainfall totals generally ranged from 3 to 6 inches across the county mollusk 1 se reported 398 inches of rain kilmarnock 1 sw reported 311 inches of rain
     30 tokens: rainfall totals generally ranged from 1 inch to 3 inches across the county louisa 1 nnw reported 151 inches of rain zion crossroads 1 nne reported 114 inches of rain
     12 tokens: rainfall totals generally ranged from 2 to 4 inches across the county
     36 tokens: the mount pleasant police department reported longpoint road near needlerush parkway closed due to saltwater flooding at 748 am est a maximum tide level of 771 ft mllw was recorded at the charleston harbor tide gauge
      9 tokens: trace amounts of ice were reported around the county
     11 tokens: severe storm winds caused tree damage in the town of deanville
     18 tokens: there were numerous reports of trees and power lines down throughout dickenson county especially between clincho and haysi
     37 tokens: wind chills of 35 to 40 below zero were common across olmsted county on the morning of january 17th the lowest recorded wind chill by the automated weather observing equipment at the rochester airport was 41 below
     12 tokens: there were a few trees down in the county including in grundy
     12 tokens: flash flooding was reported along john b carter road southeast of fayetteville
     19 tokens: blizzard conditions were reported at reagan national airport snowfall reports were between 18 and 26 inches across arlington county
      5 tokens: thunderstorm winds damaged a fence
     10 tokens: just over two inches of rain fell due to thunderstorms
     13 tokens: several roads closed due to flash flooding with some debris washed into roadways
     11 tokens: a trained spotter estimated 60 mile per hour winds in bridgeport
     20 tokens: a public report of quarter size hail in crossroads was relayed by broadcast media event time was estimated by radar
      8 tokens: snowfall totaled up to 225 inches near dayton
     13 tokens: thunderstorm winds caused tree damage including a large tree blown across a road
     12 tokens: a brief waterspout over northern sarasota bay was reported by the public
     15 tokens: thunderstorm winds blew the roof off a mobile home and also blew down power lines
     14 tokens: a wind gust of 58 mph was recorded at the judith gap dot site
     11 tokens: windows were knocked out at the tom steed reservoir bait shop
     10 tokens: a gust of 61 mph was recorded across the area
     12 tokens: a wind gust of 59 mph was recorded at the baker airport
     11 tokens: thunderstorm winds destroyed two grain bins and damaged a light pole
     70 tokens: a nws survery crew found 25 homes that sustained damage mainly to pool cages roofs garages and carports numerous tree limbs were snapped at the top of the trees with a few being uprooted a few business signs in the area were damaged or destroyed sporadic damage was found along the 3 mile path likely indicating the tornado may have lifted off the ground a time or two before dissipating
     11 tokens: branches were reported down in the northern end of pocahontas county
      9 tokens: the butler awos reported a wind chill of 12
     32 tokens: two to six inches of snow fell across the region the larger totals were in higher elevations and in northern sections of the county an isolated report or two exceeded 7 inches
     27 tokens: less than an inch to two inches of snow fell across the region the larger totals were in higher elevations and in northern sections of the county
     13 tokens: the asos at columbia metro airport reported a wind gust of 51 mph
      9 tokens: almost three inches of heavy rainfall fell with thunderstorms
      7 tokens: several trees downed due to thunderstorm winds
     60 tokens: between 18 and 30 inches of snow fell fell near the sierra crest and in the higher elevations south of lake tahoe at lake level periods of rain or rain mixed with snow cut down totals greatly with only 9 inches of snow in tahoma and just under 6 inches at the south lake tahoe airport and in tahoe city
     18 tokens: nickel to quarter sized hail fell and nearly covered the ground over two inches of rain also fell
     11 tokens: two and a half inches of rain fell due to thunderstorms
      7 tokens: over two inches fell due to thunderstorms
      8 tokens: several trees taken down due to thunderstorm winds
      9 tokens: hail with a thunderstorm was measured at 75 inches
      9 tokens: hail was measured at 34 inch from a thunderstorm
     28 tokens: snowfall amounts of 6 to 7 inches were measured above the 5000 foot level wind gusts of 25 to 35 mph produced areas of blowing and drifting snow
     26 tokens: snowfall amounts of 6 to 10 inches were measured across the area wind gusts of 20 to 35 mph produced areas of blowing and drifting snow
     27 tokens: snowfall amounts of 1 to 3 inches were measured across the area wind gusts of 20 to 30 mph produced some areas of blowing and drifting snow
     17 tokens: county comms reported multiple trees and power lines blown down near highway 74 and old fort rd
     31 tokens: heavy rain resulted in flash flooding at a couple of locations in asheboro colony road and the intersection of patton avenue and thomas street were briefly closed due to high water
     14 tokens: two trees were blown down at a residence approximately 4 miles northnortheast of enfield
      8 tokens: one tree was reported down on morganton road
     24 tokens: four to six inches of snow fell across the region the larger totals were in higher elevations and in northern sections of the county
     24 tokens: the wydot sensor at dana ridge measured sustained winds of 40 mph or higher with a peak gust of 60 mph at 291430 mst
      7 tokens: several trees downed due to thunderstorm winds
     11 tokens: trees and wires downed on centerville road due to thunderstorm winds
     38 tokens: snowfall amounts between 25 and 38 inches were received across frederick county snowfall totaled up to 38 inches near gainesboro a snowfall report of 350 inches was received near stephens city and 245 inches was reported in middletown
      8 tokens: quarter to ping pong ball sized hail fell
      7 tokens: flooding was reported on mirror lake drive
     12 tokens: lightning struck a tree which fell on a house damaging several rooms
     32 tokens: two to six inches of snow fell across the region the larger totals were in higher elevations and in northern sections of the county an isolated report or two exceeded 7 inches
     10 tokens: several trees taken down due to thunderstorm winds in bridgeton
     10 tokens: numerous trees taken down due to thunderstorm winds in fairton
     10 tokens: several trees down on straughn mill road near interstate 295
      9 tokens: almost three inches of rain was measured with thunderstorms
      7 tokens: trees taken down by thunderstorm wind gusts
      6 tokens: a house was struck by lightning
     10 tokens: a 63 mph wind gust was measured from a thunderstorm
      9 tokens: a funnel cloud was observed at 9148 centreville road
     13 tokens: public reported heavy rainfall of 211 inches so far beginning time radar estimated
     13 tokens: power pole and wires taken down due to thunderstorm winds trees also downed
      8 tokens: hail was estimated at 1 inch in diameter
     37 tokens: the department of highways relayed a report of flash flooding at highway 20 four miles west of loup loup summit a debris flow went across the road roughly 6 miles east of twisp wa on highway 20
      8 tokens: trees and wires downed on bunker hill road
     12 tokens: a 53 mph thunderstorm wind gust was measured by a weatherflow site
     19 tokens: severe thunderstorm wind gusts around 60 mph downed trees along south mt pleasant avenue between monroeville and highway 84
      9 tokens: severe thunderstorm wind gusts downed trees along oakley road
     12 tokens: numerous wires were reported down at route 27 at davils mill rd
     15 tokens: fd reported a tree blown down on a home causing significant damage on lakeside loop
     18 tokens: one shallow rooted oak tree was blown over wind speeds were estimated to be 60 miles per hour
     22 tokens: snow accumulated 3 to 6 inches including 60 inches near pukwana the snow caused slippery roads which resulted in a few accidents
     48 tokens: a bow echo producing winds estimated at 80 mph produced a corridor of wind damage along and north of straughn school road which is northeast of andalusia numerous trees were uprooted with power lines also downed a tree fell onto a home on country drive causing considerable damage
     11 tokens: a tree fell and damaged utility equipment off of mcdaniel road
     36 tokens: social media reports of at least a couple of dozen trees blown down across far northern iredell county with one on a house causing a brief entrapment the roof of a gas station was also damaged
     47 tokens: county comms and highway patrol reported multiple trees blown down across roads in southwest greenwood county from the intersection of alexander and briarwood rd south to just north and east of bradley part of a roof was reported to be damaged on breezewood rd east of bradley
     39 tokens: westerly winds behind a cold front reach sustained speed of 40 to 45 mph for a few hours a gust to 63 mph was measured near wessington springs the high winds caused spotty power line and traffic light damage
     15 tokens: a few trees were blown down in the cranfield liberty road area south of cranfield
      9 tokens: public reported quarter size hail on sam dee rd
     63 tokens: the stream gauge on potomac river at point of rocks reached flood stage the gauge peaked at 16 feet at 0015 est the parking lots at both the mckimmey and brunswick boat ramps began to flood flooding of an agricultural field adjacent to the mckimmey boat ramp occurred about half the lower parking lot of the point of rocks boat ramp also flooded
     10 tokens: old charles town road was closed near the opequon creek
     17 tokens: spiky hail around ping pong ball size was reported near the intersection of highways 82 and 319
     39 tokens: light snow began around noon on january 17th then continued through the afternoon hours storm totals included 23 inches near little egg harbor 15 inches in berkeley township 13 inches in brick township and 10 inches in jackson township
      6 tokens: several roads flooded in the area
     20 tokens: dallas center fire department reported hail just under ping pong ball in size mixed with larger amounts of smaller hail
     39 tokens: westerly winds behind a cold front reach sustained speed of 40 to 45 mph for a few hours a gust to 54 mph was measured at le mars the high winds caused spotty power line and traffic light damage
     23 tokens: trees and power lines were blown down in ocilla in addition a house fire resulted from a downed power line damage was estimated
     11 tokens: trees damaged an suv and a mobile home damage was estimated
     10 tokens: spotter reported brief 34 inch hail off old river rd
     10 tokens: spotters reported around half of an inch across the county
     39 tokens: while lingering light snow after a blizzard produced little additional accumulation continuing strong north to northwest winds produced blowing and drifting of the heavy new snowpack through the morning hours difficult to impossible travel conditions slowly began to ease
     12 tokens: a large cedar tree about two feet in diameter was reported down
     23 tokens: multiple large trees uprooted and blown onto power lines resulting in toppled power lines all which resulted in blockage of the entire roadway
     12 tokens: a large tree was reported down across calhoun st in west baltimore
     44 tokens: county comms and public via social media reported multiple trees blown down in the uptown and central city area the damage was centered in the elizabeth neighborhood where multiple trees fell on vehicles and one tree fell on an apartment building along greenway ave
     13 tokens: the wydot sensor at strouss hill measured peak wind gusts of 58 mph
     27 tokens: the wydot sensor at interstate 80 mile post 249 measured sustained winds of 40 mph or higher with a peak gust of 60 mph at 151355 mst

전처리된 처음 몇 개의 훈련 문서를 표시합니다.

documentsTrain(1:5)
ans = 
  5×1 tokenizedDocument:

     7 tokens: large tree down between plantersville and nettleton
    37 tokens: one to two feet of deep standing water developed on a street on the winthrop university campus after more than an inch of rain fell in less than an hour one vehicle was stalled in the water
    13 tokens: nws columbia relayed a report of trees blown down along tom hall st
    13 tokens: media reported two trees blown down along i40 in the old fort area
    14 tokens: a few tree limbs greater than 6 inches down on hwy 18 in roseland

문서를 시퀀스로 변환하기

문서를 LSTM 네트워크에 입력하려면 단어 인코딩을 사용하여 문서를 숫자형 인덱스로 구성된 시퀀스로 변환하십시오.

단어 인코딩을 만들려면 wordEncoding 함수를 사용하십시오.

enc = wordEncoding(documentsTrain);

다음 변환 단계는 문서가 모두 같은 길이가 되도록 채우고 자르는 것입니다. trainingOptions 함수는 입력 시퀀스를 자동으로 채우고 자르는 옵션을 제공합니다. 그러나 이러한 옵션은 단어 벡터로 구성된 시퀀스에 적합하지 않습니다. 이러한 옵션을 사용하는 대신 시퀀스를 수동으로 채우고 자릅니다. 단어 벡터로 구성된 시퀀스를 왼쪽을 채우고 자르면 훈련이 향상될 수 있습니다.

문서를 채우고 자르려면 먼저 목표 길이를 선택하고, 목표 길이보다 긴 문서는 자르고 목표 길이보다 짧은 문서는 왼쪽을 채우십시오. 최상의 결과를 위해 목표 길이는 다량의 데이터가 버려지지 않을 만큼 짧아야 합니다. 적당한 목표 길이를 찾으려면 훈련 문서의 길이를 히스토그램으로 표시해 보십시오.

documentLengths = doclength(documentsTrain);
figure
histogram(documentLengths)
title("Document Lengths")
xlabel("Length")
ylabel("Number of Documents")

대부분의 훈련 문서가 75개 미만의 토큰을 갖습니다. 이 값을 자르기와 채우기의 목표 길이로 사용합니다.

doc2sequence를 사용하여 문서를 숫자형 인덱스로 구성된 시퀀스로 변환합니다. 시퀀스의 길이가 75가 되도록 자르거나 왼쪽을 채우려면 'Length' 옵션을 75로 설정하십시오.

XTrain = doc2sequence(enc,documentsTrain,'Length',75);
XTrain(1:5)
ans = 5×1 cell array
    {1×75 double}
    {1×75 double}
    {1×75 double}
    {1×75 double}
    {1×75 double}

동일한 옵션을 사용하여 검증 문서를 시퀀스로 변환합니다.

XValidation = doc2sequence(enc,documentsValidation,'Length',75);

LSTM 네트워크 만들고 훈련시키기

LSTM 네트워크 아키텍처를 정의합니다. 네트워크에 시퀀스 데이터를 입력하려면 시퀀스 입력 계층을 포함시키고 입력 크기를 1로 설정하십시오. 다음으로, 차원이 100이고 단어 인코딩과 동일한 단어 개수를 갖는 단어 임베딩 계층을 포함시킵니다. 다음으로, LSTM 계층을 포함시키고 은닉 유닛의 개수를 180으로 설정합니다. sequence-to-label 분류 문제에서 LSTM 계층을 사용하려면 출력 모드를 'last'로 설정하십시오. 마지막으로, 클래스 개수와 동일한 크기를 갖는 완전 연결 계층, 소프트맥스 계층, 분류 계층을 추가합니다.

inputSize = 1;
embeddingDimension = 100;
numWords = enc.NumWords;
numHiddenUnits = 180;
numClasses = numel(categories(YTrain));

layers = [ ...
    sequenceInputLayer(inputSize)
    wordEmbeddingLayer(embeddingDimension,numWords)
    lstmLayer(numHiddenUnits,'OutputMode','last')
    fullyConnectedLayer(numClasses)
    softmaxLayer
    classificationLayer]
layers = 
  6x1 Layer array with layers:

     1   ''   Sequence Input          Sequence input with 1 dimensions
     2   ''   Word Embedding Layer    Word embedding layer with 100 dimensions and 16954 unique words
     3   ''   LSTM                    LSTM with 180 hidden units
     4   ''   Fully Connected         39 fully connected layer
     5   ''   Softmax                 softmax
     6   ''   Classification Output   crossentropyex

훈련 옵션을 지정합니다. 솔버를 'adam'으로 설정하고 훈련은 Epoch 10회, 기울기 임계값은 1로 설정합니다. 초기 학습률을 0.01로 설정합니다. 훈련 진행 상황을 모니터링하려면 'Plots' 옵션을 'training-progress'로 설정하십시오. 'ValidationData' 옵션을 사용하여 검증 데이터를 지정합니다. 세부 정보가 출력되지 않도록 하려면 'Verbose'false로 설정하십시오.

기본적으로 trainNetwork는 GPU를 사용할 수 있으면 GPU를 사용합니다(Parallel Computing Toolbox™와 Compute Capability 3.0 이상의 CUDA® 지원 GPU 필요). GPU가 없으면 CPU를 사용합니다. 실행 환경을 수동으로 지정하려면 trainingOptions'ExecutionEnvironment' 이름-값 쌍 인수를 사용하십시오. CPU에서 훈련시키면 GPU에서 훈련시키는 것보다 시간이 상당히 오래 걸릴 수 있습니다.

options = trainingOptions('adam', ...
    'MaxEpochs',10, ...    
    'GradientThreshold',1, ...
    'InitialLearnRate',0.01, ...
    'ValidationData',{XValidation,YValidation}, ...
    'Plots','training-progress', ...
    'Verbose',false);

trainNetwork 함수를 사용하여 LSTM 네트워크를 훈련시킵니다.

net = trainNetwork(XTrain,YTrain,layers,options);

LSTM 네트워크 테스트하기

LSTM 네트워크를 테스트하려면 먼저 훈련 데이터와 같은 방식으로 테스트 데이터를 준비하십시오. 그런 다음 전처리된 테스트 데이터에 대해 훈련된 LSTM 네트워크 net을 사용하여 예측을 수행합니다.

훈련 문서와 같은 단계를 사용하여 테스트 데이터를 전처리합니다.

textDataTest = lower(textDataTest);
documentsTest = tokenizedDocument(textDataTest);
documentsTest = erasePunctuation(documentsTest);

훈련 시퀀스를 만들 때와 같은 옵션으로 doc2sequence를 사용하여 테스트 문서를 시퀀스로 변환합니다.

XTest = doc2sequence(enc,documentsTest,'Length',75);
XTest(1:5)
ans = 5×1 cell array
    {1×75 double}
    {1×75 double}
    {1×75 double}
    {1×75 double}
    {1×75 double}

훈련된 LSTM 네트워크를 사용하여 테스트 문서를 분류합니다.

YPred = classify(net,XTest);

분류 정확도를 계산합니다. 정확도는 네트워크가 올바르게 예측한 레이블의 비율입니다.

accuracy = sum(YPred == YTest)/numel(YPred)
accuracy = 0.8691

새 데이터를 사용하여 예측하기

새 일기 예보 3개의 이벤트 유형을 분류합니다. 새 일기 예보를 포함하는 string형 배열을 만듭니다.

reportsNew = [ ...
    "Lots of water damage to computer equipment inside the office."
    "A large tree is downed and blocking traffic outside Apple Hill."
    "Damage to many car windshields in parking lot."];

전처리 단계를 사용하여 텍스트 데이터를 훈련 문서로 전처리합니다.

documentsNew = preprocessText(reportsNew);

훈련 시퀀스를 만들 때와 같은 옵션으로 doc2sequence를 사용하여 텍스트 데이터를 시퀀스로 변환합니다.

XNew = doc2sequence(enc,documentsNew,'Length',75);

훈련된 LSTM 네트워크를 사용하여 새 시퀀스를 분류합니다.

[labelsNew,score] = classify(net,XNew);

일기 예보를 예측된 레이블과 함께 표시합니다.

[reportsNew string(labelsNew)]
ans = 3×2 string array
    "Lots of water damage to computer equipment inside the office."      "Flash Flood"      
    "A large tree is downed and blocking traffic outside Apple Hill."    "Thunderstorm Wind"
    "Damage to many car windshields in parking lot."                     "Hail"             

전처리 함수

함수 preprocessText는 다음 단계를 수행합니다.

  1. tokenizedDocument를 사용하여 텍스트를 토큰화합니다.

  2. lower를 사용하여 텍스트를 소문자로 변환합니다.

  3. erasePunctuation을 사용하여 문장 부호를 지웁니다.

function documents = preprocessText(textData)

% Tokenize the text.
documents = tokenizedDocument(textData);

% Convert to lowercase.
documents = lower(documents);

% Erase punctuation.
documents = erasePunctuation(documents);

end

참고 항목

| | | | | | | |

관련 항목