finding e-mail address which begins with 2 character and different domain names as array

조회 수: 28 (최근 30일)
hello every one; i have 3 arrays and they are declared as the following:
dom_lists={'000' 'hotmal.com';'001' 'gmail.com';'010' 'yahoo.com'; '011' 'mail.com';'100' 'live.com';'101' 'myspace.com';'110' 'msn.com';'111' 'mynet.com'};
sub_g1={'aj';'ih';'vn';'hu';'eg';'is';'rd';'nt';'me';'ah';'zb';'en';'mm'};
sub_g2={'001';'101'; '101';'101'; '000'; '110'; '001'; '111'; '101'; '000'; '110'; '000'; '000'};
list_emialadre={'aakm@hotmail.com';'abomcn@hotmail.com';...............}; 5408 emails which are based on domain names, e.g. hotmail.com have 676 email address and also gmail.com have 676....
what i want is; generating the sub_g2's meaning or equivalent strings from domain_lists vector. after that, sub_g1's data is also used for finding from the email address which begins with the sub_g1's dual characters and ends sub_2 domain_lis. so help me for solving this problem.

채택된 답변

Stephen23
Stephen23 2015년 5월 1일
편집: Stephen23 2015년 5월 2일
If you split the problem into parts then it is much easier to solve. Here are the data call arrays, which I altered e.g. by adding one sample email address that actually matches the second pair+domain data (otherwise the two email addresses given do not match any, so we would not have a positive result):
dom_lists = {'001' 'gmail.com'; '000' 'hotmal.com';'010' 'yahoo.com'; '011' 'mail.com';'100' 'live.com';'101' 'myspace.com';'110' 'msn.com';'111' 'mynet.com'};
sub_g1 = {'aj'; 'ih'; 'vn'; 'hu'; 'eg'; 'is'; 'rd'; 'nt'; 'me'; 'ah'; 'zb'; 'en'; 'mm'};
sub_g2 = {'001';'101'; '101';'101'; '000'; '110'; '001'; '111'; '101'; '000'; '110'; '000'; '000'};
email_list = {'aakm@hotmail.com'; 'abomcn@hotmail.com'; 'ihzzz@myspace.com'};
First convert the binary strings into numeric values with bin2dec, as this makes them easier to work with:
sub_N = bin2dec(cell2mat(sub_g2));
dom_N = bin2dec(cell2mat(dom_lists(:,1)));
Then simply match these numeric values using bsxfun, and extract the correct domain strings:
[row,~] = find(bsxfun(@eq,dom_N,sub_N.'));
dom_g2 = dom_lists(row,2);
Then create regexp regular expressions based on the character-pair and domain strings, and use these to locate the matching email addresses:
rgx = strcat('^',sub_g1,'.*@',strrep(dom_g2,'.','\.'),'$');
mtc = cellfun(@(s)regexp(email_list,s),rgx, 'UniformOutput',false);
out = ~cellfun('isempty',[mtc{:}]);
where out can be shown in the command window:
>> out
out =
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0 0
out is a logical array where each row corresponds to one of the given email addresses (only three in the sample data!) and each column corresponds to the pair+domain data of sub_g1 and sub_g2 (thus thirteen columns). From this array we can see that the third email address matches the data of the second pair+domain data, which is what was stated at the beginning, so the algorithm has successfully detected this positive test case.
  댓글 수: 7
Stephen23
Stephen23 2015년 5월 3일
@abdulkarim hassan: I'm glad to help! On this forum it is also considered polite to accept answers that resolve your questions.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Logical에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by