how to obtain the space between two words in a given sentence?
이전 댓글 표시

Hello, I want to obtain the space between the two words in the given image(i.e,the space between 'A' and 'MOVE','MOVE' and 'to' so on..)i want the space in terms of mm(millimeters). Please help me.Thank you
댓글 수: 9
Walter Roberson
2015년 12월 17일
The space in terms of millimeters is only available if the image has been calibrated to the real world. You might be able to obtain some camera parameters by using imfinfo() that give you an idea of how far away the paper was and the focal length and what the aperture was for the image to calculate the calibration, but more commonly those parameters are not present.
If you are not using a higher end camera or a scientific instrument, then you should likely not rely upon any image XResolution or YResolution parameter you find by using imfinfo() -- though I have seen some smartphones record good information for those.
Meghashree G
2015년 12월 17일
Walter Roberson
2015년 12월 17일
I suggest that you look at the documentation for the Mathworks ocr() routine. It mentions the name of the tool that they use. You can then search for information about that tool and find the paper that was written about it and the site with the long description of how it works. Included there is a description of the three or four different processing layers that they found that they needed to go through in order to determine the number of pixels between the end of one word and the beginning of the next. If I recall correctly, it is something that they found that they were not able to do until they had already tentatively recognized two adjacent words and checked the dictionary and the corpus statistics to determine the probability that the two adjacent words were really one word with a longer spacing between letters.
To phrase this a different way: you cannot just use spacing between letters to determine whether you have reached the end of a word when you are using handwritten letters. Because of that there is no easy way to determine the space between words—because without going through a lot of work you are not sure whether a gap is a gap between letters in the same word or a gap between adjacent words. Of you were hoping to be able to break up the text into words according to spacing and then apply recognition one word at a time, that is not going to work.
Meghashree G
2015년 12월 17일
Walter Roberson
2015년 12월 17일
No.
Look at your image again. There is what appears to be an "I" and then there is a gap and then there is M and then there is a gap and then there is O and then there is a gap and then there is V,but the MOV are all part of the same word. You might notice that part of the M is under part of the O and say that the gap there is not a word ending gap because of the overlap and that is a good hint (but not a rule.) But there does not appear to be a horizontal overlap between the O and the V, so if you just go by whether there is a gap or not, you would say that the MO is one word, the V is a second word and the E is a third word. You know in your brain that MOVE is forming a word there but you are mentally making assessments about how big the gap between letters is compared to the average letter height and width in that region.
For your first stage you need to look at the gap between letters not at the gap between words.
Meghashree G
2015년 12월 17일
Image Analyst
2015년 12월 17일
You can sum vertically to get the horizontal profile, then threshold, then use diff() to find the first white pixel and then the next black pixel in the profile. Please try it.
Meghashree G
2015년 12월 17일
Meghashree G
2015년 12월 18일
채택된 답변
추가 답변 (1개)
harjeet singh
2015년 12월 17일
dear meghashree try this code

clear all
close all
clc
image=imread('capture.png');
figure(1)
imshow(image)
drawnow
img_1=image(:,:,1)<150;
figure(2)
imshow(img_1)
drawnow
se=strel('disk',5);
img_2=imdilate(img_1,se);
figure(3)
imshow(img_2)
drawnow
[lab,num]=bwlabel(img_2);
for i=1:num
[r,c]=find(lab==i);
img_3=image(min(r):max(r),min(c):max(c),:);
figure(4)
subplot(3,3,i)
imshow(img_3);
drawnow
end
댓글 수: 13
Meghashree G
2015년 12월 18일
Walter Roberson
2015년 12월 18일
Once you have regions, regionprops them to get their bounding box. The difference between the last position implied by one bounding box, and the first position implied by the next bound box, is the pixels between them.
Meghashree G
2015년 12월 18일
Meghashree G
2015년 12월 18일
Walter Roberson
2015년 12월 18일
propied(2).BoundingBox(1) - (propied(1).BoundingBox(1)+propied(1).BoundingBox(3))
Meghashree G
2015년 12월 18일
Walter Roberson
2015년 12월 18일
What error did you get in your code when you implemented Image Analyst's suggestion?
Meghashree G
2015년 12월 18일
Walter Roberson
2015년 12월 18일
Your profiling code is acting on white values, not on black values. To fix it you need to form your binary image from places less than 128 rather than from places greater than 128.
Meghashree G
2015년 12월 18일
lotus whit
2016년 7월 26일
could you please, help me how i can separate each words in text image , then separate letter as bounding box in each word , i tried this code but without feasibility.
{
%%Image segmentation and extraction
%%Read Image
%%dis_letter =4;
%%dis_word =30;
close all;
clear all;
fontSize = 20;
imagen=imread('scan0001.jpg');
%%Show image
figure(1)
imshow(imagen);
title('INPUT IMAGE WITH NOISE')
%%Convert to gray scale
if size(imagen,3)==3 % RGB image
imagen=rgb2gray(imagen);
end
%%Convert to binary image
threshold = graythresh(imagen);
imagen =~im2bw(imagen,threshold);
%%Remove all object containing fewer than 30 pixels
imagen = bwareaopen(imagen,15);
pause(1)
%%Show image binary image
figure(2)
imshow(~imagen);
title('INPUT IMAGE WITHOUT NOISE')
[m,n]=size(imagen);
%%Label connected components
[lab num]=bwlabel(imagen);
%%figure(3),imshow(L);
%%imtool(L)
% calculate distance between each label
%% figure(3), imshow(img_1);
%%figure(4), imshow(img_2);
%% Measure properties of image regions propied=regionprops(lab,'BoundingBox'); hold on %% Plot Bounding Box for n=1:size(propied,1) rectangle('Position',propied(n).BoundingBox,'EdgeColor','g','LineWidth',2) end hold off pause (1)
figure for i=1:num-1 img_1=lab==i;
img_2=lab==i+1;
[r,c]=find(lab==i);
[r1,c1]=find(lab==i+1);
if(min(c1)-max(c) > fontSize && max(c1)-min(c1)>10)
line([min(c1) max(c)],[round(m/2) round(m/2)],'LineWidth',3);
text((min(c1)+max(c))/2,10,num2str(min(c1)-max(c)));
figure(5), imshow(img_1);
figure(6), imshow(img_2);
end
end
%% Objects extraction
%for n=1:Ne % [r,c] = find(L==n); % n1=imagen(min(r):max(r),min(c):max(c)); % imshow(~n1); % pause(0.5) %end }
I have a doubt in arrangement of code sections, help me please.
Sumita Das
2016년 12월 15일
How do I use the values in the figure? for eg : if I want to take average of all the values?
sparsh garg
2021년 8월 28일
Hey harjeet thanks for the code,in this we are able to figure out the spacing between two words.However for examples like this,I am also interested in looking at the spacing between the characters in a word.If anyone can give me pointers on how to go about this,it would be really useful.

카테고리
도움말 센터 및 File Exchange에서 Image Processing Toolbox에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!



