how to obtain the space between two words in a given sentence?

Hello, I want to obtain the space between the two words in the given image(i.e,the space between 'A' and 'MOVE','MOVE' and 'to' so on..)i want the space in terms of mm(millimeters). Please help me.Thank you

댓글 수: 9

The space in terms of millimeters is only available if the image has been calibrated to the real world. You might be able to obtain some camera parameters by using imfinfo() that give you an idea of how far away the paper was and the focal length and what the aperture was for the image to calculate the calibration, but more commonly those parameters are not present.
If you are not using a higher end camera or a scientific instrument, then you should likely not rely upon any image XResolution or YResolution parameter you find by using imfinfo() -- though I have seen some smartphones record good information for those.
Ok sir,can we atleast obtain the number of pixels between the end of the one word and the start of the next word????
I suggest that you look at the documentation for the Mathworks ocr() routine. It mentions the name of the tool that they use. You can then search for information about that tool and find the paper that was written about it and the site with the long description of how it works. Included there is a description of the three or four different processing layers that they found that they needed to go through in order to determine the number of pixels between the end of one word and the beginning of the next. If I recall correctly, it is something that they found that they were not able to do until they had already tentatively recognized two adjacent words and checked the dictionary and the corpus statistics to determine the probability that the two adjacent words were really one word with a longer spacing between letters.
To phrase this a different way: you cannot just use spacing between letters to determine whether you have reached the end of a word when you are using handwritten letters. Because of that there is no easy way to determine the space between words—because without going through a lot of work you are not sure whether a gap is a gap between letters in the same word or a gap between adjacent words. Of you were hoping to be able to break up the text into words according to spacing and then apply recognition one word at a time, that is not going to work.
sir,it seems too complex,can't we identify the number of white pixels between starting after the ending of black pixel(i.e. ending of a word) and the ending of white pixel(i.e. starting of next word)?? I don't know how to code this,if this is feasible please help me with the code.Thank you sir
No.
Look at your image again. There is what appears to be an "I" and then there is a gap and then there is M and then there is a gap and then there is O and then there is a gap and then there is V,but the MOV are all part of the same word. You might notice that part of the M is under part of the O and say that the gap there is not a word ending gap because of the overlap and that is a good hint (but not a rule.) But there does not appear to be a horizontal overlap between the O and the V, so if you just go by whether there is a gap or not, you would say that the MO is one word, the V is a second word and the E is a third word. You know in your brain that MOVE is forming a word there but you are mentally making assessments about how big the gap between letters is compared to the average letter height and width in that region.
For your first stage you need to look at the gap between letters not at the gap between words.
correct sir,i got it..Thank you :) Leaving it aside,how can i just get the space between first word (A) and next letter (MOVE) only in my image?
You can sum vertically to get the horizontal profile, then threshold, then use diff() to find the first white pixel and then the next black pixel in the profile. Please try it.
sure sir,i will try..Thank you
sir this is what i have done!!you can see my attached file(space.m) But still not getting,have a look and please do help me. Thank you

댓글을 달려면 로그인하십시오.

 채택된 답변

harjeet singh
harjeet singh 2015년 12월 18일
i used your code and modified that to detect space between pixels as shown in pic attached
%//////////// your code ////////////////////////////////////
clc;
clear all
close all
format long g;
format compact;
fontSize = 20;
fullFileName = fullfile(pwd, 'capture.png');
grayImage = imread(fullFileName);
[rows, columns, numberOfColorBands] = size(grayImage);
if numberOfColorBands > 1
grayImage = grayImage(:, :, 2); % Take green channel.
end
% Display the original gray scale image.
figure(1)
imshow(grayImage, []);
title('Original Grayscale Image', 'FontSize', fontSize, 'Interpreter', 'None');
binaryImage = grayImage > 128;
figure(2)
imshow(binaryImage);
title('Binary Image', 'FontSize', fontSize, 'Interpreter', 'None');
hold on
%////////////// include this //////////////////////////////////////////
[m,n]=size(binaryImage);
[lab,num]=bwlabel(~binaryImage);
for i=1:num-1
img_1=lab==i;
img_2=lab==i+1;
[r,c]=find(lab==i);
[r1,c1]=find(lab==i+1);
if(min(c1)-max(c) > fontSize)
line([min(c1) max(c)],[round(m/2) round(m/2)],'LineWidth',3);
text((min(c1)+max(c))/2,10,num2str(min(c1)-max(c)));
end
end
hold off

댓글 수: 8

wow! thanks a lot sir:) the values are the same when i tried with bounding box method ..thanks much to you and also walter roberson sir :)
30 between A and MOVE represents number of pixels right?
last doubt ,if at all i want the value ,how do i retrieve it??i mean i want that 30 to be displayed on the terminal,i have to use the value for further computations,how do i do that??sorry for asking too many questions.
use this to display on terminal
for i=1:num-1
img_1=lab==i;
img_2=lab==i+1;
[r,c]=find(lab==i);
[r1,c1]=find(lab==i+1);
if(min(c1)-max(c) > fontSize)
display(strcat('distance:',num2str(min(c1)-max(c)));
line([min(c1) max(c)],[round(m/2) round(m/2)],'LineWidth',3);
text((min(c1)+max(c))/2,10,num2str(min(c1)-max(c)));
end
end
Thank you so much :) am very much grateful to you!
thnaks meghashree
Hello sir, I used the above code with some modifications to find the space between the words. But I am getting negative values a distance.Please help me.
try use code in this way
clc;
clear all
close all
format long g;
format compact;
fontSize = 20;
fullFileName = fullfile(pwd, 'segmented_line.png');
grayImage = imread(fullFileName);
[rows, columns, numberOfColorBands] = size(grayImage);
if numberOfColorBands > 1
grayImage = grayImage(:, :, 2); % Take green channel.
end
% Display the original gray scale image.
figure(1)
imshow(grayImage, []);
title('Original Grayscale Image', 'FontSize', fontSize, 'Interpreter', 'None');
binaryImage = grayImage; %> 128;
figure(2)
imshow(binaryImage);
title('Binary Image', 'FontSize', fontSize, 'Interpreter', 'None');
hold on
%////////////// include this //////////////////////////////////////////
[m,n]=size(binaryImage);
se=strel('disk',8);
binaryImage1=imdilate(~binaryImage,se);
binaryImage1=bwareaopen(binaryImage1,500);
[lab,num]=bwlabel(binaryImage1);
figure(3)
imshow(binaryImage)
hold on
for i=1:num-1
img_1=lab==i;
img_2=lab==i+1;
[r,c]=find(lab==i);
[r1,c1]=find(lab==i+1);
if(min(c1)-max(c) > fontSize && max(c1)-min(c1)>10)
line([min(c1) max(c)],[round(m/2) round(m/2)],'LineWidth',3);
text((min(c1)+max(c))/2,10,num2str(min(c1)-max(c)));
end
end
hold off

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

harjeet singh
harjeet singh 2015년 12월 17일
dear meghashree try this code
clear all
close all
clc
image=imread('capture.png');
figure(1)
imshow(image)
drawnow
img_1=image(:,:,1)<150;
figure(2)
imshow(img_1)
drawnow
se=strel('disk',5);
img_2=imdilate(img_1,se);
figure(3)
imshow(img_2)
drawnow
[lab,num]=bwlabel(img_2);
for i=1:num
[r,c]=find(lab==i);
img_3=image(min(r):max(r),min(c):max(c),:);
figure(4)
subplot(3,3,i)
imshow(img_3);
drawnow
end

댓글 수: 13

Thank you for the reply sir...This segmentation of words part is done,my question is not certainly that.My question is to find the number of white pixels between first word and next word only .
Once you have regions, regionprops them to get their bounding box. The difference between the last position implied by one bounding box, and the first position implied by the next bound box, is the pixels between them.
propied=regionprops(L,'BoundingBox');
boundingbox=propied.BoundingBox;
left=boundingbox(1)
I now have the bounding box,but how do i get the position sir??
sir,now i can get the first and second bounding box by propied(1) and propied(2);
the bounding box characteristics are extracted by propied.BoundingBox(1) and propied.BoundingBox(2);
Now how do i calculate the position??i mean which 2 parameters should be subtracted ?? Please help me,thank you
propied(2).BoundingBox(1) - (propied(1).BoundingBox(1)+propied(1).BoundingBox(3))
Thank you so much sir :) i got it right! if at all i have to use it without bounding box concept and do,how to do??(like image analyst sir mentioned above)
What error did you get in your code when you implemented Image Analyst's suggestion?
have a look into my code sir pls.One more doubt .. How do i count the total number of bounding boxes in the image???
Your profiling code is acting on white values, not on black values. To fix it you need to form your binary image from places less than 128 rather than from places greater than 128.
thank you sir..but where's the value??how am i getting the spacing here??sorry for asking too many questions.
could you please, help me how i can separate each words in text image , then separate letter as bounding box in each word , i tried this code but without feasibility.
{
%%Image segmentation and extraction
%%Read Image
%%dis_letter =4;
%%dis_word =30;
close all;
clear all;
fontSize = 20;
imagen=imread('scan0001.jpg');
%%Show image
figure(1)
imshow(imagen);
title('INPUT IMAGE WITH NOISE')
%%Convert to gray scale
if size(imagen,3)==3 % RGB image
imagen=rgb2gray(imagen);
end
%%Convert to binary image
threshold = graythresh(imagen);
imagen =~im2bw(imagen,threshold);
%%Remove all object containing fewer than 30 pixels
imagen = bwareaopen(imagen,15);
pause(1)
%%Show image binary image
figure(2)
imshow(~imagen);
title('INPUT IMAGE WITHOUT NOISE')
[m,n]=size(imagen);
%%Label connected components
[lab num]=bwlabel(imagen);
%%figure(3),imshow(L);
%%imtool(L)
% calculate distance between each label
%% figure(3), imshow(img_1);
%%figure(4), imshow(img_2);
%% Measure properties of image regions propied=regionprops(lab,'BoundingBox'); hold on %% Plot Bounding Box for n=1:size(propied,1) rectangle('Position',propied(n).BoundingBox,'EdgeColor','g','LineWidth',2) end hold off pause (1)
figure for i=1:num-1 img_1=lab==i;
img_2=lab==i+1;
[r,c]=find(lab==i);
[r1,c1]=find(lab==i+1);
if(min(c1)-max(c) > fontSize && max(c1)-min(c1)>10)
line([min(c1) max(c)],[round(m/2) round(m/2)],'LineWidth',3);
text((min(c1)+max(c))/2,10,num2str(min(c1)-max(c)));
figure(5), imshow(img_1);
figure(6), imshow(img_2);
end
end
%% Objects extraction
%for n=1:Ne % [r,c] = find(L==n); % n1=imagen(min(r):max(r),min(c):max(c)); % imshow(~n1); % pause(0.5) %end }
I have a doubt in arrangement of code sections, help me please.
How do I use the values in the figure? for eg : if I want to take average of all the values?
Hey harjeet thanks for the code,in this we are able to figure out the spacing between two words.However for examples like this,I am also interested in looking at the spacing between the characters in a word.If anyone can give me pointers on how to go about this,it would be really useful.

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Image Processing Toolbox에 대해 자세히 알아보기

질문:

2015년 12월 17일

댓글:

2021년 8월 28일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by