![]()
基于视窗的OCR页面图像倾斜检测方法
摘 要
文档在扫描输入过程中,所生成的页面图像一般都存在一定的角度倾斜,当页面图像倾斜角度过大时,将对进一步的版面分析以及字符识别产生不良影响。为了快速准确地检测页面图像倾斜角度和降低计算量,提出了一种基于视窗变换的页面图像倾斜检测方法,该算法首先对视窗中的文字及图片的细节部分进行模糊,然后对其边沿进行直线拟合,以便快速检测页面图像倾斜角度。实验结果表明,该方法能快速准确地检测出各类页面图像的倾斜角度,并具有良好的适应性。
关键词
Skew Document Image Detection Method Based on Windows Transform
() Abstract
During OCR(optical character recognition) image scanning, the document images, are always placed slantwise to some extent. When the skew degree is big enough, it will influence the effect of document analysis and lower the recognition accuracy as the algorithm for layout analysis and character recognition are very sensitive to page skew. So the skew degree detection is a very important step during the preprocessing of document analysis. In this paper, a skew detection method based on the window analysis is presented. First it chooses the suitable windows which are not in the margin but in the layout of a printed page. Then according to the kind of contents, just like tables, text lines, images and etc., it uses the different methods to pre-processing the windows image. To overcome the large computing, the third step is to blur the text lines and image from the window. The forth step is to detect the edges of the blurring regions .At last it uses a straight line fitting to the edges, and gets the skew angle. By this method, experimental results show that the skew angles of many kinds of document images can be efficiently and accurately detected, and it has sufficient adaptability.
Keywords
|