确定报纸文章中的列数

如何解决确定报纸文章中的列数

让我们想象以下报纸文章需要分析的列数(解决方案应该是3个文本列)。我尝试使用cv2库和python检索列的数量,并在StackOverflow上找到以下建议:Detect number of rows and columns in table image with OpenCV

Example of a newspaper article

但是,由于该解决方案的表格结构合理,因此可以很容易地提取列和行的数量。基于该解决方案,这是我想到的:

import numpy as np
from imutils import contours
import cv2

# Load image,grayscale,Gaussian blur,Otsu's threshold
image = cv2.imread('example_newspaper_article.png')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray,(5,5),0)
thresh = cv2.threshold(blur,240,255,cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Find contours and remove text inside cells
cnts = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    area = cv2.contourArea(c)
    if area < 10000:
        cv2.drawContours(thresh,[c],-1,(255,255),30)

# Invert image
invert = thresh
offset,old_cY,first = 10,True
visualize = cv2.cvtColor(invert,cv2.COLOR_GRAY2BGR)

# Find contours,sort from top-to-bottom and then sum up column/rows
cnts = cv2.findContours(invert,cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
(cnts,_) = contours.sort_contours(cnts,method="top-to-bottom")
for c in cnts:
    # Find centroid
    M = cv2.moments(c)
    cX = int(M["m10"] / M["m00"])
    cY = int(M["m01"] / M["m00"])

    # New row
    if (abs(cY) - abs(old_cY)) > offset:
        if first:
            row,table = [],[]
            first = False
        old_cY = cY
        table.append(row)
        row = []
    # Cell in same row
    if ((abs(cY) - abs(old_cY)) <= offset) or first:
        row.append(1)
    # Uncomment to visualize
    #cv2.circle(visualize,(cX,cY),10,(36,12),-1)
    #cv2.imshow('visualize',visualize)
    #cv2.waitKey(200)

print('Rows: {}'.format(len(table)))
print('Columns: {}'.format(len(table[1])))

cv2.imshow('invert',invert)
cv2.imshow('thresh',thresh)
cv2.waitKey()

我认为,增加drawContours方法的厚度参数将在某种程度上有所帮助,但不幸的是,这并不能解决问题。结果看起来像这样:

Result of my attempt

我认为,在文本区域上绘制矩形会更有用吗? 有谁知道解决方案,可以帮助我吗? 预先感谢!

解决方法

每当有这样的任务时,我都会沿y轴计数像素,并尝试找出相邻列之间的(大)差异。那将是我完整的管道:

  1. 将图像转换为灰度;使用Otsu逆二进制阈值以获取黑色背景上的白色像素。
  2. 做一些形态上的封闭,这里使用一个大的垂直线内核连接同一列中的所有像素。
  3. 计算所有白色像素;计算相邻列之间的绝对差。
  4. 手动或通过使用scipy.signal.find_peaks来找到该“信号”中的峰值。峰标识每个文本列的开始和结束,因此文本列的数量是峰数的一半。

以下是整个代码,包括一些可视化内容:

import cv2
import matplotlib.pyplot as plt     # Only for visualization output
import numpy as np
from scipy import signal
from skimage import io              # Only for web grabbing images

# Read image from web (attention: RGB order here,scikit-image)
image = io.imread('https://i.stack.imgur.com/jbAeZ.png')

# Convert image to grayscale
gray = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)

# Inverse binary threshold by Otsu's
thr = cv2.threshold(gray,255,cv2.THRESH_OTSU + cv2.THRESH_BINARY_INV)[1]

# Morphological closing with large vertical line kernel
thr_mod = cv2.morphologyEx(thr,cv2.MORPH_CLOSE,np.ones((image.shape[0],1)))

# Count white pixels along y-axis
y_count = np.sum(thr_mod / 255,0)

# Calculate absolute difference between neighbouring x-axis values
y_count_diff = np.abs(np.diff(y_count))

# Find peaks in that "signal"
peaks = signal.find_peaks(y_count_diff,distance=50)[0]

# Number of columns is half the number of found peaks
n_cols = np.int(peaks.shape[0] / 2)

# Text output
print('Number of columns: ' + str(n_cols))

# Some visualization output
plt.figure(0)
plt.subplot(221)
plt.imshow(image)
plt.title('Original image')

plt.subplot(222)
plt.imshow(thr_mod,cmap='gray')
plt.title('Thresholded,morphlogically closed image')

plt.subplot(223)
plt.plot(y_count)
plt.plot(peaks,y_count[peaks],'r.')
plt.title('Summed white pixels along y-axis')

plt.subplot(224)
plt.plot(y_count_diff)
plt.plot(peaks,y_count_diff[peaks],'r.')
plt.title('Absolute difference in summed white pixels')

plt.tight_layout()
plt.show()

文本输出:

Number of columns: 3

可视化输出:

1

限制:如果图像倾斜等,可能会导致不良结果。如果您有很多(大)图像横穿文本列,则可能还会得到不好的结果。通常,您需要调整给定实现中的细节以满足您的实际需求(不再给出示例)。

----------------------------------------
System information
----------------------------------------
Platform:    Windows-10-10.0.16299-SP0
Python:      3.8.5
Matplotlib:  3.3.1
NumPy:       1.19.1
OpenCV:      4.4.0
SciPy:       1.5.2
----------------------------------------
,

搜索列之前,可以对图像进行一些不同的准备。例如,您可以先水平连接文本(通过某种形态学操作)。这将为您提供一定高度的轮廓(标题将垂直连接为每行一个轮廓,而列中的文本将连接为每行一个轮廓)。然后搜索所有轮廓,并在高于您设置的特定值(可以计算或手动设置)的轮廓上绘制边界矩形。之后,使用更大的内核(水平和垂直)再次执行形态学操作,以便将其余所有文本紧密连接在一起。

这是示例代码:

import cv2
import numpy as np

img = cv2.imread("columns.png")  # read image
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)   # grayscale transform
thresh = cv2.threshold(gray,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)[1]  # OTSU thresold
kernel = np.ones((5,10),dtype=np.uint8)  # kernel for first closing procedure (connect blobs in x direction)
closing = cv2.morphologyEx(thresh,kernel)  # closing
cv2.imwrite("closing1.png",closing)
contours = cv2.findContours(closing,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)[0]  # search for contours

heights = []  # all of contours heights

for cnt in contours:
    x,y,w,h = cv2.boundingRect(cnt)  # bounding rectangles height,width and coordinates
    heights.append(h)  # append height of one contours

boundary = np.mean(heights,axis=0)  # mean of heights will serve as boundary but
# this will probably not be the case on other samples - you would need to make
# a function to determin this boundary or manualy set it

# iterate through contours
for cnt in contours:
    x,width and coordinates
    if h > boundary:  # condition - contour must be higher than height boundary
        cv2.rectangle(closing,(x,y),(x+w,y+h),(0,0),-1)  # draw filled rectangle on the closing image

cv2.imwrite("closing1-filled.png",closing)

kernel = np.ones((25,25),dtype=np.uint8)  # kernel for second closing (connect blobs in x and y direction)
closing = cv2.morphologyEx(closing,kernel)  # closing again

cv2.imwrite("closing2.png",closing)


contours = cv2.findContours(closing,cv2.CHAIN_APPROX_NONE)[0]   # search for contours again


# iterate through contours
print("Number of columns: ",len(contours))  # this is the number of columns
for cnt in contours:
    x,h = cv2.boundingRect(cnt)  # this are height,width and coordinates of the columns
    cv2.rectangle(img,3)  # draw bouning rectangle on original image

cv2.imwrite("result.png",img)


cv2.imshow("img",img)
cv2.waitKey(0)
cv2.destroyAllWindows()

结果:

enter image description here

列数:3

第1步:

enter image description here

第2步:

enter image description here

第3步:

enter image description here

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


依赖报错 idea导入项目后依赖报错,解决方案:https://blog.csdn.net/weixin_42420249/article/details/81191861 依赖版本报错:更换其他版本 无法下载依赖可参考:https://blog.csdn.net/weixin_42628809/a
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下 2021-12-03 13:33:33.927 ERROR 7228 [ main] o.s.b.d.LoggingFailureAnalysisReporter : *************************** APPL
错误1:gradle项目控制台输出为乱码 # 解决方案:https://blog.csdn.net/weixin_43501566/article/details/112482302 # 在gradle-wrapper.properties 添加以下内容 org.gradle.jvmargs=-Df
错误还原:在查询的过程中,传入的workType为0时,该条件不起作用 &lt;select id=&quot;xxx&quot;&gt; SELECT di.id, di.name, di.work_type, di.updated... &lt;where&gt; &lt;if test=&qu
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct redisServer’没有名为‘server_cpulist’的成员 redisSetCpuAffinity(server.server_cpulist); ^ server.c: 在函数‘hasActiveC
解决方案1 1、改项目中.idea/workspace.xml配置文件,增加dynamic.classpath参数 2、搜索PropertiesComponent,添加如下 &lt;property name=&quot;dynamic.classpath&quot; value=&quot;tru
删除根组件app.vue中的默认代码后报错:Module Error (from ./node_modules/eslint-loader/index.js): 解决方案:关闭ESlint代码检测,在项目根目录创建vue.config.js,在文件中添加 module.exports = { lin
查看spark默认的python版本 [root@master day27]# pyspark /home/software/spark-2.3.4-bin-hadoop2.7/conf/spark-env.sh: line 2: /usr/local/hadoop/bin/hadoop: No s
使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-