从论坛使用“美丽的汤”进行报废-如何刮除使用<td>重复多次的表?

如何解决从论坛使用“美丽的汤”进行报废-如何刮除使用<td>重复多次的表?

我想从论坛中检索表格数据,该论坛要求使用用户名和密码登录。我已经编写了代码,但是无法从论坛表中获取任何值。这是我的代码:

from bs4 import BeautifulSoup as bs
import requests

URL = 'http://kingmedia.tv'
LOGIN_ROUTE = '/home/'

HEADERS = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/84.0.4147.125 Safari/537.36','origin': URL,'referer': URL + LOGIN_ROUTE}

s = requests.session()


login_payload = {
        'login': "bachoo786",'password': "abcde12345" 
        }

login_req = s.post(URL + LOGIN_ROUTE,headers=HEADERS,data=login_payload)

print(login_req.status_code)

cookies = login_req.cookies


soup = bs(s.get(URL + '/forumdisplay.php?f=2').text,'html.parser')
tbody = soup.find('table',id='tborder')
print(tbody)

我也尝试过使用硒,但无法获取数据。这是我的硒代码:

from selenium.webdriver.chrome.options import Options 
from selenium import webdriver 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.common.by import By 
from bs4 import BeautifulSoup
import re
#from bs4 import BeautifulSoup as bs
import requests

URL = 'http://kingmedia.tv'
LOGIN_ROUTE = '/home/'

HEADERS = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,'password': "abcde123" 
        }

login_req = s.post(URL + LOGIN_ROUTE,data=login_payload)

print(login_req.status_code)

cookies = login_req.cookies


options = Options()
# Runs Chrome in headless mode.
#options.add_argument("--headless") 
#path of the chrome driver
chrome_options = Options() 
chrome_options.add_argument("--headless") 
chrome_options.add_argument('--no-sandbox') 
chrome_options.add_argument('--disable-dev-shm-usage') 
driver=webdriver.Chrome('/usr/bin/chromedriver',chrome_options=chrome_options) 
driver.headless=True 
driver.get('http://kingmedia.tv/home/forumdisplay.php?f=2') 
WebDriverWait(driver,20).until(EC.visibility_of_element_located((By.CSS_SELECTOR,'div.sidebar-widget.widget_text>div>table'))) 
print("Data rendered successfully!!!")
#Get the page source
html = driver.page_source 
soup = BeautifulSoup(html,'html.parser')

#print (soup)

driver.close() 
table=soup.find('table',class_='tborder').find_next('table').find_next('class') 
for row in table.find_all('tr'):
    name=row.find_all("td")[0].text.strip()

    print(name)

我尝试提取的表数据如下所示:

General

更新: 这是上面图片中表格的html元素:

<table class="tborder" cellpadding="6" cellspacing="1" border="0" width="100%" align="center">
<thead>
    <tr align="center">
     <td class="thead" width="150">Live Screenshot</td> 
      <td class="thead" width="130" align="left">Channel Number</td>
  <td class="thead" width="290">Now Playing</td>
      <td class="thead" width="170">Watching Options</td>
 

      
      
    </tr>
</thead>
<tbody>
    <tr align="left">
        <td class="alt1Active" colspan="2" align="left" id="f13">
        
        <table cellpadding="0" cellspacing="0" border="0">
        <tbody><tr>
            <td class="alt1" width="20">
            
            <input type="image" width="130" height="90" src="http://213.163.74.154/ss2/tv1.jpg" onclick="open_tv1()"></td><td width="20"><center>
                <br></center></td><td height="80"><br><br><strong><img src="/images/uk.gif"><img src="/images/hd2.png"> Channel 1</strong> <br><br>
                
            
        
        


            </td>
        </tr>
        </tbody></table>
        
</td><td class="alt1" align="Center">


    

<div class="smallfont" align="left">
    <div style="clear:both">

    
<span class="smallfont"></span>

 <font size="2"><br><img class="inlineimg" src="http://www.kingmedia.tv/scripts/status.php?file=tv12.nsv" alt="12" border="0" title="12">&nbsp;<strong>Back Soon</strong><br>
    </font><iframe height="25" width="140" frameborder="0" scrolling="no" seamless="seamless" src="http://213.163.74.154:8080/tv1.xsl"></iframe><iframe height="25" width="80" frameborder="0" scrolling="no" seamless="seamless" src="http://207.244.98.215:8080/tv1.xsl"></iframe>
    <br><span class="smallfont"></span></div></div></td>
<td class="alt1" nowrap="nowrap">
<div align="right"><form name="menuformtv1">
<select name="menu1tv1">
<option disabled="">-- Server 1 --</option>
<option value="/buildmtv1s2.php">Watch in Winamp</option>
<option value="/webvlctv1s2.php">Watch in a Web Player</option>
<option value="/directlinktv1s2.php">View Stream URL</option>
<option disabled="">-- Server 2 (Backup) --</option>
<option value="/buildmtv1.php" selected="selected">Watch in Winamp </option>
<option value="/webvlctv1.php">Watch in a Web Player</option>
<option value="/directlinktv1.php">View Stream URL</option>

</select>

    <br><br><strong><a href="/home/payments.php">Subscribe to Unlock</a></strong>


</form></div><br>
 </td>
    <!----> 
        
    </tr>
</tbody>
<tbody>
    <tr align="left">
        <td class="alt1Active" colspan="2" align="left" id="f6">
        
        <table cellpadding="0" cellspacing="0" border="0">
        <tbody><tr>
            <td class="alt1" width="20">
            
            <input type="image" width="130" height="90" src="http://213.163.74.154/ss2/tv2.jpg" onclick="open_tv2()"></td><td width="20"><center>
                <br></center></td><td height="80"><br><br><strong><img src="/images/uk.gif"><img src="/images/hd2.png"> Channel 2</strong> <br><br>
                
            
        
        


            </td>
        </tr>
        </tbody></table>
        
</td><td class="alt1" align="Center">


    

<div class="smallfont" align="left">
    <div style="clear:both">

    
<span class="smallfont"></span>

 <font size="2"><br><img class="inlineimg" src="http://www.kingmedia.tv/scripts/status.php?file=tv9.nsv" alt="9" border="0" title="9">&nbsp;<strong>Back Soon</strong><br>
    </font><iframe height="25" width="140" frameborder="0" scrolling="no" seamless="seamless" src="http://213.163.74.154:8080/tv2.xsl"></iframe><iframe height="25" width="80" frameborder="0" scrolling="no" seamless="seamless" src="http://207.244.98.215:8080/tv2.xsl"></iframe>
    <br><span class="smallfont"></span></div></div></td>
<td class="alt1" nowrap="nowrap">
<div align="right"><form name="menuformtv2">
<select name="menu1tv2">
<option disabled="">-- Server 1 --</option>
<option value="/buildmtv2s2.php">Watch in Winamp</option>
<option value="/webvlctv2s2.php">Watch in a Web Player</option>
<option value="/directlinktv2s2.php">View Stream URL</option>
<option disabled="">-- Server 2 (Backup) --</option>
<option value="/buildmtv2.php" selected="selected">Watch in Winamp </option>
<option value="/webvlctv2.php">Watch in a Web Player</option>
<option value="/directlinktv2.php">View Stream URL</option>

</select>

    <br><br><strong><a href="/home/payments.php">Subscribe to Unlock</a></strong>


</form></div><br>
 </td>
    <!--<input type="image" src="/images/gs.gif" onClick="open_win2()" /><br>
<br><a href="/home/showthread.php?goto=newpost&amp;t=4"><img src=/images/wtn.gif align=right border=0></a>-->   
        
    </tr>
</tbody>
<tbody>
    <tr align="left">
        <td class="alt1Active" colspan="2" align="left" id="f17">
        
        <table cellpadding="0" cellspacing="0" border="0">
        <tbody><tr>
            <td class="alt1" width="20">
            
            <input type="image" width="130" height="90" src="http://213.163.74.154/ss2/tv3.jpg" onclick="open_tv3()"></td><td width="20"><center>
                <br></center></td><td height="80"><br><br><strong><img src="/images/uk.gif"><img src="/images/hd2.png"> Channel 3</strong> <br><br>
                
            
        
        


            </td>
        </tr>
        </tbody></table>
        
</td><td class="alt1" align="Center">


    

<div class="smallfont" align="left">
    <div style="clear:both">

    
<span class="smallfont"></span>

 <font size="2"><br><img class="inlineimg" src="http://www.kingmedia.tv/scripts/status.php?file=tv13.nsv" alt="13" border="0" title="13">&nbsp;<strong>Back Soon</strong><br>
    </font><iframe height="25" width="140" frameborder="0" scrolling="no" seamless="seamless" src="http://213.163.74.154:8080/tv3.xsl"></iframe><iframe height="25" width="80" frameborder="0" scrolling="no" seamless="seamless" src="http://207.244.98.215:8080/tv3.xsl"></iframe>
    <br><span class="smallfont"></span></div></div></td>
<td class="alt1" nowrap="nowrap">
<div align="right"><form name="menuformtv3">
<select name="menu1tv3">
<option disabled="">-- Server 1 --</option>
<option value="/buildmtv3s2.php">Watch in Winamp</option>
<option value="/webvlctv3s2.php">Watch in a Web Player</option>
<option value="/directlinktv3s2.php">View Stream URL</option>
<option disabled="">-- Server 2 (Backup) --</option>
<option value="/buildmtv3.php" selected="selected">Watch in Winamp </option>
<option value="/webvlctv3.php">Watch in a Web Player</option>
<option value="/directlinktv3.php">View Stream URL</option>

</select>

    <br><br><strong><a href="/home/payments.php">Subscribe to Unlock</a></strong>


</form></div><br>
 </td>
    <!--<input type="image" src="/images/gs.gif" onClick="open_win3()" /><br>
<br><a href="/home/showthread.php?goto=newpost&amp;t=8"><img src=/images/wtn.gif align=right border=0></a>-->   
        
    </tr>
</tbody>
<tbody>
    <tr align="left">
        <td class="alt1Active" colspan="2" align="left" id="f9">
        
        <table cellpadding="0" cellspacing="0" border="0">
        <tbody><tr>
            <td class="alt1" width="20">
            
            <input type="image" width="130" height="90" src="http://213.163.74.154/ss2/tv4.jpg" onclick="open_tv4()"></td><td width="20"><center>
                <br></center></td><td height="80"><br><br><strong><img src="/images/canada.gif"><img src="/images/hd2.png"> Channel 4</strong> <br><br>
                
            
        
        


            </td>
        </tr>
        </tbody></table>
        
</td><td class="alt1" align="Center">


    

<div class="smallfont" align="left">
    <div style="clear:both">

    
<span class="smallfont"></span>

 <font size="2"><br><img class="inlineimg" src="http://www.kingmedia.tv/scripts/status.php?file=tv3.nsv" alt="3" border="0" title="3">&nbsp;<strong>TSN 2 HD</strong><br>
    </font><iframe height="25" width="140" frameborder="0" scrolling="no" seamless="seamless" src="http://213.163.74.154:8080/tv4.xsl"></iframe><iframe height="25" width="80" frameborder="0" scrolling="no" seamless="seamless" src="http://207.244.98.215:8080/tv4.xsl"></iframe>
    <br><span class="smallfont"></span></div></div></td>
<td class="alt1" nowrap="nowrap">
<div align="right"><form name="menuformtv4">
<select name="menu1tv4">
<option disabled="">-- Server 1 --</option>
<option value="/buildmtv4s2.php">Watch in Winamp</option>
<option value="/webvlctv4s2.php">Watch in a Web Player</option>
<option value="/directlinktv4s2.php">View Stream URL</option>
<option disabled="">-- Server 2 (Backup) --</option>
<option value="/buildmtv4.php" selected="selected">Watch in Winamp </option>
<option value="/webvlctv4.php">Watch in a Web Player</option>
<option value="/directlinktv4.php">View Stream URL</option>

</select>

    <br><br><strong><a href="/home/payments.php">Subscribe to Unlock</a></strong>


</form></div><br>
 </td>
    <!--<input type="image" src="/images/gs.gif" onClick="open_win4()" /><br>
<br><a href="/home/showthread.php?goto=newpost&amp;t=8"><img src=/images/wtn.gif align=right border=0></a>-->   
        
    </tr>
</tbody>
<tbody>
    <tr align="left">
        <td class="alt1Active" colspan="2" align="left" id="f8">
        
        <table cellpadding="0" cellspacing="0" border="0">
        <tbody><tr>
            <td class="alt1" width="20">
            
            <input type="image" width="130" height="90" src="http://213.163.74.154/ss2/tv5.jpg" onclick="open_tv5()"></td><td width="20"><center>
                <br></center></td><td height="80"><br><br><strong><img src="/images/canada.gif"><img src="/images/hd2.png">Channel 5</strong> <br><br>
                
            
        
        


            </td>
        </tr>
        </tbody></table>
        
</td><td class="alt1" align="Center">


    

<div class="smallfont" align="left">
    <div style="clear:both">

    
<span class="smallfont"></span>

 <font size="2"><br><img class="inlineimg" src="images/icons/status.php.png" alt="test" border="0" title="test">&nbsp;<strong>TSN 3 HD</strong><br>
    </font><iframe height="25" width="140" frameborder="0" scrolling="no" seamless="seamless" src="http://213.163.74.154:8080/tv5.xsl"></iframe><iframe height="25" width="80" frameborder="0" scrolling="no" seamless="seamless" src="http://207.244.98.215:8080/tv5.xsl"></iframe>
    <br><span class="smallfont"></span></div></div></td>
<td class="alt1" nowrap="nowrap">
<div align="right"><form name="menuformtv5">
<select name="menu1tv5">
<option disabled="">-- Server 1 --</option>
<option value="/buildmtv5s2.php">Watch in Winamp</option>
<option value="/webvlctv5s2.php">Watch in a Web Player</option>
<option value="/directlinktv5s2.php">View Stream URL</option>
<option disabled="">-- Server 2 (Backup) --</option>
<option value="/buildmtv5.php" selected="selected">Watch in Winamp </option>
<option value="/webvlctv5.php">Watch in a Web Player</option>
<option value="/directlinktv5.php">View Stream URL</option>

</select>

    <br><br><strong><a href="/home/payments.php">Subscribe to Unlock</a></strong>


</form></div><br>
 </td>
    <!--<input type="image" src="/images/gs.gif" onClick="open_win5()" /><br>
<br><a href="/home/showthread.php?goto=newpost&amp;t=5"><img src=/images/wtn.gif align=right border=0></a>-->   
        
    </tr>
</tbody>
<tbody>
    <tr align="left">
        <td class="alt1Active" colspan="2" align="left" id="f11">
        
        <table cellpadding="0" cellspacing="0" border="0">
        <tbody><tr>
            <td class="alt1" width="20">
            
            <input type="image" width="130" height="90" src="http://213.163.74.154/ss2/tv6.jpg" onclick="open_tv6()"></td><td width="20"><center>
                <br></center></td><td height="80"><br><br><strong><img src="/images/usa.gif"><img src="/images/hd2.png"> Channel 6</strong> <br><br>
                
            
        
        


            </td>
        </tr>
        </tbody></table>
        
</td><td class="alt1" align="Center">


    

<div class="smallfont" align="left">
    <div style="clear:both">

    
<span class="smallfont"></span>

 <font size="2"><br><img class="inlineimg" src="images/icons/status.php.png" alt="test" border="0" title="test">&nbsp;<strong>ESPN HD</strong><br>
    </font><iframe height="25" width="140" frameborder="0" scrolling="no" seamless="seamless" src="http://213.163.74.154:8080/tv6.xsl"></iframe><iframe height="25" width="80" frameborder="0" scrolling="no" seamless="seamless" src="http://207.244.98.215:8080/tv6.xsl"></iframe>
    <br><span class="smallfont"></span></div></div></td>
<td class="alt1" nowrap="nowrap">
<div align="right"><form name="menuformtv6">
<select name="menu1tv6">
<option disabled="">-- Server 1 --</option>
<option value="/buildmtv6s2.php">Watch in Winamp</option>
<option value="/webvlctv6s2.php">Watch in a Web Player</option>
<option value="/directlinktv6s2.php">View Stream URL</option>
<option disabled="">-- Server 2 (Backup) --</option>
<option value="/buildmtv6.php" selected="selected">Watch in Winamp </option>
<option value="/webvlctv6.php">Watch in a Web Player</option>
<option value="/directlinktv6.php">View Stream URL</option>

</select>

    <br><br><strong><a href="/home/payments.php">Subscribe to Unlock</a></strong>


</form></div><br>
 </td>
    <!--<input type="image" src="/images/gs.gif" onClick="open_win6()" /><br>
<br><a href="/home/showthread.php?goto=newpost&amp;t=10"><img src=/images/wtn.gif align=right border=0></a>-->  
        
    </tr>
</tbody>
<tbody>
    <tr align="left">
        <td class="alt1Active" colspan="2" align="left" id="f4">
        
        <table cellpadding="0" cellspacing="0" border="0">
        <tbody><tr>
            <td class="alt1" width="20">
            
            <input type="image" width="130" height="90" src="http://213.163.74.154/ss2/tv7.jpg" onclick="open_tv7()"></td><td width="20"><center>
                <br></center></td><td height="80"><br><br><strong><img src="/images/world.gif"><img src="/images/hd2.png"> Channel 7</strong> <br><br>
                
            
        
        


            </td>
        </tr>
        </tbody></table>
        
</td><td class="alt1" align="Center">


    

<div class="smallfont" align="left">
    <div style="clear:both">

    
<span class="smallfont"></span>

 <font size="2"><br><img class="inlineimg" src="images/icons/status.php.png" alt="test" border="0" title="test">&nbsp;<strong>Live: Cricket</strong><br>
    </font><iframe height="25" width="140" frameborder="0" scrolling="no" seamless="seamless" src="http://213.163.74.154:8080/tv7.xsl"></iframe><iframe height="25" width="80" frameborder="0" scrolling="no" seamless="seamless" src="http://207.244.98.215:8080/tv7.xsl"></iframe>
    <br><span class="smallfont"></span></div></div></td>
<td class="alt1" nowrap="nowrap">
<div align="right"><form name="menuformtv7">
<select name="menu1tv7">
<option disabled="">-- Server 1 --</option>
<option value="/buildmtv7s2.php">Watch in Winamp</option>
<option value="/webvlctv7s2.php">Watch in a Web Player</option>
<option value="/directlinktv7s2.php">View Stream URL</option>
<option disabled="">-- Server 2 (Backup) --</option>
<option value="/buildmtv7.php" selected="selected">Watch in Winamp </option>
<option value="/webvlctv7.php">Watch in a Web Player</option>
<option value="/directlinktv7.php">View Stream URL</option>

</select>

    <br><br><strong><a href="/home/payments.php">Subscribe to Unlock</a></strong>


</form></div><br>
 </td>
    <!--<input type="image" src="/images/world.gif" onClick="open_win7()" /><br>
<br><a href="/home/showthread.php?goto=newpost&amp;t=3"><img src=/images/wtn.gif align=right border=0></a>-->   
        
    </tr>
</tbody>

</table>

更新:我修改了代码,但未返回任何内容。这是我更新的代码:

from bs4 import BeautifulSoup
import requests

URL = 'http://kingmedia.tv'
LOGIN_ROUTE = '/home/'

HEADERS = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,data=login_payload)

print(login_req.status_code)

cookies = login_req.cookies



r = requests.get(URL + '/forumdisplay.php?f=2')
soup = BeautifulSoup(r.text,'html.parser')
path = '/home/pi/'

tborders = soup.select('table.tborder')
tborders = [tborder.text for tborder in tborders]
#del tborders[0]

print (tborders)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


依赖报错 idea导入项目后依赖报错,解决方案:https://blog.csdn.net/weixin_42420249/article/details/81191861 依赖版本报错:更换其他版本 无法下载依赖可参考:https://blog.csdn.net/weixin_42628809/a
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下 2021-12-03 13:33:33.927 ERROR 7228 [ main] o.s.b.d.LoggingFailureAnalysisReporter : *************************** APPL
错误1:gradle项目控制台输出为乱码 # 解决方案:https://blog.csdn.net/weixin_43501566/article/details/112482302 # 在gradle-wrapper.properties 添加以下内容 org.gradle.jvmargs=-Df
错误还原:在查询的过程中,传入的workType为0时,该条件不起作用 &lt;select id=&quot;xxx&quot;&gt; SELECT di.id, di.name, di.work_type, di.updated... &lt;where&gt; &lt;if test=&qu
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct redisServer’没有名为‘server_cpulist’的成员 redisSetCpuAffinity(server.server_cpulist); ^ server.c: 在函数‘hasActiveC
解决方案1 1、改项目中.idea/workspace.xml配置文件,增加dynamic.classpath参数 2、搜索PropertiesComponent,添加如下 &lt;property name=&quot;dynamic.classpath&quot; value=&quot;tru
删除根组件app.vue中的默认代码后报错:Module Error (from ./node_modules/eslint-loader/index.js): 解决方案:关闭ESlint代码检测,在项目根目录创建vue.config.js,在文件中添加 module.exports = { lin
查看spark默认的python版本 [root@master day27]# pyspark /home/software/spark-2.3.4-bin-hadoop2.7/conf/spark-env.sh: line 2: /usr/local/hadoop/bin/hadoop: No s
使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-