如何解决NoneType对象没有属性“ get_text” — Python
我正在从Amazon抓取一些网页,并且遇到了此错误(标题中提到了)。
这是我的代码:
import requests
from bs4 import BeautifulSoup
import smtplib
URL = 'https://www.amazon.co.uk/UGREEN-Adapter-Samsung-Oneplus- Blackview/dp/B072V9CNTK/ref=sr_1_2_sspa?keywords=otg+cable&qid=1578610622&sr=8-2-spons&psc=1&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUEzRzRRUUdaR05RVlRJJmVuY3J5cHRlZElkPUEwNjExNjM4MVI4NVZaTFlYTlhGSCZlbmNyeXB0ZWRBZElkPUEwMjg1MTU0OEhROERWQTBSRFAzJndpZGdldE5hbWU9c3BfYXRmJmFjdGlvbj1jbGlja1JlZGlyZWN0JmRvTm90TG9nQ2xpY2s9dHJ1ZQ=='
headers = {
"User Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64 AppleWebKit/537.36 (KHTML,like Gecko) Chrome/79.0.3945.117 Safari/537.36'}
page = requests.get(URL,headers=headers)
soup = BeautifulSoup(page.content,'html.parser')
title = soup.find(id="productTitle").get_text()
price = soup.find(id="priceblock_ourprice").get_text()
converted_price = float(price[0:3])
def check_price():
print(soup.find(id="priceblock_ourprice").get_text())
converted_price = float(price[0:3])
if(converted_price < 7.00):
send_mail()
解决方法
这是因为该页面是使用javascript动态加载的。您可以使用selenium获取网站的html代码,如下所示:
from selenium import webdriver
URL = 'https://www.amazon.co.uk/UGREEN-Adapter-Samsung-Oneplus- Blackview/dp/B072V9CNTK/ref=sr_1_2_sspa?keywords=otg+cable&qid=1578610622&sr=8-2-spons&psc=1&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUEzRzRRUUdaR05RVlRJJmVuY3J5cHRlZElkPUEwNjExNjM4MVI4NVZaTFlYTlhGSCZlbmNyeXB0ZWRBZElkPUEwMjg1MTU0OEhROERWQTBSRFAzJndpZGdldE5hbWU9c3BfYXRmJmFjdGlvbj1jbGlja1JlZGlyZWN0JmRvTm90TG9nQ2xpY2s9dHJ1ZQ=='
driver = webdriver.Chrome()
driver.get(URL)
time.sleep(5)
page = driver.page_source
driver.close()
因此,这是完整的代码:
from bs4 import BeautifulSoup
from selenium import webdriver
import time
URL = 'https://www.amazon.co.uk/UGREEN-Adapter-Samsung-Oneplus- Blackview/dp/B072V9CNTK/ref=sr_1_2_sspa?keywords=otg+cable&qid=1578610622&sr=8-2-spons&psc=1&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUEzRzRRUUdaR05RVlRJJmVuY3J5cHRlZElkPUEwNjExNjM4MVI4NVZaTFlYTlhGSCZlbmNyeXB0ZWRBZElkPUEwMjg1MTU0OEhROERWQTBSRFAzJndpZGdldE5hbWU9c3BfYXRmJmFjdGlvbj1jbGlja1JlZGlyZWN0JmRvTm90TG9nQ2xpY2s9dHJ1ZQ=='
driver = webdriver.Chrome()
driver.get(URL)
time.sleep(5)
page = driver.page_source
driver.close()
soup = BeautifulSoup(page,'html5lib')
title = soup.find(id="productTitle")
price = soup.find(id="priceblock_ourprice")
print(soup.find(id="priceblock_ourprice").get_text())
输出:
£6.99
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。