如何解决python 请求 - 抓取 https://imgur.com
我正在尝试使用以下方法从 https://imgur.com 获取照片内容(字节):
from requests import get
from bs4 import BeautifulSoup as soup
divs = soup(get('https://imgur.com/a/...').text,'html.parser').find('div',{'class':'post-image-container'})
我的意思是,如果我发现了一个阻止请求的网站,就到此为止了吗?
(我不想使用硒)
解决方法
嘿,☑️ 使用轮换代理来避免 404:
第 1 步: (我这里用的是免费代理)
pip3 install bs4 requests stem
自动获取此免费可用代理列表:
import requests
import random
from bs4 import BeautifulSoup as bs
def get_free_proxies():
url = "https://free-proxy-list.net/"
# get the HTTP response and construct soup object
soup = bs(requests.get(url).content,"html.parser")
proxies = []
for row in soup.find("table",attrs={"id": "proxylisttable"}).find_all("tr")[1:]:
tds = row.find_all("td")
try:
ip = tds[0].text.strip()
port = tds[1].text.strip()
host = f"{ip}:{port}"
proxies.append(host)
except IndexError:
continue
return proxies
第 2 步:
如果您想检查前 10 个代理:
get_free_proxies()[:10]
>> Output : ['45.235.216.112:8080','139.255.11.148:8080','139.255.11.146:8080','152.67.24.187:80','197.220.109.222:80','180.250.12.10:80','87.140.8.148:8080','14.207.60.6:3128','136.243.254.196:80','52.56.100.107:3128']
proxies = ['45.235.216.112:8080','52.56.100.107:3128']
第 3 步:
创建会话
def get_session(proxies):
# construct an HTTP session
session = requests.Session()
# choose one random proxy
proxy = random.choice(proxies)
session.proxies = {"http": proxy,"https": proxy}
return session
第 4 步:
for i in range(5):
s = get_session(proxies)
try:
print("Request page with IP:",s.get("https://imgur.com/",timeout=1.5).text.strip())
divs = soup(get('https://imgur.com/').text,'html.parser').find('div',{'class':'post-image-container'})
print(divs)
except Exception as e:
continue
✅这里是输出:
Request page with IP: <!doctype html> <html lang=en> <head> <meta charset=utf-8> <meta name=viewport content="width=device-width,initial-scale=1"> <meta name=keywords content="funny,image,gif,gifs,memes,jokes,image upload,upload image,lol,humor,vote,comment,share,imgur,imgur.com,wallpaper"/> <meta name=description content="Discover the magic of the internet at Imgur,a community powered entertainment destination. Lift your spirits with funny jokes,trending memes,entertaining gifs,inspiring stories,viral videos,and so much more."/> ....
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。