如何解决如何解决Splash 405 https://www.controller.com/listings/aircraft/for-sale/list>:未处理或不允许HTTP状态代码
我正在尝试使用Scrapy-Splash访问网站,但出现错误405,忽略响应https://www.controller.com/>:未处理或不允许HTTP状态代码
我使用的代码
import scrapy
from scrapy_splash import SplashRequest
class ProxySpider(scrapy.Spider):
name = "proxyss"
def start_requests(self):
urls = [
'https://controller.com/',]
for url in urls:
yield SplashRequest("https://www.controller.com/listings/aircraft/for-sale/list",self.parse,args={"http_method":'GET','wait': 5,'proxy': 'http://xxxxxxxxxx'})
def parse(self,response):
page = response.url.split("/")[-2]
filename = 'proxy.html'
with open(filename,'wb') as f:
f.write(response.body)
self.log('Saved file %s' % filename)
日志
2020-08-17 21:30:55 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.controller.com> (failed 1 times): 405 Method Not Allowed
2020-08-17 21:30:55 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.controller.com/listings/aircraft/for-sale/list> (failed 1 times): 405 Method Not Allowed
2020-08-17 21:30:55 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.controller.com> (failed 2 times): 405 Method Not Allowed
2020-08-17 21:30:55 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.controller.com/listings/aircraft/for-sale/list> (failed 2 times): 405 Method Not Allowed
2020-08-17 21:30:55 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.controller.com> (failed 3 times): 405 Method Not Allowed
2020-08-17 21:30:55 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.controller.com/listings/aircraft/for-sale/list> (failed 3 times): 405 Method Not Allowed
2020-08-17 21:30:55 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://www.controller.com> (failed 4 times): 405 Method Not Allowed
2020-08-17 21:30:55 [scrapy.core.engine] DEBUG: Crawled (405) <GET https://www.controller.com> (referer: https://www.controller.com/listings/aircraft/for-sale/list)
2020-08-17 21:30:55 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://www.controller.com/listings/aircraft/for-sale/list> (failed 4 times): 405 Method Not Allowed
2020-08-17 21:30:55 [scrapy.core.engine] DEBUG: Crawled (405) <GET https://www.controller.com/listings/aircraft/for-sale/list> (referer: https://www.controller.com/listings/aircraft/for-sale/list)
2020-08-17 21:30:56 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <405 https://www.controller.com>: HTTP status code is not handled or not allowed
2020-08-17 21:30:56 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <405 https://www.controller.com/listings/aircraft/for-sale/list>: HTTP status code is not handled or not allowed
2020-08-17 21:30:56 [scrapy.core.engine] INFO: Closing spider (finished)
2020-08-17 21:30:56 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
解决方法
可能只是重试问题。将此添加到您的settings.py文件中,看看是否有帮助:
RETRY_ENABLED = True
RETRY_TIMES = 3
RETRY_HTTP_CODES = [405]
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。