如何解决在heroku上部署Selenium脚本时出现的问题
我在heroku上是一个全新的人,我正在尝试部署一个简单的python脚本,该脚本可打印google的源代码。 这是我的脚本:
import os
from selenium import webdriver
op = webdriver.ChromeOptions()
op.binary_location = os.environ.get("GOOGLE_CHROME_BIN")
op.add_argument("--headless")
op.add_argument("--no-sandbox")
op.add_argument("--disable-dev-sh-usage")
driver = webdriver.Chrome(executable_path=os.environ.get("CHROMEDRIVER_PATH"),chrome_options=op)
driver.get('https://google.com')
print(driver.page_source)
这些是我在heroku上的configvars
CHROMEDRIVER_PATH = /app/.chromedriver/bin/chromedriver
GOOGLE_CHROME_BIN = /app/.apt/usr/bin/google-chrome
而我在heroku上遇到的错误是
2020-08-10T20:18:21.304180+00:00 heroku[web.1]: State changed from crashed to starting
2020-08-10T20:18:48.000000+00:00 app[api]: Build succeeded
2020-08-10T20:18:49.871354+00:00 heroku[web.1]: Starting process with command `python google.py`
2020-08-10T20:18:53.312155+00:00 heroku[web.1]: Process exited with status 1
2020-08-10T20:18:53.367125+00:00 heroku[web.1]: State changed from starting to crashed
2020-08-10T20:18:53.371240+00:00 heroku[web.1]: State changed from crashed to starting
2020-08-10T20:18:53.206728+00:00 app[web.1]: Traceback (most recent call last):
2020-08-10T20:18:53.206831+00:00 app[web.1]: File "google.py",line 8,in <module>
2020-08-10T20:18:53.207147+00:00 app[web.1]: driver = webdriver.Chrome(os.environ.get("CHROMEDRIVER_PATH"),op)
2020-08-10T20:18:53.208271+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/selenium/webdriver/chrome/webdriver.py",line 73,in __init__
2020-08-10T20:18:53.208582+00:00 app[web.1]: self.service.start()
2020-08-10T20:18:53.208623+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/selenium/webdriver/common/service.py",line 71,in start
2020-08-10T20:18:53.208923+00:00 app[web.1]: cmd.extend(self.command_line_args())
2020-08-10T20:18:53.208958+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/selenium/webdriver/chrome/service.py",line 45,in command_line_args
2020-08-10T20:18:53.209208+00:00 app[web.1]: return ["--port=%d" % self.port] + self.service_args
2020-08-10T20:18:53.209262+00:00 app[web.1]: TypeError: %d format: a number is required,not Options
2020-08-10T20:19:06.776347+00:00 heroku[web.1]: Starting process with command `python google.py`
2020-08-10T20:19:10.012273+00:00 heroku[web.1]: Process exited with status 1
2020-08-10T20:19:10.081926+00:00 heroku[web.1]: State changed from starting to crashed
2020-08-10T20:19:09.881424+00:00 app[web.1]: Traceback (most recent call last):
2020-08-10T20:19:09.881458+00:00 app[web.1]: File "google.py",in <module>
2020-08-10T20:19:09.881678+00:00 app[web.1]: driver = webdriver.Chrome(os.environ.get("CHROMEDRIVER_PATH"),op)
2020-08-10T20:19:09.881717+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/selenium/webdriver/chrome/webdriver.py",in __init__
2020-08-10T20:19:09.881941+00:00 app[web.1]: self.service.start()
2020-08-10T20:19:09.881974+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/selenium/webdriver/common/service.py",in start
2020-08-10T20:19:09.882191+00:00 app[web.1]: cmd.extend(self.command_line_args())
2020-08-10T20:19:09.882227+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/selenium/webdriver/chrome/service.py",in command_line_args
2020-08-10T20:19:09.882434+00:00 app[web.1]: return ["--port=%d" % self.port] + self.service_args
2020-08-10T20:19:09.882481+00:00 app[web.1]: TypeError: %d format: a number is required,not Options
解决方法
TL; DR 您需要将正确的代码(使用chrome_options=op
重新部署到Heroku。
说明
仔细查看回溯。它提到了这一行:
driver = webdriver.Chrome(os.environ.get("CHROMEDRIVER_PATH"),op)
与您认为执行的代码行不同:
driver = webdriver.Chrome(executable_path=os.environ.get("CHROMEDRIVER_PATH"),chrome_options=op)
webdriver.Chrome.__init__
的第二个位置参数确实预期为port
,这完全可以解释错误
return ["--port=%d" % self.port] + self.service_args
TypeError: %d format: a number is required,not Options
抱怨self.port
是Options
对象,而不是int
(%d
期望)。
chrome_options
现在已弃用,而您必须使用options
,有效的代码块将是:
import os
from selenium import webdriver
op = webdriver.ChromeOptions()
op.binary_location = os.environ.get("GOOGLE_CHROME_BIN")
op.add_argument("--headless")
op.add_argument("--no-sandbox")
op.add_argument("--disable-dev-sh-usage")
driver = webdriver.Chrome(executable_path=os.environ.get("CHROMEDRIVER_PATH"),options=op)
参考文献
您可以在以下位置找到一些相关的讨论
- DeprecationWarning: use setter for headless property instead of set_headless opts.set_headless(headless=True) using Geckodriver and Selenium in Python
- DeprecationWarning: use options instead of chrome_options error using ChromeDriver and Chrome through Selenium on Windows 10 system
- How to configure ChromeDriver to initiate Chrome browser in Headless mode through Selenium?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。