带有启动设置的Scrapy可在scrapy shell中使用,否则将失败

如何解决带有启动设置的Scrapy可在scrapy shell中使用,否则将失败

我正尝试使用scrapyscrapy_splash设置,并使用BeautifulSoup在我的macOS上从此link抓取内容,并按照{{3 }}

  • 我在scrapy shell中测试了每个命令,并且每个命令都工作得很好,在几页上进行了测试。当我使用相同的命令运行Spider时,它无法检测到任何项目。

settings.py

BOT_NAME = 'stepstone'
SPIDER_MODULES = ['stepstone.spiders']
NEWSPIDER_MODULE = 'stepstone.spiders'
SPLASH_URL = 'http://0.0.0.0:8050'  # changed from the documentation's http://192.168.59.103:8050 which does not work 
DOWNLOADER_MIDDLEWARES = {
    'scrapy_splash.SplashCookiesMiddleware': 723,'scrapy_splash.SplashMiddleware': 725,'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware': 810,}
SPIDER_MIDDLEWARES = {
    'scrapy_splash.SplashDeduplicateArgsMiddleware': 100,}
DUPEFILTER_CLASS = 'scrapy_splash.SplashAwareDupeFilter'
HTTPCACHE_STORAGE = 'scrapy_splash.SplashAwareFSCacheStorage'

蜘蛛模块:

from scrapy.spiders import Spider
from scrapy_splash import SplashRequest
from scrapy import Request
from bs4 import BeautifulSoup


class StepSpider(Spider):
    name = 'step'
    allowed_domains = ['www.stepstone.de']
    start_urls = [
        'https://www.stepstone.de/5/ergebnisliste.html?stf=freeText&ns=1&qs='
        '%5B%7B%22id%22%3A%22216805%22%2C%22description%22%3A%22Software-Entw'
        'ickler%2Fin%22%2C%22type%22%3A%22jd%22%7D%2C%7B%22id%22%3A%223000001'
        '15%22%2C%22description%22%3A%22Deutschland%22%2C%22type%22%3A%22geoc'
        'ity%22%7D%5D&companyID=0&cityID=300000115&sourceOfTheSearchField=home'
        'pagemex%3Ageneral&searchOrigin=Homepage_top-search&ke=Software-Entwic'
        'kler%2Fin&ws=Deutschland&ra=30/'
    ]

    @staticmethod
    def extract_item(soup,extraction_path):
        result = soup.find(*extraction_path)
        if result:
            return result.getText()

    def parse(self,response):
        soup = BeautifulSoup(response.body,features='lxml')
        listings = [
            response.urljoin(item)
            for item in response.xpath('//div/div/a/@href').extract()
            if 'stellenangebote' in item
        ]
        yield from [
            Request(
                url,callback=self.parse_item,cb_kwargs={'soup': soup},meta={'splash': {'args': {'html': 1,'png': 1,}}},)
            for url in listings
        ]
        next_page = soup.find('a',{'data-at': 'pagination-next'})
        if next_page:
            yield SplashRequest(next_page.get('href'),self.parse)

    def parse_header(self,response,soup):
        title = response.xpath('//h1/text()').get()
        location = self.extract_item(
            soup,('li',{'class': 'at-listing__li: st-icons_location'})
        )
        contract_type = self.extract_item(
            soup,{'class': 'at-listing__list-icons_contract-type'})
        )
        work_type = self.extract_item(
            soup,{'class': 'at-listing__list-icons_work-type'})
        )
        return {
            'title': title,'location': location,'contract_type': contract_type,'work_type': work_type,}

    def parse_body(self,soup):
        titles = response.xpath('//h4/text()').extract()
        intro = self.extract_item(
            soup,('div',{'class': 'at-section-text-introduction-content'})
        )
        description = self.extract_item(
            soup,{'class': 'at-section-text-description-content'})
        )
        profile = self.extract_item(
            soup,{'class': 'at-section-text-profile-content'})
        )
        we_offer = self.extract_item(
            soup,{'class': 'at-section-text-weoffer-content'})
        )
        contact = self.extract_item(
            soup,{'class': 'at-section-text-contact-content'})
        )
        return {
            title: text
            for title,text in zip(
                titles,[intro,description,profile,we_offer,contact]
            )
        }

    def parse_item(self,soup):
        items = self.parse_header(response,soup)
        items.update(self.parse_body(response,soup))
        yield items

完整日志:

2020-08-11 17:57:44 [scrapy.utils.log] INFO: Scrapy 2.2.1 started (bot: stepstone)
2020-08-11 17:57:44 [scrapy.utils.log] INFO: Versions: lxml 4.5.0.0,libxml2 2.9.10,cssselect 1.1.0,parsel 1.6.0,w3lib 1.22.0,Twisted 20.3.0,Python 3.8.3 (default,May 27 2020,20:54:22) - [Clang 11.0.3 (clang-1103.0.32.59)],pyOpenSSL 19.1.0 (OpenSSL 1.1.1g  21 Apr 2020),cryptography 3.0,Platform macOS-10.15.6-x86_64-i386-64bit
2020-08-11 17:57:44 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2020-08-11 17:57:44 [scrapy.crawler] INFO: Overridden settings:
{'BOT_NAME': 'stepstone','DUPEFILTER_CLASS': 'scrapy_splash.SplashAwareDupeFilter','HTTPCACHE_STORAGE': 'scrapy_splash.SplashAwareFSCacheStorage','NEWSPIDER_MODULE': 'stepstone.spiders','SPIDER_MODULES': ['stepstone.spiders']}
2020-08-11 17:57:44 [scrapy.extensions.telnet] INFO: Telnet Password: 71c7bd3bdaf32c63
2020-08-11 17:57:44 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats','scrapy.extensions.telnet.TelnetConsole','scrapy.extensions.memusage.MemoryUsage','scrapy.extensions.logstats.LogStats']
2020-08-11 17:57:44 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware','scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware','scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware','scrapy.downloadermiddlewares.useragent.UserAgentMiddleware','scrapy.downloadermiddlewares.retry.RetryMiddleware','scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware','scrapy.downloadermiddlewares.redirect.RedirectMiddleware','scrapy.downloadermiddlewares.cookies.CookiesMiddleware','scrapy_splash.SplashCookiesMiddleware','scrapy_splash.SplashMiddleware','scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware','scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware','scrapy.downloadermiddlewares.stats.DownloaderStats']
2020-08-11 17:57:44 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware','scrapy_splash.SplashDeduplicateArgsMiddleware','scrapy.spidermiddlewares.offsite.OffsiteMiddleware','scrapy.spidermiddlewares.referer.RefererMiddleware','scrapy.spidermiddlewares.urllength.UrlLengthMiddleware','scrapy.spidermiddlewares.depth.DepthMiddleware']
2020-08-11 17:57:44 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2020-08-11 17:57:44 [scrapy.core.engine] INFO: Spider opened
2020-08-11 17:57:44 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min),scraped 0 items (at 0 items/min)
2020-08-11 17:57:44 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2020-08-11 17:57:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.stepstone.de/5/ergebnisliste.html?stf=freeText&ns=1&qs=%5B%7B%22id%22%3A%22216805%22%2C%22description%22%3A%22Software-Entwickler%2Fin%22%2C%22type%22%3A%22jd%22%7D%2C%7B%22id%22%3A%22300000115%22%2C%22description%22%3A%22Deutschland%22%2C%22type%22%3A%22geocity%22%7D%5D&companyID=0&cityID=300000115&sourceOfTheSearchField=homepagemex%3Ageneral&searchOrigin=Homepage_top-search&ke=Software-Entwickler%2Fin&ws=Deutschland&ra=30/> (referer: None)
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: utf-8  confidence = 0.99
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: SHIFT_JIS Japanese confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: EUC-JP Japanese confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: GB2312 Chinese confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: EUC-KR Korean confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: CP949 Korean confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: Big5 Chinese confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: EUC-TW Taiwan confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: windows-1251 Russian confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: KOI8-R Russian confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: ISO-8859-5 Russian confidence = 0.0
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: MacCyrillic Russian confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: IBM866 Russian confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: IBM855 Russian confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: ISO-8859-7 Greek confidence = 0.0
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: windows-1253 Greek confidence = 0.0
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: ISO-8859-5 Bulgairan confidence = 0.0
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: windows-1251 Bulgarian confidence = 0.0
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: TIS-620 Thai confidence = 0.041278205445058724
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: ISO-8859-9 Turkish confidence = 0.5186494104315963
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: windows-1255 Hebrew confidence = 0.0
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: windows-1255 Hebrew confidence = 0.0
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: windows-1255 Hebrew confidence = 0.0
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: utf-8  confidence = 0.99
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: SHIFT_JIS Japanese confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: EUC-JP Japanese confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: GB2312 Chinese confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: EUC-KR Korean confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: CP949 Korean confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: Big5 Chinese confidence = 0.01
2020-08-11 17:57:46 [chardet.charsetprober] DEBUG: EUC-TW Taiwan confidence = 0.01
2020-08-11 17:57:47 [scrapy.dupefilters] DEBUG: Filtered duplicate request: <GET https://www.stepstone.de/stellenangebote--JAVA-Software-Entwickler-m-w-d-Sueddeutschland-TECCON-Consulting-Engineering-GmbH--6582908-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=1_1_25_dynrl_m_0_0_0_0> - no more duplicates will be shown (see DUPEFILTER_DEBUG to show all duplicates)
2020-08-11 17:57:47 [py.warnings] WARNING: /usr/local/lib/python3.8/site-packages/scrapy_splash/request.py:41: ScrapyDeprecationWarning: Call to deprecated function to_native_str. Use to_unicode instead.
  url = to_native_str(url)

2020-08-11 17:57:50 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://0.0.0.0:8050/render.json> (referer: None)
2020-08-11 17:57:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stepstone.de/stellenangebote--Software-Entwickler-fuer-Windowsapplikationen-m-w-d-Stockach-oder-Boeblingen-Baumer-MDS-GmbH--6568164-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=19_19_25_dynrl_m_0_0_0_0>
{'title': 'Software Entwickler für Windowsapplikationen (m/w/d)','location': None,'contract_type': None,'work_type': None,'Ihre Herausforderung:': None,'Sie verfügen über:': None,'Wir bieten:': None,'Kontakt:': None}
2020-08-11 17:57:51 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://0.0.0.0:8050/render.json> (referer: None)
2020-08-11 17:57:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stepstone.de/stellenangebote--JAVA-Software-Entwickler-m-w-d-Sueddeutschland-TECCON-Consulting-Engineering-GmbH--6582908-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=1_1_25_dynrl_m_0_0_0_0>
{'title': 'JAVA Software-Entwickler (m/w/d)','Einleitung': None,'Ihre Aufgaben': None,'Ihr Profil': None,'Wir bieten': None,'Weitere Informationen': None}
2020-08-11 17:57:52 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://0.0.0.0:8050/render.json> (referer: None)
2020-08-11 17:57:52 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stepstone.de/stellenangebote--Software-Entwickler-Business-Engineer-fuer-Blockchain-Team-in-Gruendung-w-m-d-Frankfurt-Main-Deutsche-Bahn-AG--6249570-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=16_16_25_dynrl_m_0_0_0_0>
{'title': 'Software-Entwickler / Business Engineer für Blockchain-Team in Gründung (w/m/d)','Was dich erwartet': None,'Was wir erwarten': None,'Standort': None}
2020-08-11 17:57:55 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://0.0.0.0:8050/render.json> (referer: None)
2020-08-11 17:57:55 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stepstone.de/stellenangebote--Software-Entwickler-w-m-d-Diagnose-und-Visualisierungssysteme-Mannheim-Halle-Stadler-Mannheim-GmbH--6615613-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=13_13_25_dynrl_m_0_0_0_0>
{'title': 'Software-Entwickler (w/m/d) Diagnose und Visualisierungssysteme','Ihre Aufgaben:': None,'Ihr Profil:': None,'Unser Angebot:': None,'Begeistert?': None}
2020-08-11 17:57:55 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://0.0.0.0:8050/render.json> (referer: None)
2020-08-11 17:57:55 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stepstone.de/stellenangebote--Software-Entwickler-m-w-d-Rosenheim-Agenda-Informationssysteme-GmbH-Co-KG--6590641-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=17_17_25_dynrl_m_0_0_0_0>
{'title': 'Software-Entwickler (m/w/d)','Das spricht für uns:': None,'Kontakt:': None,'Standort': None}
2020-08-11 17:58:08 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://0.0.0.0:8050/render.json> (referer: None)
2020-08-11 17:58:08 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stepstone.de/stellenangebote--Software-Entwickler-w-m-d-fuer-Fahrzeugsteuerung-Mannheim-Halle-Stadler-Mannheim-GmbH--6615612-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=11_11_25_dynrl_m_0_0_0_0>
{'title': 'Software-Entwickler (w/m/d) für Fahrzeugsteuerung','Begeistert?': None}
^C2020-08-11 17:58:09 [scrapy.crawler] INFO: Received SIGINT,shutting down gracefully. Send again to force 
2020-08-11 17:58:09 [scrapy.core.engine] INFO: Closing spider (shutdown)
2020-08-11 17:58:13 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://0.0.0.0:8050/render.json> (referer: None)
2020-08-11 17:58:13 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://0.0.0.0:8050/render.json> (referer: None)
2020-08-11 17:58:13 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://0.0.0.0:8050/render.json> (referer: None)
2020-08-11 17:58:13 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://0.0.0.0:8050/render.json> (referer: None)
2020-08-11 17:58:13 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stepstone.de/stellenangebote--Software-Entwickler-m-w-d-Meissen-Staatliche-Porzellan-Manufaktur-Meissen-GmbH--6462761-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=14_14_25_dynrl_m_0_0_0_0>
{'title': 'Software Entwickler (m/w/d)','Wir gehen neue Wege': None,'unsere Anforderungen': None,'unser Angebot': None,'Kontakt': None}
2020-08-11 17:58:13 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stepstone.de/stellenangebote--Agiler-Software-Entwickler-m-w-div-Dresden-Otto-Group-Solution-Provider-OSP-GmbH--4573007-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=8_8_25_dynrl_m_0_0_0_0>
{'title': 'Agiler Software Entwickler (m/w/div)','Über uns': None,'Was du mitbringen solltest': None,'Diese und weitere Benefits erwarten dich': None,'Kontakt': None}
2020-08-11 17:58:13 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stepstone.de/stellenangebote--Software-Entwickler-m-w-d-Essen-Lowell-Group--6615697-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=9_9_25_dynrl_m_0_0_0_0>
{'title': 'Software Entwickler (m/w/d)','Kontakt': None,'Standort': None}
2020-08-11 17:58:13 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stepstone.de/stellenangebote--Softwareentwickler-m-w-d-Fullstack-Web-Boeblingen-Braunschweig-Deutschlandweit-Ingolstadt-Muenchen-Norddeutschland-Stuttgart-umlaut--6122455-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=15_15_25_dynrl_m_0_0_0_0>
{'title': 'Softwareentwickler (m/w/d) - Fullstack Web','our öffer': None,'yöu': None,'top 5 reasöns': None,'cöntact': None,'Mitarbeiterbewertungen': None}
2020-08-11 17:58:13 [scrapy.core.engine] DEBUG: Crawled (200) <POST http://0.0.0.0:8050/render.json> (referer: None)
2020-08-11 17:58:13 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stepstone.de/stellenangebote--Software-Entwickler-E-Commerce-m-w-d-Dresden-Otto-Group-Solution-Provider-OSP-GmbH--6550022-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=10_10_25_dynrl_m_0_0_0_0>
{'title': 'Software Entwickler E-Commerce (m/w/d)','Standort': None}
^C2020-08-11 17:58:16 [scrapy.crawler] INFO: Received SIGINT twice,forcing unclean shutdown
2020-08-11 17:58:16 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <POST http://0.0.0.0:8050/render.json> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2020-08-11 17:58:16 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <POST http://0.0.0.0:8050/render.json> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2020-08-11 17:58:16 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <POST http://0.0.0.0:8050/render.json> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2020-08-11 17:58:16 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.stepstone.de/5/ergebnisliste.html?stf=freeText&ns=1&companyid=0&sourceofthesearchfield=homepagemex%3Ageneral&qs=[{"id"%3A216805%2C"description"%3A"Software-Entwickler\%2Fin"%2C"type"%3A"jd"}%2C{"id"%3A300000115%2C"description"%3A"Deutschland"%2C"type"%3A"geocity"}]&cityid=300000115&ke=Software-Entwickler%2Fin&ws=Deutschland&ra=30&suid=90b7defb-2854-4c23-98bd-b39bc15a6922&of=25&action=paging_next via http://0.0.0.0:8050/render.html> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2020-08-11 17:58:16 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <POST http://0.0.0.0:8050/render.json> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2020-08-11 17:58:16 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <POST http://0.0.0.0:8050/render.json> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2020-08-11 17:58:16 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <POST http://0.0.0.0:8050/render.json> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2020-08-11 17:58:16 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <POST http://0.0.0.0:8050/render.json> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]

docker日志(由于文本限制,不是完整的日志,但是不断重复几乎是一样的事情):

2020-08-11 15:57:27+0000 [-] Log opened.
2020-08-11 15:57:27.990815 [-] Xvfb is started: ['Xvfb',':2061643423','-screen','0','1024x768x24','-nolisten','tcp']
QStandardPaths: XDG_RUNTIME_DIR not set,defaulting to '/tmp/runtime-splash'
2020-08-11 15:57:28.135258 [-] Splash version: 3.4.1
2020-08-11 15:57:28.203198 [-] Qt 5.13.1,PyQt 5.13.1,WebKit 602.1,Chromium 73.0.3683.105,sip 4.19.19,Twisted 19.7.0,Lua 5.2
2020-08-11 15:57:28.203826 [-] Python 3.6.9 (default,Nov  7 2019,10:44:02) [GCC 8.3.0]
2020-08-11 15:57:28.204679 [-] Open files limit: 1048576
2020-08-11 15:57:28.205242 [-] Can't bump open files limit
2020-08-11 15:57:28.229336 [-] proxy profiles support is enabled,proxy profiles path: /etc/splash/proxy-profiles
2020-08-11 15:57:28.229855 [-] memory cache: enabled,private mode: enabled,js cross-domain access: disabled
2020-08-11 15:57:28.410540 [-] verbosity=1,slots=20,argument_cache_max_entries=500,max-timeout=90.0
2020-08-11 15:57:28.411484 [-] Web UI: enabled,Lua: enabled (sandbox: enabled),Webkit: enabled,Chromium: enabled
2020-08-11 15:57:28.412634 [-] Site starting on 8050
2020-08-11 15:57:28.412924 [-] Starting factory <twisted.web.server.Site object at 0x7fbfa77591d0>
2020-08-11 15:57:28.414172 [-] Server listening on http://0.0.0.0:8050
2020-08-11 15:57:49.583386 [events] {"path": "/render.json","rendertime": 2.339588165283203,"maxrss": 236848,"load": [0.1,0.05,0.06],"fds": 102,"active": 7,"qsize": 0,"_id": 140461124347104,"method": "POST","timestamp": 1597161469,"user-agent": "Scrapy/2.2.1 (+https://scrapy.org)","args": {"headers": {"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8","Accept-Language": "en","Cookie": "cfid=bef36179-b81a-44e3-9bd1-059f5406a911; cftoken=0; USER_HASH_ID=f896d04a-5348-455b-a0ab-ffd0d5f6e674; V5=1; UXUSER=BLACKLIST%3BA%3B%20%3B; STEPSTONEV5LANG=de; ONLINE_CF=14-190; dtCookie=35$77973CDF4397BDD4A3EF1CAFDB05C9FD","Referer": "https://www.stepstone.de/5/ergebnisliste.html?stf=freeText&ns=1&qs=%5B%7B%22id%22%3A%22216805%22%2C%22description%22%3A%22Software-Entwickler%2Fin%22%2C%22type%22%3A%22jd%22%7D%2C%7B%22id%22%3A%22300000115%22%2C%22description%22%3A%22Deutschland%22%2C%22type%22%3A%22geocity%22%7D%5D&companyID=0&cityID=300000115&sourceOfTheSearchField=homepagemex%3Ageneral&searchOrigin=Homepage_top-search&ke=Software-Entwickler%2Fin&ws=Deutschland&ra=30/","User-Agent": "Scrapy/2.2.1 (+https://scrapy.org)"},"html": 1,"png": 1,"url": "https://www.stepstone.de/stellenangebote--Software-Entwickler-fuer-Windowsapplikationen-m-w-d-Stockach-oder-Boeblingen-Baumer-MDS-GmbH--6568164-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=19_19_25_dynrl_m_0_0_0_0","uid": 140461124347104},"status_code": 200,"client_ip": "172.17.0.1"}
2020-08-11 15:57:49.584498 [-] "172.17.0.1" - - [11/Aug/2020:15:57:48 +0000] "POST /render.json HTTP/1.1" 200 371319 "-" "Scrapy/2.2.1 (+https://scrapy.org)"
2020-08-11 15:57:49.777352 [events] {"path": "/render.json","rendertime": 2.6071407794952393,"maxrss": 243100,"fds": 106,"active": 6,"_id": 140461124981984,"url": "https://www.stepstone.de/stellenangebote--JAVA-Software-Entwickler-m-w-d-Sueddeutschland-TECCON-Consulting-Engineering-GmbH--6582908-inline.html?suid=90b7defb-2854-4c23-98bd-b39bc15a6922&rltr=1_1_25_dynrl_m_0_0_0_0","uid": 140461124981984},"client_ip": "172.17.0.1"}

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


依赖报错 idea导入项目后依赖报错,解决方案:https://blog.csdn.net/weixin_42420249/article/details/81191861 依赖版本报错:更换其他版本 无法下载依赖可参考:https://blog.csdn.net/weixin_42628809/a
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下 2021-12-03 13:33:33.927 ERROR 7228 [ main] o.s.b.d.LoggingFailureAnalysisReporter : *************************** APPL
错误1:gradle项目控制台输出为乱码 # 解决方案:https://blog.csdn.net/weixin_43501566/article/details/112482302 # 在gradle-wrapper.properties 添加以下内容 org.gradle.jvmargs=-Df
错误还原:在查询的过程中,传入的workType为0时,该条件不起作用 &lt;select id=&quot;xxx&quot;&gt; SELECT di.id, di.name, di.work_type, di.updated... &lt;where&gt; &lt;if test=&qu
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct redisServer’没有名为‘server_cpulist’的成员 redisSetCpuAffinity(server.server_cpulist); ^ server.c: 在函数‘hasActiveC
解决方案1 1、改项目中.idea/workspace.xml配置文件,增加dynamic.classpath参数 2、搜索PropertiesComponent,添加如下 &lt;property name=&quot;dynamic.classpath&quot; value=&quot;tru
删除根组件app.vue中的默认代码后报错:Module Error (from ./node_modules/eslint-loader/index.js): 解决方案:关闭ESlint代码检测,在项目根目录创建vue.config.js,在文件中添加 module.exports = { lin
查看spark默认的python版本 [root@master day27]# pyspark /home/software/spark-2.3.4-bin-hadoop2.7/conf/spark-env.sh: line 2: /usr/local/hadoop/bin/hadoop: No s
使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-