使用Python下载PDF并跟踪下载

如何解决使用Python下载PDF并跟踪下载

我正在创建一个从网站下载PDF并将其保存到磁盘的应用程序。我知道“请求”模块可以执行此操作，但不能处理下载背后的逻辑（文件大小，进度，剩余时间等）。

到目前为止，我已经使用硒创建了程序，最终希望将其最终合并到GUI Tkinter应用程序中。

处理下载，跟踪并最终创建进度条的最佳方法是什么？

到目前为止，这是我的代码：

from selenium import webdriver
from time import sleep 
import requests

import secrets

class manual_grabber():
    """ A class creating a manual downloader for the Roger Technology website """
    def __init__(self):
    """ Initialize attributes of manual grabber """
    self.driver = webdriver.Chrome('\\Users\\Joel\\Desktop\\Python\\manual_grabber\\chromedriver.exe')

def login(self):
    """ Function controlling the login logic """
    self.driver.get('https://rogertechnology.it/en/b2b')

    sleep(1)

    # Locate elements and enter login details
    user_in = self.driver.find_element_by_xpath('/html/body/div[2]/form/input[6]')
    user_in.send_keys(secrets.username)   

    pass_in = self.driver.find_element_by_xpath('/html/body/div[2]/form/input[7]')
    pass_in.send_keys(secrets.password)

    enter_button = self.driver.find_element_by_xpath('/html/body/div[2]/form/div/input')
    enter_button.click()
    
    # Click Self Service Area button
    self_service_button = self.driver.find_element_by_xpath('//*[@id="bs-example-navbar-collapse-1"]/ul/li[1]/a')
    self_service_button.click()

def download_file(self):
    """Access file tree and navigate to PDF's and download"""
    # Wait for all elements to load 
    sleep(3)

    # Find and switch to iFrame
    frame = self.driver.find_element_by_xpath('//*[@id="siteOutFrame"]/iframe')
    self.driver.switch_to.frame(frame)

    # Find and click tech manuals button 
    tech_manuals_button = self.driver.find_element_by_xpath('//*[@id="fileTree_1"]/ul/li/ul/li[6]/a')
    tech_manuals_button.click()


bot = manual_grabber()
bot.login()
bot.download_file()

因此，总而言之，我想使此代码在网站上下载PDF，并将它们存储在特定目录（以“ JQuery文件树”中的父文件夹命名），并跟踪进度（文件大小，时间）其余等）。

这是DOM：

我希望这是足够的信息。还有其他要求吗？

使用Python下载PDF并跟踪下载

如何解决使用Python下载PDF并跟踪下载

相关推荐