如何解决Python urllib库不适用于http
我认为我在以下链接中找到了答案; Python urllib doesn't open any http URL
但是该页面上的答案并未清楚说明要解决此问题需要做什么。
我正在尝试通过Jupyter Notebook在Python 3.7.4版本中运行此代码;
from urllib.request import urlopen
html = urlopen('http://pythonscraping.com/pages/page1.html')
print(html.read())
然后我收到此错误;
ValueError Traceback (most recent call last)
C:\Anaconda3\lib\http\client.py in _get_hostport(self,host,port)
886 try:
--> 887 port = int(host[i+1:])
888 except ValueError:
ValueError: invalid literal for int() with base 10: 'port'
During handling of the above exception,another exception occurred:
InvalidURL Traceback (most recent call last)
<ipython-input-2-b7a1a86203a2> in <module>
1 from urllib.request import urlopen
2
----> 3 html = urlopen('http://pythonscraping.com/pages/page1.html')
4 print(html.read())
C:\Anaconda3\lib\urllib\request.py in urlopen(url,data,timeout,cafile,capath,cadefault,context)
220 else:
221 opener = _opener
--> 222 return opener.open(url,timeout)
223
224 def install_opener(opener):
C:\Anaconda3\lib\urllib\request.py in open(self,fullurl,timeout)
523 req = meth(req)
524
--> 525 response = self._open(req,data)
526
527 # post-process response
C:\Anaconda3\lib\urllib\request.py in _open(self,req,data)
541 protocol = req.type
542 result = self._call_chain(self.handle_open,protocol,protocol +
--> 543 '_open',req)
544 if result:
545 return result
C:\Anaconda3\lib\urllib\request.py in _call_chain(self,chain,kind,meth_name,*args)
501 for handler in handlers:
502 func = getattr(handler,meth_name)
--> 503 result = func(*args)
504 if result is not None:
505 return result
C:\Anaconda3\lib\urllib\request.py in http_open(self,req)
1343
1344 def http_open(self,req):
-> 1345 return self.do_open(http.client.HTTPConnection,req)
1346
1347 http_request = AbstractHTTPHandler.do_request_
C:\Anaconda3\lib\urllib\request.py in do_open(self,http_class,**http_conn_args)
1283
1284 # will parse host:port
-> 1285 h = http_class(host,timeout=req.timeout,**http_conn_args)
1286 h.set_debuglevel(self._debuglevel)
1287
C:\Anaconda3\lib\http\client.py in __init__(self,port,source_address,blocksize)
849 self._tunnel_headers = {}
850
--> 851 (self.host,self.port) = self._get_hostport(host,port)
852
853 # This is stored as an instance variable to allow unit
C:\Anaconda3\lib\http\client.py in _get_hostport(self,port)
890 port = self.default_port
891 else:
--> 892 raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
893 host = host[:i]
894 else:
InvalidURL: nonnumeric port: 'port'
如果我尝试使用链接的 https 版本使用相同的代码,
from urllib.request import urlopen
html = urlopen('https://pythonscraping.com/pages/page1.html')
print(html.read())
然后它起作用了
b'<html>\n<head>\n<title>A Useful Page</title>\n</head>\n<body>\n<h1>An Interesting Title</h1>\n<div>\nLorem ipsum dolor sit amet,consectetur adipisicing elit,sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident,sunt in culpa qui officia deserunt mollit anim id est laborum.\n</div>\n</body>\n</html>\n'
我该怎么做才能使其也适用于 http ?
谢谢。
编辑:我也没有使用Anaconda尝试了此操作,并得到了相同的错误。然后,我在Anaconda中创建了另一个环境,并安装了最新版本的Python(当前为3.8.5),并再次遇到相同的错误。当我在在线解释器上尝试时,它可以工作。我想这与本地设置有关,但我无法弄清楚。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。