如何解决Python并发executor.map和Submit
我正在学习如何与executor.map()
和executor.submit()
并发使用。
我有一个包含20个url的列表,并且想同时发送20个请求,问题是.submit()
返回的结果与开头的给定列表的顺序不同。我读过map()
可以满足我的需求,但是我不知道如何用它编写代码。
下面的代码对我来说很完美。
问题:map()
中是否有与下面的代码等效的代码块,或者有任何排序方法可以按给定列表的顺序对submit()
中的结果列表进行排序?
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/','http://www.cnn.com/','http://europe.wsj.com/','http://www.bbc.co.uk/','http://some-made-up-domain.com/']
# Retrieve a single page and report the url and contents
def load_url(url,timeout):
with urllib.request.urlopen(url,timeout=timeout) as conn:
return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url,url,60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url,exc))
else:
print('%r page is %d bytes' % (url,len(data)))
解决方法
这是现有代码的地图版本。注意,回调现在接受一个元组作为参数。我在回调中添加了try \ except,因此结果不会引发错误。结果根据输入列表排序。
from concurrent.futures import ThreadPoolExecutor
import urllib.request
URLS = ['http://www.foxnews.com/','http://www.cnn.com/','http://www.wsj.com/','http://www.bbc.co.uk/','http://some-made-up-domain.com/']
# Retrieve a single page and report the url and contents
def load_url(tt): # (url,timeout)
url,timeout = tt
try:
with urllib.request.urlopen(url,timeout=timeout) as conn:
return (url,conn.read())
except Exception as ex:
print("Error:",url,ex)
return(url,"") # error,return empty string
with ThreadPoolExecutor(max_workers=5) as executor:
results = executor.map(load_url,[(u,60) for u in URLS]) # pass url and timeout as tuple to callback
executor.shutdown(wait=True) # wait for all complete
print("Results:")
for r in results: # ordered results,will throw exception here if not handled in callback
print(' %r page is %d bytes' % (r[0],len(r[1])))
输出
Error: http://www.wsj.com/ HTTP Error 404: Not Found
Results:
'http://www.foxnews.com/' page is 320028 bytes
'http://www.cnn.com/' page is 1144916 bytes
'http://www.wsj.com/' page is 0 bytes
'http://www.bbc.co.uk/' page is 279418 bytes
'http://some-made-up-domain.com/' page is 64668 bytes
,
不使用map
方法,就可以使用enumerate
来构建future_to_url
字典,不仅将URL作为值,而且将其在列表中的索引也作为值。然后,您可以使用索引作为键,从调用future
返回的concurrent.futures.as_completed(future_to_url)
对象中构建一个字典,以便您可以遍历字典长度遍历一个索引以在同一字典中读取该字典。顺序作为原始列表中的对应项:
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {
executor.submit(load_url,60): (i,url) for i,url in enumerate(URLS)
}
futures = {}
for future in concurrent.futures.as_completed(future_to_url):
i,url = future_to_url[future]
futures[i] = url,future
for i in range(len(futures)):
url,future = futures[i]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url,exc))
else:
print('%r page is %d bytes' % (url,len(data)))
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。