如何解决通过request.post发送数据字段
from bs4 import BeautifulSoup as bs #importing the required libraries
from urllib.request import urlopen
import requests
urls1="https://www.makemytrip.com/hotels/" #initial url which contains the form where we could give our preferences.
#passing the data parameters
data={'checkin': '08152020','city': 'CTGOI','checkout': '08162020','roomStayQualifier': '2e0e','locusId': 'CTGOI','country': 'IN','locusType': 'city','searchText': 'Goa,India','visitorId': '5c68c2fb-0551-4ef2-8dae-1a55bb744e66'
}
req=requests.post(urls1,data,headers={'User-Agent': 'XYZ/3.0'})
page_soup = bs(req.content,"html.parser")
print(page_soup)
实际上,我想抓取上述数据字段下的酒店,这就是为什么我将带有request.post方法的data参数发送到初始网址的原因,这样当我收到响应对象时,我将得到下一页的内容将包含符合上述要求条件的酒店。
解决方法
您要抓取的网站使用GET方法执行搜索。
它还使用其他网址搜索酒店https://www.makemytrip.com/hotels/hotel-listing/
稍微修改一下示例以应用GET请求而不是POST请求,我们就能获得酒店列表的结果。
from bs4 import BeautifulSoup as bs
headers = {"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/83.0.4103.97 Safari/537.36"}
# setting a "browser" header seems to be required for this site.
data = {'checkin': '08192020','city': 'CTGOI','checkout': '08202020','roomStayQualifier': '2e0e','locusId': 'CTGOI','country': 'IN','locusType': 'city','searchText': 'Goa,India','visitorId': 'aaab4f61-2069-4033-bb97-0791f0f70'}
url = 'https://www.makemytrip.com/hotels/hotel-listing/'
# adding the params argument and supplying the dictionary of search data formats the resulting URL into something that makemytrip.com can understand.
# adding a timeout just in case makemytrip.com doesn't respond
req = requests.get(url,params=data,headers=headers,timeout=5)
page_soup = bs(req.content,'html.parser')
# this finds all the divs in the result with a class name of "listingRow".
listing_results = page_soup.findAll('div',class_='listingRow')
# this results array can then be looped through to find more details about each listing.
for listing in listing_results:
print(listing.find("p",itemprop="name").getText())
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。