在R中网络抓取搜索结果

如何解决在R中网络抓取搜索结果

我是网络爬虫的新手,我正在尝试抓取网站中搜索功能产生的一些数据。我正在使用rvest获取信息,但没有得到结果。这是网站:

https://www.encompassinsurance.com/agency-locator.aspx#PostalCode=30350&City=&StateProvCd=&Latitude=&Longitude=

这是我正在运行的:

URL <- 'https://www.encompassinsurance.com/agency-locator.aspx#PostalCode=21403&City=&StateProvCd=&Latitude=&Longitude='

webpage <- read_html(URL)

name_html <- html_nodes(webpage,'.locator_result_name')

name_data <- html_text(name_html)

运行此代码时,我得到以下响应: 字符(0)

我希望作为邮政编码搜索结果的每个公司的名称(例如“ Townley-Kenton Insurance Agency”,“ Bradford Turner Insurance Group LLC”)。

我知道此页面上有一些Javascript,并且我可能会遗漏重要的文章,但是鉴于我对html,CSS,javascript的了解有限,所以我不确定如何应用V8或PhantomJS来完成这项工作。

感谢任何帮助。

解决方法

确实确实是使用javascript(通过XHR GET请求)动态获取数据的。但是,可以使用httr包直接从R发送此请求。它返回一个JSON字符串,该字符串易于使用jsonlite进行解析。

您要抓取的几乎所有信息都将在数据框Info$OfficeInfo中:

library(httr)
library(jsonlite)

res <- content(GET(paste0("https://alr.encompassinsurance.com/","?PostalCode=30350&City=&StateProvCd=","&Latitude=&Longitude=")),"text")
info <- fromJSON(res)

info$OfficeInfo$Name
#>  [1] "Townley-Kenton Insurance Agency"                          
#>  [2] "Bradford Turner Insurance Group LLC"                      
#>  [3] "Arthur J Gallagher Risk Management Services,Inc."        
#>  [4] "Lanigan Insurance Group Inc"                              
#>  [5] "Haven Insurance Group"                                    
#>  [6] "The Leavitt Insurance Group of Atlanta,Incorporated"     
#>  [7] "Findley Insurance Agency Inc"                             
#>  [8] "Grimes Insurance Agency Inc"                              
#>  [9] "Larry L Talbert Ins Agency DBA Talbert Insurance Services"
#> [10] "The Alliance Group,Inc."                                 
#> [11] "Concierge Insurance Group LLC"                            
#> [12] "Sutter McLellan & Gilbreath Inc"                          
#> [13] "The Wichalonis Insurance Agency"                          
#> [14] "The Beck Agency"                                          
#> [15] "USI Insurance Services LLC"                               
#> [16] "The Insurance Store"                                      
#> [17] "Southern Insurance Associates of Dunwoody"                
#> [18] "D.C.J.D. Corporation DBA The Markey Insurance Group"      
#> [19] "DM Services,Incorporated"                                
#> [20] "Southern Insurance Advisors"                              
#> [21] "Metro Brokers Insurance Services"                         
#> [22] "1 Source Insurance,LLC"                                  
#> [23] "The Bates Agency II,LLC"                                 
#> [24] "Risk & Insurance Consultants Inc"                         
#> [25] "Integrity Insurance & Financial Services Inc"             
#> [26] "HN Insurance Services Inc"                                
#> [27] "Norton Metro LLC"                                         
#> [28] "The Nsure Network LLC"                                    
#> [29] "Henssler Norton Insurance LLC"                            
#> [30] "Brown & Brown Insurance of Georgia"                       
#> [31] "America Insurance Brokers,Inc. DBA AIB"                  
#> [32] "Clear View Insurance Agency"                              
#> [33] "Relation Insurance Services"                              
#> [34] "Partners Risk Services LLC"                               
#> [35] "PointeNorth Insurance Group LLC"                          
#> [36] "Advanced Insurors Inc"                                    
#> [37] "Mcever & Tribble,Inc."                                   
#> [38] "The Bethea Insurance Group,LLC"                          
#> [39] "Watchko - Young Ins Agcy Inc"                             
#> [40] "Sterling Seacrest Partners Inc"                           
#> [41] "Little & Smith,Incorporated"                             
#> [42] "LMG Insurance Services Inc"                               
#> [43] "Granite Risk Advisors LLC"                                
#> [44] "Mountain Lakes Insurance,LLC"                            
#> [45] "Hutchinson Traylor Insurance"                             
#> [46] "Edgewood Partners Insurance Center"                       
#> [47] "ADC Agency"                                               
#> [48] "MLG Insurance & Financial Services"                       
#> [49] "Burnette Insurance Agency"                                
#> [50] "Campbell and Company Enterprise,Incorporated"

reprex package(v0.3.0)于2020-08-19创建

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


依赖报错 idea导入项目后依赖报错,解决方案:https://blog.csdn.net/weixin_42420249/article/details/81191861 依赖版本报错:更换其他版本 无法下载依赖可参考:https://blog.csdn.net/weixin_42628809/a
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下 2021-12-03 13:33:33.927 ERROR 7228 [ main] o.s.b.d.LoggingFailureAnalysisReporter : *************************** APPL
错误1:gradle项目控制台输出为乱码 # 解决方案:https://blog.csdn.net/weixin_43501566/article/details/112482302 # 在gradle-wrapper.properties 添加以下内容 org.gradle.jvmargs=-Df
错误还原:在查询的过程中,传入的workType为0时,该条件不起作用 &lt;select id=&quot;xxx&quot;&gt; SELECT di.id, di.name, di.work_type, di.updated... &lt;where&gt; &lt;if test=&qu
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct redisServer’没有名为‘server_cpulist’的成员 redisSetCpuAffinity(server.server_cpulist); ^ server.c: 在函数‘hasActiveC
解决方案1 1、改项目中.idea/workspace.xml配置文件,增加dynamic.classpath参数 2、搜索PropertiesComponent,添加如下 &lt;property name=&quot;dynamic.classpath&quot; value=&quot;tru
删除根组件app.vue中的默认代码后报错:Module Error (from ./node_modules/eslint-loader/index.js): 解决方案:关闭ESlint代码检测,在项目根目录创建vue.config.js,在文件中添加 module.exports = { lin
查看spark默认的python版本 [root@master day27]# pyspark /home/software/spark-2.3.4-bin-hadoop2.7/conf/spark-env.sh: line 2: /usr/local/hadoop/bin/hadoop: No s
使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-