如何解决通过VBA抓取的标题<H1>返回的编码与所需的编码不同
我正在使用以下功能通过VBA抓取网站的标题,但是,由于某些字符(例如破折号)以例如 – 最高人民法院公报 – ;科特迪瓦 d&#039 ; ivoire。我正在获取的标题可以在这里找到:https://www.presidence.ci/communiques-du-conseil-des-ministres/。有什么办法可以通过更改现有功能来解决此问题?
Function fgetMetaTitle(ByVal strURL) As String
Dim stPnt As Long,x As String
Dim oXH As Object
'Get URL's HTML Source
Set oXH = CreateObject("msxml2.xmlhttp")
With oXH
.Open "get",strURL,False
.send
x = .responseText
End With
Set oXH = Nothing
'Parse HTML Source for Title
If InStr(1,UCase(x),"<TITLE>") Then
stPnt = InStr(1,"<TITLE>") + Len("<TITLE>")
fgetMetaTitle = Mid(x,stPnt,InStr(stPnt,"</TITLE>") - stPnt)
Else
fgetMetaTitle = ""
End If
End Function
解决方法
这样的事情怎么样(我为破折号和撇号添加了处理程序):
Function fgetMetaTitle(ByVal strURL) As String
Dim stPnt As Long,x As String
Dim oXH As Object
'Get URL's HTML Source
Set oXH = CreateObject("msxml2.xmlhttp")
With oXH
.Open "get",strURL,False
.send
x = .responseText
End With
Set oXH = Nothing
'Parse HTML Source for Title
If InStr(1,UCase(x),"<TITLE>") Then
stPnt = InStr(1,"<TITLE>") + Len("<TITLE>")
fgetMetaTitle = Mid(x,stPnt,InStr(stPnt,"</TITLE>") - stPnt)
'This will handle the apostrophe:
fgetMetaTitle = Replace(fgetMetaTitle,"'","'")
'This will handle the dash:
fgetMetaTitle = Replace(fgetMetaTitle,"–","-")
Else
fgetMetaTitle = ""
End If
End Function
也请检查以下内容:HTML Entities
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。