利用正则表达式获取博客园随笔一

晚上起先和朋友们跑步去了,然后回来之后洗了个澡,打开VS新建项目发现都会弹出一个问题

然后就去找万能的度娘了,http://bbs.csdn.net/topics/390514964?page=1#post-395015041

25楼真相,卸载掉那2个补丁就可以了,不过在卸载第一个补丁的时候你需要停止他指出的那个服务。

我当初刚开始接触正则是去年公司主管让我去学,然后发了个网址给我:http://www.cnblogs.com/ie421/archive/2008/07/23/1249896.html

看完后收益颇大,下面就开始正题。

之所以要获取博客园的内容是因为博客园造就了我,而大家也都是在博客园里相识,所以我们就以博客园为例子。

下面上传的这个是当初主管给我的一个类,大家可以参考参考,我今天的内容用到了里面的GetString()这个方法。在运行之前要引用System.Web

  1 using System;
  2 using System.Collections.Generic;
  3 using System.IO;
  4 using System.IO.Compression;
  5 using System.Linq;
  6 using System.Net;
  7 using System.Text;
  8 using System.Web;
  9 
 10 namespace CnblogsSearch
 11 {
 12     public class HttpClient
 13     {
 14         #region fields
 15         private bool keepContext;
 16         private string defaultLanguage = "zh-CN";
 17         private Encoding defaultEncoding = Encoding.UTF8;
 18         private string accept = "*/*";
 19         private string userAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)";
 20         private HttpVerb verb = HttpVerb.GET;
 21         private HttpClientContext context;
 22         private readonly List<HttpUploadingFile> files = new List<HttpUploadingFile>();
 23         private readonly Dictionary<string,string> postingData = new Dictionary<string,string>();
 24         private string url;
 25         private WebHeaderCollection responseHeaders;
 26         private int startPoint;
 27         private int endPoint;
 28         public bool boundaryed;
 29         private string encodingType = "utf-8";
 30         private int timeOut = 10000;
 31 
 32         #endregion
 33 
 34         #region events
 35         public event EventHandler<StatusUpdateEventArgs> StatusUpdate;
 36 
 37         private void OnStatusUpdate(StatusUpdateEventArgs e)
 38         {
 39             EventHandler<StatusUpdateEventArgs> temp = StatusUpdate;
 40 
 41             if (temp != null)
 42                 temp(this,e);
 43         }
 44         #endregion
 45 
 46         #region properties
 47 
 48         public string EncodingType
 49         {
 50             get
 51             {
 52                 return encodingType;
 53             }
 54             set
 55             {
 56                 encodingType = value;
 57             }
 58         }
 59 
 60         /// <summary>
 61         /// 是否启用gzip压缩传输
 62         /// </summary>
 63         public bool IsGzip { get; set; }
 64 
 65         /// <summary>
 66         /// 是否在数据流中编码
 67         /// </summary>
 68         public bool encodeMemory { get; set; }
 69         /// <summary>
 70         /// 是否自动在不同的请求间保留Cookie,Referer
 71         /// </summary>
 72         public bool KeepContext
 73         {
 74             get { return keepContext; }
 75             set { keepContext = value; }
 76         }
 77         public CookieContainer cookie;
 78         /// <summary>
 79         /// 期望的回应的语言
 80         /// </summary>
 81         public string DefaultLanguage
 82         {
 83             get { return defaultLanguage; }
 84             set { defaultLanguage = value; }
 85         }
 86 
 87         /// <summary>
 88         /// GetString()如果不能从HTTP头或Meta标签中获取编码信息,则使用此编码来获取字符串
 89         /// </summary>
 90         public Encoding DefaultEncoding
 91         {
 92             get { return defaultEncoding; }
 93             set { defaultEncoding = value; }
 94         }
 95 
 96         public int TimeOut
 97         {
 98             get
 99             {
100                 return timeOut;
101             }
102             set
103             {
104                 timeOut = value;
105             }
106         }
107         /// <summary>
108         /// 指示发出Get请求还是Post请求
109         /// </summary>
110         public HttpVerb Verb
111         {
112             get { return verb; }
113             set { verb = value; }
114         }
115 
116         /// <summary>
117         /// 要上传的文件.如果不为空则自动转为Post请求
118         /// </summary>
119         public List<HttpUploadingFile> Files
120         {
121             get { return files; }
122         }
123 
124         public List<RepeatPostData> repeatPostData
125         {
126             get;
127             set;
128         }
129 
130         /// <summary>
131         /// 要发送的Form表单信息
132         /// </summary>
133         public Dictionary<string,string> PostingData
134         {
135 
136             get { return postingData; }
137         }
138 
139         /// <summary>
140         /// 获取或设置请求资源的地址
141         /// </summary>
142         public string Url
143         {
144             get { return url; }
145             set { url = value; }
146         }
147 
148         /// <summary>
149         /// 用于在获取回应后,暂时记录回应的HTTP头
150         /// </summary>
151         public WebHeaderCollection ResponseHeaders
152         {
153             get { return responseHeaders; }
154         }
155 
156         /// <summary>
157         /// 获取或设置期望的资源类型
158         /// </summary>
159         public string Accept
160         {
161             get { return accept; }
162             set { accept = value; }
163         }
164 
165         /// <summary>
166         /// 获取或设置请求中的Http头User-Agent的值
167         /// </summary>
168         public string UserAgent
169         {
170             get { return userAgent; }
171             set { userAgent = value; }
172         }
173 
174         /// <summary>
175         /// 获取或设置Cookie及Referer
176         /// </summary>
177         public HttpClientContext Context
178         {
179             get { return context; }
180             set { context = value; }
181         }
182 
183         /// <summary>
184         /// 获取或设置获取内容的起始点,用于断点续传,多线程下载等
185         /// </summary>
186         public int StartPoint
187         {
188             get { return startPoint; }
189             set { startPoint = value; }
190         }
191 
192         /// <summary>
193         /// 获取或设置获取内容的结束点,多下程下载等.
194         /// 如果为0,表示获取资源从StartPoint开始的剩余内容
195         /// </summary>
196         public int EndPoint
197         {
198             get { return endPoint; }
199             set { endPoint = value; }
200         }
201 
202 
203 
204         #endregion
205 
206         #region constructors
207         /// <summary>
208         /// 构造新的HttpClient实例
209         /// </summary>
210         public HttpClient()
211             : this(null)
212         {
213         }
214 
215         /// <summary>
216         /// 构造新的HttpClient实例
217         /// </summary>
218         /// <param name="url">要获取的资源的地址</param>
219         public HttpClient(string url)
220             : this(url,null)
221         {
222         }
223 
224         /// <summary>
225         /// 构造新的HttpClient实例
226         /// </summary>
227         /// <param name="url">要获取的资源的地址</param>
228         /// <param name="context">Cookie及Referer</param>
229         public HttpClient(string url,HttpClientContext context)
230             : this(url,context,false)
231         {
232         }
233 
234         /// <summary>
235         /// 构造新的HttpClient实例
236         /// </summary>
237         /// <param name="url">要获取的资源的地址</param>
238         /// <param name="context">Cookie及Referer</param>
239         /// <param name="keepContext">是否自动在不同的请求间保留Cookie,Referer</param>
240         public HttpClient(string url,HttpClientContext context,bool keepContext)
241         {
242             this.url = url;
243             this.context = context;
244             this.keepContext = keepContext;
245             if (this.context == null)
246                 this.context = new HttpClientContext();
247             cookie = new CookieContainer();
248         }
249         #endregion
250 
251         #region AttachFile
252         /// <summary>
253         /// 在请求中添加要上传的文件
254         /// </summary>
255         /// <param name="fileName">要上传的文件路径</param>
256         /// <param name="fieldName">文件字段的名称(相当于&lt;input type=file name=fieldName&gt;)里的fieldName)</param>
257         public void AttachFile(string fileName,string fieldName)
258         {
259             HttpUploadingFile file = new HttpUploadingFile(fileName,fieldName);
260             files.Add(file);
261         }
262 
263         /// <summary>
264         /// 在请求中添加要上传的文件
265         /// </summary>
266         /// <param name="data">要上传的文件内容</param>
267         /// <param name="fileName">文件名</param>
268         /// <param name="fieldName">文件字段的名称(相当于&lt;input type=file name=fieldName&gt;)里的fieldName)</param>
269         public void AttachFile(byte[] data,string fileName,string fieldName)
270         {
271             HttpUploadingFile file = new HttpUploadingFile(data,fileName,fieldName);
272             files.Add(file);
273         }
274         #endregion
275 
276         /// <summary>
277         /// 清空PostingData,Files,StartPoint,EndPoint,ResponseHeaders,并把Verb设置为Get.
278         /// 在发出一个包含上述信息的请求后,必须调用此方法或手工设置相应属性以使下一次请求不会受到影响.
279         /// </summary>
280         public void Reset()
281         {
282             verb = HttpVerb.GET;
283             files.Clear();
284             postingData.Clear();
285             responseHeaders = null;
286             startPoint = 0;
287             endPoint = 0;
288             IsGzip = false;
289             if (repeatPostData != null) repeatPostData.Clear();
290         }
291         public string ip;
292         private IPEndPoint BindIPEndPointCallback(ServicePoint servicePoint,IPEndPoint remoteEndPoint,int retryCount)
293         {
294             return new IPEndPoint(IPAddress.Parse(ip),0);
295         }
296 
297         public string cookieStr = "";
298 
299         private HttpWebRequest CreateRequest()
300         {
301             HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
302 
303             req.CookieContainer = cookie;
304             //req.Headers.Add("Accept-Language",defaultLanguage);
305             req.Accept = accept;
306             req.UserAgent = userAgent;
307             req.KeepAlive = true;
308             req.AllowAutoRedirect = true;
309             req.Timeout = TimeOut;
310 
311 
312 
313             if (IsGzip)
314             {
315                 req.Headers.Add("Accept-Encoding","gzip");
316             }
317 
318             if (ip != null)
319             {
320                 req.ServicePoint.BindIPEndPointDelegate = new BindIPEndPoint(BindIPEndPointCallback);
321             }
322             if (context.Cookies != null)
323                 req.CookieContainer.Add(context.Cookies);
324             if (!string.IsNullOrEmpty(context.Referer))
325                 req.Referer = context.Referer;
326 
327             if (verb == HttpVerb.HEAD)
328             {
329                 req.Method = "HEAD";
330                 return req;
331             }
332 
333             if (postingData.Count > 0 || files.Count > 0)
334                 verb = HttpVerb.POST;
335             if (cookieStr != "") req.Headers.Add("Cookie",cookieStr);
336             if (verb == HttpVerb.POST)
337             {
338                 req.Method = "POST";
339 
340                 MemoryStream memoryStream = new MemoryStream();
341 
342                 StreamWriter writer;
343                 if (encodeMemory)
344                 {
345                     writer = new StreamWriter(memoryStream,Encoding.GetEncoding(EncodingType));
346                 }
347                 else
348                     writer = new StreamWriter(memoryStream);
349 
350                 if (files.Count > 0 || boundaryed)
351                 {
352                     string newLine = "\r\n";
353                     string boundary = Guid.NewGuid().ToString().Replace("-","");
354                     req.ContentType = "multipart/form-data; boundary=" + boundary;
355 
356                     foreach (string key in postingData.Keys)
357                     {
358                         writer.Write("--" + boundary + newLine);
359                         writer.Write("Content-Disposition: form-data; name=\"{0}\"{1}{1}",key,newLine);
360                         writer.Write(postingData[key] + newLine);
361                     }
362 
363 
364 
365                     foreach (HttpUploadingFile file in files)
366                     {
367                         writer.Write("--" + boundary + newLine);
368                         writer.Write(
369                             "Content-Disposition: form-data; name=\"{0}\"; filename=\"{1}\"{2}",370                             file.FieldName,371                             file.FileName,372                             newLine
373                             );
374                         writer.Write("Content-Type: image/jpeg" + newLine + newLine);
375                         writer.Flush();
376                         memoryStream.Write(file.Data,0,file.Data.Length);
377                         writer.Write(newLine);
378                         writer.Write("--" + boundary + "--" + newLine);
379                     }
380 
381                 }
382                 else
383                 {
384                     req.ContentType = "application/x-www-form-urlencoded";
385                     StringBuilder sb = new StringBuilder();
386                     foreach (string key in postingData.Keys)
387                     {
388                         sb.AppendFormat("{0}={1}&",HttpUtility.UrlEncode(key,Encoding.GetEncoding(EncodingType)),HttpUtility.UrlEncode(postingData[key],Encoding.GetEncoding(EncodingType)));
389                     }
390 
391                     if (repeatPostData != null)
392                     {
393                         foreach (var item in repeatPostData)
394                         {
395                             sb.AppendFormat("{0}={1}&",HttpUtility.UrlEncode(item.key,HttpUtility.UrlEncode(item.value,Encoding.GetEncoding(EncodingType)));
396                         }
397                     }
398 
399                     if (sb.Length > 0)
400                         sb.Length--;
401                     writer.Write(sb.ToString());
402                 }
403 
404                 writer.Flush();
405 
406                 using (Stream stream = req.GetRequestStream())
407                 {
408                     memoryStream.WriteTo(stream);
409                 }
410             }
411 
412             if (startPoint != 0 && endPoint != 0)
413                 req.AddRange(startPoint,endPoint);
414             else if (startPoint != 0 && endPoint == 0)
415                 req.AddRange(startPoint);
416 
417             return req;
418         }
419 
420         /// <summary>
421         /// 发出一次新的请求,并返回获得的回应
422         /// 调用此方法永远不会触发StatusUpdate事件.
423         /// </summary>
424         /// <returns>相应的HttpWebResponse</returns>
425         public HttpWebResponse GetResponse()
426         {
427 
428             HttpWebRequest req = CreateRequest();
429             HttpWebResponse res = null;
430             try
431             {
432                 res = (HttpWebResponse)req.GetResponse();
433 
434 
435                 responseHeaders = res.Headers;
436                 if (keepContext)
437                 {
438                     context.Cookies = res.Cookies;
439                     context.Referer = url;
440                     cookie.Add(context.Cookies);
441                 }
442             }
443             catch (Exception)
444             { throw; }
445             return res;
446 
447         }
448 
449         /// <summary>
450         /// 发出一次新的请求,并返回回应内容的流
451         /// 调用此方法永远不会触发StatusUpdate事件.
452         /// </summary>
453         /// <returns>包含回应主体内容的流</returns>
454         public Stream GetStream()
455         {
456             return GetResponse().GetResponseStream();
457         }
458         public string responseURL;
459         /// <summary>
460         /// 发出一次新的请求,并以字节数组形式返回回应的内容
461         /// 调用此方法会触发StatusUpdate事件
462         /// </summary>
463         /// <returns>包含回应主体内容的字节数组</returns>
464         public byte[] GetBytes()
465         {
466             byte[] result = new byte[] { 0,1 };
467             try
468             {
469                 HttpWebResponse res = GetResponse();
470                 int length = (int)res.ContentLength;
471                 responseURL = res.ResponseUri.AbsoluteUri;
472                 MemoryStream memoryStream = new MemoryStream();
473                 byte[] buffer = new byte[0x100];
474                 Stream rs = res.GetResponseStream();
475                 for (int i = rs.Read(buffer,0,buffer.Length); i > 0; i = rs.Read(buffer,buffer.Length))
476                 {
477                     memoryStream.Write(buffer,i);
478                     OnStatusUpdate(new StatusUpdateEventArgs((int)memoryStream.Length,length));
479                 }
480                 rs.Close();
481                 result = memoryStream.ToArray();
482             }
483             catch (Exception)
484             {
485                 throw;
486             }
487 
488             return result;
489         }
490 
491         /// <summary>
492         /// 发出一次新的请求,以Http头,或Html Meta标签,或DefaultEncoding指示的编码信息对回应主体解码
493         /// 调用此方法会触发StatusUpdate事件
494         /// </summary>
495         /// <returns>解码后的字符串</returns>
496         public string GetString()
497         {
498             byte[] data = GetBytes();
499             if (responseHeaders.AllKeys.Contains<string>("Content-Encoding") && responseHeaders["Content-Encoding"].Contains("gzip"))
500             {
501                 //Console.WriteLine(responseHeaders["Content-Encoding"].ToString());
502                 data = GZipDecompress(data);
503             }
504 
505             string encodingName = GetEncodingFromHeaders();
506 
507             if (encodingName == null)
508                 encodingName = GetEncodingFromBody(data);
509 
510             Encoding encoding;
511             if (encodingName == null)
512                 encoding = defaultEncoding;
513             else
514             {
515                 try
516                 {
517                     encoding = Encoding.GetEncoding(encodingName);
518                 }
519                 catch (ArgumentException)
520                 {
521                     encoding = defaultEncoding;
522                 }
523             }
524             return encoding.GetString(data);
525         }
526 
527         /// <summary>
528         /// 发出一次新的请求,对回应的主体内容以指定的编码进行解码
529         /// 调用此方法会触发StatusUpdate事件
530         /// </summary>
531         /// <param name="encoding">指定的编码</param>
532         /// <returns>解码后的字符串</returns>
533         public string GetString(Encoding encoding)
534         {
535             byte[] data = GetBytes();
536             return encoding.GetString(data);
537         }
538 
539         /// <summary>
540         /// GZip解压函数
541         /// </summary>
542         /// <param name="data"></param>
543         /// <returns></returns>
544         private byte[] GZipDecompress(byte[] data)
545         {
546             using (MemoryStream stream = new MemoryStream())
547             {
548                 using (GZipStream gZipStream = new GZipStream(new MemoryStream(data),CompressionMode.Decompress))
549                 {
550                     byte[] bytes = new byte[40960];
551                     int n;
552                     while ((n = gZipStream.Read(bytes,bytes.Length)) != 0)
553                     {
554                         stream.Write(bytes,n);
555                     }
556                     gZipStream.Close();
557                 }
558 
559                 return stream.ToArray();
560             }
561         }
562 
563         private string GetEncodingFromHeaders()
564         {
565             string encoding = null;
566             try
567             {
568                 string contentType = responseHeaders["Content-Type"];
569                 if (contentType != null)
570                 {
571                     int i = contentType.IndexOf("charset=");
572                     if (i != -1)
573                     {
574                         encoding = EncodingType = contentType.Substring(i + 8);
575                     }
576                 }
577             }
578             catch (Exception)
579             { }
580             return encoding;
581         }
582 
583         private string GetEncodingFromBody(byte[] data)
584         {
585             //string encodingName = null;
586             string dataAsAscii = Encoding.ASCII.GetString(data);
587             if (dataAsAscii != null)
588             {
589                 int i = dataAsAscii.IndexOf("charset=");
590                 if (i != -1)
591                 {
592                     int j = dataAsAscii.IndexOf("\"",i);
593                     if (j != -1)
594                     {
595                         int k = i + 8;
596                         EncodingType = dataAsAscii.Substring(k,(j - k) + 1);
597                         char[] chArray = new char[2] { '>','"' };
598                         EncodingType = EncodingType.TrimEnd(chArray);
599                     }
600                 }
601             }
602             return EncodingType;
603         }
604 
605         /// <summary>
606         /// 发出一次新的Head请求,获取资源的长度
607         /// 此请求会忽略PostingData,Verb
608         /// </summary>
609         /// <returns>返回的资源长度</returns>
610         public int HeadContentLength()
611         {
612             Reset();
613             HttpVerb lastVerb = verb;
614             verb = HttpVerb.HEAD;
615             using (HttpWebResponse res = GetResponse())
616             {
617                 verb = lastVerb;
618                 return (int)res.ContentLength;
619             }
620         }
621 
622         /// <summary>
623         /// 发出一次新的请求,把回应的主体内容保存到文件
624         /// 调用此方法会触发StatusUpdate事件
625         /// 如果指定的文件存在,它会被覆盖
626         /// </summary>
627         /// <param name="fileName">要保存的文件路径</param>
628         public void SaveAsFile(string fileName)
629         {
630             SaveAsFile(fileName,FileExistsAction.Overwrite);
631         }
632 
633         /// <summary>
634         /// 发出一次新的请求,把回应的主体内容保存到文件
635         /// 调用此方法会触发StatusUpdate事件
636         /// </summary>
637         /// <param name="fileName">要保存的文件路径</param>
638         /// <param name="existsAction">指定的文件存在时的选项</param>
639         /// <returns>是否向目标文件写入了数据</returns>
640         public bool SaveAsFile(string fileName,FileExistsAction existsAction)
641         {
642             byte[] data = GetBytes();
643             switch (existsAction)
644             {
645                 case FileExistsAction.Overwrite:
646                     using (BinaryWriter writer = new BinaryWriter(new FileStream(fileName,FileMode.OpenOrCreate,FileAccess.Write)))
647                         writer.Write(data);
648                     return true;
649 
650                 case FileExistsAction.Append:
651                     using (BinaryWriter writer = new BinaryWriter(new FileStream(fileName,FileMode.Append,FileAccess.Write)))
652                         writer.Write(data);
653                     return true;
654 
655                 default:
656                     if (!File.Exists(fileName))
657                     {
658                         using (
659                             BinaryWriter writer =
660                                 new BinaryWriter(new FileStream(fileName,FileMode.Create,FileAccess.Write)))
661                             writer.Write(data);
662                         return true;
663                     }
664                     else
665                     {
666                         return false;
667                     }
668             }
669         }
670     }
671 
672     public class HttpClientContext
673     {
674         private CookieCollection cookies;
675         private string referer;
676 
677         public CookieCollection Cookies
678         {
679             get { return cookies; }
680             set { cookies = value; }
681         }
682 
683         public string Referer
684         {
685             get { return referer; }
686             set { referer = value; }
687         }
688     }
689 
690     public class RepeatPostData
691     {
692         public string key { get; set; }
693         public string value { get; set; }
694     }
695 
696     public enum HttpVerb
697     {
698         GET,699         POST,700         HEAD,701     }
702 
703     public enum FileExistsAction
704     {
705         Overwrite,706         Append,707         Cancel,708     }
709 
710     public class HttpUploadingFile
711     {
712         private string fileName;
713         private string fieldName;
714         private byte[] data;
715 
716         public string FileName
717         {
718             get { return fileName; }
719             set { fileName = value; }
720         }
721 
722         public string FieldName
723         {
724             get { return fieldName; }
725             set { fieldName = value; }
726         }
727 
728         public byte[] Data
729         {
730             get { return data; }
731             set { data = value; }
732         }
733 
734         public HttpUploadingFile(string fileName,string fieldName)
735         {
736             this.fileName = fileName;
737             this.fieldName = fieldName;
738             using (FileStream stream = new FileStream(fileName,FileMode.Open))
739             {
740                 byte[] inBytes = new byte[stream.Length];
741                 stream.Read(inBytes,inBytes.Length);
742                 data = inBytes;
743             }
744         }
745 
746         public HttpUploadingFile(byte[] data,string fieldName)
747         {
748             this.data = data;
749             this.fileName = fileName;
750             this.fieldName = fieldName;
751         }
752     }
753 
754     public class StatusUpdateEventArgs : EventArgs
755     {
756         private readonly int bytesGot;
757         private readonly int bytesTotal;
758 
759         public StatusUpdateEventArgs(int got,int total)
760         {
761             bytesGot = got;
762             bytesTotal = total;
763         }
764 
765         /// <summary>
766         /// 已经下载的字节数
767         /// </summary>
768         public int BytesGot
769         {
770             get { return bytesGot; }
771         }
772 
773         /// <summary>
774         /// 资源的总字节数
775         /// </summary>
776         public int BytesTotal
777         {
778             get { return bytesTotal; }
779         }
780     }
781 }
View Code

然后我们先根据这个方法获取博客园首页的源代码

 1         /// <summary>
 2         /// 根据网址获取页面源码
 3         /// </summary>
 4         /// <param name="url"></param>
 5         /// <returns></returns>
 6         public string GetHtml(string url)
 7         {
 8             string ContentHtml = "";
 9             try
10             {
11                 HttpClient hc = new HttpClient();
12                 hc.Url = url;
13                 if (!hc.Url.Contains("http://"))//如果输入的网址没有包含http:// 则手动添加
14                 {
15                     hc.Url = "http://" + hc.Url;
16                 }
17                 ContentHtml = hc.GetString();
18             }
19             catch (Exception e)//如果上面的执行出错,则返回继续执行
20             {
21                 return GetHtml(url);
22             }
23             return ContentHtml;
24         }
View Code

然后再观察每条随笔的规律,我们发现没条的开头是<div class="post_item_body">,结尾是<div class="clear">,那我们就可以根据这个规律来写出正则:Regex regexContent = new Regex("<div class=\"post_item_body\">(?<content>.*?)<div class=\"clear\"></div>",RegexOptions.Singleline);
然后可以使用这个正则来获取我们需要匹配的内容了

1         string Html= GetHtml("http://www.cnblogs.com/");
2             Regex regexContent = new Regex("<div class=\"post_item_body\">(?<content>.*?)<div class=\"clear\"></div>",RegexOptions.Singleline);
3             string blog = regexContent.Match(Html).Groups["content"].Value.ToString();

在这里我用到的正则匹配工具是Expresso,有需要的朋友可以留言。当然,如果我有什么地方写的不好的,欢迎各位指出。晚上就先到这里了,该洗洗睡了。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


jquery.validate使用攻略(表单校验) 目录 jquery.validate使用攻略1 第一章&#160;jquery.validate使用攻略1 第二章&#160;jQuery.validate.js API7 Custom selectors7 Utilities8 Validato
/\s+/g和/\s/g的区别 正则表达式/\s+/g和/\s/g,目的均是找出目标字符串中的所有空白字符,但两者到底有什么区别呢? 我们先来看下面一个例子: let name = &#39;ye wen jun&#39;;let ans = name.replace(/\s/g, &#39;&#3
自整理几个jquery.Validate验证正则: 1. 只能输入数字和字母 /^[0-9a-zA-Z]*$/g jQuery.validator.addMethod(&quot;letters&quot;, function (value, element) { return this.optio
this.optional(element)的用法 this.optional(element)是jquery.validator.js表单验证框架中的一个函数,用于表单控件的值不为空时才触发验证。 简单来说,就是当表单控件值为空的时候不会进行表单校验,此函数会返回true,表示校验通过,当表单控件
jQuery.validate 表单动态验证 实际上jQuery.validate提供了动态校验的方法。而动态拼JSON串的方式是不支持动态校验的。牺牲jQuery.validate的性能优化可以实现(jQuery.validate的性能优化见图1.2 jQuery.validate源码 )。 也可
自定义验证之这能输入数字(包括小数 负数 ) &lt;script type=&quot;text/javascript&quot;&gt; function onlyNumber(obj){ //得到第一个字符是否为负号 var t = obj.value.charAt(0); //先把非数字的都
// 引入了外部的验证规则 import { validateAccountNumber } from &quot;@/utils/validate&quot;; validator.js /*是否合法IP地址*/ export function validateIP(rule, value,cal
VUE开发--表单验证(六十三) 一、常用验证方式 vue 中表单字段验证的写法和方式有多种,常用的验证方式有3种: data 中验证 表单内容: &lt;!-- 表单 --&gt; &lt;el-form ref=&quot;rulesForm&quot; :rules=&quot;formRul
正则表达式 座机的: 例子: 座机有效写法: 0316-8418331 (010)-67433539 (010)67433539 010-67433539 (0316)-8418331 (0316)8418331 正则表达式写法 0\d{2,3}-\d{7,8}|\(?0\d{2,3}[)-]?\d
var reg = /^0\.[1-9]{0,2}$/;var linka = 0.1;console.log (reg.test (linka)); 0到1两位小数正则 ^(0\.(0[1-9]|[1-9]{1,2}|[1-9]0)$)|^1$ 不含0、0.0、0.00 // 验证是否是[1-10
input最大长度限制问题 &lt;input type=&quot;text&quot; maxlength=&quot;5&quot; /&gt; //可以 &lt;input type=&quot;number&quot; maxlength=&quot;5&quot; /&gt; //没有效
js输入验证是否为空、是否为null、是否都是空格 目录 1.截头去尾 trim 2.截头去尾 会去掉开始和结束的空格,类似于trim 3.会去掉所有的空格,包括开始,结束,中间 1.截头去尾 trim str=str.trim(); // 强烈推荐 最常用、最实用 or $.trim(str);
正则表达式语法大全 字符串.match(正则):返回符合的字符串,若不满足返回null 字符串.search(正则):返回搜索到的位置,若非一个字符,则返回第一个字母的下标,若不匹配则返回-1 字符串.replace(正则,新的字符串):找到符合正则的内容并替换 正则.test(字符串):在字符串中
正整数正则表达式正数的正则表达式(包括0,小数保留两位): ^((0{1}.\d{1,2})|([1-9]\d.{1}\d{1,2})|([1-9]+\d)|0)$正数的正则表达式(不包括0,小数保留两位): ^((0{1}.\d{1,2})|([1-9]\d.{1}\d{1,2})|([1-9]+
JS 正则验证 test() /*用途:检查输入手机号码是否正确输入:s:字符串返回:如果通过验证返回true,否则返回false /function checkMobile(s){var regu =/[1][3][0-9]{9}$/;var re = new RegExp(regu);if (r
请输入保留两位小数的销售价的正则: /(^[1-9]([0-9]+)?(\.[0-9]{1,2})?$)|(^(0){1}$)|(^[0-9]\.[0-9]([0-9])?$)/ 1.只能输入英文 &lt;input type=&quot;text&quot; onkeyup=&quot;value
判断价格的正则表达式 价格的正则表达式 /(^[1-9]\d*(\.\d{1,2})?$)|(^0(\.\d{1,2})?$)/; 1 解析:价格符合两种格式 ^ [1-9]\d*(.\d{1,2})?$ : 1-9 开头,后跟是 0-9,可以跟小数点,但小数点后要带上 1-2 位小数,类似 2,2
文章浏览阅读106次。这篇文章主要介绍了最实用的正则表达式整理,比如校验邮箱的正则,号码相关,数字相关等等,本文给大家列举的比较多,需要的朋友可以参考下。_/^(?:[1-9]d*)$/ 手机号
文章浏览阅读1.2k次。4、匹配中的==、an==、== an9、i9 == "9i"和99p==请注意下面这部分的作用,它在匹配中间内容的时候排除了说明:当html字符串如下时,可以匹配到两处,表示匹配的字符串不包含and且不包含空白字符。说明:在上面的正则表达式中,_gvim正则表达式匹配不包含某个字符串
文章浏览阅读897次。【代码】正则表达式匹配a标签的href。_auto.js 正则匹配herf