<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.2.2" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: How to avoid VBScript regular expression gotchas</title>
	<link>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/</link>
	<description>Stay curious!</description>
	<pubDate>Thu, 28 Aug 2008 23:34:23 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2.2</generator>

	<item>
		<title>By: Samantha Small</title>
		<link>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-14547</link>
		<author>Samantha Small</author>
		<pubDate>Mon, 12 May 2008 16:53:47 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-14547</guid>
		<description>What Rubbish, they work perfectly for me....</description>
		<content:encoded><![CDATA[<p>What Rubbish, they work perfectly for me&#8230;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Gray</title>
		<link>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-14335</link>
		<author>David Gray</author>
		<pubDate>Tue, 25 Mar 2008 01:39:24 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-14335</guid>
		<description>First, regarding the debate about whether the way the VBScript Regular Expression engine handles the match expression “r\r\n\w” is a bug, that depends on your perspective. From the perspective of a person who learned them in the context of either Perl or a Unix gerp tool, this is incorrect. Even Win32 Perl behaves consistently, by treating the CR/LF pair as a single atom.

Second, this behavior appears to extend to the System.Text.Regex class of the Microsoft .NET Framework.</description>
		<content:encoded><![CDATA[<p>First, regarding the debate about whether the way the VBScript Regular Expression engine handles the match expression “r\r\n\w” is a bug, that depends on your perspective. From the perspective of a person who learned them in the context of either Perl or a Unix gerp tool, this is incorrect. Even Win32 Perl behaves consistently, by treating the CR/LF pair as a single atom.</p>
<p>Second, this behavior appears to extend to the System.Text.Regex class of the Microsoft .NET Framework.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alok Saldanha</title>
		<link>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-13606</link>
		<author>Alok Saldanha</author>
		<pubDate>Fri, 02 Nov 2007 18:56:03 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-13606</guid>
		<description>Any ideas why the following two patterns do not both match the embedded numbers in the string?

Sub foo()
    processPattern "\d "
    processPattern "\d*"
End Sub

Sub processPattern(pat As String)
    Dim re As VBScript_RegExp_55.RegExp
    Set re = New VBScript_RegExp_55.RegExp
    Dim matches As VBScript_RegExp_55.MatchCollection
    re.Pattern = pat
    Set matches = re.Execute("asdf234asdf")
Debug.Print pat &#38; " returns '" &#38; matches(0) &#38; "'"
End Sub</description>
		<content:encoded><![CDATA[<p>Any ideas why the following two patterns do not both match the embedded numbers in the string?</p>
<p>Sub foo()<br />
    processPattern &#8220;\d &#8220;<br />
    processPattern &#8220;\d*&#8221;<br />
End Sub</p>
<p>Sub processPattern(pat As String)<br />
    Dim re As VBScript_RegExp_55.RegExp<br />
    Set re = New VBScript_RegExp_55.RegExp<br />
    Dim matches As VBScript_RegExp_55.MatchCollection<br />
    re.Pattern = pat<br />
    Set matches = re.Execute(&#8221;asdf234asdf&#8221;)<br />
Debug.Print pat &amp; &#8221; returns &#8216;&#8221; &amp; matches(0) &amp; &#8220;&#8216;&#8221;<br />
End Sub</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: divVerent</title>
		<link>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-13383</link>
		<author>divVerent</author>
		<pubDate>Sat, 08 Sep 2007 07:42:09 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-13383</guid>
		<description>I tried to use that from VBA, and - of course - slammed into a bug.

Set reObj As New RegExp
With reObj
  .Pattern = re
  .MultiLine = True
End With
Set result = reObj.Execute(someString)
Set RE_Match_And_Capture = result(0).SubMatches

When my string is "hello" &#38; vbCrLf &#38; "world", "^(.*)$" still just matches the "hello". When I instead set MultiLine to False, the expression stops matching AT ALL! I then found that ^ and $ then correctly match just at beginning and end of the string, but that . still refuses to match newlines. I found that [^] is a working replacement for ., but still... this HAS to work. Or maybe I have just not seen some of the properties of the object? But the autocompletion SHOULD have showed me all...

In Perl, there is the multiline flag /.../m which seems to do exactly what MultiLine does (change the meaning of ^ and $ to be line-baseed), but what I really need is an equivalent of /.../s (changes the meaning of . to match everything, including newlines). Apparently, VBScript REs don't support the "(?s)" way to set such flags either. So... how can I fix that problem?</description>
		<content:encoded><![CDATA[<p>I tried to use that from VBA, and - of course - slammed into a bug.</p>
<p>Set reObj As New RegExp<br />
With reObj<br />
  .Pattern = re<br />
  .MultiLine = True<br />
End With<br />
Set result = reObj.Execute(someString)<br />
Set RE_Match_And_Capture = result(0).SubMatches</p>
<p>When my string is &#8220;hello&#8221; &amp; vbCrLf &amp; &#8220;world&#8221;, &#8220;^(.*)$&#8221; still just matches the &#8220;hello&#8221;. When I instead set MultiLine to False, the expression stops matching AT ALL! I then found that ^ and $ then correctly match just at beginning and end of the string, but that . still refuses to match newlines. I found that [^] is a working replacement for ., but still&#8230; this HAS to work. Or maybe I have just not seen some of the properties of the object? But the autocompletion SHOULD have showed me all&#8230;</p>
<p>In Perl, there is the multiline flag /&#8230;/m which seems to do exactly what MultiLine does (change the meaning of ^ and $ to be line-baseed), but what I really need is an equivalent of /&#8230;/s (changes the meaning of . to match everything, including newlines). Apparently, VBScript REs don&#8217;t support the &#8220;(?s)&#8221; way to set such flags either. So&#8230; how can I fix that problem?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave M</title>
		<link>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-13325</link>
		<author>Dave M</author>
		<pubDate>Wed, 22 Aug 2007 03:11:54 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-13325</guid>
		<description>Hmm I was just playing with vbscript regexps a few days ago and the expression "r\r\n\w" is not a bug at all.

windows files use 2 chars as line delimiter. 1. a 'carriage return' (Cr).  2. a 'line feed' (Lf). that's why the end of line delimiter is 'vbCrLf -- and not 'vbLf'.  and that's why \r\n is a match.

\r matches the carriage return char, and \n matches the line feed ( or newline) char.  Thus "r\r\n\w" matches your example string perfectly.

Dave</description>
		<content:encoded><![CDATA[<p>Hmm I was just playing with vbscript regexps a few days ago and the expression &#8220;r\r\n\w&#8221; is not a bug at all.</p>
<p>windows files use 2 chars as line delimiter. 1. a &#8216;carriage return&#8217; (Cr).  2. a &#8216;line feed&#8217; (Lf). that&#8217;s why the end of line delimiter is &#8216;vbCrLf &#8212; and not &#8216;vbLf&#8217;.  and that&#8217;s why \r\n is a match.</p>
<p>\r matches the carriage return char, and \n matches the line feed ( or newline) char.  Thus &#8220;r\r\n\w&#8221; matches your example string perfectly.</p>
<p>Dave</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: deneshac</title>
		<link>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-13309</link>
		<author>deneshac</author>
		<pubDate>Tue, 14 Aug 2007 20:33:53 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-13309</guid>
		<description>Thank You for your reply, Potosino, it helped me look harder for a possible bug.  I had the following experience:

Given the following two lines separated by a newline

test of r
what will come out?

I could not set the RegExp.Pattern to "r\nw".  It would match "r\r" (using just the CR) and it would match "\nw".  I finally tried them both together, and illogically it worked - "r\r\nw".

I hate bugs! :)

chris</description>
		<content:encoded><![CDATA[<p>Thank You for your reply, Potosino, it helped me look harder for a possible bug.  I had the following experience:</p>
<p>Given the following two lines separated by a newline</p>
<p>test of r<br />
what will come out?</p>
<p>I could not set the RegExp.Pattern to &#8220;r\nw&#8221;.  It would match &#8220;r\r&#8221; (using just the CR) and it would match &#8220;\nw&#8221;.  I finally tried them both together, and illogically it worked - &#8220;r\r\nw&#8221;.</p>
<p>I hate bugs! :)</p>
<p>chris</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Potosino</title>
		<link>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-1528</link>
		<author>Potosino</author>
		<pubDate>Wed, 16 Aug 2006 18:34:41 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2005/11/04/vbscript-regular-expression-gotchas/#comment-1528</guid>
		<description>&lt;p&gt;I had an experience with VBscript where I was trying to eliminate trailing blank spaces and tabs from lines of text. VBscript would not match the regular expression pattern &lt;code&gt;[ \t]+$&lt;/code&gt;. Instead, I had to use something like &lt;code&gt;([ \t]+)[\n\r]{1,2}&lt;/code&gt; and replace it with &lt;code&gt;vbCrLf&lt;/code&gt;.&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>I had an experience with VBscript where I was trying to eliminate trailing blank spaces and tabs from lines of text. VBscript would not match the regular expression pattern <code>[ \t]+$</code>. Instead, I had to use something like <code>([ \t]+)[\n\r]{1,2}</code> and replace it with <code>vbCrLf</code>.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
