<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.2.2" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Duplicate index checker version 1.8 released</title>
	<link>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/</link>
	<description>Stay curious!</description>
	<pubDate>Sun, 20 Jul 2008 11:25:28 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2.2</generator>

	<item>
		<title>By: Five great Perl programming techniques to make your life fun again &#171; Jfree&#8217;s dreaming</title>
		<link>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-13497</link>
		<author>Five great Perl programming techniques to make your life fun again &#171; Jfree&#8217;s dreaming</author>
		<pubDate>Tue, 09 Oct 2007 01:32:39 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-13497</guid>
		<description>[...] If you want to know how that works, read the comments on my earlier post about a duplicate index and foreign key checker for MySQL. [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] If you want to know how that works, read the comments on my earlier post about a duplicate index and foreign key checker for MySQL. [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Five great Perl programming techniques to make your life fun again at Xaprb</title>
		<link>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-13493</link>
		<author>Five great Perl programming techniques to make your life fun again at Xaprb</author>
		<pubDate>Mon, 08 Oct 2007 16:12:43 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-13493</guid>
		<description>[...] If you want to know how that works, read the comments on my earlier post about a duplicate index and foreign key checker for MySQL. [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] If you want to know how that works, read the comments on my earlier post about a duplicate index and foreign key checker for MySQL. [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Xaprb</title>
		<link>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-1890</link>
		<author>Xaprb</author>
		<pubDate>Tue, 19 Sep 2006 16:53:05 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-1890</guid>
		<description>&lt;p&gt;That's right.  There are several of these (?something) constructs.  Whenever there's a ? right inside the (, it's magical :-)  You can do "man perlre" to find out the full syntax.&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>That&#8217;s right.  There are several of these (?something) constructs.  Whenever there&#8217;s a ? right inside the (, it&#8217;s magical :-)  You can do &#8220;man perlre&#8221; to find out the full syntax.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Roland Bouman</title>
		<link>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-1888</link>
		<author>Roland Bouman</author>
		<pubDate>Tue, 19 Sep 2006 15:19:53 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-1888</guid>
		<description>&lt;p&gt;Thanks Baron, that is quite enlightning!&lt;/p&gt;

&lt;p&gt;I just assumed that the parenthesis would define a capturing group, but they don't. If I understand it well, parenthesis are necessary, but not sufficient to form a capturing group.&lt;/p&gt;

&lt;p&gt;I have some experience with regular expressions but this perl dialect really is new to me (shame on Roland).&lt;/p&gt;

&lt;p&gt;If I understand well, that the non-capturing patterns are quite like the "anchors" $ and ^. So, they match a particular *position* in the text, *not* a piece of text as such, right?&lt;/p&gt;

&lt;p&gt;kind regards, &lt;/p&gt;

&lt;p&gt;Roland Bouman&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Thanks Baron, that is quite enlightning!</p>
<p>I just assumed that the parenthesis would define a capturing group, but they don&#8217;t. If I understand it well, parenthesis are necessary, but not sufficient to form a capturing group.</p>
<p>I have some experience with regular expressions but this perl dialect really is new to me (shame on Roland).</p>
<p>If I understand well, that the non-capturing patterns are quite like the &#8220;anchors&#8221; $ and ^. So, they match a particular *position* in the text, *not* a piece of text as such, right?</p>
<p>kind regards, </p>
<p>Roland Bouman</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Xaprb</title>
		<link>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-1884</link>
		<author>Xaprb</author>
		<pubDate>Tue, 19 Sep 2006 14:09:29 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-1884</guid>
		<description>&lt;p&gt;Very close... I shouldn't have just posted it without explaining it.  I'll try to explain it:&lt;/p&gt;

&lt;pre&gt;$fk =~ s#&lt;strong&gt;(?&#60;=\()([^\)]+)(?=\))&lt;/strong&gt;#&lt;strong&gt;join(', ', sort(split(/, /, $1)))&lt;/strong&gt;#ge;&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;$fk =~ s&lt;/code&gt; means "match and replace on the $fk variable."  This alters the variable in place.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;#&lt;/code&gt; is an alternate delimiter for the &lt;code&gt;s&lt;/code&gt; operation, which is usually of the form &lt;code&gt;s/pattern/replacement/modifiers&lt;/code&gt;.  But you can choose your own delimiters if you're going to have to backslash a lot of &lt;code&gt;/&lt;/code&gt; characters within the pattern or something.  I chose &lt;code&gt;#&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The whole line is then &lt;code&gt;$fk =~ s#pattern#replacement#ge&lt;/code&gt;.  The modifiers are &lt;code&gt;g&lt;/code&gt; for 'global' and &lt;code&gt;e&lt;/code&gt; for 'execute'.  The 'execute' option means to search for the pattern, interpolate $1 (the first capturing group) into the substitution pattern, then instead of just doing string replacement, execute the substitution pattern as code, and use the result as the replacement value.&lt;/p&gt;

&lt;p&gt;As for the pattern itself, there are three groups: &lt;code&gt;(?&#60;=\()&lt;/code&gt;, &lt;code&gt;([^\)]+)&lt;/code&gt;, and &lt;code&gt;(?=\))&lt;/code&gt;.  The first is a non-capturing assertion that says "the pattern must be preceded by a left paren."  This is called a lookbehind assertion.  This does not get included in the match, just asserts something must be true at the lefthand side of the match. The second is the actual capture, which finds every character that isn't a right paren, and then there's a lookahead assertion, which says "the pattern must be followed by a right paren."  So this pattern is grabbing whatever is inside the parentheses.  That's the column names.&lt;/p&gt;

&lt;p&gt;Finally, when the substitution code executes, it splits the column names on a comma followed by a space, which returns a list of the column names themselves; this gets sorted alphabetically, and then joined together with a comma and space again.&lt;/p&gt;

&lt;p&gt;Whew!&lt;/p&gt;

&lt;p&gt;Perl is a neat language.  It can be the devil to maintain, but when it's written with care, it's as maintainable as any other language.  That said, I should follow the suggestions in &lt;em&gt;Perl Best Practices&lt;/em&gt; (the "Dog Book") and make this pattern more readable with inline comments.  There's another modifier, &lt;code&gt;x&lt;/code&gt;, which allows patterns to span multiple lines and have comments.  That helps a lot.&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Very close&#8230; I shouldn&#8217;t have just posted it without explaining it.  I&#8217;ll try to explain it:</p>
<pre>$fk =~ s#<strong>(?&lt;=()([^)]+)(?=))</strong>#<strong>join(&#8217;, &#8216;, sort(split(/, /, $1)))</strong>#ge;</pre>
<p><code>$fk =~ s</code> means &#8220;match and replace on the $fk variable.&#8221;  This alters the variable in place.</p>
<p>The <code>#</code> is an alternate delimiter for the <code>s</code> operation, which is usually of the form <code>s/pattern/replacement/modifiers</code>.  But you can choose your own delimiters if you&#8217;re going to have to backslash a lot of <code>/</code> characters within the pattern or something.  I chose <code>#</code>.</p>
<p>The whole line is then <code>$fk =~ s#pattern#replacement#ge</code>.  The modifiers are <code>g</code> for &#8216;global&#8217; and <code>e</code> for &#8216;execute&#8217;.  The &#8216;execute&#8217; option means to search for the pattern, interpolate $1 (the first capturing group) into the substitution pattern, then instead of just doing string replacement, execute the substitution pattern as code, and use the result as the replacement value.</p>
<p>As for the pattern itself, there are three groups: <code>(?&lt;=\()</code>, <code>([^\)]+)</code>, and <code>(?=\))</code>.  The first is a non-capturing assertion that says &#8220;the pattern must be preceded by a left paren.&#8221;  This is called a lookbehind assertion.  This does not get included in the match, just asserts something must be true at the lefthand side of the match. The second is the actual capture, which finds every character that isn&#8217;t a right paren, and then there&#8217;s a lookahead assertion, which says &#8220;the pattern must be followed by a right paren.&#8221;  So this pattern is grabbing whatever is inside the parentheses.  That&#8217;s the column names.</p>
<p>Finally, when the substitution code executes, it splits the column names on a comma followed by a space, which returns a list of the column names themselves; this gets sorted alphabetically, and then joined together with a comma and space again.</p>
<p>Whew!</p>
<p>Perl is a neat language.  It can be the devil to maintain, but when it&#8217;s written with care, it&#8217;s as maintainable as any other language.  That said, I should follow the suggestions in <em>Perl Best Practices</em> (the &#8220;Dog Book&#8221;) and make this pattern more readable with inline comments.  There&#8217;s another modifier, <code>x</code>, which allows patterns to span multiple lines and have comments.  That helps a lot.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Roland Bouman</title>
		<link>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-1882</link>
		<author>Roland Bouman</author>
		<pubDate>Tue, 19 Sep 2006 13:37:12 +0000</pubDate>
		<guid>http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-1882</guid>
		<description>&lt;p&gt;Wow - Baron I already explained I know nothing about perl, but I sure did not expect it could be written as compact as this. Phew! &lt;/p&gt;

&lt;p&gt;Lemme see if I get this:&lt;/p&gt;

&lt;pre&gt;$fk =~ s#(?&#60;=\()([^\)]+)(?=\))#join(', ', sort(split(/, /, $1)))#ge;&lt;/pre&gt;

&lt;p&gt;So, &lt;/p&gt;

&lt;p&gt;match a literal "(":&lt;/p&gt;

&lt;pre&gt;(
?&#60;=
\(
)&lt;/pre&gt;

&lt;p&gt;followed by one or more characters not like ")": &lt;/p&gt;

&lt;pre&gt;(
[
^\)
]+
)&lt;/pre&gt;

&lt;p&gt;followed by a literal ")":&lt;/p&gt;

&lt;pre&gt;(?=\))&lt;/pre&gt;

&lt;p&gt;So far I get it. Now it works "inward-out" starting a the split&lt;/p&gt;

&lt;pre&gt;split(/, /, $1)&lt;/pre&gt;

&lt;p&gt;which probably means: split the matched group $1 at the comma's , but I expected $1 to correspond with with the match made by &lt;code&gt;(?&lt;=\()&lt;/code&gt;, but clearly this is wrong as it should split whatever was matched by &lt;code&gt;([^\)]+)&lt;/code&gt;. Or start the indices at 0? &lt;/p&gt;

&lt;p&gt;I think I know what sort and join then do, but I don't understand the semantics of the # and the leading ~ s (ignore whitespace? guessing..)&lt;/p&gt;

&lt;p&gt;Anyway, thanks for this addition. I'm still not a perl user, but this kind of thing is seriously tempting me...:)&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Wow - Baron I already explained I know nothing about perl, but I sure did not expect it could be written as compact as this. Phew! </p>
<p>Lemme see if I get this:</p>
<pre>$fk =~ s#(?&lt;=()([^)]+)(?=))#join(', ', sort(split(/, /, $1)))#ge;</pre>
<p>So, </p>
<p>match a literal &#8220;(&#8221;:</p>
<pre>(
?&lt;=
(
)</pre>
<p>followed by one or more characters not like &#8220;)&#8221;: </p>
<pre>(
[
^)
]+
)</pre>
<p>followed by a literal &#8220;)&#8221;:</p>
<pre>(?=))</pre>
<p>So far I get it. Now it works &#8220;inward-out&#8221; starting a the split</p>
<pre>split(/, /, $1)</pre>
<p>which probably means: split the matched group $1 at the comma&#8217;s , but I expected $1 to correspond with with the match made by <code>(?< =\()</code>, but clearly this is wrong as it should split whatever was matched by </code><code>([^\)]+)</code>. Or start the indices at 0? </p>
<p>I think I know what sort and join then do, but I don&#8217;t understand the semantics of the # and the leading ~ s (ignore whitespace? guessing..)</p>
<p>Anyway, thanks for this addition. I&#8217;m still not a perl user, but this kind of thing is seriously tempting me&#8230;:)</p>
]]></content:encoded>
	</item>
</channel>
</rss>
