Comments on: Duplicate index checker version 1.8 released http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/ Stay curious! Fri, 10 May 2013 18:25:19 +0000 hourly 1 http://wordpress.org/?v=3.5.1 By: Five great Perl programming techniques to make your life fun again « Jfree’s dreaming http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-13497 Five great Perl programming techniques to make your life fun again « Jfree’s dreaming Tue, 09 Oct 2007 01:32:39 +0000 http://www.xaprb.com/blog/?p=231#comment-13497 [...] If you want to know how that works, read the comments on my earlier post about a duplicate index and foreign key checker for MySQL. [...]

]]>
By: Five great Perl programming techniques to make your life fun again at Xaprb http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-13493 Five great Perl programming techniques to make your life fun again at Xaprb Mon, 08 Oct 2007 16:12:43 +0000 http://www.xaprb.com/blog/?p=231#comment-13493 [...] If you want to know how that works, read the comments on my earlier post about a duplicate index and foreign key checker for MySQL. [...]

]]>
By: Xaprb http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-1890 Xaprb Tue, 19 Sep 2006 16:53:05 +0000 http://www.xaprb.com/blog/?p=231#comment-1890 That’s right. There are several of these (?something) constructs. Whenever there’s a ? right inside the (, it’s magical :-) You can do “man perlre” to find out the full syntax.

]]>
By: Roland Bouman http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-1888 Roland Bouman Tue, 19 Sep 2006 15:19:53 +0000 http://www.xaprb.com/blog/?p=231#comment-1888 Thanks Baron, that is quite enlightning!

I just assumed that the parenthesis would define a capturing group, but they don’t. If I understand it well, parenthesis are necessary, but not sufficient to form a capturing group.

I have some experience with regular expressions but this perl dialect really is new to me (shame on Roland).

If I understand well, that the non-capturing patterns are quite like the “anchors” $ and ^. So, they match a particular *position* in the text, *not* a piece of text as such, right?

kind regards,

Roland Bouman

]]>
By: Xaprb http://www.xaprb.com/blog/2006/09/17/duplicate-index-checker-version-18-released/#comment-1884 Xaprb Tue, 19 Sep 2006 14:09:29 +0000 http://www.xaprb.com/blog/?p=231#comment-1884 Very close… I shouldn’t have just posted it without explaining it. I’ll try to explain it:

$fk =~ s#(?<=\()([^\)]+)(?=\))#join(', ', sort(split(/, /, $1)))#ge;

$fk =~ s means “match and replace on the $fk variable.” This alters the variable in place.

The # is an alternate delimiter for the s operation, which is usually of the form s/pattern/replacement/modifiers. But you can choose your own delimiters if you’re going to have to backslash a lot of / characters within the pattern or something. I chose #.

The whole line is then $fk =~ s#pattern#replacement#ge. The modifiers are g for ‘global’ and e for ‘execute’. The ‘execute’ option means to search for the pattern, interpolate $1 (the first capturing group) into the substitution pattern, then instead of just doing string replacement, execute the substitution pattern as code, and use the result as the replacement value.

As for the pattern itself, there are three groups: (?<=\(), ([^\)]+), and (?=\)). The first is a non-capturing assertion that says “the pattern must be preceded by a left paren.” This is called a lookbehind assertion. This does not get included in the match, just asserts something must be true at the lefthand side of the match. The second is the actual capture, which finds every character that isn’t a right paren, and then there’s a lookahead assertion, which says “the pattern must be followed by a right paren.” So this pattern is grabbing whatever is inside the parentheses. That’s the column names.

Finally, when the substitution code executes, it splits the column names on a comma followed by a space, which returns a list of the column names themselves; this gets sorted alphabetically, and then joined together with a comma and space again.

Whew!

Perl is a neat language. It can be the devil to maintain, but when it’s written with care, it’s as maintainable as any other language. That said, I should follow the suggestions in Perl Best Practices (the “Dog Book”) and make this pattern more readable with inline comments. There’s another modifier, x, which allows patterns to span multiple lines and have comments. That helps a lot.

]]>