<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Seeking input on a badness score for query execution</title>
	<atom:link href="http://www.xaprb.com/blog/2009/06/26/seeking-input-on-a-badness-score-for-query-execution/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.xaprb.com/blog/2009/06/26/seeking-input-on-a-badness-score-for-query-execution/</link>
	<description>Stay curious!</description>
	<lastBuildDate>Thu, 09 Feb 2012 20:41:20 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Fernando Ipar</title>
		<link>http://www.xaprb.com/blog/2009/06/26/seeking-input-on-a-badness-score-for-query-execution/#comment-16647</link>
		<dc:creator>Fernando Ipar</dc:creator>
		<pubDate>Thu, 09 Jul 2009 14:10:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.xaprb.com/blog/?p=1140#comment-16647</guid>
		<description>Baron: 

The advices on inspecting the number of rows returned are good. Since you&#039;re asking for pointers in CS literature, I&#039;d suggest anything in Big &#039;O&#039; notation, since what you need is more of a theoretical and architecture agnostic way of determining how bad a query is. 

Together with the number of rows, I&#039;d compute the data type length of each column, this should give you a better idea of how much info the server will have to process and transfer back to get the results to the client. 

I&#039;d also take into account if columns can take NULLs or not, since this puts a performance penalty on MySQL, but you know a lot more about this than me :P
Either way, my ideal algorithm would use these information plus the output from explain, to take into account what stages MySQL goes through to bring you the data. 

But maybe I divert.</description>
		<content:encoded><![CDATA[<p>Baron: </p>
<p>The advices on inspecting the number of rows returned are good. Since you&#8217;re asking for pointers in CS literature, I&#8217;d suggest anything in Big &#8216;O&#8217; notation, since what you need is more of a theoretical and architecture agnostic way of determining how bad a query is. </p>
<p>Together with the number of rows, I&#8217;d compute the data type length of each column, this should give you a better idea of how much info the server will have to process and transfer back to get the results to the client. </p>
<p>I&#8217;d also take into account if columns can take NULLs or not, since this puts a performance penalty on MySQL, but you know a lot more about this than me :P<br />
Either way, my ideal algorithm would use these information plus the output from explain, to take into account what stages MySQL goes through to bring you the data. </p>
<p>But maybe I divert.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Shlomi Noach</title>
		<link>http://www.xaprb.com/blog/2009/06/26/seeking-input-on-a-badness-score-for-query-execution/#comment-16607</link>
		<dc:creator>Shlomi Noach</dc:creator>
		<pubDate>Sun, 28 Jun 2009 05:29:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.xaprb.com/blog/?p=1140#comment-16607</guid>
		<description>Baron,
Joining Tim&#039;s advice.

But also, back from my Math/CS studies, in order to give more weight to larger numbers, you usually use the log() function.
So for example, to see how far apart two numbers a,b are, you&#039;d use something like:
log(&#124;a&#124;)*&#124;a-b&#124;/&#124;a&#124;

&#124;a-b&#124;/&#124;a&#124; gives you the percent in growth from a to b;
multiplying by log(&#124;a&#124;) gives very little weight to &quot;small&quot; numbers (whatever that means) and more weight to &quot;large&quot; numbers.</description>
		<content:encoded><![CDATA[<p>Baron,<br />
Joining Tim&#8217;s advice.</p>
<p>But also, back from my Math/CS studies, in order to give more weight to larger numbers, you usually use the log() function.<br />
So for example, to see how far apart two numbers a,b are, you&#8217;d use something like:<br />
log(|a|)*|a-b|/|a|</p>
<p>|a-b|/|a| gives you the percent in growth from a to b;<br />
multiplying by log(|a|) gives very little weight to &#8220;small&#8221; numbers (whatever that means) and more weight to &#8220;large&#8221; numbers.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tim McCormack</title>
		<link>http://www.xaprb.com/blog/2009/06/26/seeking-input-on-a-badness-score-for-query-execution/#comment-16605</link>
		<dc:creator>Tim McCormack</dc:creator>
		<pubDate>Sat, 27 Jun 2009 00:42:07 +0000</pubDate>
		<guid isPermaLink="false">http://www.xaprb.com/blog/?p=1140#comment-16605</guid>
		<description>I think you want to get in touch with a statistician. They have tools to determine what constitutes a &quot;significant&quot; difference.</description>
		<content:encoded><![CDATA[<p>I think you want to get in touch with a statistician. They have tools to determine what constitutes a &#8220;significant&#8221; difference.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: rich</title>
		<link>http://www.xaprb.com/blog/2009/06/26/seeking-input-on-a-badness-score-for-query-execution/#comment-16604</link>
		<dc:creator>rich</dc:creator>
		<pubDate>Sat, 27 Jun 2009 00:17:11 +0000</pubDate>
		<guid isPermaLink="false">http://www.xaprb.com/blog/?p=1140#comment-16604</guid>
		<description>It seems the reason why going from 100rows to 125rows (a 25% change) is not really likely to be significant is because it probably doesn&#039;t cost any I/O.  Leaving aside what&#039;s possible, it would seem like you really want to measure the I/O cost of the query, and perhaps combine it with a CPU cost of the query.  (Alternatively, have two separate badness measures.)</description>
		<content:encoded><![CDATA[<p>It seems the reason why going from 100rows to 125rows (a 25% change) is not really likely to be significant is because it probably doesn&#8217;t cost any I/O.  Leaving aside what&#8217;s possible, it would seem like you really want to measure the I/O cost of the query, and perhaps combine it with a CPU cost of the query.  (Alternatively, have two separate badness measures.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Arjen Lentz</title>
		<link>http://www.xaprb.com/blog/2009/06/26/seeking-input-on-a-badness-score-for-query-execution/#comment-16603</link>
		<dc:creator>Arjen Lentz</dc:creator>
		<pubDate>Fri, 26 Jun 2009 22:32:54 +0000</pubDate>
		<guid isPermaLink="false">http://www.xaprb.com/blog/?p=1140#comment-16603</guid>
		<description>Baron, right now we tend to use some info from the queryplan as provided by the microslow patch. Mem/disk temp tables, mem/disk sorts, as well as # rows examined/returned. Rather than timing, it looks more at the flow of the query through the server, given the current dataset. It does of course also give time, so you can take it into account as well.

A &quot;badness&#039; score could be a good extra for mk-query-digest.</description>
		<content:encoded><![CDATA[<p>Baron, right now we tend to use some info from the queryplan as provided by the microslow patch. Mem/disk temp tables, mem/disk sorts, as well as # rows examined/returned. Rather than timing, it looks more at the flow of the query through the server, given the current dataset. It does of course also give time, so you can take it into account as well.</p>
<p>A &#8220;badness&#8217; score could be a good extra for mk-query-digest.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

