Xaprb

Stay curious!

Seeing things from the user’s point of view

with 14 comments

I was discussing how to avoid surprising users and someone pointed out that what seems intuitive and rational to one person is often complete insanity for others. The mental gap between a developer and a user can often be a chasm far too wide to cross. Of all the bug reports I’ve filed against MySQL, here is my all-time favorite:

select * from t where a >= 1.0order by a;

Does not cause an error. I believe it should, because there should be a whitespace before ORDER BY.

Similar syntax errors such as “select 1e0from dual” were also accepted as valid SQL. Much soul-searching later, the official reply from MySQL’s development team:

The server behaves properly here:
- “1″ alone can not be an identifier, because it’s followed by a “.”
- therefore, the lexer parses “1″ as the beginning of a number, and ends up with “1.0″ as one token.
The next token will be “order”, a keyword.

There is nothing special to document here either. What was reported is not a bug in the server, and is not a documentation bug.

What just happened here? It’s fairly simple. Never attribute to malice that which can be adequately explained by tunnel vision. The developer decided that the SQL isn’t a syntax error because the lexer is designed to parse it. If the developer were able to stand in a user’s shoes for a moment, this answer would be obviously absurd. But when you live deep in the dungeons of a large codebase, you lose the ability (despite your best intentions) to see a normal user’s point of view.

For this reason I often doubt my own judgment. I’m ignorant of my ignorance. It’s a very “meta” problem. I have to rely on others to give me a sanity check.

I suggest, as a topic of discussion, that the willingness and ability to accept others’ judgment over one’s own is an essential characteristic of good leadership. I also suggest that knowing when this is appropriate is another attribute a good leader must have. Both of these are very difficult in my experience.

Written by Xaprb

March 4th, 2013 at 3:43 pm

Posted in Commentary,SQL

14 Responses to 'Seeing things from the user’s point of view'

Subscribe to comments with RSS

  1. As someone who has been maintaining the same product for almost 10 years this is something that I grapple with constantly.

    I’m fortunate to be easily accessible to a pool of veteran and new users, which definitely helps keep things on the straight and narrow. Trying to be conscious of it and actively put yourself in a user’s shoes is hard, but also helpful.

    AussieDan

    4 Mar 13 at 4:18 pm

  2. If you recall, Baron, my bachelors is in philosophy. Consequently, one of my maxims is: “Those who study logic too long may lose sight of reason.” I think that applies to programming, too. In fact, at the moment, I’m fixing a tool to do what users expect, whereas previously my advice to users was “wrap the tool in a script and make the script do that.” Easy for me; less so for the user.

    But as you note: it’s difficult. Bloat, feature creep, truly one-off features… sometimes the user has the proper insight, and other times they want the proverbial “faster horse”.

    My general rule of thumb is: there’s an inverse relationship between “good for the user” and “good for the programmer”. So when I find myself slogging away at seemingly mundane things–error handling, documentation, etc.–I know I’m on the right track. It’s like pt-table-checksum 2.0: it took months of painstaking work to make, but it’s been nearly infallible for the user ever since.

    Daniel Nichter

    4 Mar 13 at 7:21 pm

  3. Although the user perspective is an important factor, in this particular case the guiding document should be the SQL standard. I’m sure users wouldn’t appreciate a SQL parser that accepts invalid SQL and/or rejects valid SQL.

    So, the question is whether that is a valid SQL construct. IIRC, that is not valid because the standard requires such tokens to be followed by a delimiter token (a keyword is not a delimiter token) or a separator (blank space).

    Davi Arnaut

    4 Mar 13 at 9:05 pm

  4. Thanks.

    I will leave you as an exercise, to read your own post from my point of view.

    You want want to read this again:

    http://www.xaprb.com/blog/2010/04/19/great-things-afoot-in-mysql-5-5/

    – Marc Alff

    Marc Alff

    5 Mar 13 at 5:00 am

  5. I’d suggest reading up a bit of lexing, e.g. level three grammars which correspond precisely to true *regular* expressions. MySQL is correct in this case and is not alone, e.g. Perl has an `x’ operator yet a space isn’t required in

    $ perl -e ‘print 1.x42, “\n”‘
    111111111111111111111111111111111111111111

    In many languages, whitespace is only required in order to separate two tokens/lexemes; without it, the standard rule of `leftmost longest’ match applies and the two intended tokens can become one if and only if the second could be considered an extension of the first. For example, in C the space in `return foo;’ is required otherwise the token is `returnfoo’ which may result in a compilation error. (It might not if there’s an identifier with that name.)

    A lot of the time, the second token can’t be a continuation of the first so the space can be skipped, though it can lessen readability, e.g. `1*2+3/4′, but C, again, has `-’ and `–’ as tokens and if one means `a – -b’ then it can’t be written `a–b’ as `leftmost longest’ makes that three tokens.

    I’ve defended MySQL; it’s a funny old world!

    Ralph Corderoy

    5 Mar 13 at 8:39 am

  6. Marc, this isn’t about you personally, it’s about the system’s behavior and a generic programmer’s point of view on it. It could have been anyone, including me as I pointed out. (This is why I didn’t mention your name in the blog post.)

    Ralph, you’ve defended lexers, not MySQL. Davi Arnaut is correct in my opinion: the language spec is the reference of truth (and I pointed this out in the bug report, by the way). Frankly my dear, I don’t give a damn if this is how lexers work :-) The lexer is the tool of choice for implementing the language. If the tool can’t implement the language correctly, is that an excuse? Or should we say that SQL is buggy because it requires something the lexer doesn’t implement?

    To heat the flame war up even more, I’ll ask whether PostgreSQL has this behavior. Or — better yet — SQLite.

    sqlite> select * from t where a >= 1.0order by a;
    Error: unrecognized token: "1.0order"
    sqlite> select 1e0from dual;
    Error: unrecognized token: "1e0from"

    Looks like it’s possible to do the right thing. I rest my case.

    Xaprb

    5 Mar 13 at 9:42 am

  7. Ah, thank you for saving me; you’re right, I’ve defended lexers, and the ANSI SQL spec. is the arbiter of correctness.

    Ralph Corderoy

    5 Mar 13 at 10:19 am

  8. Actually, this may be invalid according to the standard. The standard says “A shall be followed by a or a “. 1.0 is an unsigned numeric literal, which is a (NDT). order is a key word, which is also an NDT. So here we have two consecutive NDTs without an intervening separator, which is not allowed.

    Without this rule, I would say the syntax was legal and valid…

    Roy Lyseng

    5 Mar 13 at 11:26 am

  9. Marc Alff

    5 Mar 13 at 1:20 pm

  10. That behaviour is simply stupid. “Fixing” it by changing the documentation is even more so. No other DBMS would accept such a statement.

    Hans

    5 Mar 13 at 2:24 pm

  11. Btw: the SQL standard requires tokens in SQL to be separated by whitespace:

    From ISO/IEC JTC 1/SC 32, “SQL/Foundation”:

    white space: Characters used to separate tokens in SQL text”

    This again shows how much the MySQL development actually cares for SQL standards

    Hans

    5 Mar 13 at 2:35 pm

  12. Hans,
    MySQL development cares a lot about the SQL standard. New features are usually developed to conform strictly with the standard. Bugs describing non-standard behavior are also reported and fixed all the time.

    However, MySQL has a lot of legacy functions that are not necessarily in compliance, and sometimes one will face the decision of implementing a standard-compliant solution or breaking compatibility with existing applications.

    In this particular case, I think the problem is minor, and the risk of breaking existing applications could easily be more significant than the benefit gained by eliminating this confusion.

    Roy Lyseng

    6 Mar 13 at 4:22 am

  13. Hi, I’ve had some customers with a similar problem on Oracle. A bug in the parser in 10g was allowing joins without qualifier.

    Bug 5368296 : SQL NOT GENERATING ORA-00918 WHEN USING JOIN

    The 11g has fixed the bug but introduced a code incompatibility for developers relying on the buggy syntax, so the upgrade to 11g has been painful.

    IMO this behavior of mysql should be fixed with a new “hidden” variable that enable or disable it… so in new versions developers can revert the variable to maintain code compatibility.

    This is just my 2 cents…

    Ludovico

  14. I absolutely agree with Roy that the problem is very minor and the fix is probably much riskier than is merited.

    Xaprb

    7 Mar 13 at 10:27 am

Leave a Reply