Archive for January, 2006

How to implement CAPTCHAs without images

I’ve started getting a lot of spam comments, so I decided the time has come to put a simple system in place to foil the spam robots. In an earlier article I asserted CAPTCHAs are terrible and there are easy ways to foil naive robots without making the site inaccessible or unusable. I’ve implemented a simple question-and-answer system to prove my point. Comment forms show a randomly chosen predefined question (I’ve only put three in the system) and display several predefined answers, only one of which is correct. This took about 30 minutes and maybe 20 lines of actual code to do. Right now the questions and answers are hard-coded in a new include file, but it would be trivial to database them too.

How I did it

  1. Create a new file, say captchas.php
  2. Fill it with an array of entries, one per question:
    <?php
    $captchas = Array();
    
    # Create a single entry
    $captchas[] = array(
        "question" => "What color is the sky?",
        "answer" => "blue",
        "options" => array("blue", "red", "orange"));
    
    # Create as many as desired by copy-and-paste...
    ?>
  3. Modify the comment form. Include the above file in wp-content/themes/default/comments.php, and change a few lines where the form is displayed. At the top of the file,
    <?php require_once("captchas.php"); ?>
    Then, just before the SUBMIT button for the form,
    <?php
    $tabindex = 5;
    $captcha_index = rand(0, count($captchas) - 1);
    ?>
    <input type="hidden"
        name="captcha_index" value="<?php echo $captcha_index; ?>" />
    <p><?php echo $captchas[$captcha_index]["question"]; ?>
    <?php foreach ($captchas[$captcha_index]["options"] as $captcha_answer) { ?>
    <br /><label for="captcha_<?php echo $captcha_answer; ?>">
    <input tabindex="<?php echo $tabindex++; ?>"
        id="captcha_<?php echo $captcha_answer; ?>" type="radio"
        name="captcha" value="<?php echo $captcha_answer; ?>"
        /><?php echo $captcha_answer; ?></label>
    <?php } ?></p>
  4. Now the randomly chosen question’s ID and the user’s answer are submitted along with the form.
  5. On the receiving end of the form, which is in wp-comments-post.php, all I have to do is check the answer against the correct answer for the question. First, I include the captchas.php file as before. Then I grab the two new inputs where the rest of the input is grabbed:
    $comment_author       = trim($_POST['author']);
    $comment_author_email = trim($_POST['email']);
    $comment_author_url   = trim($_POST['url']);
    $comment_content      = trim($_POST['comment']);
    $comment_captcha_idx  = trim($_POST['captcha_index']);
    $comment_captcha      = trim($_POST['captcha']);
    Only the last two lines are changed in that code sample — I included the first lines for context. I use the input a bit later, where the input checking occurs:
    if ( !is_numeric($comment_captcha_idx) || !$comment_captcha
        || $captchas[$comment_captcha_idx]["answer"] != $comment_captcha)
    {
            die( __("Error: wrong answer to the CAPTCHA question"));
    }
  6. That’s it. I’m done. It took longer to explain than to actually write the code.

How hard is it to circumvent this?

If you’re a human, or if you’re trying to bypass just my site, it would be easy to do, but if you’re a spam robot, bypassing the system means learning something about my site (the questions and answers) that is different from other sites. I’m counting on spam robots to be dumber than that, and assume all WordPress sites are the same. Time will tell if I’m right; I anticipate not getting any more spam comments, though. I think the crucial thing is to get it “right enough” that humans find it easy and it’s just a little too hard for spam robots to make it worth anyone’s while. This is, in my opinion, the essential trade-off in security and any other application where you want to stop the Wrong Thing from happening. You can never truly guarantee the Wrong Thing won’t happen, but you can make yourself a less attractive target so all the easy targets will get the heat instead.

Technorati Tags:No Tags

You might also like:

  1. CAPTCHAs without images, part 2
  2. My unorthodox CAPTCHA blocked thousands of spam comments every week
  3. Why CAPTCHAs don’t work well

Firefox vs. Opera on slow hardware

My main computer is a medieval laptop running Ubuntu GNU/Linux. I used to run Gentoo but tried Ubuntu on a lark, and haven’t been motivated enough to change back to Gentoo (or even decide whether I want to, since Ubuntu works fine too). There is one problem, though: Firefox is running more slowly with each release. What to do?

The background

I’ve been running Firefox since way back when the project got started. When it was Phoenix 0.4, I was on board. I was buying t-shirts, displaying buttons and logos on my websites, and telling my friends. At 0.5 or so, my brother got hooked too. I was there for the name changes, to Firebird and then Firefox. I’ve submitted, discussed, and voted for bugs and patches. I’ve donated to the Mozilla project. In short, I feel attached to this piece of software. For me, Firefox is not just a good web browser. It represents freedom, adherence to standards, respect for privacy, cooperation, and so much more.

Most of all, Firefox demonstrates to the world that you don’t have to sell your soul to Them. I love Free Software as a philosophy. I love the ethics. It speaks clearly to me of That Which Is Right. I’m serious about that. I have a lot of problems with non-Free software, and I really see it as the root of or enabler to many of our current evils (loss of privacy, rootkits, credit card thefts, election fraud). But all through my career with computing, anytime I run into someone who says “screw the ethics, show me practical reasons why I should stop using Excel or SQL Server or IE” I’ve come up against a wall: for every reason I can give, someone else’s marketing department has created a counter-argument. Just to name one example, Microsoft has commissioned lots of studies “showing” the equality or superiority of their products (they call it getting the “facts”). On the other hand, there are tons of studies and benchmarks and whitepapers showing the opposite, too — measurably higher code quality, fewer security incidents, lower total cost of ownership, and on and on. If you’re not an expert, you don’t know who to believe. It’s my word against theirs, and statistics are worse than lies.

I view Firefox as the tipping point. Finally, we who believe don’t have to sway people with words alone. It’s blatantly obvious to many people at this point that Microsoft’s offering is categorically inferior in ways that matter to everyone. Since Firefox has caught hold, I no longer try to convince people. They ask me when they see my t-shirt, and I just say “you might consider giving it a try. Read their website and see if you think it’s worth looking into.” Things just seem to progress after that. A week later they tell me they’re really excited about it, too. That’s when I try to let them know it’s part of a much larger picture; I tell them about the GNU project, about the ethics and philosophy behind it all. I try to give a bit of context. I hope the snowball picks up speed — I’m trying to push it faster.

Practical concerns

As time goes on my old, slow hardware has a harder and harder time with newer software (whose features I love and don’t want to live without). It’s gotten to the point that my laptop doesn’t feel responsive when browing the web. I’m not griping about little things — I’m talking about the browser being unresponsive for many seconds while a new tab opens or something. I want to keep this old tanker around, though. First of all, it works just fine. There’s nothing wrong with it — as long as I’m running XFCE or Fluxbox (or ratpoison, better yet!) and lynx. Second, it is also proof that Linux can run just fine on old hardware — hell, Windows 98 had a hard time on this thing, so it’s pretty amazing to see it boot up in less time than XP takes to boot on my spankin’ new laptop from work. (For those who don’t know — every version of the Linux kernel gets faster, not slower like Windows). It’s just this new breed of software that’s getting harder and harder to run on it. Finally, I detest the “consumer” culture that says “stuff” is OK to make and throw into landfills when it becomes boring. I don’t want to contribute to that any more than I have to. I want to run this thing until it melts into an unrecognizable blob.

Enter Opera. I’ve also been a longtime fan of Opera. I bought a license for an early version on Windows, back in the bad old days when I used Windows. I have always liked Opera’s support for standards, small size and speed. I’ve had my share of gripes, but overall, it’s not all that bad to use. And there is one critical thing that makes it attractive on this old laptop: it’s much faster than Firefox. You folks with processors that go faster than 10 mph might not appreciate this fact, but use it on my old laptop and you will definitely see the difference. Opera 8.51 is fast and lightweight enough to browse the web in a reasonably usable way on my machine.

Here is my list of Opera pros:

  • I like it OK
  • it’s fast(er)
  • it has reasonably good privacy controls (a cookie whitelist)

And the cons:

  • it’s not extensible like Firefox
  • there’s no adblocking capability (you can block ads with stylesheets, but it doesn’t prevent the content from ever being loaded, which is really important for privacy in my opinion)
  • there are limited JavaScript tools
  • overall I want my features — I want Aardvark, I want Venkman, I want the Web Developer Toolbar, I want AdBlock. I feel starved for features.
  • it’s not Free Software. I balk at the feeling of betraying my ideals.

Solutions (or not) and fun

I feel conflicted. I’m thinking I might just need to bite the bullet. I might use this laptop for a file and print server, to run LAMP as a development box, and so forth. It might be time for me to build myself another computer for use as a desktop machine. After all, I’ve gotten about 7 years out of this laptop, so if I build a decent desktop machine, maybe it’ll be good for 10 more or so.

There’s more. My fiancée has a schmancy new dual-core Mac G5, which according to her can do “eighteen billion billion” of something or other. I’m not sure she knows what that means, but she told me I can quote her:

I only need to know three words: Eighteen. Billion. Billion. Are you going to quote me on your blog? Quote me where?

Hmmm, that sounds like a challenge. I might need to spend a little extra money and get the biggest and baddest now. And you thought I was all Mr. EgoDontMatter, did you?

She tells me jealousy is a horrible thing, and I can touch her computer if I’m feeling envious. When I bring up how often it crashes and forces her to reboot (how is it that a computer with eighteen billion billion somethings can’t run a few programs without crashing?) she says

It only crashes when you’re around. I’ve had 20 years of using a Mac and it never used to crash. Now you’re around, and it’s crashing.

For the record, I never did anything to her computer to cause crashes. I did show her once how, since it’s built on UNIX, you can use killall to kill programs when the point-and-click interface’s command to “forcibly kill” something gets laughed down by the offending app. Remember, Real Men Don’t Click.

I will leave you with another quote from my younger brother, who recently built a computer himself. He’s talking about my computer, after my fiancée looked online to see “how many bits her computer has”:

His computer probably only has twelve bits.

I suppose, whether it’s Opera or Firefox, the most important thing is to keep it fun. Next to Freedom, of course.

Technorati Tags:No Tags

You might also like:

  1. How to prelink mozilla-firefox-bin
  2. How to guard your privacy with blacklists and whitelists
  3. Ubuntu on Dell Inspiron 1501
  4. How to set up dual monitors in Ubuntu on Dell Inspiron 1501
  5. How to auto-mount removable devices in GNU/Linux

How to label Excel and OpenOffice.org XY scatter plots

In an earlier post I compared number formatting in Excel vs. OpenOffice.org Calc. I’ve learned some more interesting things about both spreadsheets, as regards opening CSV files and adding labels to XY scatter charts (spoiler: both spreadsheets have problems)

Excel vs. Calc

Opening CSV files with Excel

Maybe someone else can answer this one for me, because I’m stumped and can’t seem to find the right search phrase to turn up relevant results in Google: I can’t get Excel to open a .csv file on my friend’s Mac. It runs OSX, and the “About Excel” dialog says “Excel X for Mac” (can you tell what a dummy I am when it comes to Mac? The only thing that saves me is the presence of the Terminal, so I can resort to the command line to do things). Both of us have tried all the ways we know. No matter what we do in the Open dialog, including choosing “All Readable Files,” it leaves CSV files grayed out. The only way we’ve found is to rename it to something else such as .txt, open the file, and then do Data->Text To Columns.

Labelling XY scatter charts

I’ve been working with cemetery data again. Recently we took a total station out to a cemetery and mapped it, then downloaded the data as tab-separated values. For a quick and dirty map of the data, it’s great to import it into a spreadsheet, select the Northing and Easting columns, and map it as a scatter plot. This gives a quick sense of what the map looks like. Of course, when you’ve got hundreds of points on the map, you want them labelled so you can see what they are, like so:

The desired result

The first column in the spreadsheet is the point’s name. We tried and tried but couldn’t get Excel to plot the points with nice labels next to them. A bit of Googling revealed lots of other frustrated folks with the same problem. This has been a limitation in Excel for many years, and so many people want this feature, I wonder why they aren’t implementing it. The good news is, someone has written a little utility which will label XY scatter plots in Excel, both for PC and Mac (here’s another link). So it’s possible to do after all — just not easy, and not built-in.

On the other hand, opening the same file with OpenOffice.org Calc and creating the same graph led me to believe it is supported in Calc. The graphing autopilot has a step where I specified the first column as labels:

Step 1, choosing the data

But after following through the rest of the steps — choose chart type, etc etc — the final result has no labels. I fooled around with it for a while, read the documentation and surfed the web, but still couldn’t get it to show the labels. Only after I posted on the OpenOffice.org forums did I find an answer:

  1. Select the data, start the graphing AutoPilot, check “First column as labels” and create the graph
  2. Place the cursor over the unselected graph and right click. Select “Edit”
  3. Select “Insert > Data Labels…” and check “Show Label Text”

I probably would not have solved this on my own. The way to select and unselect charts, and how to modify their properties, is really unintuitive, I’m afraid. Even after fooling with charts a while, I’m still blundering through things like exactly what sequence of actions is necessary to make a chart editable, what I need to do to alter the scale on the axes, and so forth. Even if I had known all that in advance, though, I wouldn’t think to go to the Insert menu to add labels to the chart. I told it to do that when I created the chart — why doesn’t it show them by default? If I didn’t want them to show, I wouldn’t have specified the first column as labels.

I conclude both Excel and OpenOffice.org both have some room for improvement. I’m sure that comes as no surprise! The good news is, OpenOffice.org is a community-driven effort, with an open bug-tracking system and active forums — not to mention it’s Free Software. You know who I’m backing… take ‘em to the mat!

Technorati Tags:No Tags

You might also like:

  1. How to convert text to columns in OpenOffice.org Calc
  2. Excel vs. OpenOffice.org Calc in number formatting
  3. Seldom-used HTML form elements

JavaScript regular expression toolkit

I have created a web page that matches regular expressions against arbitrary input text and displays the results graphically, so you can take some sample text and build regular expressions the easy way, with immediate feedback about what matches and where, where you have errors, and more.

Regular Expression Toolkit Screenshot

I’ve been working on this for a couple of weeks, a minute here and there as I get time. A few days ago I saw someone created a very nifty similar app over at Rex V. Apparently the pundits are right — never assume you’re the only one with an idea.

Mine is simpler and doesn’t use AJAX. It’s JavaScript only. Thanks to the folks at ActiveState for the idea — I was inspired by Komodo. I have to wonder whether Rex V was too!

Here it is: the JavaScript regular expression toolkit.

I used my work on grouping data visually with row groups and browser variations in RexExp.exec() to build this tool.

Technorati Tags:No Tags

You might also like:

  1. How to avoid VBScript regular expression gotchas
  2. Javascript date parsing and formatting, Part 2
  3. Browser variations in RegExp.exec()
  4. MySQL Find 0.9.0 released
  5. How to block ads effectively with AdBlock regular expressions

Browser variations in RegExp.exec()

IE6, Firefox 1.5 and Opera 8.5 handle regular expressions slightly differently. I analyzed them and learned some interesting things about the browsers. For this article, I’m using abcde as the input, and /abc|d(e)/g as the regular expression. The expression should match twice, and the second match should capture the letter “e” because it’s in a capturing subexpression.

Properties of match results

Calling RegExp.prototype.exec() with a global regular expression returns an Array object with some extra properties. IE and Opera don’t quite agree with ECMA-262 — they add extra properties and don’t create properties that should exist (Firefox gets it right).

In case you don’t know the ins and outs of JavaScript regexes, there is a subtlety about capturing subexpressions and global regular expressions. exec actually only looks for one match even though the expression is global, but sets the index at which the match happened. If I call it again, it keeps looking from where it left off, so I can loop through the successive matches. Each time the result is an array containing the current match, plus the captures and information about where the match occurred. As usual, index 0 of the array contains the text of the entire match, index 1 contains the first captured subexpression, and so on.

Here’s what I get when I call /abc|d(e)/g.exec("abcde") twice in a row on the browsers in question:

I got the results with a for/in loop, like so:

var re = /abc|d(e)/g;
var result = re.exec("abcde");
for (var prop in result) {
    ...

Here are the differences:

  • Opera doesn’t enumerate over the first captured subexpression in the first result. In Firefox, it exists without a value (has the special value undefined), and in IE it exists with a value — the empty string.
  • IE adds the proprietary lastIndex property to the result.

Subexpressions

I said Opera doesn’t enumerate over the property named “1″ in the first result. According to the spec, the property named “1″ should still exist. Opera knows the property should exist, as I proved by examining the length property. Its value is 2 in all browsers, which is correct as specified by section 15.10.6.2 of the spec:

15.10.6.2 RegExp.prototype.exec(string)

Performs a regular expression match of string against the regular expression and returns an Array object containing the results of the match, or null if the string did not match

…snip…

  1. …[1-11 omitted]…
  2. Let n be the length of r’s captures array. (This is the same value as 15.10.2.1’s NCapturingParens.)
  3. Return a new array with the following properties:
    • The index property is set to the position of the matched substring within the complete string S.
    • The input property is set to S.
    • The length property is set to n + 1.
    • The 0 property is set to the matched substring (i.e. the portion of S between offset i inclusive and offset e exclusive).
    • For each integer i such that I > 0 and In, set the property named ToString(i) to the ith element of r’s captures array.

In other words, the length of the array should be 2 even in the first match, because the length of the array depends only on the number of capturing subexpressions in the pattern — so the browsers are doing the right thing.

If the property exists, Opera should enumerate it in the for/in loop. The spec is clear about what properties are enumerable (section 15.2.4.7), and it never says such a property should get the dontEnum attribute, so I think Opera’s behavior is incorrect. In fact, I’m pretty sure Opera is actually never creating the property. I ran some tests with an Array and set one of the properties to undefined. Opera still enumerates it, so it’s not as though Opera doesn’t enumerate properties that have no value. I think Opera is setting length to 2, but never creating properties for capturing subexpressions that don’t participate in the match. Technically this does not violate the spec’s instructions on an Array’s length property, but it is suspicious.

The moral of the story is you shouldn’t use a for/in loop when iterating through subexpressions. Just iterate from 0 through length minus one.

I take exception to IE giving the capture a value. The subexpression doesn’t capture anything and doesn’t participate in the match, so it should not have a value — not even the empty string or null. I suppose this one is up for debate, but that’s my personal opinion.

IE’s lastIndex property

Technically, this property shouldn’t be there; it should be a property of the global RegExp object in ECMA-262, or the regex itself in later versions of JavaScript (I have no idea why you’d make it a property of a global object; that seems like it would cause all sorts of stupid bugs, so I think the way IE does it is probably a lot smarter than the spec).

Other stuff

I spent a lot of time messing with the various browsers to see if I could find obscure bugs enumerating properties that don’t exist, setting a value and then unsetting it on subsequent calls, and so forth. The good news is I didn’t find any more bugs (though they could still exist!), and I found that the quasi-bugs discussed above are really trivial.

Technorati Tags:No Tags

You might also like:

  1. How to avoid VBScript regular expression gotchas
  2. JavaScript regular expression toolkit
  3. How to create input masks in HTML
  4. How to find duplicate and redundant indexes in MySQL
  5. Javascript date parsing and formatting, Part 2

How to display an HTML table as a folder tree

XHTML tables provide several elements to group and structure data, including row groups (thead, tbody, and tfoot). Styling row groups with CSS can make data relationships visually obvious. One familiar way to group data visually is with Explorer-style folder icons.

Data grouped as a folder view

The basic idea is to use tbody as many times as needed to group each set of rows together. The image above shows a single tbody element. I think using multiple tbody elements may not occur to developers because it sounds like there ought to be only one — but that’s not true. Tables can have as many tbody elements as you want. You can optionally have one (and only one) thead and tfoot too. Read the Tables in HTML documents part of the HTML spec for more, if you want (there’s no need to for this article).

The next thing to do is add some CSS. The image will go at the far left of the leftmost (first) td as a background image, and I’ll add some left-padding to keep the text from overlapping the image. I identify the leftmost column with the first-child class.

Next, the first and last rows in the group need special treatment. The middle rows get a little dotted “tree-view” extender line, but the first row needs a folder icon and the last needs the extender line not to continue downward (because there’s nothing below it to connect to). To accomplish this, the first row in the group gets the first-child class, and the last gets class="last-child". Now I can use these to set different background images for the first and last rows in the group.

If I knew that my browser was a Good Browser such as Opera, Firefox or Konqueror, I could use the CSS selector :first-child instead of adding classes, but since IE is still popular, I’m adding the classes to the HTML instead.

This part is optional, but I like to do it because it keeps the number of images down: use exactly the same image for every row (first, middle and last), and set the background-position property so a different part of the image will show up (top, middle, and bottom of the image).

That’s it! Here’s a demo.

Another option is to re-code the first row of each group with th elements instead of td. The scope attribute can then be set to rowgroup, which conveys additional semantic meaning about the row and eliminates the need to add the first-child class to the tr. Whether I do that depends on the data. I don’t think it makes sense for my demo, but I can imagine data applications where it does. I can also imagine making the leftmost column th instead of td in my sample data; that strikes me as appropriate. Regardless of how I do it, if I mark the data up semantically, I can use CSS to reflect that meaning visually.

Technorati Tags:No Tags

You might also like:

  1. How to find duplicate rows with SQL
  2. Simple and complex types in XML Schema
  3. Advanced HTML table features, Part 1
  4. How to find data distributions with SQL
  5. JavaScript regular expression toolkit

ASP.NET’s Profile DB schema

ASP.NET has built-in functionality to store profile information about a user. The DB table schema has several design trade-offs that make it somewhat inflexible for certain uses.

ASP.NET will write a custom class, given the properties you want, such as name and birthdate. It will also take care of hooking the plumbing up in the database (there is a little script to create the profile tables in the database). It then stores and retrieves the data on subsequent requests. The feature can handle both text and binary data, but for simplicity’s sake, I’ll just ignore the binary. Since the profile could contain arbitrary information, the table has to be designed to accommodate any type of data — essentially name/value pairs. Here’s the table schema:

CREATE TABLE dbo.aspnet_Profile (
    UserId uniqueidentifier NOT NULL PRIMARY KEY CLUSTERED,
    PropertyNames ntext NOT NULL,
    PropertyValuesString ntext NOT NULL,
    PropertyValuesBinary image NOT NULL,
    LastUpdatedDate datetime NOT NULL
)

Hmmm, that’s an interesting schema. How do you store name/value pairs in that? I’d expect to see a UserID column and a Name column, with the primary key on UserID and Name, but it looks like they must be storing the data another way. For one thing, there can’t be multiple rows per user — all the values have to be in one row. I could see someone arguing that’s a good idea, because it would keep the data all on one page — but the columns are ntext and image so they’re not stored in-page anyway. That results in a compact table, with a small clustered index to seek for the user’s row, but then the DB has to seek to other pages and find the data stored in those three columns. So how is the data stored?

select top 1 UserId, PropertyNames, PropertyValuesString from aspnet_Profile;

Results:

Yuck! So the object just dehydrates itself in a similar fashion as PHP’s serialize and re-writes the entire row whenever it saves itself into the database (I’m guessing it re-writes the entire row; perhaps it’s smart enough to know that the binary data doesn’t need to be re-written if only the text has changed, though the design doesn’t instill much confidence about that). This is a very bad design. The table isn’t even first normal form. There is also no decent way to use this data except through the Profile objects. I can’t grab the data and query it for reports or whatnot. And finally, those ubiquitous Microsoft uniqueidentifier 128-bit surrogate keys are rearing their ugly heads.

I’m surprised and nonplussed. After all the gazillions of dollars that went into ASP.NET 2.0… I’ll give them some credit and say “it’s good that they found a way to store the data in the table without customizing the table schema based on the desired profile properties,” but this design is barely a step up from that. This schema is missing all the obvious benefits of normalization.

Technorati Tags:No Tags

You might also like:

  1. What is your favorite database design book?
  2. How to simulate the GROUP_CONCAT function
  3. How to escalate privileges in MySQL

Don’t change a constant variable

A company for whom I have done some coding advertises their years of service on their website. Every year after the New Year, someone notices the dates are out of whack, sends around an email and it has to be fixed. It’s not quite hard-coded, if that’s what you’re thinking. It’s just that the wrong thing is hard-coded in the website’s configuration file, Config.asp:

Const YearsOfService = 31

I’ve seen someone update that variable literally every year I’ve been involved with the company in question. Today it happened again:

Const YearsOfService = 32

A moment’s thought shows there is something wrong with this code. YearsOfService cannot possibly be a constant, right? Unless it’s posthumous and the company will never add another year of service. The issue is that we’re holding the wrong data constant: the real constant, which will not change (hence the name) is the year the company began offering service.

I proposed the following code:

YearsOfService = DateDiff("YYYY", "1/1/1974", Now())

I got the terse reply “Go for it.” Of course I did. Why is this so hard? I can see missing the obvious once, but year after year after year? In a team of six or seven people? How can you explain everyone missing it time after time? I don’t get it. This is really, really easy. Even if you postulate that it takes a deep thinker to notice the incongruence about a “constant” that has to be updated, it doesn’t take a genius to notice a pattern after you do something really simple a number of times.

I really want to wrap this post up by saying “it only seems easy, but there’s a factor that’s not obvious, which explains the whole thing.” But I can’t. I don’t see any such factor. If you were hoping for some insight, sorry, I can’t offer it. *sigh*

Technorati Tags:No Tags

You might also like:

  1. MySQL Community Member of the Year

Credit card expiration dates should conform to standards

My credit card says it expires “06/07″. What is that? Is it June 2007, or July 2006? You may think I’m being silly, but it confuses me. I’m not as smart as some people, but if it confuses me, it’s gonna confuse some others too.

expires WHEN???

I recently placed an order online and got the expiration date wrong. It wouldn’t have been all that bad if the entry form had mimicked exactly what’s on the front of my card, but the online form had a 4-digit year pulldown, followed by a two-digit space for the month — exactly the opposite order from my card. As a result, my card didn’t go through and the order became a big hassle. Yuck!

All this could have been avoided if the expiration date were specified as YYYY-MM. Is there a reason not to do this? Maybe the machines that stamp the cards would have to be replaced in order to change the dates, I don’t know. Regardless, it’s confusing for poor little me.

This won’t get any better until 2013, when two-digit years will be distinguishable from months. Alas.

Hey you credit card companies, why don’t you use ISO-8601 date formats?

Technorati Tags:No Tags

You might also like:

  1. How to create input masks in HTML
  2. Favorite USB wireless card for Ubuntu?
  3. Javascript date parsing and formatting, Part 2

How to format numbers in JavaScript flexibly and efficiently

Download number-functions

This article continues my series on parsing and formatting data with JavaScript, this time with numeric data. I don’t need to do number parsing, but formatting is very useful. The technique is similar to my date formatting code — code that writes code (for raw speed), using custom format specifier strings (for flexibility and ease of use). The result is number formatting functionality that is highly efficient, flexible, and easy to use.

First, the idea: you have a number, you want it formatted a certain way. Here’s how:

var dollars = 5.001;
alert(dollars.numberFormat("$0.00");
// result: "$5.00"
var percent = .08134;
alert(percent.numberFormat("0.00%");
// result: "8.13%"
var bignum = 12831242485472;
alert(bignum.numberFormat("0,0,, million");
// result: "12,831,243 million"

My custom date formatting code used PHP’s date-formatting syntax because it’s much less context-sensitive and (I think) more useful than Microsoft’s, but my number-formatting syntax is similar to Microsoft’s because it’s much more widely used and I don’t see an existing, better alternative. Rather than documenting it separately, I’ll just point you to the (poor quality) Microsoft documentation for the .NET Custom Numeric Format Strings functionality, and list the differences from my implementation:

  • Rounding works differently in multi-section format strings. In .NET with a two-section string,

    If the number to be formatted is negative, but becomes zero after rounding according to the format in the second section, then the resulting zero is formatted according to the first section.

    This is not true in my code — the number is formatted according to its value, and once the code decides which section applies, that section will be used no matter what happens during rounding.
  • Question marks are digit placeholders just like the number sign (#), but if there’s no digit to insert, they get replaced with spaces, not removed. They can be used for space-padding, which might be useful for, say, accounting notation.
  • You don’t have to enter quotes around strings that should be mixed in with the number placeholders. In fact, my syntax is much more permissive than the Microsoft syntax: anything can go anywhere. You can put arbitrary strings smack in the middle of your number if you want.
  • It’s not internationalized.

I’ve only implemented a subset of the various number-formatting syntaxes I’ve seen in spreadsheets and so forth. The subset is about 85% complete in my opinion. However, I think it’s functionally about 99% complete, which means I think 99% of the time you want to format a number, it will do what you want. The tradeoff is simplicity and speed. Number formatting is actually much more difficult than date formatting, and I’ve tried to keep the code sane.

I have a set of unit tests, which use the excellent JsUnit library. Bring up the unit test page and enter the following url to be tested: www.xaprb.com/articles/number-test.html.

Of course there’s the obligatory demo page, too.

Technorati Tags:, , , , ,

You might also like:

  1. Javascript date parsing and formatting, Part 2
  2. JavaScript date parsing and formatting, Part 1
  3. Excel vs. OpenOffice.org Calc in number formatting
  4. JavaScript formatting library update
  5. JavaScript Number Formatting Library v1.3 released