Archive for October, 2005

How to block ads effectively with AdBlock regular expressions

One of the greatest things about Firefox is the ability to customize the browser with extensions. By far my favorite extension is Adblock. It allows you to specify arbitrary regular expression patterns to block your browser from fetching and displaying content. The regular expression syntax is standard JavaScript syntax, based on Perl 5.

Blocking content is a delicate dance. It’s very easy to block too much, so the patterns need to be fairly specific. The following patterns will match almost any content I don’t want to see on the Internet, and rarely block something I do want to see. Note that these are regular expressions.

  • \.swf
  • \bads\b
  • 2o7
  • a1\.yimg
  • adbrite
  • adclick
  • adfarm
  • adrevolver
  • adserver
  • adtech
  • advert
  • atdmt
  • atwola
  • banner
  • bizrate
  • blogads
  • bluestreak
  • burstnet
  • casalemedia
  • coremetrics
  • doubleclick
  • falkag
  • fastclick
  • feedstermedia
  • googlesyndication
  • hitbox
  • httpads
  • imiclk
  • intellitxt
  • js\.overture
  • kanoodle
  • kontera
  • mediaplex
  • nextag
  • pointroll
  • qksrv
  • rightmedia
  • speedera
  • statcounter
  • tribalfusion
  • webtrends

To use it, join the entire list with | and start and end it with /. The following is everything together as a single regular expression. This is my one and only Adblock filter:

/\bads\b|2o7|a1\.yimg|adbrite|adclick|adfarm|adrevolver|adserver|adtech|advert|atdmt|atwola|banner|bizrate|blogads|bluestreak|burstnet|casalemedia|coremetrics|doubleclick|falkag|fastclick|feedstermedia|googlesyndication|hitbox|httpads|imiclk|intellitxt|js\.overture|kanoodle|kontera|mediaplex|nextag|pointroll|qksrv|rightmedia|speedera|statcounter|tribalfusion|webtrends/

If you’re familiar with regular expressions, you will have realized some of the entries can be combined into one with grouping. Without taking this to extremes, here is the shorter version:

/\bads\b|2o7|a1\.yimg|ad(brite|click|farm|revolver|server|tech|vert)|at(dmt|wola)|banner|bizrate|blogads|bluestreak|burstnet|casalemedia|coremetrics|(double|fast)click|falkag|(feedster|right)media|googlesyndication|hitbox|httpads|imiclk|intellitxt|js\.overture|kanoodle|kontera|mediaplex|nextag|pointroll|qksrv|speedera|statcounter|tribalfusion|webtrends/

I don’t block swf because the flashblock extension blocks Flash more conveniently than AdBlock, in my opinion.Technorati Tags:No Tags

You might also like:

  1. How to guard your privacy with blacklists and whitelists

How to quote and encode XML attribute values

Attribute values in XML are usually double-quoted, but single-quotes can be used as well, according to the relevant part of the XML Spec. Here is the production:

[10] AttValue ::= '"' ([^<&"] | Reference)* '"'
                  | "'" ([^<&'] | Reference)* "'"

In plain English: an attribute consists of

  • a double or single quote
  • any number of the following:
    • any character but <, & or " OR
    • an entity reference
  • the same character that was used to begin the attribute (double or single quote)

What’s most interesting about this to me is that a < is forbidden inside attribute values, but a > is not. I always assumed both were illegal.

This is why I love reading specs. The XML spec is a great example of clear and terse writing. There is no chance for confusion when reading the productions themselves! Any secondary source can only obscure the matter, in my opinion.

Technorati Tags:No Tags

You might also like:

  1. Advanced HTML table features, Part 2

How to read the clipboard from JavaScript

Microsoft Internet Explorer exposes the contents of the clipboard to JavaScript on websites. This is a bad thing.

Just another in the long string of examples how Microsoft, in an attempt to get developers on board and build the largest software empire in the world, has made everything easy for the developers, and totally disregarded security.

My personal opinion, as always, is Internet Explorer should never be used.

Technorati Tags:No Tags

You might also like:

  1. How to exploit an insecure order of access to resources

How to create a VB6 console program

Visual Basic 6 programs can be run as console programs, if configured correctly. There are four basic requirements to create a useful console program in VB6:

  • Remove all forms and dialogs
  • Provide access to standard input, output, and error streams
  • Provide access to the command-line arguments
  • Re-link the program for the Windows Console subsystem

Remove forms and dialogs

By default, a VB6 project has “forms” or “windows,” which can contain application code. When running a program in the console, you don’t want anything but the console, ever. When you create a VB6 project, just remove all the forms from it, and add a module. You need at least one module, which will contain a subroutine called Main(). When you look at the project properties, you will see the “startup object” set to Sub Main.

There is still a possibility that some dialogs can be created. For example, a runtime error will pop up a dialog. To avoid this, choose the “Unattended Execution” checkbox in the Project Properties dialog. By default, dialogs will now be shunted to the Windows Event Log. You can control this with the App.StartLogging method, if desired.

Get access to stdio streams

A console app usually needs to work with standard input and output. There are at least two ways to accomplish this: by using the Win32 API, and by using the Scripting.FileSystemObject’s text streams. In either case, the streams will not be available when running the app in the debugger, so it may be a good idea to create a wrapper around the calls and only try to use them if they are available. The Win32 API calls are easy to use, and I have posted sample code for your reading pleasure. The Scripting.FileSystemObject’s text streams are equally easy to use. Microsoft’s FileSystemObject documentation should help you get started on those. You will need to add a reference to “Microsoft Scripting Runtime” in your project to use the FileSystemObject.

Get access to command-line arguments

The text of the command-line arguments with which the VB6 console app was invoked is available by calling the Command() function, but it is non-trivial to parse the text into individual arguments such as those C programmers are used to using. It’s not impossible; depending on your needs you may be able to use regular expressions, the Split() function, a tokenizer (finite state machine), or invoke the Win32 API again by calling the CommandLineToArgvW function. The latter uses Unicode, so you will need to convert between VB strings and Unicode. The StrConv() function will help here, but on the reverse conversion you will need to do a bit more. Google will provide many links to examples of using these two functions for this job.

Re-link the program for the Windows Console subsystem

There seems to be no option in the VB project properties or compile options to do this automatically when making the program, so you will need to re-link after compilation. If you don’t do this, your program will not run correctly. The standard streams will not be available, for one thing. Fortunately, it is quite easy to do:

"C:\Program Files\Microsoft Visual Studio\vb98\LINK.EXE" /EDIT /SUBSYSTEM:CONSOLE <yourfile.exe> (this code should all be on one line).

A handy shortcut is to create a batch file with the command in it. You can then drag your .EXE file onto the batch file. Assuming LINK.EXE is in your path, the following will work:

LINK.EXE /EDIT /SUBSYSTEM:CONSOLE %1

Don’t name the batch file “link.bat” or it will call itself! Another of Microsoft’s insecure default behaviors.

Acknowledgements

I have gleaned this code from all over the Internet. Very little of it is my own.

Technorati Tags:No Tags

You might also like:

  1. How to use the Visual SourceSafe automation interface

How to use the Visual SourceSafe automation interface

Microsoft Visual SourceSafe provides an automation interface that can be used from within VBScript, VB6 and other languages. This article lists the options for automating SourceSafe, provides links to documentation, and discusses some bugs that are impossible to work around.

Documentation

I am in the unenviable position of needing to write a VBScript program to interact with SourceSafe. Unfortunately the documentation is hard to find and not at all clear or user-friendly. Here are the two links I find useful:

Bugs

I notice a number of bugs. For instance, no matter what I do, I cannot get the IVSSItem.Get or IVSSItem.CheckOut methods working in VBScript (they do work in VB6). I just get strange errors about “Type mismatch” or “Invalid DOS path:” depending on how I call the method. I see others on the web have had the same problem, but no solution. It’s pretty miserable.

Command-line interface

There is a command-line tool, SS.EXE, but it is almost totally unusable for scripting purposes. It has strange dependencies on the Visual SourceSafe client program, requires environment variables instead of accepting command-line arguments, and does very frustrating things that cannot be overriden easily. For example, the “get” command gets the file into the current directory even if you specify a subdirectory; in other words, the command ss Get dir/file.txt will get the file into the current directory, not dir. This is a shame, as this is the command I really want to use because it doesn’t work via the automation interface in VBScript.

Technorati Tags:No Tags

You might also like:

  1. How to create a VB6 console program
  2. How to avoid VBScript regular expression gotchas
  3. How to exploit an insecure order of access to resources
  4. How to flatten hierarchies with awk
  5. Three updated tools in MySQL Toolkit

How to style HTML lists consistently in all browsers

IE’s and Mozilla’s ordered and unordered lists are rendered similarly by default, but the way the list is indented is opposite in the two browsers. Understanding how to style lists correctly is key to avoiding unexpected ugliness. In this article, I explain how UL and OL are styled by default, how to re-style them so they behave consistently, and uncover an incompatibility that cannot be fixed.

The example list is the same as the example list used in the CSS list-style-position property definition. Each sample image shows Mozilla’s rendering on the left and IE’s on the right. To show visually what is happening, I styled the left border of the content area, the list, and the list items red, black and blue respectively.

With default styling, the colored borders made it clear that the left borders of the list were in different places in the two browsers, even though the content was in the same position. In Mozilla, the list’s box extended all the way left to the content area. There was about 40px of space between the list’s left edge and the list item’s left edge. It was not obvious whether this was created by the UL’s padding or the LI’s margin. In IE, the left edges of the UL and LI were next to each other, so I guessed the indentation was created by the UL’s left margin. In both cases, it was clear the LI had no padding, but there was no way to know if it had a margin in Mozilla.

default styling

To understand whether Mozilla adds padding to the UL or margin to the LI, I removed the padding and margin from the elements and watched the results. First, I removed the margin from the UL:

margin-left: 0

There was no change in Mozilla, so that wasn’t it. Based on that, I decided there was probably padding on the UL. IE collapsed the list all the way to the left edge, so as expected, IE must use the margin on the UL to indent the bullets. Next, I removed the padding from the UL and reset the margin to the default:

padding-left: 0

This time IE was unchanged from the default, and Mozilla collapsed to the left edge, so I guessed right.

At this point, I understood enough to know how to make the browsers render the lists identically, but I didn’t know whether one way was better. I think either will do equally well for general purposes, but for some purposes, it is better to use Mozilla’s method. For example, when placing lists on the right side of a float, there are issues with margins. CSS defines special rules for margins on and around floating elements. In general, I think it is best to style every UL with padding-left, and remove the margins. This expands the left edge of the content box so there are no margins to behave strangely around floats.

So far so good, but I also have also noticed strange behavior with text-indent applied to LI elements. I was trying to style certain LI elements as “new,” with an icon to the left of the text. My first idea was to add a background image and indent the text so it didn’t overlay the image. I saw strange behavior again, though. That led me to experiment further with list items, namely with marker-position and text-indent.

To figure out how the text-indent was implemented, I first set the marker-position to outside. I saw no change in the rendering at all, so I set it to inside, and the results looked very much like the CSS spec’s example:

marker-position: inside

The CSS spec says when marker-position is inside, the marker should become the first inline box in the LI. Given that, I expect the marker to be indented with the text when it is inside the LI, and to remain independent when it is outside the LI. I experimented with this, adding text-indent with marker-position outside:

text-indent: 40px

Mozilla did as expected, indenting the content but not the marker. IE indented the marker too though, indicating the marker is not rendered independently from the content. Next I added text-indent with the marker inside:

marker-position: inside; text-indent: 40px

This time both browsers rendered the text the same, as per the spec. In this regard it seems IE doesn’t follow the spec. To be fair though, the spec is deliberately vague on markers to be backwards-compatible with the ambiguity in CSS1 on markers.

There seems to be no way to indent the text in a LI without also moving the marker, at least in some browsers. I recommend not relying on marker-position because different browsers treat it differently and the spec doesn’t indicate what is absolutely correct. As a side note, Opera treats markers exactly as IE in this regard.

PS: Guess what? It turns out I’m not the first to notice this.

Technorati Tags:No Tags

You might also like:

  1. Why not to use CSS for columnar layouts
  2. How to use CSS to go beyond separation of content and presentation
  3. How to display an HTML table as a folder tree
  4. Simple and complex types in XML Schema
  5. How to use meta-data to sort itself

Review of the iRiver HD340

The iRiver HD340 is a 40GB hard-drive-based multi-codec music player, radio, text reader, and image viewer. I have had mine for about 6 months now. I have found some strengths and weaknesses that do not seem to be common knowledge. As usual, I will try to avoid giving information available elsewhere on the Internet.

I use my unit solely to listen to my CD collection and have not used its other functions much. My motivation for buying it was to be able to take my music with me conveniently, without risking loss or damage. In this respect I’m just like millions of others. But I don’t use non-Free Software compression formats (and I don’t want to have to encode at a high bitrate to get good sound quality, another reason not to use mp3), so just any old player won’t do. I need something that will play Ogg Vorbis files. That’s why I chose this unit over the others available on the market.

I have a lot of criticism below, but don’t be fooled: I really like this little gizmo and would definitely buy it again.

What’s good

The unit is compatible with Ogg Vorbis and other Free Software formats. The firmware is upgradable and iRiver has a good track record with updates.

The unit itself is little more than a hard drive, a screen, and some software. It has a USB2.0 interface and when I plug it in, it shows up as an external hard drive like any other. The filesystem is FAT32. This means interfacing with my Gentoo GNU/Linux system is trivial.

The player is well-built; it is very solid. It looks and feels like quality.

Filesystem layout is very simple: the unit simply browses the directory structure in a familiar tree view. There is a single four-way up/down/left/right button for navigation, which is very easy to use.

Sound quality is excellent. People often shrug off sound quality reviews and assume that every player will be great, but I can tell you from experience, some players can take a great file and make it sound like crap. Playback quality is very dependent on the software and hardware used. iRiver’s sounds great. I listen through a set of Sennheiser HD590s.

Battery life is very good. I’m not sure exactly how long it lasts; I listen at low levels and it will play for a couple of work days, so maybe 16 to 20 hours.

Room for improvement

The interface is fairly geeky. Buttons do unintuitive things in different modes. After a little practice it’s not hard to use, but it is not straightforward. A person who isn’t used to it can easily choose the wrong song or album by accident.

The instructions are unclear, in part because everything has a different function in different modes. For example, I tried and tried to get .m3u playlists to work, and concluded that the unit didn’t work right, but finally found something on the iRiver support website that explained how to do it. You have to press (and hold?) a certain button in a certain mode. I’ve forgotten how to do it now, and I usually don’t forget these types of things.

Navigating through the list of artists is slow. I have a lot of music and it would be nice to have a page-up/page-down function instead of clicking through them one at a time.

There is about a half-second gap between tracks. Gapless playback would be really nice, especially for my many albums that have no space between tracks.

The display doesn’t really show all the information in a good way. Most song titles are too long to fit on the display. Instead of wrapping, they are scrolled. I have to wait for the text to scroll into view. In general, there is a lot of display real estate that’s poorly used.

The unit takes about 20 seconds to start. I understand that if I used Windows to transfer files to it, an internal database would be maintained and boot-up would be faster. I don’t think this is the reason for slow boot-up though. The system just takes a while to boot, period.

There is both a headphone and line-level output. The line output seems to be the same as the headphone though, and its level is controlled by the volume setting. A line output ought to be a certain voltage and impedance range. This isn’t a true line output.

When the battery goes low, playback is interrupted by a beep every few minutes as a warning. This is annoying and can’t be turned off. Worse, the battery really isn’t that close to dying. I’ve found there are typically two to three hours left in the battery when it starts beeping. To make the battery last as long as possible, it really needs to be drained every time before recharging, so there’s no way around it except to put it aside and not listen to it while it plays itself out of battery.

The unit is bigger than an iPod. It’s small, just not that small. It’s about the length and thickness of a deck of cards, and slightly narrower. When it is in its leather case it’s bulkier. The case has a small nub for attaching to a belt or other means of carrying, and the nub is not removable, which is a poor design choice.

Changing the settings is confusing. Many settings are unclear. Once the settings are saved it’s hard to get back to playback mode.

It is possible to charge the unit by plugging it into a USB cable. Sometimes it started to charge when I really wanted it to connect for data transfer or vice versa. There is a setting to change this, but it’s hard to find, and unclear what really does what I want. It took some experimentation, but I was able to just disable USB charging, and I’m happy with that.

Conclusion

Now that I have my settings the way I want, I really enjoy using this unit. I just turn it on and let it play, and that’s really all I want out of it. I’m very satisfied. I do not expect it to be a very popular toy for the masses, though.

Technorati Tags:No Tags

No related posts.

How to understand SQL joins

I have noticed many people do not understand SQL joins, even after somewhat successfully using them for a time. Joins are key to understanding SQL. This article explains what joins really are and what they really do.

Many programmers learn SQL by writing it. I learned it by studying relational algebra under the tutelage of a theoretically-minded specialist in real-time databases. I never spoke of tables and columns; I thought in sigmas and other funny letters, and I spoke of tuples and relations. When I got a real job, I had a lot to learn about SQL in the real world, though my theoretical background helped me in some ways. I think a thorough grounding in theory is good, so I will approach this article (somewhat) from that angle.

SQL is a functional language. Try to think of a SELECT statement as a function. That is, a mathematical function, or mapping, which — this is important — maps an input to an output. When you select data from a table, think of the table as a source. Data streams out of the table. If it helps you, think of a little grinding cog icon. Then it streams out of the cog onto your screen as a familiar tabular result set. The cog is the SELECT statement, the function. It transforms the data. Maybe it just passes it straight through, but it really is a mapping of input to output. (By the way, if you take this approach when programming in XSLT or LISP, you will grasp things much more easily.)

A join is a SELECT statement with multiple data sources. The data streams from those sources into your cog icon, and a single stream flows out again. A SELECT statement always has one and only one output. (Why? Think of a function… think back to your math classes). Joins are functions that perform matching between data streams. The matching is necessary to merge the multiple input streams into a single output.

Let’s look at two tables of data, apples and oranges.

Here is an example SELECT statement:

select apples.Variety, oranges.Price
from apples
    inner join oranges on apples.Price = oranges.Price

Here is (conceptually) what happens when we join these tables:

  1. Choose a left-hand table (the first table in the SELECT statement).
  2. For each row in the right-hand table, take the entire left-hand table and stack its rows next to the row in the right-hand table.
  3. Fill in the missing rows in the right-hand table by duplicating them into the empty spaces.
  4. The result is a large table containing the cross-product or Cartesian product of the two data sets. Now satisfy the matching criteria by applying them as a predicate to each row in this new data set. If the predicate is true for the row, include it, otherwise exclude it. The result contains a single row:
  5. Now choose only the desired columns from the result:

This may not be what a given query optimizer really does to execute a join, but the result is the same regardless of the algorithm. If a query optimizer does something different, it is for efficiency, not correctness. Every join always involves a cross product followed by choosing the desired data from the result.

Technorati Tags:No Tags

You might also like:

  1. How to write a SQL exclusion join
  2. How to simulate FULL OUTER JOIN in MySQL
  3. How to select from an update target in MySQL
  4. How to simulate the GROUP_CONCAT function
  5. How to simulate the SQL ROW_NUMBER function

How to prelink mozilla-firefox-bin

Gentoo GNU/Linux users can enjoy additional performance enhancements by prelinking binaries. The documentation is unclear on whether binary packages can be prelinked. I tried it and it seems to work fine.

It is clear that any software I’ve compiled on my own machine can be safely prelinked, but since I have a very slow, old laptop, I also use some software precompiled as binary packages. Mozilla Firefox is one (emerge mozilla-firefox-bin). This is installed under /opt, which is masked out of the prelink path, so it doesn’t get prelinked automatically. I prelinked it manually by running prelink -Rm /opt/firefox and everything seems to be fine.

This is completely safe, because prelinking is fully reversible.

Technorati Tags:No Tags

You might also like:

  1. How to auto-mount removable devices in GNU/Linux
  2. Firefox vs. Opera on slow hardware
  3. To Gentoo or not to Gentoo?
  4. Permit Cookies: a Firefox extension that makes cookie whitelisting easy
  5. How to update a GCC profile on Gentoo