Archive for August, 2006

Why you should take microformats seriously

Microformats are getting some attention from the group of people who are clueful about the semantic web, but some people are not yet convinced they’re useful. While microformats aren’t as robust and complex as some more fully-featured technologies, some apparent limitations are actually possible to overcome, but with different methods than might be expected. In this article I’ll address some common concerns about microformats, and explain how they are either solvable or already solved.

The problem in a nutshell

The problem many see with microformats is they apparently don’t identify themselves as such; they rely on conventions, which are not yet agreed-upon. To recognize a microformat, you might think that you have to look for some markup such as <div class="bio vcard">. That’s not as formal as some standards, like XML, which require a DTD, a schema, or something equivalent (for the rest of this article, I’ll just call it a doctype).

A doctype is important for several reasons. Lack of a doctype means microformats can’t:

  • be discovered automatically
  • be validated automatically
  • be versioned

These could be show-stopping reasons not to use microformats for some applications. Fortunately, as I’ll explain, microformats can have the equivalent of a doctype — the mechanism is just a bit different from XML.

But wait, there are specs, right?

Yes, there are microformat specifications. The people at microformats.org have done lots of hard work, but these specs are not doctypes. They lack

  • a formal grammar
  • a formal means of identifying and locating the grammar given an instance document
  • a formal means of identifying the very presence of a microformat in the instance document

I’m a purist. Maybe it’s silly, but I’ve seen a lot of data, parsed a lot of data, done a lot of work with interoperability and interchange, and I’ve seen and used a lot of both good and bad data and programs to work with data. People like me are the reason many of the WC3 specs are so complicated. We want data that machines can know how to handle, automatically, without preconceived notions. The specifications at microformats.org explain things to humans, but that doesn’t automatically make them usable as doctypes.

What does “automatic” mean, and why do we need it?

It’s relatively easy to write, for example, a parser for hCards, because they’re pretty simple. It’s pretty easy to take such a parser and give it an XHTML document, and have it figure out whether there are any hCards in the document. So far, so good — one goal of a doctype is achieved, without even needing a doctype.

However, this isn’t everything a doctype gives you. In the example I just gave, the hCard parser has some things built into it. It knows what an hCard looks like, and it has the smarts to find one. In fact, it is built with a pre-conceived notion: “I am an hCard parser, and I look for hCards.” Notice, “hCard parser,” not “microformat parser.” That’s important. This is a “type one” parser (that’s my own naming convention, which I’m inventing just for this article).

Imagine for a moment that I upgrade the hCard parser to a microformat parser. I can give it an instance document, and it can detect the presence of any microformat in the document, parse it, and deliver it to me — suppose it’s a C library and it delivers it to me as a struct of the appropriate type. Still not too hard to do, right? There are only a few different types of microformats as I write this — fewer than twenty. The C library can certainly check for the presence of each. No big deal. Now I have a type two parser.

Now imagine that I have a microformat parser, and it knows nothing about any specific type of microformat, yet it can accept an instance document, find all microformats in it, and deliver them all to me in an appropriate struct. How can it do this? With a formal grammar, of course. That’s what a doctype is for. My new microformat parser is not so much a microformat parser as it is a doctype interpreter. It reads a doctype, generates some way to recognize microformats, and then does it. This is a type three parser.

This third parser is infinitely more powerful than the second and first. From type one to two is really not much of a difference; it’s just incremental enhancement. From two to three, though, is a quantum leap. Type three is a totally different animal. It’s like the difference between writing functions in a procedural language, and writing lambda functions in a functional language (if you’re curious about this, you may want to read about my Javascript date functions).

You cannot do this without a doctype.

Isn’t that too complicated?

That’s a good question. I just said I’m a purist, and many W3C specs are written by purists too, but I’ll be the first to say SOAP is way over-engineered (and don’t even mention all the WS-* specs). On the other hand, the microformats folks are actually creating highly complex specs, so maybe they’re not really trying to avoid complex specs like SOAP and WS-*. In that case, the lack of a doctype could be seen as a serious imbalance — all that complexity, and no doctype to show for it?

In my opinion, the microformat work is going in the right direction — it just isn’t at the stage where uber-formalism has set in, partly because microformats are not ubiquitously used yet. To create microformats, then create uses for them and demonstrate their effectiveness and usefulness, is a very good idea, even if it can’t all be as automated as I said above. I do not think microformat work should start with a committee, and produce a standard before anything else happens. That is an approach some have taken in the past, and it’s flawed. Tim Bray says it well:

I’m deeply suspicious of “standards” built by committees in advance of industry experience…

And Bruce Eckel:

A standards body should formalize existing practice, rather than inventing new practice without experience.

In any case, you could easily argue that doctypes for microformats are not the right thing to do, at least not yet. And I’d agree with you, but I’d also point out that many things doctypes enable are already possible with microformats — you just have to do them a bit differently than you’d think, depending on your background.

Who wants doctypes?

Lots of people, like Steve Farrell, have raised these issues in conversations, on the web, and in mailing lists. In fact, a lot of the discussion has started since I started drafting this article about seven months ago. Unfortunately, there’s some PR work to be done. Some discussions I see are sometimes phrased as a bolt-on solution, or a means of appeasing the purists, which defeats the point. And the existing solutions can be hard for experts in data interoperability to understand — probably because of their expertise — so they sometimes don’t take them seriously enough, thinking the solutions are bolt-on, appease-the-purist jokes.

I hope people don’t continue to be pessimistic about the future of microformats. Many standards have failed because of some show-stopping problem or other, sure, but the world hasn’t ended. One excellent example is the IPv6 mess. Even if there are some shortcomings to microformats, whatever they are will be addressed somehow.

To a man with a hammer…

Microformats are built on XHTML, and therefore XML, so there’s a huge pre-existing toolset for addressing the requirement for a doctype. If you come from the XML world, as I do, your first thought might be “we need to use XML tools to solve this XML problem.” And the first places you’d probably look are XML Schema and good old-fashioned DTDs. These have been proven over and over again, especially DTDs. Let me explain these a bit, and show you why they’re not the solution for microformats.

XML itself, and DTDs, upon which XML is built, also provide methods for extending any document type. One is simply to redefine the DTD through extensions or referring to external DTDs. I’ll give an example of this done brilliantly: TEI, the Text Encoding Initiative. Most web programmers, even those who are hip to the latest and greatest, have never heard of TEI, but behind the scenes, it’s widely used in text encoding, and has been hugely successful; vast quantities of information have been marked up with MEI. For example, historians, librarians, and other academic types use it lots. And I do mean vast quantities, even on the scale of things you consider vast, whoever you are.

DTDs and XML Schema serve similar purposes, so I won’t go into Schema, especially since XHTML is defined by a DTD, not an XML Schema instance. Both tools really do what I’m talking about — doctypes. But we’re talking about XHTML here. Yes, by its nature the X part of it is meant to be extensible, but in practice, web browsers do not read the XHTML doctype to tell them how to parse and render web pages. Web browsers are specialized parsers that know XHTML only. You can’t just feed a browser some arbitrary XML, give it a doctype, and expect it to somehow render this. You also can’t extend XHTML and expect browsers to continue parsing and rendering it correctly, because they are not “type three” parsers in my informal taxonomy. You need to work within the constraints of the technology’s end users, and the place you have some degree of freedom is in the meta-data allowed in XHTML documents. I refer to the class attribute, of course — and remember, one of its official purposes is “general purpose processing by user agents.” This is exactly where microformats are targeted.

Another tool is XML Namespaces, which are partially a formal means of doing what many web programmers try to do informally (using “semantic class names”). I suspect most web programmers don’t know what XML Namespaces are, and probably many who do consider them black magic. There’s definitely a steep learning curve, but they do make sense and are absolutely necessary, so I hope more web programmers will take the time to understand them. (By the way, XML Namespaces and namespaces for a variable in a programming language, for example JavaScript, are not the same thing; there’s been lots of thought lately on scoping and namespaces in JavaScript libraries, but I don’t want you to read this article and think you already know what XML Namespaces are just because you’re a Prototype expert. It’s totally different.)

Back to namespaces — you might be inclined to define an XML Namespace for microformats, define them with that namespace, and think that somehow this provides what you need. Unfortunately, this will break things even worse than extending the XHTML DTD, because a <div> with a custom namespace is, to an XML parser, totally unrelated to an XHTML <div>. The fact that they happen to have the same element name is a coincidence, and an unhappy coincidence at that — hence the need for a namespace to disambiguate and prevent name clashes. That is the problem XML Namespaces are designed to solve, not “semantic extensions” or some such. XML Namespaces gain you nothing here.

If you’re an XML expert, I hope by this point I’ve convinced you that you need to put down your XML tools. Just because you have a hammer doesn’t mean everything is a nail. By the way, I originally started drafting this article because I thought XML tools would be the solution, but I’ve changed my mind. It has been difficult for me to see and understand new ways of defining doctypes for microformats. Now you know why I’ve been drafting this for so long!

So, what’s the solution?

I think, but I’m not sure, the solution is XHTML Meta Data Profiles (XMDP).

It might look like a “hey, would this work?” idea got an acronym, and maybe that’s true. I’m not really clear on where the GMPG work is going to end up, but at this point it looks more promising to me than anything else (digression: Their homepage sounds like an artist’s statement. “…GMPG efforts provoke optimism and empowerment, nevertheless reawakening criticism of complexity…” Go read the whole thing. I still don’t know whether it’s a joke. If it’s a joke, it’s wicked funny. If they’re serious, it’s wicked funny).

Anyway, I want to mention some of the background for this idea of meta-data profiles. This is not something made up out of whole cloth. HTML has many un-explored possibilities. A famous one is the class attribute, which didn’t get recognition as a meta-data container for a long time, and was just relegated to the role of CSS beast of burden. Similarly, HTML 4 defines the profile attribute, which is — like the mechanisms built into DTD and XML Schema — a means of extending (X)HTML, without changing or extending the doctype. This is exactly what’s needed for extensions like microformats, which embed documents within documents. The profile extension mechanism is a generic hook. When an HTML document defines a meta-data profile, it is really saying “hey, there’s more you can find out about me. Go to such-and-such URI to learn more.” And at the referenced URI can be a meta-data definition, which is completely outside the scope of the HTML specification. This is important, because it means the HTML document’s meaning can be extended arbitrarily, not just in the ways the HTML spec’s authors might have foreseen.

This is the furthest thing from a bolt-on solution. If and when the microformat work progresses to the point it’s needed, that’s exactly how doctypes for microformats could be defined.

How does this solve the problems?

Recall my three problems. I said doctypes are needed so microformats can:

  • be discovered automatically
  • be validated automatically
  • be versioned

If you haven’t already, please go read the parts of the HTML spec I linked in the previous section. Then you will understand a great deal more.

Meta-data profiles already solve the problem of automatic discoverability, so the first point is taken care of. If someone creates doctypes for microformats, the second two can be addressed then. That’s not to say they will be, or will be addressed adequately — for instance, someone could create doctypes without a versioning mechanism, which would be an unfortunate oversight. But it can be done.

Conclusion

There’s no need to be pessimistic about microformats. Sure, not everything is formalized yet, but that’s the way it should be! People are proving that microformats are useful before writing formal specs, which is a great thing. And when the formalities are needed, nothing stands in the way. It’s just a question of when and how it happens, and who does it, and whether they do it well or not.

Technorati Tags:No Tags

You might also like:

  1. A PHP implementation of the XML DOM
  2. How to use meta-data to sort itself
  3. How to use extended properties as documentation with sp_showdoc

How to find duplicate and redundant indexes in MySQL

Download version 0.1.9

Peter Zaitsev over at the excellent MySQL Performance Blog recently wrote an article on duplicated and redundant indexes — any indexes which cover exactly the same columns as another index, or cover a leftmost prefix of another index. While there are subtleties, such as FULLTEXT indexes not being the same as non-FULLTEXT, for the most part this is sufficient criteria to raise possible duplicates to a DBA’s attention. I opened my big mouth in the comments and said I could write a quick Perl script to discover possible offenders in just a few lines of code. Once I did that, I had to do it and give you the script. Here it is.

The reason this is really easy to do in Perl is that the output of SHOW CREATE TABLE lists each index with its columns in order, in an easy-to-parse way, and therefore all one needs to do is compare the string that defines each index with each other index to find duplication and redundancy. Note: you just need to compare the string definition! You don’t need to actually parse out the columns and do any advanced computer science on them. And a quick regular expression to anchor each index definition to the beginning of the one to which you’re comparing it will satisfy the “leftmost prefix” requirement.

Why use SHOW CREATE TABLE’s output? Why not query SHOW INDEXES FROM ____ and use that instead? Well, first of all it’s way faster, as I also said in the comments on Peter’s blog. When I do something like this I like it to be zippy. SHOW INDEXES can take a long time, as it has to calculate stats on the indexes. Plus, even if I did use SHOW INDEXES, or query the INFORMATION_SCHEMA tables (also slow) I’d then have a result set of individual columns, which frankly I’d just concatenate together and do a string comparison on.

OK, on to my “advanced, patented algorithm.” Here’s a sample SHOW CREATE statement (I’m using a table from my recent article on role-based access control for an example):

mysql > show create table t_privilege\G
*************************** 1. row ***************************
       Table: t_privilege
Create Table: CREATE TABLE `t_privilege` (
  `c_role` varchar(30) NOT NULL default 'other',
  `c_who` int(11) NOT NULL default '0',
  `c_action` varchar(100) NOT NULL,
  `c_type` varchar(30) NOT NULL default '',
  `c_related_table` varchar(100) NOT NULL default '',
  `c_related_uid` int(11) NOT NULL default '0',
  PRIMARY KEY  (`c_role`,`c_who`,`c_action`,`c_type`,`c_related_table`,`c_related_uid`),
  KEY `c_role` (`c_role`,`c_who`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

You’ll notice I added a key on (c_role, c_who) which is a leftmost prefix of the primary key. In general, indexes always appear in this output as KEY (column names), with a possible PRIMARY or UNIQUE in front (update: it should not have FOREIGN in front, because that’s not an index). That’s pretty easy to parse with a regular expression, and grab just the columns. A global match captures every index into an array. Then it’s just a matter of looping through the array and comparing. Here is the code:

foreach my $table ( @tables ) {
   my $ddl = $dbh->selectall_arrayref("show create table $table")
      ->[0]->[1];

   my @indexes = $ddl =~ m/(?<!FOREIGN) KEY .*?\((.*?)\)[^\)]*$/mg;

   my $has_dupes = 0;
   foreach my $i ( 0..$#indexes ) {
      my $index = $indexes[$i];
      foreach my $j ( 0..$#indexes ) {
         next if $i == $j;
         my $other_index = $indexes[$j];
         if ( $index =~ m/^$other_index/ ) {
            print "Table $table has possible duplicate indexes:\n",
               "\t$index\n\t$other_index\n";
            $has_dupes = 1;
         }
      }
   }

   if ( $has_dupes ) {
      print "Here is the CREATE TABLE statement:\n$ddl\n\n";
   }
}

I literally wrote that in five minutes, so it may not catch everything, but it caught the duplicate I defined above:

Table t_privilege has possible duplicate indexes:
        `c_role`,`c_who`,`c_action`,`c_type`,`c_related_table`,`c_related_uid`
        `c_role`,`c_who`
Here is the CREATE TABLE statement:
CREATE TABLE `t_privilege` (
  `c_role` varchar(30) NOT NULL default 'other',
  `c_who` int(11) NOT NULL default '0',
  `c_action` varchar(100) NOT NULL,
  `c_type` varchar(30) NOT NULL default '',
  `c_related_table` varchar(100) NOT NULL default '',
  `c_related_uid` int(11) NOT NULL default '0',
  PRIMARY KEY  (`c_role`,`c_who`,`c_action`,`c_type`,`c_related_table`,`c_related_uid`),
  KEY `c_role` (`c_role`,`c_who`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

As I said in my comments on Peter’s blog, I don’t really need to have something generate statements that can correct the problem for me, or anything like that. It’s nice, but it’s not essential. First of all, I’d never just trust a tool to go “fix” my tables for me. I’d want it to tell me where it found potential problems. Then I’d go inspect and alter the table by myself if I want to.

With my program, I don’t really have to “go inspect the table,” since it’s kind enough to print out the SHOW CREATE statement for me :-) Its output has everything I need to make a good decision about the table, unless it’s someone else’s table which I don’t understand well.

I wrapped the above Perl code into a script you can run from the command-line with familiar command-line arguments (plus it reads from your .my.cnf file to get defaults). You can download it and have fun with it. Execute perldoc duplicate-index-checker for all the gory details, or just use the --help command-line argument. Let me know if you want me to tweak it — I’m happy to. If you find a scenario it doesn’t work for, please put the SHOW CREATE statement in your comment.

One thing I also want to make it do, but it’s past my bedtime so I won’t do it tonight, is report duplicate foreign keys. I sometimes find this (actually I found a lot of them at my current employer). Maybe later this week.

For those who want more features, or don’t like Perl, check out a nice (and far more mature) Java implementation of a similar tool: MySQL Index Analyzer.

Technorati Tags:No Tags

You might also like:

  1. Role-based access control in SQL, part 2
  2. Introducing MySQL Duplicate Key Checker
  3. Duplicate index checker version 1.9 released
  4. How to understand key length limitations in MySQL
  5. How to simulate optional parameters in SQL

If you only learn one thing about database transactions, it should be this

I’ve been writing a lot of articles about locks, deadlocks, and transactions recently, and it occurs to me I’ve neglected to mention the single most important thing to know. If you only learn one thing about transactions in database systems, you should learn this, and learn it thoroughly — burn it into your brain permanently, if possible.

The single most important thing you can do is keep your transactions as small as possible.

That simple practice will improve performance, increase concurrency, reduce deadlocks, and generally create world peace. Plus it’ll make you think hard about your queries, which will probably make them higher-quality and better to maintain.

Now, how can you do it? Ah, that’s the trick. I offer you six simple suggestions.

1. Have the right attitude

The way to think about transactions is as an urgent mission. The instant you say START TRANSACTION, the race is on.

It’s like when you’re waiting in the car for the pouring rain to stop, before you make a mad dash for the house. You prepare yourself, unlock the door, unbuckle the seat belt… take a deep breath… START TRANSACTION go go go go COMMIT!

2. Do as few statements as possible

Do only the statements you need to inside the transaction. Don’t make updates to big_huge_table and inserts to giant_table, then poke around inside other tables looking at little things of no consequence before finishing your work. Remember, a transaction is a set of statements that must all succeed together as a unit, or must all fail together as a unit. Include in your transaction only the statements that belong to that unit of work.

If you can do it all in a single query, you don’t even need a transaction. By definition, a single query is a one-statement transaction.

3. Prepare as much ahead of time as possible

To help include as few statements as possible in your transaction, look at the order of the queries. Can some of them be moved earlier in the sequence, before the START TRANSACTION statement? If so, good. Do as much preparation as possible before you start the work. Especially think about whether you can check to make sure there’s even work to be done, or whether you’re likely to be able to finish the work — if you can find that out ahead of time, you might be able to avoid even doing anything.

4. Touch the smallest amount of data possible

If possible, avoid changing data that doesn’t need to be changed. For example, if you’re updating a summary table and you know what was just changed in the table from which the summary is calculated, you may not need to update the entire summary — maybe just part of it. Use indexes wisely to constrain your work to just part of a table instead of doing the entire table. Use every bit of information at your disposal to avoid working with more data than you need to.

5. Don’t wait around before committing

The goal is to lock as few resources as possible, for the shortest time possible. To that end, look at whether you can re-order the statements within your transaction. Is it possible to make the big changes later in the transaction? Can you delay getting locks on the really important table, which everyone else is accessing at the same time, until near the end? If so, you might cut down the number of locks and the duration they’re held. And definitely commit the transaction as soon as you’re done.

6. Don’t sacrifice consistency

Transactions have a purpose, and you should not be so afraid of holding locks that you commit when only half the work is done. Use transactions deliberately and carefully to group a set of work together into a logical unit. By all means examine whether the unit is bigger than it needs to be, but don’t shoot yourself in the foot by committing before the work is all done, out of fear of deadlocks.

Conclusion

Keeping transactions as small as possible is the most important thing to do, but it may not be obvious, especially if you’re less experienced with databases. A few easy practices, combined with simple awareness, can go a very long way. But remember — don’t defeat the purpose by cheating yourself out of the very benefits transactions give you.

Technorati Tags:No Tags

You might also like:

  1. How to deliberately cause a deadlock in MySQL
  2. How to give locking hints in MySQL
  3. An introduction to InnoDB error handling
  4. A little-known way to cause a database deadlock
  5. How to monitor InnoDB lock waits

MySQL’s ERROR 1025 explained

MySQL issues a cryptic error message, “Error on rename,” when you try to alter a table in such a way that it would break a foreign key constraint:

create table test1(a int not null primary key)engine=innodb;
create table test2(a int not null, foreign key(a) references test1 (a)) engine=innodb;
alter table test2 modify a smallint not null;       
ERROR 1025 (HY000): Error on rename of './test/#sql-2fa8_1' to './test/test2' (errno: 150)

This happens because ALTER TABLE really works by making a copy of the table, then renaming to move the old table out of the way and move the new table into its place. It is certainly one of the less meaningful error messages I’ve seen in MySQL.

There’s slightly more information in the output of SHOW ENGINE INNODB STATUS, if you are looking there (of course, if you’re looking there you’re probably already clued in to what’s going on). And innotop can parse that information for you:

innotop FK screenshot

In case you didn’t understand why the foreign key constraint was failing, the error message innotop parses out is much clearer. It’s because the foreign key columns in the parent and child table have to have the same data type. I was trying to change the child’s column to an incompatible type.

Technorati Tags:No Tags

You might also like:

  1. Why does InnoDB create two indexes per foreign key?
  2. Version 0.1.123 of innotop released
  3. How to deliberately cause a deadlock in MySQL
  4. Version 0.1.132 of innotop released
  5. What to do when MySQL says skip-innodb is defined

Version 0.1.146 of innotop released

I’ve released version 0.1.146 of the innotop MySQL and InnoDB monitor. You can download innotop from its homepage.

I re-arranged some information to be more compact and readable in this version, but there isn’t really much new functionality. This is mostly a bug-fix release to prevent crashes when innotop encounters unexpected information, or doesn’t find some information it expects to exist. It’s still very much beta software, so it may die unexpectedly. See this article about what information I need to debug and fix crashes.

Crashes should not cause any loss of information or other problems, by the way. It’s completely safe to run, because it doesn’t modify anything, it just reads status information. Up till now I’ve preferred for it to die so I notice any deficiencies, rather than failing silently.

Technorati Tags:No Tags

You might also like:

  1. What to do when innotop crashes
  2. Version 0.1.149 of innotop released
  3. Version 0.1.132 of innotop released
  4. Version 0.1.106 of innotop MySQL/InnoDB monitor released
  5. Version 0.1.123 of innotop released

Role-based access control in SQL, part 2

This is my second article on how to build a role-based access control system in SQL. In the first article I gave a high-level overview of access control systems in general, especially in the web-application context, and talked about how some ACLs are implemented. I introduced the problems I designed my system to solve, and gave a roadmap for where this series of articles will end. I finished that article with a sketch of some basics to provide row-level read, write, and delete access control.

This article picks up where I left off. I want to revisit some things I swept under the rug in the first article, because I didn’t want to throw all the complexity in at once. I’ll explain my current system’s full functionality, which includes roles, status, type checking, and table-level and set-level privileges. I will show you the design in detail, and give working examples and ready-to-run SQL queries. I’ll also explore ideas for extending or restricting functionality, because your application isn’t likely to be the same as mine. I’ll mention possible optimizations, because performance and scalability are important design goals. I’ll end with a brief explanation of how I’ve used this system to make my own web applications simpler and more secure.

If you haven’t already, you should read the first article before continuing with the rest of this article, because I’ll assume you have the context it provides.

Roles

Let’s finish the discussion about roles I pushed aside in the first article. When a user acts in a role, it “acts in the capacity of” that role. When a user is a member of a group, my system permits the user to act in that capacity — to assume that role. I don’t want to go into the details of roles and role-based access control, partly because it’s way outside my expertise, but I want to point out that acting as a member of a group is only one way to implement roles. There are others.

The trick to making roles work well is finding the right level of granularity. To accomplish this, I’ve defined some special roles in my system. In this article I’ll demonstrate how to implement the “self” role, but many others are possible. Anything you can deduce about the relationship between a user and an object is a candidate for a role, should you need it. For example, if your system records who creates each row, you could implement a “creator” role. If your schema tracks who supervises employees, you could have a “supervisor” role.

As I said, groups are subsets of roles, which is why they are often a convenient way to implement a role. To carry the example a bit farther, you can implement the “supervisor” role by looking at data to determine if the user is in a supervisory relationship to the object, or you can just add all supervisors to the “supervisor” group. The latter approach is a bit coarser, because it would allow a user to act as a supervisor on a user she doesn’t actually supervise, but you gain simplicity and speed. The “self” role is similar, but in this case you obviously want the fine-grained control of saying “a user is only allowed to act in the self role upon herself.” Adding everyone to the “self” group would accomplish nothing.

I’ve already used roles in the previous article, though I didn’t encourage you to think of them that way yet. The roles I demonstrated are “owner,” “owner_group,” and “other.” These are implicit in UNIX privileges. Later I’ll show you how privileges are assigned to roles explicitly.

Actions

Another topic I put aside, but used implicitly, was actions. Actions are important because they’re the verbs in the “can user X do Y to object Z” question. The three basic UNIX-style actions I introduced in the first article are “read,” “write,” and “delete.” As with roles, I’m going to make them explicit in this article.

I’ve already introduced the “event” data type. What actions can a user take on an event, besides the basic three? I can think of “join” and “withdraw.” Since actions are verbs, chances are your application already defines a lot of actions as class methods, and you may even maintain a list of actions as part of your design process.

Actions are the first place where type matters. I’ve mentioned that objects in my system are both typed and identified; that is, you need to know both the table and the ID of a row to apply privileges to it (you didn’t need to know the table in the last article). Certain actions will apply to all types, such as read/write/delete, but others will only apply to specific types – a natural concept in object-oriented programming, which I assume you’re doing if you’re using an ORM system.

I find it useful to define a generic set of semi-UNIX-ish actions, which can be applied uniformly to every type by common code. These let me build “property pages” automatically, which are important when I need to administer the properties of different types of objects. Some examples are “stat”, “chmod”, “chgrp”, “chown”, “chmeta”, “view_acl”, and “add_privilege.” Other actions apply only to specific types.

Actions are stored in the database, because they need to be involved in some of the queries I’ll show you later. This means they could theoretically be subject to ACL privileges like other objects, but in practice I find this is needless complexity (what good would it do to define a new action in the database, unless there is application code to implement it? Perhaps in an application with plug-in functionality this would be useful, but I’m not doing that.) For that reason I don’t give them extra columns like other tables, and I exclude them from my ORM system. Here’s the basic schema:

create table t_action (
    c_title           varchar(100) not null primary key,
    c_apply_object    tinyint      not null
);

The c_apply_object column specifies whether an action applies to objects or tables. Certain actions, like “create,” apply only to tables. I find the system is easier to manage if I choose my actions so they can only apply to one or the other, not both.

In my applications I typically add more columns to define user-interface labels and so on, but I omit them here. Actions will return soon, but I want to talk about statuses first.

Status

Your privilege system can represent reality better if it respects the object’s status, because some things can only be done when an object is in a certain status. Flashback to the imaginary application code:

if ( $user->can('join', $event) ) {
   if ( $event->status() == 'active' ) {
      // the user joins the event
   }
   else {
      print_error("Sorry, this event isn't active");
   }
}

It’s happening again — the code is taking too much responsibility. Shouldn’t the code just be asking if the event is joinable? In fact, isn’t it cleaner to make privileges contingent upon the event’s status? If you haven’t really explored this possibility in your own code, I encourage you to do so. My personal experience is it’s a much better way to do it. Think about the places in your code where you could omit checking something’s status before doing something to it. You might get rid of a lot of code.

While this represents reality better in one way, it mis-represents it another way. What if $user->can('join', $event) returns false? Is permission denied? Maybe not; maybe the event just isn’t active. This makes it a little harder to understand why a user can’t join an event, but in practice I find this almost never happens in my applications. The applications are built by asking what the user can do, so no user ever gets a link to join an event that’s not “joinable,” whatever that means. Mashing status and privileges together is a trade-off, but the upside (performance and managability) is so great, I think it’s overwhelmingly worth the slightly unfaithful representation. In a bit I’ll show you how to do this.

Now that I’ve explained my decision to include status in the privilege system, let’s look at status values themselves. They’re like groups and permissions: they are so static in the applications I use, they probably ought to be defined in the code. What statuses should you define? Well, for the “event” data type, maybe you need “active,” “inactive,” “cancelled,” and so forth. A membership, for example, might be “pending” when the user signs up, and “active” when the administrators receive payment and activate it. Statuses are powers of two, like groups:

$statuses = array(
   "deleted"     => 1,
   "inactive"    => 2,
   "active"      => 4,
   "cancelled"   => 16,
   "pending"     => 32
);

An object’s status is stored in its c_status column, which I need to add to my generic set of columns. Now my table template looks like this:

create table t_foo (
    c_uid             int not null auto_increment primary key,
    c_owner           int not null default 1,
    c_group           int not null default 1,
    c_unixperms       int not null default 500,
    c_status          int not null default 0,
    -- other columns ...
);

Where’s the type information?

I said actions are only valid for certain types of objects, but that’s not represented anywhere in the privilege system. I’ve said I’m going to mix privileges and status together for the many benefits it gives. Should I also mix types into the recipe?

There could be one good reason not to do this: Once And Only Once. The code might already define what actions are valid for what types, if your system is very object-oriented. On the other hand, your code might not be that strictly object-oriented, and you might want the application to be able to take actions that don’t correspond to an object method. I find this is the case in my code, because I don’t live in the Kingdom of Nouns. In this case you do want the privilege system to know which actions are valid for which types.

But there’s another reason, too: your types probably share some actions. Here’s one example from my application: memberships and events can both be activated, so the ‘activate’ action applies in more than one place. In fact, some actions are even shared more than twice in my system, and in some cases they don’t apply in the same statuses. In this case, I think the relational database is the best place to store this information.

The type-action-status information lives in t_implemented_action, another “system table” that doesn’t get all the auto-increment baggage, like t_action:

create table t_implemented_action (
    c_table           varchar(100)    not null,
    c_action          varchar(100)    not null,
    c_status          int             not null default 0, 
    primary key (c_table, c_action)
);

This is the first place I’ve shown you where a column contains a table name. Remember, in the ORM worldview, a table’s name is its data type (I hope your code has an easy way to translate between an object’s data type and the name of the table it lives in). A row in t_implemented_action, then, says “this type of object implements this action in this status.”

In this table, the c_status column is not a single-valued integer; since an object might support a given action (such as “delete”) in several statuses, the statuses are packed in bitwise. If an action is always valid for a given data type, I set the c_status to 0.

Rows only need to be in this table if the action applies to objects, because tables don’t have a type (they are a type!) or status in my worldview.

Privileges

I’ve now covered enough background to explain how my system represents privileges. There are three types of privileges:

  1. “object”: a regular object-level (row-level) privilege.
  2. “table”: a privilege granted upon a table itself, as opposed to its contents. For example, “create” cannot be applied to an object, because an object has to exist for a privilege to apply to it. “create” can be granted upon a table, which allows a user to create a row in that table.
  3. “global”: a privilege granted on all rows in a given table. For example, officers ought to be able to view details on every user — details which might be hidden from other users. A single global privilege in the ACL can grant this.

I store privileges two different ways. First, there are the UNIX-style privileges I’ve already explained. These are clearly object privileges, because they are defined directly in the row. I find these take care of nearly all my needs.

Next, if I need additional granularity, there’s the t_privilege table. I have denormalized the design for performance and space efficiency, and crammed all three privilege types into one table:

create table t_privilege (
    c_role            varchar(30)     not null,
    c_who             int             not null default 0,
    c_action          varchar(100)    not null,
    c_type            varchar(30)     not null,
    c_related_table   varchar(100)    not null,
    c_related_uid     int             not null default 0,
    primary key(c_role, c_who, c_action, c_type, c_related_table, c_related_uid)
);

From top to bottom, these columns mean the following:

  • c_role specifies whether the privilege is granted to a user, a group, or in the case of an “object” privilege, the object’s owner or owner_group. A further special case, in my system, is “self.”
  • c_who is needed if c_role is user or group, and holds the user or group ID to which the privilege is granted.
  • c_action is the action the privilege grants. This is always required.
  • c_type specifies whether the privilege is “object”, “table”, or “global.”
  • c_related_table holds the name of the table to which the privilege applies. This is always required, though in the case of a “self” privilege it’s redundant because a “self” privilege always applies to the t_user table.
  • c_related_uid stores the ID of the object to which the privilege applies, if it’s an object privilege. This has no meaning for table and global privileges, of course. The one applies to a table, not an object, and the second applies to all rows in a table, so an ID is immaterial. This is also not used for self privileges, because by definition a self privilege has to apply to the user requesting permission to do something.

It’s a fairly complex table because of all the different types of things it holds. I generally hate columns that mean one thing sometimes and something else other times. When I first designed this system, I put different types and roles in different tables, which ended up being almost but not quite identical. I’ll let you imagine how horrible that was to actually work with. This is much better, and having a single table is more efficient for querying.

Example privileges

Examples might help understand the table structure. Suppose I have the following entries in t_privilege:

  • The first row grants group 4 (users) the privilege to join every event (all rows in the t_event table).
  • The second row grants all users the right to list the contents of the t_event table. This is the first example of a table-level privilege I’ve given. This is equivalent to setting the executable bit on a directory in UNIX.
  • The third row grants everyone the privilege to change our own password.
  • The fourth row grants user 3 the right to delete event 1.

Sample schema

Here’s a complete script (again, for MySQL) to create and populate a sample schema with some sample data. This is what I’ll run queries against later:

drop table if exists t_user;
create table t_user (
   c_uid             int             not null auto_increment primary key,
   c_owner           int             not null default 1,
   c_group           int             not null default 1,
   c_unixperms       int             not null default 500,
   c_status          int             not null default 0,
   c_username        varchar(50)     not null,
   c_group_memberships int           not null
);

insert into t_user (c_username, c_group_memberships)
   values('root', 1), ('xaprb', 4), ('sakila', 5);

drop table if exists t_event;
create table t_event (
   c_uid             int             not null auto_increment primary key,
   c_owner           int             not null default 1,
   c_group           int             not null default 1,
   c_unixperms       int             not null default 500,
   c_status          int             not null default 2,
   c_description     varchar(50)     not null
);

insert into t_event(c_owner, c_group, c_status, c_description) values
   (1, 1, 2, 'MySQL Camp'), (1, 4, 4, 'Microsoft Keynote');

drop table if exists t_action;
create table t_action (
   c_title           varchar(100) not null primary key,
   c_apply_object    tinyint      not null
);

insert into t_action(c_title, c_apply_object) values
   ('read',     1),
   ('write',    1),
   ('delete',   1),
   ('join',     1),
   ('activate', 1),
   ('passwd',   1),
   ('list_all', 0);

drop table if exists t_implemented_action;
create table t_implemented_action (
   c_table           varchar(100)    not null,
   c_action          varchar(100)    not null,
   c_status          int             not null default 0, 
   primary key (c_table, c_action)   
);

insert into t_implemented_action(c_table, c_action, c_status) values
   ('t_user',       'read',     0),
   ('t_user',       'write',    0),
   ('t_user',       'delete',   0),
   ('t_user',       'passwd',   0),
   ('t_event',      'read',     0),
   ('t_event',      'write',    0),
   ('t_event',      'delete',   0),
   ('t_event',      'join',     4),
   ('t_event',      'activate', 2),
   ('t_membership', 'read',     0),
   ('t_membership', 'write',    0),
   ('t_membership', 'delete',   0),
   ('t_membership', 'activate', 2);

drop table if exists t_privilege;
create table t_privilege (
   c_role            varchar(30)     not null default 'other',
   c_who             int             not null default 0,
   c_action          varchar(100)    not null,
   c_type            varchar(30)     not null default '',
   c_related_table   varchar(100)    not null default '',
   c_related_uid     int             not null default 0,
   primary key(c_role, c_who, c_action, c_type, c_related_table, c_related_uid)
);

insert into t_privilege
   (c_role, c_who, c_action, c_type, c_related_table, c_related_uid)
   values
   ('self',  0, 'passwd',   'object', 't_user',  0),
   ('group', 4, 'join',     'global', 't_event', 0),
   ('group', 4, 'list_all', 'table',  't_event', 0),
   ('user',  3, 'delete',   'object', 't_event', 1);

One note on this schema — I have not included the indexes good performance will require. I’ve only included primary keys to ensure data validity. My real application has more indexes on t_implemented_action and t_privilege.

How to determine whether a user can take an action

It is now a much more complex process to determine whether a user can take a given action on an object or table. You need to start with some of the same information as in the last article:

  • The user’s ID and group memberships.
  • The type and identity of the thing in question. In case it’s an object, you need to know the table it lives in, and its c_uid. In case it’s a table, you just need to know the table name.
  • The desired action.

How about seeing if ‘xaprb’ can join the ‘MySQL Camp’ event:

  1. The UNIX-style permissions don’t specify anything about the “join” action, so I’ll skip them.
  2. The “join” action is valid for objects.
  3. The object’s type is “t_event,” and the “join” action is defined for that type, but not in status 2 — only in status 4.

xaprb cannot join the event. Can he join the ‘Microsoft Keynote’ event?

  1. The first two steps are the same, and this time the event’s status matches, so we can go to the next step.
  2. We need to look in the t_privilege table for a row that matches any of the following:
    • c_role is ‘user’, c_who is 2, c_action is ‘join’, c_type is ‘object’, c_related_table is ‘t_event’, and c_related_uid is 2,
    • or, c_role is ‘user’, c_who is 2, c_action is ‘join’, c_type is ‘global’, and c_related_table is ‘t_event’,
    • … this goes on for a long time.

I’m not going to type all the possible combinations of columns and values. This is exactly why my old privilege system was buggy and bloated. Fortunately, it’s really not bad to express this in SQL.

How to determine privileges with an SQL query

I’ve mentioned several times that the trick to doing this easily is to ask the question the right way. Instead of trying to enumerate every possible way a privilege could be granted, as I did above, it’s much easier to simply ask “tell me every privilege this user has on this object.” Maybe it doesn’t sound simpler, but it ends up being true. The query is a bit lengthy, but nothing in it is complicated. Here are links to view and download queries:

The queries are built dynamically with a few substitution variables. You can test these queries by saving them to a .php file and piping their output directly into mysql, like so:

php all-object-privileges.php | mysql

How efficient is it, really?

Pretty darn efficient. The all-object-privileges query really only needs to join a single row (the object) through the t_implemented_action table to the t_privileges table, so the query is only as big as those two tables. My sample queries include the t_action table, because I tend to use it to define GUI labels for my application, but otherwise t_action really isn’t used in these queries.

Remember, one of my design goals is to keep the privileges table small. That’s why I built in the complex, but efficient, self roles and global privilege types. You can really say a lot with a single global privilege.

If you need to optimize this system further, there are lots of opportunities to omit information, shrink column sizes if you don’t need 32 bits (for example, you may only have a few groups), and so forth. You can also pick and choose which features you want; drop the UNIX-style permissions if you need to, for example.

If anyone wants to benchmark this system, I’d be interested in the results. I have never done it. All I know is it works better than it used to, and when I run the queries, they are fast. The largest application I use this on has 122 rows in t_privilege and 672 in t_implemented_action. This is really what determines the size of the query; you should be able to have ten or ten million rows in the tables that store your objects, and that should make no difference.

If you do want to benchmark the system, please don’t do it without putting indexes on tables. As I’ve said, I only included primary keys in this article.

What’s ugly about this system?

I’ve just shown you essentially the same design I use in my production system. It is a strange mixture of different things that don’t always make much sense together. For example, why not just drop the UNIX-style privileges and put a couple extra rows in t_privilege instead? I could do that easily, and my queries would be simpler. The entire system would be more consistent. In fact, if I were starting from scratch, I might do that. I’ve let you see this design in this article so you can make up your own mind about which features you want to include.

One thing I’d be reluctant to give up is the laziness the UNIX-style privileges give my code. For the vast majority of uses, my code just wants to know whether the user is allowed to read or update an object. That can be answered without a trip to the database, so it’s an efficiency.

I have also never really needed the “other” role in the t_privilege table, so I haven’t written the queries for it (check the queries!). That means my “other” role is implemented in the UNIX-style privileges only. It works for me, though it’s slightly inconsistent.

What’s missing? How can you extend this system?

I’ve deliberately omitted a few things. One is negative privileges, which deny someone the right to do something. This would not be hard to add — I’ve just never needed it! You could do an exclusion self-join against negative privileges to implement this, and store the negative privileges in the same table. Another possibility would be using more bitwise logic to negate privileges. I’ve honestly never put too much thought into it.

I hope my sample queries (which are almost identical to my production queries, by the way) give you enough insight to figure out other special things you may need, such as the “creator” or “supervisor” roles I mentioned. Another possibility is packing more bits into the UNIX-style permissions. My examples only use 9 bits. If your application is constantly asking whether some other action, besides read/write/delete, is possible — hey, use those extra bits. Or if you want, put another role besides user/group_owner/other into the UNIX-style bits. Just because I modelled after UNIX doesn’t mean you can’t do it differently.

There’s lots of room to play with the table structures. For example, my production system has extra columns on t_action and t_implemented_action to define labels and other things my user interface wants.

How does this compare to other systems?

I hate comparisons for their own sake, so I’ll only say my table schema is as simple as I could make it and still jam all the special cases in. There are only a couple of tables, and no complex hierachical relationships between them — everything is flattened out and de-normalized as much as I can think to do. By comparison, phpGACL uses 18 tables to represent relationships among ACOs and ARO. Here’s another system I’ve found on the web at sqlrecipes.com. Its schema is also more complex.

These systems may work better for you, especially if you need a more traditional hierarchical ACL. Fortunately, I’ve always been able to accomplish what I need without hierarchy and with just a few groups, some custom roles, column defaults, lots of bitwise arithmetic, and special types of privileges. And I have managed quite complex data with this schema, such as inventory and accounting systems together in the same application.

How can you integrate this with a web application?

I currently use this system to implement a lot of the logic behind my web application. Between the privileges, type checking in the queries, and status-awareness, all my web application needs to do is ask the database what the user can do, and then display it. For example, I often use a “tab” metaphor, and the tabs are generated from the query. Here’s a screenshot of the property pages I mentioned:

Property Pages Screenshot

I use the same query to populate drop-down menus and so on.

My web app also uses url rewriting to funnel requests through a central dispatcher, which reads the URL and decides what file should handle the request. Before that happens, though, it verifies that the user is authorized to do what the URL requests! For example, my URLs look like “http://www.site.com/event/list_all/”. The dispatcher looks at the URL and realizes the user is trying to take the list_all action on the event type. It runs the appropriate fetch-all query and sees if list_all is allowed. If so, it dispatches the request. The file that handles the request can focus on business logic instead of handling authorization, because it knows the user is pre-authorized. This is what I meant in my first article, when I said access control should be at the heart of the application, not bolted on. As a result of this design, most of the pages that actually do something only need a few lines of very focused code. This approach has also allowed me to factor out virtually all common code. I can’t think of any duplicated code at all right now, though I’m sure there is some.

I’m not trying to brag; I just want to share with you how this has been a success for me.

Summary

It’s pretty hard to summarize such a complex article, but I’d like to wrap up by saying I’m keen to hear your comments. Suggestions for improvements are especially welcome. I hope this article has been helpful for you.

And of course, if you did find it helpful, you might like to subscribe and receive free, convenient updates by email or feeds when I post new articles.

Technorati Tags:No Tags

You might also like:

  1. How to build role-based access control in SQL
  2. How to escalate privileges in MySQL
  3. How fast is MySQL replication?
  4. How to use meta-data to sort itself
  5. How to find duplicate and redundant indexes in MySQL

How to build role-based access control in SQL

The posts I’ve been reading and writing recently have reminded me how Object-Relational Mapping (ORM) systems make it fun and convenient to interact with databases. For some of the reasons they’re a developer’s favorite, they can be a database administrator’s nightmare (think surrogate keys). But designing tables with a consistent set of columns has its benefits. Just because the columns are meta-data that have no intrinsic meaning doesn’t mean they have no value. In this series of articles I’ll show you several ways to use such “meaningless” meta-data to enable powerful, efficient application-level role-based access control (RBAC) in the database, with a focus on web applications, though you could do this for any application.

The systems I’ve built are complex, so I’ll split this into at least two articles. This first article will discuss other privilege systems I’ve seen in web applications, including Access Control Lists (ACL), and introduce a simplified row-only version of the privilege system I currently use. The second article will discuss the full scope of my current system, which is much more complex and powerful. Along the way I’ll explain how to add or remove features and complexity, to achieve the right balance of control and simplicity for your application.

My goal is to explain the systems I’ve built so you can design your own, without taking years to learn how, as I did. I will present sample schemas and functional queries.

Introduction

When people think of privilege systems, if they’ve been around computers and worked with security for a while, they typically think first of Access Control Lists. This is one of those concepts that means different things to different people. While there are some standards, not everyone agrees on them, and ideas about what ACLs really are and how they should work vary widely.

This article is about a tabular ACL implemented in a database to define privileges a user has on the data. This isn’t about SQL privileges; this is about implementing your application’s security model with database tables and queries. Specifically, it’s about how to do this with the table designs you commonly see in an ORM system. This article is also not about authentication, it’s about authorization. Your system should already be able to authenticate a user.

I’ll use some terms interchangeably throughout this article. “Privilege” and “permission” generally mean the same thing. Since I assume your application uses ORM, “object” and “row” are basically the same thing from two viewpoints (in the code it’s an object, in the database it’s a row). An object’s “type” or “class” has some direct relationship to the table in which the row is stored; the class is probably named the same as the table. Finally, “role” and “group” are similar, but not the same thing; a group is a role, but roles are a superset of groups.

Requirements for my ACL system

The system I’ll explain in these articles grew over the course of several years. It was driven by the need to manage an increasingly complex membership-based website in my university. Access control was always the Achilles heel until I found an elegant way to do it; then it became the system’s greatest strength, allowing us to use role-based access control to enforce row-level privileges on every row in the database. Fine-grained, tightly integrated control was one goal. In fact, the ACL is so pervasive in the website, the user interface is built by asking the database “what can this user do here?” A single query tells the website what the user should see. Another important design goal was scalability. The complexity and speed should remain virtually constant, no matter how many rows are in the database.

My design has the following core features, some of which I’ll save for the next article:

  1. Users are defined individually.
  2. Users can belong to one or several roles.
  3. Roles can be granted permissions to take actions.
  4. Privileges can be defined on individual objects, as well as classes and collections of objects.
  5. Built-in defaults handle nearly every privilege, and ACL entries are only needed to override or enhance the defaults.
  6. Actions aren’t static; I can define whatever actions I need. The system contains definitions of actions, as well as defining what types of objects they apply to.

I omitted several features commonly found in other systems — namely, the ability to deny privileges explicitly, and an inheritance tree of privileges. I’ve never found a need for denying privileges as long as roles are set up right, and hierarchies always make things more complex in relational systems. They also tend to make things less efficient to compute in a relational system, so for that reason my system is not hierarchical at all.

What problems does my design solve?

Most of the privilege systems I’ve seen — for example, those that control photo galleries, bulletin boards, shared calendars and the like — are not very fine-grained. They usually take the form of a few database tables that define users, groups, and a mapping between the two. The table of groups usually has a few entries with a column for each privilege that applies. For example, you might get the following if you select everything from the groups table:

+-------+------------+------------+
| group | can_delete | can_update |
+-------+------------+------------+
| admin | 1          | 1          | 
| user  | 0          | 1          | 
+-------+------------+------------+

The application code often looks like this:

if ( $user->is_in_group("admin") ) {
   $message->delete();
}
else {
   print_error("Sorry, you can't delete messages.");
}

if ( $user->is_in_group("users" || $user->is_in_group("officers") ) {
   // display some link here... ad nauseum
}

Does that look familiar? That’s how a lot of applications start, but as it grows larger, I guarantee it will become a disaster. Such systems tend to get extremely buggy and hard to work on, and may perform very badly. I’ve written before about separation of concerns; for example, today’s web programmers know about using CSS to separate content and presentation. Separating your privilege system from the other logic in your code is a very good thing too; perhaps one of the most important design decisions you can make. It should be the foundation of your system, rather than woven through it.

One thing that’s wrong with the code above, though it may not stand out, is that the code has to know what an “admin” user can do. In other words, the code is asking what group the user is in, and then deciding whether to let the user delete the message. This is not a good way to do things. Instead, the code should be asking for permission to do what it needs — delete the message — and let the privilege system decide. The code shouldn’t know what groups have what privileges, because then any change in privileges requires the code to change, making the system very brittle (hard to change, easy to break). This may seem like an unimportant point, but separation of logic and privileges becomes crucial as time passes. One way you can tell if you have a good separation is if any of your code refers to a group. If you have any calls like if ( $user->is_in_group("admin") ), you already have a problem. If it doesn’t seem like a serious problem, give it some time and you’ll probably see :-)

Some web application frameworks, such as CakePHP, have a more traditional ACL. There’s also a PHP implementation of generic access control lists, phpGACL. Used properly, these separate concerns very nicely, which is a Good Thing.

People have clearly built good privilege systems before, but when I started building the website that led me to my current design, web searches for these concepts turned up nothing. Even today there’s a relative dearth of good, comprehensible information about this subject. Why is this? I think it’s because it’s very hard to do well. If you’re an operating systems programmer, or you work with filesystem security, you’ve probably been exposed to it (especially to standards like POSIX), but most web application developers haven’t, and it’s frankly something most people can’t be expected to do well.

I know, because I didn’t do it well at first either (and some of you, after reading this, will think I’m still not doing it well). My first attempts were many lines of code that made round trips to the database and had recursive calls and nested if statements six ways to Tuesday. It was slow, it consumed a lot of memory and CPU, and as the database got more and more data, it grew too large to perform well. And boy, it was buggy — it was just too complex for me. In fact, it wasn’t till I designed the present system that I finally found some of the bugs I’d had for years!

Some of these problems are common even in well-implemented systems. For example, the phpGACL demo site runs out of memory when I use it:

Fatal error: Allowed memory size of 524288000 bytes exhausted (tried to allocate 57958 bytes) in [snip]/gacl_api.class.php on line 1359

Why is phpGACL trying to allocate tens of megabytes of memory just to check whether something is permitted? It’s manipulating tree structures, recursing, and so forth. [Update: this is incorrect; see the comments — however, you’ll see more about this in the next article] It’s a complex approach, and can be inefficient. I don’t say that to dismiss the developer’s hard work. Trees are wonderful in theory. But it’s like the difference between DOM and SAX parsing in XML — if you’re parsing very large XML files, you almost certainly want to use SAX or another incremental parser. The system I’ve designed used to work like DOM parsing — like phpGACL, in fact. Now it works more like SAX.

The end result is that, for me at least, my design solves these problems:

  1. It provides table-level and row-level control.
  2. It keeps the privilege system and the code separated.
  3. It is easy to make correct.
  4. It is efficient and scales well with the number of objects (not linear scaling, constant scaling).

How can you keep your privilege system small and fast?

The trick is asking a different question. A traditional ACL asks “given this user (ARO, or ‘Access Request Object’), and this object (ACO, or ‘Access Control Object’), can the user do such-and-such?” The answer, and how much work has to be done to get it, depends on a lot of factors, and if you want to ask about another action, you usually have to make another call to the ACL system. Instead, my current system asks “what can this user do with this row?” and touches a small, essentially constant amount of data, but gets back a complete answer of everything the user can do, which can be cached for future calls.

To help minimize extra data, my system is modelled after a blend of UNIX-style privileges, which I’ll introduce in this article, and a more conventional (but not hierachical) ACL, which I’ll save for the next article. Not only is it a very powerful model, but it’s familiar to a lot of people, which makes it easier to understand and administer.

Here are the ways in which it’s like UNIX:

  1. Every object (row in a table) is owned by both a user and a group.
  2. Users can belong to multiple groups.
  3. Privileges can be granted to a row’s owner, or to its group-owner.
  4. Privileges can be granted not only on rows, but on tables too (in UNIX, privileges apply to both files and directories).
  5. There is a “root” group which always has permission to do everything.
  6. By default, an object (row) stores its own minimal set of read/write/delete privileges, which are sufficient for most common tasks. These are similar to UNIX’s read, write, and execute privileges.
  7. The minimal read/write/delete privileges specify User, Group, and Other, just as in UNIX.
  8. Schema defaults (default column values) are similar to “sticky bits” in UNIX directories.

If these concepts aren’t familiar to you, I always point people to the ls man page. The hardest thing for people to grasp is usually the concept of a group owner.

And here are the ways it’s unlike UNIX, topics I’ll cover in the next article:

  1. There are no primary and secondary groups; all group memberships are equal.
  2. Privileges are dependent on the object’s type (i.e. the table in which it’s stored).
  3. Users and ACL entries are objects just like any other, because they’re stored as rows in the database, and thus subject to privileges too.
  4. Privileges can be granted on rows and tables, as I said above, but a special class of privileges can also be granted on all rows in a given table. Not the row itself, not the table itself, but all rows in the table. This is essentially the same as extending object and class privileges to set privileges.
  5. Privileges can be contingent on external factors, such as an object’s status.
  6. Most people think of groups and roles as the same thing, but they’re not quite the same in my system. Special types of roles can be defined, which are not directly related to groups. For example, because a user of a website or similar system usually needs special privileges to “itself,” there’s a special “self” role that grants a user rights to do things to itself that it doesn’t have to any other user. These roles aren’t static; you can define more, such as “creator” if you want. Privileges can also be granted to an arbitrary group, which can be a different group than the object’s owner group.

As you’ll see in the next article, these properties permit O(1) scaling by letting one entry in the ACL apply default privileges to an entire set of objects, instead of having one entry per object that needs a privilege. For example, the “self” role can allow a user to update its own password — a privilege it shouldn’t have on other users. Without the “self” role, every user might require an ACL entry to allow it to change its own password.

The database schema

I’ll start with a simplified system to control access to rows (you’ll see how to control access to tables in the next article). My system treats every row uniformly, which requires every table to have some extra columns to support the ACL.

A word on naming: In this article I’ll use a prefix of t_ for tables and c_ for columns. Not only does this keep the queries clear, as you’ll see later, but it lets me use reserved words as identifiers (such as c_group). I’ll also use MySQL as an example, though of course you could use other databases. Here is the basic table schema:

create table t_foo (
    c_uid             int not null auto_increment primary key,
    c_owner           int not null default 1,
    c_group           int not null default 1,
    c_unixperms       int not null default 500,
    -- other columns ...
);

You need these columns in every table. I’ll introduce the columns here, and explain them in more detail as I go.

  1. c_uid is the primary or surrogate key for each row (it doesn’t have to be meaningless auto_increment, as long as it’s an integer).
  2. c_owner is the ID of the row’s owner. This corresponds to c_uid in the t_user table.
  3. c_group defines which group owns the object. As I mentioned above, this is frequently a source of confusion, because people tend to think this defines a user’s group memberships. That’s a special case I’ll cover later.
  4. c_unixperms defines the object’s UNIX-style read/write/delete permissions.

Groups and group membership

Groups could be defined in the database, but in practice I find them so static that it’s better to hard-code a lookup table or enumeration in the application code, eliminating a trip to the database to fetch the group definitions for every request. Here’s a typical definition for PHP:

$groups = array(
   "root"          => 1,
   "officer"       => 2,
   "user"          => 4,
   "wheel"         => 8
);

Notice these are powers of two. That’s because I’m going to use a lot of bitwise arithmetic to do the queries for groups. This limits the number of groups to the number of bits in an integer, but for simplicity and speed, I’ve found it better to accept that limitation (I’ve never needed more than 8 groups to define a nicely granular system of privileges, and an unsigned integer allows 32). One thing to keep in mind here is that groups and roles do not have a one-to-one relationship, so this design doesn’t limit your total roles to the number of bits in the column, just your groups.

A row’s c_group column contains the ID of the group that owns the row; so if the officer group owns a row, it will have the value 2.

A user’s group memberships are handled differently. Since group IDs are powers of two, instead of being normalized into a separate table, they can be packed into a single integer. This is stored in the c_group_memberships column on the t_user table. This de-normalization removes complexity and data from your system, and makes queries much more efficient. I’ll be using the bit-packing optimization a lot, as you’ll see.

People often get confused about group memberships, because the t_user table also has a c_groups column like every table, but that has nothing to do with which groups the user is a member of; it stores the group that owns the user.

A user who is in both the “root” and “user” groups has a c_group_memberships value of 5.

UNIX-style read/write/delete permissions

The UNIX-style read, write, and delete permissions are defined in another array in the code, and packed into each row’s c_unixperms column:

$permissions = array(
   "owner_read"   => 256,
   "owner_write"  => 128,
   "owner_delete" => 64,
   "group_read"   => 32,
   "group_write"  => 16,
   "group_delete" => 8,
   "other_read"   => 4,
   "other_write"  => 2,
   "other_delete" => 1
);

A row whose c_unixperms column has the value 500 (decimal) has the value 111110100 in binary, so that means, from most to least significant bit, the owner can read, write and delete; members of the owner group can read and write; and other users can just read. This is probably familiar to you if you know UNIX filesystem permissions.

Sample schema

Sample data is helpful at this point, so I’m going to script out a minimal schema and populate it for some queries. Here’s the script:

drop table if exists t_user;
create table t_user (
   c_uid             int             not null auto_increment primary key,
   c_owner           int             not null default 1,
   c_group           int             not null default 1,
   c_unixperms       int             not null default 500,
   c_username        varchar(50)     not null,
   c_group_memberships int           not null
);

insert into t_user (c_username, c_group_memberships)
   values('root', 1), ('xaprb', 4), ('sakila', 5);

drop table if exists t_event;
create table t_event (
   c_uid             int             not null auto_increment primary key,
   c_owner           int             not null default 1,
   c_group           int             not null default 1,
   c_unixperms       int             not null default 500,
   c_description     varchar(50)     not null
);

insert into t_event(c_owner, c_group, c_description) values
   (1, 1, 'MySQL Camp'), (1, 4, 'Microsoft Keynote');

How to determine whether a user can take an action

That’s a complete set of data, so now you can start asking questions about whether a user is allowed to do things to an object. To figure this out, you need the following information:

  • The user’s ID and group memberships.
  • The type and identity of the thing in question. Since this article deals only with objects, you need to know the table it lives in, and its c_uid.
  • The desired action (read, write, delete).

I’ll start by asking questions the way a traditional ACL does: can user X do Y to object Z? For example, let’s see if user ‘xaprb’ has the right to read the ‘MySQL Camp’ event:

  1. xaprb’s user ID is 2 and c_group_memberships is 4.
  2. The event’s c_unixperms column is 500, which grants owner_read, owner_write, owner_delete, group_read, group_write, and other_read.
  3. The event’s c_owner column is 1, so xaprb is not the object’s owner, and none of the owner read/write/delete privileges applies.
  4. The event’s c_group column is 1, and xaprb is not in the group that owns the object. None of the group privileges applies.
  5. xaprb is in the ‘other’ role (everyone always is). So the other_read privilege applies.

Therefore, xaprb can read the event. Can user ’sakila’ update (write) the ‘Microsoft Keynote’ event? Let’s see:

  1. sakila isn’t the event’s owner, so none of the owner privileges applies.
  2. sakila is in group 1 and 4, and the event’s group owner is 1, so group_read and group_write apply.

So sakila can update the event.

Because the privileges are packed into bits, you can reduce this to logical and bitwise operators. Assuming $u is the user and $e is the event, this expression determines whether the user can write the event:

$can_write
   =  (( $e->owner == $u->id ) 
         && ( $e->unixperms & $permissions['owner_write'] ))
   || (( $e->group & $u->group_memberships )
         && ( $e->unixperms & $permissions['group_write'] ))
   ||       ( $e->unixperms & $permissions['other_write'] );

And so it goes; you can do similar calculations for read and delete permissions. You would probably want to write a class method that would allow you to express this as $u->can('write', $e) or something similar. This is also an easy query to write in SQL, although you don’t need to go to the database to find data you already have in your objects.

Notice how the code doesn’t have to know anything about whether a user is in a particular group! That knowledge is completely encapsulated in the integers the objects carry around with them. This is part of the separation of code and privileges I mentioned above.

What’s next?

You can begin to see how powerful the system is just from this humble beginning. Read, write and delete access are by far the most used privileges, so just by emulating UNIX file permissions I’ve shown you how to get the basics with very little work.

The above is not how you will ultimately determine privileges, however. As I said above, asking the question in this fashion is eventually not scalable. You want to be able to ask a question such as $privs = $e->permits($u) and get everything at once. That capability is a superset of what I’ve shown you above, so when you have that capability, you can ask the question both ways if it makes sense for your code.

We’re not ready for that yet. It will involve quite a bit more complexity, which I don’t want to introduce in this article. The next article will expand this system greatly, explaining the extra capabilities I’ve mentioned throughout this article, and then some. What I’ve shown you here only scratches the surface. I’ll also discuss how to pick and choose features you need, a bunch of optimizations for size and speed, and how you can build your own customizations into the full-blown system so it does exactly what you need.

Finally, the next article will explain how to build truly role-based access control. What I’ve shown you so far doesn’t really have to be implemented as role-based, though you’ll see how it fits into the role-based paradigm in the next article.

Technorati Tags:No Tags

You might also like:

  1. Role-based access control in SQL, part 2
  2. How to escalate privileges in MySQL
  3. How to find duplicate rows with SQL
  4. Microsoft’s sp_grep system stored procedure improved

How to reverse a sequence in SQL

I wrote an article a while back about how to order updates in MySQL so you don’t violate a unique index. I said I’d write another article on how to swap numbers in a sequence with a unique index. This is that article, but I’m going to make it a little more generic: how to reverse a (possibly ordered) sequence.

I’ve been thinking about this for a while, wondering if there’s a way I can do it in-place in one statement (I like to pile challenge upon difficulty). I’ve thought of a number of techniques, some using one statement, some using more, some that won’t work on MySQL, some that will.

This article assumes a sequence of non-negative numbers in the following table:

create table t (
   pk int not null auto_increment primary key,
   i int not null,
   unique key(i)
);
insert into t(i) values (1), (2), (3), (4), (5);

The pk column is so there’s something to compare the original ordering against, and I will not use it in any query other than to show the original order. I don’t assume the sequence is composed of adjacent numbers — there could be gaps, and they don’t have to be strictly increasing.

Method 1: in-place, one statement

The most straightforward way to reverse a sequence is to do it in-place, in one statement, with a little subtraction. First, here’s a SELECT statement to reverse the sequence:

select pk, (select min(i) from t) + (select max(i) from t) - i from t;

Given that statement, you can write an UPDATE statement based on it:

update t set i = (select min(i) from t) + (select max(i) from t) - i;

That is valid SQL, and will work on other RDBMSs, but for a number of reasons it won’t work in MySQL. The first problem is “ERROR 1093 (HY000): You can’t specify target table ‘t’ for update in FROM clause.” You can solve this with variables:

select @min := min(i) from t;
select @max := max(i) from t;
update t set i = @min + @max - i;

Now you get the unique index violation: “ERROR 1062 (23000): Duplicate entry ‘5′ for key 1.” There’s just no way around this in MySQL. You must turn to more devious methods!

Method 2: bitwise XOR

Ah, here’s a solution in search of a problem! Let’s see, bitwise XOR is really cool, what can you use it for today?

Well, you can use it to swap some numbers, for starters. As you may know, you can XOR two numbers together three times and they will magically trade places. It’s a great way to swap numbers without using a temporary variable for extra storage, though that’s really of dubious value here. In any case, it’s a fun exercise to write in one statement. Can you do it more simply than the following?

select pk, ((i ^ (@min + @max - i)) ^ ((i ^ (@min + @max - i)) ^ (@min + @max - i))) from t;

Unfortunately, once you’ve proven you can use XOR to reverse the sequence, the problem of assigning it back to t is still the same; you can’t do that in MySQL without violating unique indexes.

Method 3: mirror, delete and shift

Another method is to copy the sequence back into the table, with an offset so the unique index isn’t violated, then delete the original sequence and shift the copy back into its place (or some variation of this scheme). This will work as long as the column’s data type has enough room to store the copy. In this case, the sequence is non-negative, so negative numbers are a good place to hold the copy.

insert into t(i) select -i from t;
delete from t where i between @min and @max;
update t set i = @max + 1 + i;

This is the first method I’ve shown that will work on MySQL. Other variations on this method include copying the rows off to a temporary table. Regardless, the basic thing that makes this work on MySQL is deleting the entire original sequence before trying to put anything back in its place. If you don’t do that, the index will be violated.

Other methods

Besides variations on method 3, there are some other creative things you could do:

  • Add another column, reverse the sequence into that column, then delete the original column and rename the new.
  • Use the XOR technique on only half the table at once: XOR the bottom half against the top, the top against the bottom, and the bottom against the top again. This doesn’t work if there’s a unique index, as in the example I give.
  • Use an ORDER BY clause to reverse the numbers. For my examples shown here, this is the obvious solution, but it doesn’t work if the numbers aren’t in increasing order.
  • Allow NULLs in the column, then instead of deleting in method 3, just set to NULL temporarily.

Summary

I just showed you a bunch of different ways to reverse a sequence in SQL, something that may or may not be useful. I’ve been amusing myself for months trying to find fund and different ways to solve this problem. I’m interested in any ideas you have too.

Technorati Tags:No Tags

You might also like:

  1. How to avoid unique index violations on updates in MySQL
  2. How to find contiguous ranges with SQL
  3. How to generate sequences and surrogate keys in generic SQL
  4. How to implement a queue in SQL
  5. How to write INSERT IF NOT EXISTS queries in standard SQL

Four types of database abstraction layers

Quite a few people have chimed in on a recent discussion about PHP, MySQL, database abstraction layers, and performance. I think enough viewpoints have been covered that I don’t need to comment, but one question I don’t see answered is “what are the qualities of a good SQL abstraction layer?” I think it’s a very interesting — and complicated — question. As it turns out, the term has several meanings, and I think it’s important to understand them.

I started drafting this article in February, but put it aside until the recent spate of articles prompted me to finish it. Here are links to some of those articles:

This issue has been discussed before, too. For example,

That explains why I’m not going to jump into the fray with my thoughts on the developerWorks article; everything I’d say has been said before. I just want to see if I can clarify what people mean by “database abstraction layer,” and lay out some ways to think about what’s good and bad in such a system.

Types of abstraction layers

People sometimes say “SQL abstraction layer” or “database interface” fairly loosely, assuming everyone knows what they mean. Not so — I’ve seen at least four distinct meanings in common usage:

  1. A software library to connect to a database server and issue queries, fetch results etc.
  2. A software library to present a common API to different database servers.
  3. A software library to automatically generate portable SQL queries.
  4. A software library to map Object-Oriented Programming to a relational database (Object-Relational Mapping, or ORM)

Most libraries will also provide related functionality such as escaping quotes for preventing SQL injection attacks, getting server status, controlling transactions, and so forth.

Each type of interface usually builds upon the types that precede it in my numbering scheme. Each type has different goals, which you have to understand before you can decide what criteria to use whe