Tag Archive for 'mysqlconf07'

Slides for the innotop workshop at MySQL Conference and Expo 2007

Speaker at MySQLConf 2007

The slides for my innotop presentation at the recent MySQL Conference have been posted, along with other presenter’s slides, on the conference presentations page. Love that stock photography!


Technorati Tags:

You might also like:

  1. The innotop session at MySQLConf 2007
  2. Like it or not, it is the MySQL Conference and Expo
  3. MySQL Conference and Expo 2007 Audio
  4. Baron Schwartz on a podcast at MySQL Conference and Expo 2008
  5. My presentations at the 2008 MySQL Conference and Expo

MySQL Conference and Expo 2007 Audio

Download Ogg Vorbis Files

I recorded many of the sessions I attended at the conference. You can download the audio files in Ogg Vorbis format here. These files will not stay up forever — I will probably remove them after a few weeks.

My recorder only records in mp3 format, so I was forced to crank the bitrate down pretty far to avoid ending up with gigabytes of data. Too bad it doesn’t record directly to Ogg Vorbis format; if it did, I could get natural-sounding voice-quality at something like 8 kB/sec. Anyway, it is what it is.

Some of the files begin with a little silence, or begin partway into the talk. If you don’t hear anything, try skipping forward a few minutes.

UPDATE Kevin Burton kindly hosted an iPod-compatible podcast of these files in mp3 format (more than twice the size, but… I think he has lots of disk space and bandwidth).

Technorati Tags:

You might also like:

  1. Like it or not, it is the MySQL Conference and Expo
  2. My presentations at the 2008 MySQL Conference and Expo
  3. High Performance MySQL 2nd Edition is in production

MySQL Conference and Expo 2007, Day 4

Speaker at MySQLConf 2007

In my fourth day at the MySQL Conference and Expo 2007, I attended several great sessions, starting with my own.

If you’re wondering why this is a day late, it’s because the conference ended in the late afternoon, and they almost immediately — within a half-hour or so – removed the free wireless Internet access in the hotel. That was uncool.

Keynotes etc

I skipped the opening keynotes to polish off last-minute changes to innotop and my presentation. I was sort of interested in the keynotes, but by this time I had begun to understand that the keynotes are often as much about sponsorship as about disseminating great ideas. I didn’t see Eben Moglen on the schedule again.

Here’s the lunch menu:

  • Sun-dried tomato and onion salad with Balsamic reduction
  • Napa cheese board with marinated rosemary and garlic
  • Seasonal vegetables
  • Roasted vegetable salad
  • Herb buttered tomatoes
  • Scented olives and crusty bread
  • A wedge of spinach quiche Florentine
  • Garlic rubbed rosemary breast of chicken
  • Pacific salmon filet with roasted tomato sause [sic]

Lunch was served atop a premium heavy white cotton table dressing resting on a circular quadri-pedal planar table. The center dressing was a cluster of upright wheat grass nestled on a sparkling porcelain platter, garnished with…

… two small ripe-looking plastic pears.

Seriously.

The innotop workshop

This was my own session. Pre-show entertainment was provided via the sound system by my iRiver. It was Mark Knopfler’s album Shangri-La.

I stepped through a little history and an overview of innotop’s features, then went into more advanced functions, demonstrating its SQL-like ability to compile expressions into subroutines and let you easily write formulas to monitor whatever you want. I showed how to sort, filter, colorize and otherwise customize any display. There are many features I didn’t get into because they’re either too geeky or not fully baked yet, but I did show how to add and delete columns from tables.

In the second part of the hour I demonstrated some more advanced things you can do, such as watching InnoDB’s undo log entry count to see how much work your transactions have done. If you know how many rows a transaction will affect, this is a good way to see how close to completion it is.

I had only one slight problem as I discovered a bug in the quick-filters functionality I promised to write and demo. This is what live demos are for, no?

There were quite a few people in the room, but it wasn’t packed (packed would have been upwards of 100 I think; I maybe had 50 or so). But, it’s quality not quantity, right? Quality is when Heikki Tuuri himself (the man who created InnoDB) attends. He taught me a few things I didn’t know, too.

The best part of the session was the feedback and suggestions I got.

I also spent a few moments at the end of the talk to mention MySQL Table Checksum and MySQL Table Sync, which can be useful for efficiently detecting and repairing when a slave server is out of sync with its master. As far as I know I’m the only person who’s written good tools for this purpose.

The new version of innotop will be in theaters near you as release 1.4.2 in a few days.

Incidentally, I got to meet Rohit Nadhani from webyog, who wrote the SQLYog Job Agent I benchmarked against MySQL Table Sync recently. I saw a demo of their new monitoring tool, MONyog. It looks really impressive, but I will try it out some more and let you know. A well-known MySQL user asked in my session if there could be any chance of innotop becoming an interactive web application. The answer is no. But take a look at MONyog.

Rohit said they plan to integrate some of the data from SHOW INNODB STATUS in future releases. He looked terrified and said “oh, no!” when I told him I had to read the InnoDB source code to figure out some of the SHOW INNODB STATUS parsing. To anyone who wants to parse it — my advice is “rip off my Perl code. It’s GPL and I have an enormous test suite.” This will save you a huge amount of work.

On a related topic, I got a chance to ask the NitroEDB developers the license on the NitroEDB storage engine for MySQL. The answer was how much it cost, which is A Lot. “Price isn’t license,” I pointed out. “What about GPLing it?” The answer to that was “It’s millions of lines of Modula-2. If you think you can do anything with that, we’ll GPL it.”

It’s a shame I didn’t get this remark on the record. It doesn’t matter what the language is, Free is Free. Heck, I’m in the middle of a textbook on compiler design right now, so I’ve been reading a lot of Modula-2 lately. Give it to me in Elbonian, I don’t care. Interestingly, for about $200 you can apparently get the source code from the Department of Defense up to a certain point. I’m sure that would at least enable learning about some of the indexing algorithms.

I didn’t have a ready response at the time, so I cracked a joke about converting it into Perl and left for the next session.

What Happens After You’re Scalable: Capacity Planning for LAMP

This session was very good. It was by a Flickr engineer, and focused on how to plan for capacity. I would say the Big Idea here was don’t try to calculate demand, just measure and extrapolate. Another might be don’t fall into the trap of tweaking another few percent of performance, just take what you have as a given and make sure you can scale effortlessly.

I’m sorry I didn’t take extensive notes on this one. I would say the slides are well worth reading through, though.

InnoDB Performance Optimization

This session was presented by Heikki Tuuri and Peter Zaitsev, two pretty much undisputed experts on InnoDB. The big idea in this talk was to learn how InnoDB really works so you can use it smartly. Peter did most of the talking and Heikki occasionally picked up the microphone to add a comment. It was informative, but also quite entertaining.

Here are a few of the topics Peter and Heikki covered:

  • Application design is the biggest factor in InnoDB performance.
  • Use transactions, but use them smartly.
  • Don’t use LOCK TABLES with InnoDB.
  • Use clustered indexes to your advantage. Understand how secondary indexes work as a consequence, and understand how clustered indexes interact with locking and isolation levels.
  • Know how updates, the hash index, and the query cache interact with InnoDB.
  • Understand automatic clustering keys and how keys might be promoted to clustered status.
  • Learn about row versions and purging, and how they interact with the READ COMMITTED isolation level.

Heikki quote of the week: when Peter said “SELECT FOR UPDATE cannot use a covering index because it must access the primary key to place locks,” Heikki said “this is self-evident,” which made my day.

By the way, one interesting thing I learned from Heikki is how to pronounce InnoDB. He says “in-no-db” not “eye-no-db” or “ee-no-db.”

I only stayed for the first part of this session, because I wanted to catch the Data Warehousing talk running at the same time.

High Performance Data Warehousing with MySQL: Tricks and Tips from the Field

Brian Miezejevski (it’s spelled just like it sounds, if you’re Polish) gave this talk. The focus was on how to use MySQL to build very large data warehouses that you can query, backup and maintain efficiently. Brian has a lot of experience in this area, including working with data warehouses over a dozen terabytes in size.

The main practice he espoused was partition, partition, partition. When you get into large data volumes, everything you do needs to be done a partition at a time; it is not practical to do any kind of operation on tables with hundreds of millions of rows, so you have to split them up somehow. Load, backup, archive and purge a partition at a time.

Partitioning can be done many different ways, but the primary way in production-ready versions of MySQL is to use MyISAM tables and group them together into a single table with the MERGE storage engine. Compressed MyISAM tables can help if you have free CPU; if you are CPU-bound, compressed tables are slower of course.

Another technique is simply to use many tables and access them via UNION queries or views. Of course upcoming releases have genuine partitioned tables, which will be even better.

Brian talked for a few minutes about real-time data warehousing where you need to continuously streaming data into very large tables. It is impractical to do this with MyISAM if you have to do any UPDATE or DELETE queries at all (concurrent INSERT queries can run as long as they are appending to the table). For high concurrency, you need to use InnoDB. I asked about the possibility of making all partitions MyISAM under MERGE except for the “live” partition and then creating a UNION ALL view over them. Brian said he hadn’t tried it but it sounded like a good idea to him. He came back to this later and said he’d probably load by time slice in this fashion, then alter the “live” table to MyISAM and put it into the MERGE periodically.

The data warehousing Holy Grail is to use all hash joins for joining fact and dimension tables, but MySQL doesn’t (yet) support hash joins. However, you can simulate it with MEMORY tables, which have HASH indexes. You have to do some tricks like keeping a disk (MyISAM) and memory table updated together, and loading the memory tables automatically on server start, but apparently this is a really good technique. Your joins then take the general form

SELECT ..
FROM (
   SELECT [MyISAM tables]
) INNER JOIN [MEMORY tables]
GROUP ...

The effect is similar to a hash join.

Brian spoke for a few minutes about NitroEDB (amazing indexing and aggregate performance) and InfoBright (extremely high compression) for data warehousing. But of MySQL’s own storage engines, the preferred solution is still MyISAM.

Other topics included how to back up really large data sets and how to build aggregates with ON DUPLICATE KEY UPDATE. Brian also mentioned the critical tuning parameters for large data sets and talked about the need to monitor slow queries and certain server status values to understand when performance changes so you have some advance warning when you hit some kind of performance ceiling.

Closing Keynote

The closing keynote was by one of the engineers who created Yahoo! Pipes, which in my opinion is the beginning of the semantic Web’s promise becoming reality. It’s built on MySQL and Perl among other things, of course. The demo was pretty neat, but the neater thing yet is you can go do it yourself just as easily.

I had an elevator conversation with someone afterwards about a Pipes storage engine for MySQL. As cool as Pipes is, MySQL’s storage engine architecture is no less cool. It just might take a little more imagination to see it.

Next

I too have some cool things in the works, including a whole new way to ensure slave servers stay in sync. I need to recover from my sleep deficit, snuggle with my wife and dog, and then I’ll get back to you.

Technorati Tags:

You might also like:

  1. The innotop session at MySQLConf 2007
  2. MySQL Conference and Expo 2008, Day Two
  3. MySQL Conference and Expo 2007, Day 1
  4. MySQL Conference and Expo 2008, Day Three
  5. My presentations at the 2008 MySQL Conference and Expo

MySQL Conference and Expo 2007, Day 3

Speaker at MySQLConf 2007

In my third day at the MySQL Conference and Expo 2007, I again attended keynotes and sessions, one of which I participated in. This evening I had dinner with a fellow community member and arrived late to the Quiz Show, even though I was supposed to be on one of the teams! I blame it on the restaurant, because they took too long to figure out what I meant when I said “können wir einen Hubschrauber essen heute abend?”

Today I attended by a decent margin the best sessions I’ve been at all week. If you don’t think they’re saving the best for last, come to my tutorial and demo on monitoring MySQL and InnoDB with innotop tomorrow and see!

Just two quick notes: I am recording the sessions I attend on my iRiver when possible, and will post the audio for download after I get home. Also, you can click on the headings of each of the talks; I have linked them to the session description.

Keynotes

There were again three keynotes this morning. Eben Moglen delivered a fantastic, thought-provoking speech with which I mostly agreed. I was working on innotop during the others, though I was in the room.

Lunch was… I forgot to write it down. A salad and mixed vegetables, a roll, tomatoes that I had to cut. I don’t know. I was trying to meet some folks in the exhibit hall and it’s all a blur now.

NitroEDB for MySQL Storage Engine

This session was mostly a demo and/or sales pitch by three engineers from NitroSecurity. The technology seems well-done, but as far as I can tell the storage engine is not going to be GPL’ed. Too bad they’re missing a big opportunity. On the plus side, they did write some software to ease integration with the storage engine API, and that’s GPL’ed (probably because it has to be, since I guess it’s going to be linked with MySQL).

The Declarative Power of Views

This session was amazing! It was standing-room-only. Beat Vontobel started out with the classic “animals” guessing game, except it was a series of questions to figure out which programming language you were thinking of. The demo was live, running on his own server on another continent. You simply select from the questions table, and it asks you one question, which is just a single column of text. You insert your answer into the answers table, which is a single yes/no enum column. Then you select from the questions table again… and there’s a different question now. As you continue, it narrows down the choices and eventually guesses what you’re thinking of.

Behind the scenes, even though all you can see is the questions and answers, is a series of views. No procedures, triggers, or functions.

The idea? SQL is a declarative language, and can — and should — be used as a logic language, much like Prolog or Lisp. And the basic building block of the language is the view, which expresses a predicate. I would have said it’s a functional language, and the SELECT statement is the way to express a predicate; a view is an abstraction over the SELECT. But I’m not going to argue with such eloquence.

(And for those of you who saw me raise my hand to the “do you program in Lisp” question, no, it’s not because I’m an Emacs user. I’m not, I’m a Vim user. I use Lisp for artificial intelligence and expert systems).

As if his demo had not made the point forcefully, Beat then proceeded to show us Prolog and SQL code for “who is a sibling of who,” side by side. The parallel is obvious. It was tremendously impressive. If you weren’t there, download the slides and read them.

Transaction Processing and Durability with solidDB for MySQL

This session was a very technical discussion of how isolation, Multi-Versioning Concurrency Control, durability and locking work in solidDB. The engine implements both pessimistic and optimistic locking models on a per-table basis, though the terms “pessimistic” and “optimistic” are somewhat misnomers, as is “locking.” It has more to do with record version numbers than locks, as I understand it. Durability (the D in ACID) can also be set to relaxed or strict, at the session level. These features, and the discussion around them, brought the compromises of speed, concurrency, storage, and durability into stark clarity. I’m impressed by solidDB’s range of choices for the DBA. Of course they are also going to support some notion of compatibility with InnoDB’s behavior.

Interestingly, you can mix pessimistically and optimistically locked tables in a single transaction, and the behavior is not upgraded or downgraded — you get pessimistic behavior on one table and optimistic on the other.

I have not yet learned how solidDB will be licensed.

MySQL Server Roadmap

This session was packed. Robin Schumacher took the first part of it, showing what’s planned for the entire MySQL product line over the next year or so. It was a talk calculated to make the audience spend the next year squirming in anticipation. Oooh, finally I’ll get enhanced replication monitoring, and subqueries will get decently optimized, and…!!!!

Robin is a confident, eloquent speaker. The kind of person whom I imagine promises things that make the developers in the audience cringe slightly. “Replication conflict detection! Next slide.”

He gave a demo of the upcoming online backup of a large table while selecting from the table in another session, but amusingly didn’t seem to notice that the SELECT queries in the other window were failing with syntax errors. (Never mind, though… you can do the demo yourself if you want; he included the code you’ll need on a recent article. Robin, if you’re reading this: if you noticed the statements failing, you are one cool customer to continue without missing a beat!)

He followed this with a quick mention of storage engine partners and then proceeded to pitch MySQL Enterprise and MySQL Workbench. Afterwards he finished up with a quote that went something like this: “Backup is coming. It’s real. It’s working.” ;-) Seriously, I believe him.

Jeffrey Pugh took the microphone at this point. He showed us a timeline of MySQL history and features, where we are today, and again what’s planned. Interestingly, he made a public admission of 5.0 being released before being ready, and said this mistake will not be repeated — but apparently sometime in the last week or so MySQL has decided to skip directly from version 5.1, omitting 5.2 and going right to 6.0. Is this version number inflation? It seems like it. Here are some semi-quotes: “We don’t want bugs in 6.0. We don’t want to repeat 5.0.” So, how about not jumping into it? Give it some breathing room.

Someone asked if MySQL will include other programming languages like a JVM embedded, and Robin and Jeffrey made soothing noises into the microphone. My reaction, in case anyone’s listening is for the love of all that’s good and pure keep those things the heck away from MySQL. Please! It is and should be a RDBMS, or at least tries to be and ought to keep trying to. If you embed these things in it, next thing you know it’ll be like Microsoft SQL Server where you can run a fricking web service from a stored procedure.

And of course, there was a discussion about the perpetual topic: what is the difference between Community and Enterprise? I was surprised to hear Robin and Jeffrey correcting each other on this!

All in all, a lively session. Nothing is boring around here.

Lightning Rounds with Top MySQL Community Contributors

I got talking to an engineer, who shall not be named, from a Big Company we all know and love (but which shall not be named) in the hall. Thus I arrived late to the session in which I was supposed to participate. Fortunately, I was not listed first. It was a series of community contributors giving lightning-fast (well, sometimes) talks about our experiences as community members. While I sat listening, something strange happened; I began to think in a different way than I had prepared to speak. Thus when it was my turn, I ignored my slides and spoke extemporaneously. I suppose this is a good thing; one is not supposed to read one’s slides.

Query Optimizer Internals and What’s New in the MySQL 5.2 Optimizer

Holy cow was this a great session. This was the most riveting thing I’ve seen all week. You could really tell who was into this kind of geekiness, because there weren’t that many people in the room. I even tried to record the question and answer session during the intermission as we all crowded around Timour Katchaouno at the lectern.

This session went deep into how the optimizer really works. Topics included how it is similar to and different from other database systems (most of them actually generate machine code; MySQL does not), what it does and why, and what’s coming in the next versions. And for the first time I really understand why MySQL’s core developers think the output of EXPLAIN is somehow understandable to an ordinary mortal (by the way, I have been planning for a while to reverse translate EXPLAIN into a tree view for the rest of us. I’ll get to it, really).

Timour explained MySQL’s cost-based query optimization, which is built on “units of disk access.” He showed its evolution from pre-5.0 where it was an exhaustive search of all possible execution plans, which is O(n!) and didn’t perform well on more than a handful of tables in in a join. I never had this happen to me, but apparently you could quite easily write queries that would take hours, days, weeks just to generate a query plan — and that’s before you even started to execute! These days you can join up to 62 tables, and the algorithm uses exhaustive left-deep search up until a threshold (currently 7 tables), after which it becomes greedy and can choose a non-optimal plan. At least it’ll terminate, though.

I have good news for the query optimization team, though: my brother has solved the Travelling Salesperson problem, which is N-P complete of course. Obviously left-deep search can be transformed into this; so this problem is solved as well. I’m sure it will only be a matter of time before the patents go through, so who’s the highest bidder for the best query optimizer on the planet? Anyone?

This talk brought up a bunch of questions, which I need to follow up on. I’ll report more in a future article.

What fun! I haven’t been this excited since my days at University, scribbling notes as I struggled to understand my teacher’s thick accent and predilection for thinking of everything in terms of real-time databases, sigmas, and so on.

Dinner

I went with Martin Friebe for supper at a Thai restaurant. On the way there we got talking about table checksum algorithms to detect when a slave is or isn’t in sync with its master. Martin had some great ideas, which I will implement into MySQL Table Checksum to provide another way for you to guarantee two tables have the same data. This particular method will have lower impact on the servers (no locking) and guarantee a consistent read at exactly the same point in the binlog. It will be very useful in certain circumstances. Thank you Martin for the company and the great conversation!

Quiz Show

I stumped the judges and picked up a spare copy of Programming Perl. You can never have enough, eh?

Okay, I didn’t really stump the judges; someone asked a question nobody knew the answer to, and I proposed an answer nobody could refute. Let’s see, what does the NDB option ndb_report_thresh_binlog_epoch_slip mean? Is it really the amount of clock skew NDB will permit between the data nodes?

This is a threshold on the number of epochs to be behind before reporting binlog status. For example, a value of 3 (the default) means that if the difference between which epoch has been received from the storage nodes and which epoch has been applied to the binlog is 3 or more, a status message will be sent to the cluster log.

Nope. But I got the book anyway.

Next

Well, I’m fairly slap-happy at this point with jet lag and lack of sleep, but I still want to make a plug for my innotop session tomorrow at 10:45 in Ballroom C. Even if you don’t use InnoDB, you will find this tool has something to offer you. And my presentation and demo is going to be fun, with gratuitous use of stock images. Come on out.

And by the way, I just spoke to someone from another Large Company We All Know, who asked me to implement a new feature in innotop. As Monty is famous for saying, “Trivial. It’s trivial.” If you want to see it, be there; I’ll have it done in time for the session.

Now if you’ll excuse me, I have to fire up Vim…

Technorati Tags:

You might also like:

  1. Like it or not, it is the MySQL Conference and Expo
  2. Sessions I want to see at the MySQL Conference
  3. MySQL Conference and Expo 2008, Day Three
  4. MySQL Conference and Expo 2008, Day Two
  5. My presentations at the 2008 MySQL Conference and Expo

MySQL Conference and Expo 2007, Day 2

Speaker at MySQLConf 2007

In my second day at the MySQL Conference and Expo 2007, I attended keynotes, several sessions, and three BoF (Birds of a Feather) sessions. This article is about these sessions. Again, I’ll focus on the Big Ideas and let you read other people’s blog posts for the small details.

Keynotes

There were three keynotes this morning. Two I won’t comment on, but I want to mention the third because it was mostly about the One Laptop Per Child project. I was glad to hear about it instead of what sounded like it was going to be a Red Hat pitch.

Building Scalable OLAP Applications with Mondrian and MySQL

This session introduced the Mondrian component of the Pentaho business intelligence suite. Mondrian connects to a SQL backend and converts the flat SQL view of the data into a navigable hierarchical view. The point is to make OLAP scalable on top of MySQL. As such, it touched on tactics for tuning both MySQL and Mondrian — especially aggregation, caching, and cache control in Mondrian. Also on the agenda were near-real-time OLAP (aka “active data warehousing”), and how to cache and invalidate in that scenario. There’s a high cost for doing this, but there can be great benefits as well.

Technology at Digg.com

This session featured Digg’s lead developer and lead DBA discussing how Digg built their systems (as opposed to many other sessions, which tell you how you ought to do things). The major components are

  • a cluster of web servers
  • a memcached farm caching chunks (not whole pages) of content with write-through and some nimble dancing to handle stale data after losing and regaining a server
  • MySQL replication with data partitioning and separation for scale-out, with separation into farms by function (search, data warehousing, atomic data)

There was another debate from audience members about what the words “shard” and “partition” mean. Someone in the audience even told the Digg people the correct definition of the terms, which did not match what the developers were talking about. *sigh*

Interestingly, it seems Digg is in the lucky position of being able to scale with replication for reads extremely well, since their load is about 98% reads. They also only have about 30GB of data. I assumed it would be in the terabytes.

InnoDB Performance Potential in High-end Environments

This talk was the most technical I’ve been to so far. Yasufumi Kinoshita dove deep into InnoDB to analyze points of contention in many-CPU machines under various workloads. His results are impressive; before his changes, InnoDB did not scale beyond four or sometimes even two CPUs, and could even perform dramatically worse on more CPUs than on fewer! After he identified the points of contention, scaling looked quite good up to at least 8 CPUs, with no indication of other problems caused as side effects. Though there’s still work to do and apparently much debugging needs to be done, this is hugely important for MySQL and InnoDB. I’m glad there are people who can do this kind of work. I couldn’t begin to; the speaker even wrote certain parts of the fixes in assembly.

MySQL Server Settings Tuning

This session by Peter Zaitsev focused on learning what to configure in MySQL server, and knowing how to find out if they need to be tuned. Topics included memory allocation, how to fight swapping, and a guided tour of the server status variables.

Falcon Transactions

MySQL’s own Jim and Ann Starkey discussed concurrency control in the still-in-beta Falcon storage engine. They talked about all kinds of database systems, the official standards, and other storage engines, not just Falcon (even PostgreSQL came up). Topics included transaction isolation levels, problems and challenges with those, and how InnoDB’s repeatable read really isn’t. In fact, they are trying to decide what to name that level of transaction isolation. Jim calls it “benchmark mode,” because even though it’s not really standard, it is extremely practical and does very well on benchmarks. It sounds like Falcon will provide a means to emulate InnoDB’s behavior for compatibility if for no other reason.

This talk’s Big Idea was Falcon is both like and unlike other storage engines.

This made me think of Guy Kawasaki’s keynote from this morning. Who knows what people will use and abuse Falcon for? I’m glad MySQL and the Starkeys are doing what they believe is right, even though a lot of people (including me, frankly) don’t really understand what and why they are doing. My impression is that Falcon is so different from what people are used to that most of us do not “get it,” and probably will not for a long time. Someone will, though. And when they do, and learn how to make it sing and dance in ways nothing else can do, it’ll make a lot of people* mad for not seeing it themselves sooner. Especially when it makes someone really successful.

* People in Redmond, I’m guessing.

Professional Cat Herding

I dropped in on the end of this session briefly. One community member suggested MySQL should use OpenID for authentication. Bravo! It’s a capital idea. Another suggestion brought up the fact that MySQL uses BitKeeper for source control. I voiced my regret that MySQL, a company that believes in and promotes software freedom, has fallen into the trap of using non-Free software themselves. It’s sad to see them handcuffed in such a way. Who else remembers when the use of BitKeeper burned the Linux kernel developers? I know Richard Stallman does, because he’d been predicting that fiasco for many years by the time it finally happened. To choose non-Free software is to choose to be a victim.

Exploiting MySQL 5.1 for Advanced Business Intelligence Applications

Pentaho’s Matt Casters spoke on how to extract data from many disparate sources and store it in a Pentaho data warehouse, and how to use Pentaho and MySQL 5.1’s advanced features to make OLAP queries fast (there are two Big Ideas because the talk was double-length). The first part of the talk focused a lot on Spoon, a user interface for telling Pentaho what to do with data (not how). Next he spoke on MySQL 5.1’s table partitions, followed by data partitioning across databases or servers. The idea here is to retrieve and process data in parallel for greater speed.

Have you been reading Matt’s blog? Do you remember his understated post on processing a large volume of data in parallel with near-linear scalability? I’ve been eagerly reading his articles for a while and it was great to hear him speak and see him demo these things.

The techniques he showed are great, but may result in CPU bottlenecks on the server that does the processing, because you can easily get enough data from a bunch of servers in parallel to peg the CPU. The next level of parallelization is the Carte server, which runs on remote machines and is basically grid computing for business intelligence. He gave a demo of this, which looks great. (Hmmm, I wonder if I could get seti@home to run BI for me? Yeah….) Matt finished up with a demo and overview of the Pentaho product overall.

Birds of a Feather Sessions

This evening I went to three BoF sessions: the first on DBD::mysql, the next as a fly on the wall at Paul McCullagh’s streaming blob server BoF, and finally to learn more about MySQL Proxy, which I’ve been excited about ever since I read about it a few weeks ago.

Next

Today’s expert session was a wash because the session and the official lunch were in different places, and people couldn’t bring their lunch to the meeting room. It might come together better tomorrow, it might not.

Of course, I’ll still be doing the two official sessions tomorrow and Thursday.

Technorati Tags:

You might also like:

  1. My presentations at the 2008 MySQL Conference and Expo
  2. Remember to sign up for MySQL Conference and Expo!
  3. MySQL Conference and Expo 2008, Day Two
  4. Sessions I want to see at the MySQL Conference
  5. Like it or not, it is the MySQL Conference and Expo

MySQL Conference and Expo 2007, Day 1

Speaker at MySQLConf 2007

In my first day at the MySQL Conference and Expo 2007, I attended the Scaling and High Availability Architectures tutorial in the morning, and Real-world MySQL Performance Tuning in the afternoon. This is a brief article on each session’s Big Ideas, and a short blurb about the conference overall so far.

I’ll also be involved in at least three sessions at the conference, and I describe them.

If you’re interested in short overviews of the sessions I attend, keep watching for my articles. I will give you each session’s major ideas instead of writing stream-of-thought notes. You can look at the presenter’s slides for more.

The conference overall

This conference is well-organized and friendly. Attire is casual; most people are wearing jeans and t-shirts, or khakis and three-button shirts. I found lunch basic but good — catered food, with tables set up in a grassy area in the beautiful California sunshine; nicely dressed tables. I had a nice salad with vinaigrette and crumbled bleu cheese, penne with a sun-dried tomato sauce, red potato salad, and bread.

Pretty much everyone seems to be here. I don’t want to drop names, so I’ll just leave it at that (though I cannot avoid mentioning that I’m rooming with Alexey Kovyrin, who has just released an update to the MySQL Master-Master Replication tool). It is such a pleasure to meet the people I’ve been emailing with; people from all over the world, who use MySQL for all different kinds of things. I also met some people I’ve met at previous events, and whom I consider friends now. Here’s to all of my friends, new and old!

The downside is I miss my wife and Carbon, our loving Rhodesian Ridgeback dog. But I know he will Guard The House® even while I’m gone.

Scaling and High Availability Architectures

This tutorial featured Jeremy Cole and Eric Bergen of Proven Scaling LLC. You probably know them for their generous help giving people rides and passes to various parts of various events, and for contributing patches for things we all need.

Jeremy did most of the talking. The talk was organized roughly into identifying what scaling and high availability are (they’re not the same thing), what problems typically present at various stages of an application’s lifecycle, and some strategies to use and avoid. It promoted application partitioning for scaling, and master-master replication for high availability. All in all a very good discussion of the pros and cons of many concepts, both big and small.

This tutorial was mostly pretty high-level, but frequently got down to specifics.

One of the things I noticed the most from the audience’s questions was how differently many people understand the concept “partitioning.” There were at least three working definitions I heard, and they are not at all the same thing. I think one of the primary obstacles to teaching the principles this talk covered is conveying accurately what it means to partition. The definitions I heard were

  • Dividing data into partitions (aka “shards”) and locating each on one of some number of servers.
  • Dividing a large table into smaller tables on the same server.
  • Using the partitioned tables available in MySQL 5.1.

I also heard people talking about partitioning by date, which I usually associate with archiving.

In the context of the talk, partitioning data means dividing it by something like user ID and locating each partition on one of some number of servers. This is key to horizontal scaling.

Real-world MySQL Performance Tuning

This session featured Ask Bjørn Hansen for the first half and MySQL’s own Jay Pipes for the second half. The two halves were quite different; it was not one tutorial, but two.

Ask Hansen’s talk was a broad overview of how to scale web applications, from start to finish. It included not only a lot of advice on MySQL, but also suggestions not related to MySQL, such as application-level caching, proxies, failover, etc. He covered a huge amount of material; his slides are interesting and varied, with nice illustrations. He often gave high-level advice, such as “cache aggressively,” but at least as often devoted entire slides to a low-level topic.

Jay’s talk covered much less ground. He focused on specific performance optimizations for MySQL. The topics included indexes, how to know when and how indexes are used, query plans, and server tuning. The slides showed a lot of code examples and the results of various query strategies and indexing changes.

Next

You can catch me, with Mark Callaghan and Peter Zaitsev, tomorrow at lunchtime at an experts’ session on migrating MySQL from 4.0 to 5.0 (organized by solidDB). On Wednesday I’ll be part of a lightning session Lightning Rounds with Top MySQL Community Contributors.

On Thursday I will be giving a session myself, on how to use the innotop MySQL and InnoDB monitor. I designed this session to show you how to go beyond the surface with innotop; my design strategy with innotop is that you should be able to start it and see something useful immediately without the advanced features even being visible, but there’s a tremendous amount of power lurking in it.

I hope to meet you if you’re here, whether at one of my sessions or just in the hallways. Till then, be well and enjoy the conference!

Technorati Tags:

You might also like:

  1. Sessions I want to see at the MySQL Conference
  2. My presentations at the 2008 MySQL Conference and Expo
  3. More progress on High Performance MySQL, Second Edition
  4. MySQL Conference and Expo 2008, Day One
  5. Remember to sign up for MySQL Conference and Expo!

The innotop session at MySQLConf 2007

Speaker at MySQLConf 2007 I’ll present a session on the innotop MySQL and InnoDB monitoring tool at 2007 MySQL Conference and Expo in a couple of weeks.

The innotop session will focus on using innotop’s basic and intermediate-level features. I’ll demonstrate how to install it and get the initial configuration set up. I’ll show you what innotop is good at doing, and how to do some of the things I do frequently, such as watch queries, check replication status, and look at what transactions are currently open. And I’ll demonstrate some of innotop’s many small features that can help you use it to watch and control your MySQL servers.

You’ll leave the session with a comprehensive understanding of how innotop works, and how to make it work for you.

By the way, though I originally designed innotop to monitor InnoDB, it goes far beyond that now. It has a lot to offer everyone, not just InnoDB users.

I hope to see you there.

Technorati Tags:

You might also like:

  1. Version 0.1.149 of innotop released
  2. MySQL Sandbox is the best thing since sliced bread
  3. Version 0.1.123 of innotop released
  4. Upcoming innotop features
  5. Slides for the innotop workshop at MySQL Conference and Expo 2007