Archive for the ‘Oracle’ tag
Sessions of interest at the Percona Performance Conference
Having written about what I think is cool about the upcoming MySQL Conference and the MySQL Camp, now I want to finish up with what I’d like to see at the Percona Performance Conference. Just to recap, this is a conference we created to serve those who want to learn about performance — not “learn about MySQL,” not “learn about database performance,” just learn about performance, period.
I want to see everything. I think this is going to be the single best conference I’ve ever been to. Even the way the conference is organized is exciting. For example, it’s running from early morning till late at night, nonstop. The sessions are also (mostly) only 25 minutes. This means if you decide a session isn’t all that interesting, you didn’t spend much time on it, and you don’t have long to wait for the next one.
So here is a small sample of the sessions:
- CouchDB: Behind the Buzz (Jan Lehnardt)
- Performance Instrumentation: Beyond What You Do Now (Cary Millsap)
- Hive: Distributed Data Warehousing with Hadoop (Ashish Thusoo and Prasad Chakka, Facebook)
- High Performance Erlang (Jan Henry Nystrom)
These are not just people who’ve learned about something and want to talk at you. These are the inventors, the originators, the gurus. It is truly the who’s who, and that’s just a few of them. If you aren’t familiar with those names, Google them and see. And after that, why not Google Theo Schlossnagle, Eric Burton, Monty Widenius, Andrew Aksyonoff, and a few others.
I hope to see you there. Bring your business cards and introduce yourself to me!
50 things to know before migrating Oracle to MySQL
A while back I was at a seminar on migrating database applications to MySQL. A lot of the attendees were Oracle users. Based on their questions, comments and conversations during lunch, I made the following list of things Oracle users need to know about migrating to MySQL. Most of these are “gotchas” that would be contraindications or require some thought about a workaround.
Note: this is not meant to be MySQL-bashing. Some of these limitations are going to be fixed in future versions of MySQL, but they generally apply to current GA version 5.1. Some things are possible to achieve by choosing one particular way to use the server, at the exclusion of other things (e.g. Cluster rules out foreign keys, spatial data types rules out transactions). I am posting this list at the request of some Oracle DBAs who asked me to produce it for them.
- Subqueries are poorly optimized.
- Complex queries are a weak point.
- The query executioner (aka query optimizer / planner) is less sophisticated.
- Performance tuning and metrics capabilities are limited.
- There is limited ability to audit.
- Security is unsophisticated, even crude. There are no groups or roles, no ability to deny a privilege (you can only grant privileges). A user who logs in with the same username and password from different network addresses may be treated as a completely separate user. There is no built-in encryption comparable to Oracle.
- Authentication is built-in. There is no LDAP, Active Directory, or other external authentication capability.
- Clustering is not what you think it is.
- Stored procedures and triggers are limited.
Vertical scalability is poor.No longer true; with Percona XtraDB or MySQL 5.5 (unreleased) you get excellent scalability- There is zero MPP support.
- SMP is supported, but
MySQL doesn’t scale well to more than 4 or 8 cores/CPUs. No longer true; with Percona XtraDB or MySQL 5.5 (unreleased) you get excellent scalability - There is no fractional-second storage type for times, dates, or intervals.
- The language used to write stored procedures, triggers, scheduled events, and stored functions is very limited.
- There is no roll-back recovery. There is only roll-forward recovery.
- There is no support for snapshots.
- There is no support for database links. There is something called the Federated storage engine that acts as a relay by passing queries along to a table on a remote server, but it is crude and buggy.
- Data integrity checking is very weak, and even basic integrity constraints cannot always be enforced.
- There are very few optimizer hints to tune query execution plans.
- There is only one type of join plan: nested-loop. There are no sort-merge joins or hash joins.
- Most queries can use only a single index per table; some multi-index query plans exist in certain cases, but the cost is usually underestimated by the query optimizer, and they are often slower than a table scan.
- There are no bitmap indexes. Each storage engine supports different types of indexes. Most engines support B-Tree indexes.
- There are fewer and less sophisticated tools for administration.
- There is no IDE and debugger that approaches the level of sophistication you may be accustomed to. You’ll probably be writing your stored procedures in a text editor and debugging them by adding statements that insert rows into a table called debug_log.
- Each table can have a different storage backend (“storage engine”).
- Each storage engine can have widely varying behavior, features, and properties.
- Foreign keys are not supported in most storage engines.
- The default storage engine is non-transactional and corrupts easily.
- Oracle owns InnoDB, the most advanced and popular storage engine.
- Certain types of execution plans are only supported in some storage engines. Certain types of COUNT() queries execute instantly in some storage engines and slowly in others.
- Execution plans are not cached globally, only per-connection.
- Full-text search is limited and only available for non-transactional storage backends. Ditto for GIS/spatial types and queries.
- There are no resource controls. A completely unprivileged user can effortlessly run the server out of memory and crash it, or use up all CPU resources.
- There are no integrated or add-on business intelligence, OLAP cube, etc packages.
- There is nothing analogous to Grid Control.
- There is nothing even remotely like RAC. If you are asking “How do I build RAC with MySQL,” you are asking the wrong question.
- There are no user-defined types or domains.
- The number of joins per query is limited to 61.
- MySQL supports a smaller subset of SQL syntax. There are no recursive queries, common table expressions, or windowing functions. There are a few extensions to SQL that are somewhat analogous to MERGE and similar features, but are very simplistic in comparison.
- There are no functional columns (e.g. a column whose value is calculated as an expression).
- You cannot create an index on an expression, you can only index columns.
- There are no materialized views.
- The statistics vary between storage engines and regardless of the storage engine, are limited to simple cardinality and rows-in-a-range. In other words, statistics on data distribution are limited. There is not much control over updating of statistics.
- There is no built-in promotion or failover mechanism.
- Replication is asynchronous and has many limitations and edge cases. For example, it is single-threaded, so a powerful slave can find it hard to replicate fast enough to keep up with a less powerful master.
- Cluster is not what you think it is. Maybe I already said that, but it bears repeating.
- The data dictionary (INFORMATION_SCHEMA) is limited and very slow (it can easily crash a busy server).
- There is no online ALTER TABLE.
- There are no sequences.
- DDL such as ALTER TABLE or CREATE TABLE is non-transactional. It commits open transactions and cannot be rolled back or crash-recovered. Schema is stored in the filesystem independently of the storage engine.
I hope this is helpful.
My new favorite comic: The Adventures of Ace, DBA
I follow only a few comics in my feed reader: Get Fuzzy, Dilbert, XKCD, and now The Adventures of Ace, DBA. It’s kind of XKCD-ish, only it’s about Oracle and it doesn’t have the extra punch line when you hover your mouse over the picture. And it’s proof that a DBMS (even one I don’t use) can be pretty damn funny.


