A review of Pentaho Solutions by Roland Bouman and Jos van Dongen

Pentaho Solutions

Pentaho Solutions

Pentaho Solutions, Business Intelligence and Data Warehousing with Pentaho and MySQL. By Roland Bouman and Jos van Dongen, Wiley 2009. Page count: about 570 pages. (Here’s a link to the publisher’s site.)

The book is big in part because it’s about a GUI tool, so there are the requisite number of screenshots (but not too many). It is structured into four parts, each on a different topic.

The first part is 4 chapters on getting started with Pentaho: from a quick-start through installing, configuring, and understanding the Pentaho BI Stack. Pentaho is a complex suite of tools, and there’s a handy architecture diagram to help you grok what the parts are and how they fit together. You’ll learn about topics such as Mondrian and configuring the database connection pool.

The second part is a primer on dimensional modeling and DW design. It uses a sample database that the authors developed for the book, which you can download from the publisher’s website. You’ll learn about star schemas and data marts. You can skip this part if you’re familiar with BI concepts in general, and just want to learn how Pentaho implements them.

Part three is about Pentaho data integration. The first chapter is a primer on integration, which again you’ll be able to skip if you know your stuff. Then you’ll walk through topics such as generating dimensions, designing, and deploying data integration solutions with Kettle and Spoon.

Part four is about designing and building BI applications with Pentaho: learning about its metadata layer; using the reporting tools; scheduling, subscriptions, and bursting; OLAP; data mining; and building dashboards. This is about half the book, really. There’s a lot to it – it’s all about how to take a generic and flexible suite of tools and get something specific and useful out of it. If you’ve ever done that, you’ll know why this could occupy half a book. This isn’t a simple suite of tools that only does one thing well.

In the end, this is a good beginner-to-intermediate book for people who want to learn about data warehousing, business intelligence, Pentaho, or all of the above. If you don’t know anything about these topics, you’ll find the entire book quite useful. If you know a lot about BI and DW, you’ll probably get the most out of the Pentaho-specific bits. On the other hand, people who already have an advanced level of proficiency with Pentaho will probably know much of what’s in this book. Those seeking to build advanced solutions presumably also know a lot about the general BI concepts, too. So this is probably not the book for you if you already know what you’re doing with Pentaho.

Proprietary BI systems cost at least an arm and a leg, and possibly more. That’s why open-source BI is such a hot topic. If you’re looking to get acquainted with Pentaho, I think this is an excellent book – that’s what I got it for, and I wasn’t disappointed. Now if only I could find a similar book for Jaspersoft.

See Also

I'm Baron Schwartz, the founder and CEO of VividCortex. I am the author of High Performance MySQL and lots of open-source software for performance analysis, monitoring, and system administration. I contribute to various database communities such as Oracle, PostgreSQL, Redis and MongoDB. More about me.