Keep It Simple and Fast: Memory management in Power Pivot: Column oriented databases.

Wednesday, February 27, 2013

Memory management in Power Pivot: Column oriented databases.

Power Pivot is a perfect personal Business Intelligence tool. It is simple to use and the performance of the Power Pivot engine is really great. To better understand this engine, so you can even better make use of it, I will explain how this engine is working.

Row oriented versus column oriented databases.

All traditional relational databases, including SQL Server, are row oriented databases. They store data in tables row by row. The row of a table is the main unit of storage. Indexes are used to point to all columns of a certain row. It depends on the definition of the index which records belongs to this index.

A column-oriented database, like Power Pivot, uses a different approach. Every column is considered as a separate entity. Data is stored for every column in a separate way. I will explain this with an example.

ID	Car	Engine	Color
1	Audi A4	Petrol	Silver
2	Audi A4	Gazole	Red
3	Audi A4	Gazole	Blue
4	BMW	Petrol	Silver
5	BMW	Gazole	Silver
6	BMW	Gazole	Red
7	Mercedes	Gazole	Blue

Every column will have it's own sorted dictionary with all distinct values and a bitmap index references the actual values of each item in the column by using a zero-based index to the dictionary. Next table will show the dictionary values and index values.

Column	Dictionary	Values
ID	32,23,10,43,57,65,71	2,1,0,3,4,5,6
Car	Audi,BMW,Mercedes	0,0,0,1,1,1,2
Engine	Petrol, Gazole	0,1,1,0,1,1,1
Color	Silver, Red, Blue	0,1,2,0,0,1,2

As you can see, the dictionary can be the most expansive part of the index. Especially if a high number of distinct values exists in a column. The lower the number of distinct values in a column the smaller the size of dictionary for this column. This will make the value bitmap index more efficient.

The xVelocity engine, which is implemented on Power Pivot, is an in-memory database. This means that it has been designed and optimized assuming that the whole database is loaded in memory. Data is compressed in memory and dynamically uncompressed during each query. Because all data is kept in memory it is essential to be critical which data to import in your Power Pivot sheet. For instance customer data can be useful like, country, state. However street name is not efficient. Every customer will have a unique address which will result in a big dictionary without a low number of distinct values. It will have a high number of distinct values.

Enjoy the power of Power Pivot.

1 comment:

sarine said...: Amazing content! You explained everything so clearly. Looking forward to your next post. spirit airlines; May 14, 2025 at 8:07 PM

Keep It Simple and Fast

Wednesday, February 27, 2013

Memory management in Power Pivot: Column oriented databases.

1 comment:

About Me

Search This Blog

Popular Posts

Blog Archive

Followers of my blog

Blogs I Read

My Places