MrExcel
.
- Joined
- Feb 8, 2002
- Messages
- 3,414
- Office Version
- 365
- Platform
- Windows
I spent the last 12 days writing a book about PowerPivot, a new add-in from Microsoft for Excel 2010.
While all of this is still fresh in my mind, I wanted to share some facts about PowerPivot.
1) The client version of PowerPivot is free for anyone with Excel 2010.
2) It is written in VSTO and feels almost like a part of Excel.
3) It won't be on the Office 2010 CD, but it will be a free download.
4) You are not constrained by the 1,048,576 row limit in Excel. While writing the book, I loaded up an 18 million row text file into a grid. I could scroll, sort, filter that data set. If you have the 64-bit version of Excel 2010, you can in theory load up data sets with hundreds of millions of rows.
5) When the workbook is open, the data is stored in memory in a columnwise fashion. Think about a huge column of data. There are usually some columns that contain similar values over and over and over. In my 18 million row data set, I have one column with only 11 unique values. PowerPivot then can almost create rainbow of stripes in that column. When I look for "X", they know that they can filter to this big stripe of similar data. I can't say that I understand exactly what is happening behind the scenes, but I know this: The 18 million row data set took 58MB of hard drive as a text file, and in the Excel workbook, it compressed to 4MB. Microsoft is calling this VertiPaq compression. It allegedly was invented by the PowerPivot team.
6) You can load up multiple tables into PowerPivot. They will attempt to figure out the relationship between the tables or you can explicitly define a relationship in three clicks.
7) When you return to Excel to create a pivot table from the data in PowerPivot, a bizarre version of the PivotTable Field List appears with all columns from all tables in the list. If you choose something from three different tables, those fields are reported in the same pivot table without ever writing a VLOOKUP formula.
8) There is a new function language called DAX (Data Analysis Expressions) with 134 functions. 81 of these are lifted directly from Excel. The new ones are stunning. Some of these functions are for adding formulas to the grid. Other functions are for calculating values in the pivot table. We can FINALLY calculate Count Distinct! We can invent time comparisons like "running 30 days" or "compare to parallel period from N years ago". The CALCULATE function has parameters where you can say to respect the filters applied on this cell in the pivot table, or override the filters. Regular pivot tables let you do "% of Total". DAX will let you do "% of Red Units Sold".
9) You can have 1, 2, or 4 (or more with a little work) pivot tables and/or pivot charts that respect the same filters, but each show something different.
10) Microsoft had a blog post about 4 months ago where they described something new in pivot tables called Named Sets. These allow for asymetric reporting, where you might have 2009|Budget, 2009|Actual, 2010|Budget, 2010|Actual and only want to show 2009|Actual, 2010|Budget. Not supported in regular pivot tables, but you can do it with a named set in Excel 2010. Problem: they don't work with regular Excel data . They only work with OLAP data sets. So...along the way, I discovered that named sets work with PowerPivot data, so I've considered running all of my regular Excel data through PowerPivot just to enable the Named Sets feature.
I hope to come up with more blog posts, but you can see an example of a PowerPivot mashup where I took daily sales data for 3 years and mashed it up by the amount of rainfall and daily high temperature to see if I could find any trends. See that guest blog post here: http://powerpivotpro.com/2010/02/11/jewelry-mashup-for-alpha-geek-challenge/
And, in a gratuitious bit of self promotion, I would love to win the trip to New Orleans, so please consider voting for my entry in the PowerPivot contest here: http://tinyurl.com/yhovp9e
Bill Jelen
While all of this is still fresh in my mind, I wanted to share some facts about PowerPivot.
1) The client version of PowerPivot is free for anyone with Excel 2010.
2) It is written in VSTO and feels almost like a part of Excel.
3) It won't be on the Office 2010 CD, but it will be a free download.
4) You are not constrained by the 1,048,576 row limit in Excel. While writing the book, I loaded up an 18 million row text file into a grid. I could scroll, sort, filter that data set. If you have the 64-bit version of Excel 2010, you can in theory load up data sets with hundreds of millions of rows.
5) When the workbook is open, the data is stored in memory in a columnwise fashion. Think about a huge column of data. There are usually some columns that contain similar values over and over and over. In my 18 million row data set, I have one column with only 11 unique values. PowerPivot then can almost create rainbow of stripes in that column. When I look for "X", they know that they can filter to this big stripe of similar data. I can't say that I understand exactly what is happening behind the scenes, but I know this: The 18 million row data set took 58MB of hard drive as a text file, and in the Excel workbook, it compressed to 4MB. Microsoft is calling this VertiPaq compression. It allegedly was invented by the PowerPivot team.
6) You can load up multiple tables into PowerPivot. They will attempt to figure out the relationship between the tables or you can explicitly define a relationship in three clicks.
7) When you return to Excel to create a pivot table from the data in PowerPivot, a bizarre version of the PivotTable Field List appears with all columns from all tables in the list. If you choose something from three different tables, those fields are reported in the same pivot table without ever writing a VLOOKUP formula.
8) There is a new function language called DAX (Data Analysis Expressions) with 134 functions. 81 of these are lifted directly from Excel. The new ones are stunning. Some of these functions are for adding formulas to the grid. Other functions are for calculating values in the pivot table. We can FINALLY calculate Count Distinct! We can invent time comparisons like "running 30 days" or "compare to parallel period from N years ago". The CALCULATE function has parameters where you can say to respect the filters applied on this cell in the pivot table, or override the filters. Regular pivot tables let you do "% of Total". DAX will let you do "% of Red Units Sold".
9) You can have 1, 2, or 4 (or more with a little work) pivot tables and/or pivot charts that respect the same filters, but each show something different.
10) Microsoft had a blog post about 4 months ago where they described something new in pivot tables called Named Sets. These allow for asymetric reporting, where you might have 2009|Budget, 2009|Actual, 2010|Budget, 2010|Actual and only want to show 2009|Actual, 2010|Budget. Not supported in regular pivot tables, but you can do it with a named set in Excel 2010. Problem: they don't work with regular Excel data . They only work with OLAP data sets. So...along the way, I discovered that named sets work with PowerPivot data, so I've considered running all of my regular Excel data through PowerPivot just to enable the Named Sets feature.
I hope to come up with more blog posts, but you can see an example of a PowerPivot mashup where I took daily sales data for 3 years and mashed it up by the amount of rainfall and daily high temperature to see if I could find any trends. See that guest blog post here: http://powerpivotpro.com/2010/02/11/jewelry-mashup-for-alpha-geek-challenge/
And, in a gratuitious bit of self promotion, I would love to win the trip to New Orleans, so please consider voting for my entry in the PowerPivot contest here: http://tinyurl.com/yhovp9e
Bill Jelen