Is this way of using Excel 2007 Pivot table for BI scalable?

0

Background:

  • We need to consolidate sales data across the country to do analysis

  • Our Internet connection/IT expertise/IT investment is not quite strong, therefore full BI solution is out of question

  • I tried several SaaS BI solution (GoodData, ZohoReports) and while they're good, they seem not to fully support what we need

  • We're looking at 'bout 2 millions record for every 2 months

My current approach

  • Our (10) sites currently gathers data from all their branches and consolidate them into 1 Excel file with Pivot table and embed source data

  • In HQ, I will request 10 sites to send back those Excel files periodically

  • We will import those Excel to our MSSQL server

  • There will be a master Excel file, that will also have the same pivot table (as those came from site Excel file), and datasource is the MSSQL server

More details

  • For testing, I currently use MSSQL 2008 Express on my laptop

  • So far, I imported our transactions for the past 2 months and there are 2 millions+ row in 1 table in MSSQL (we just use 1 table, corresponding to our common pivot table structure). DB size is ~ 600 MB

  • In the master Excel file, if not including the source data, it's just < 10MB. Including the source data will increase the size to 60 MB (so I supposed Office 2007 automatically zip the data ?)

  • I try using the Pivot (drag-and-drop fields) and the performance so far is OK (my laptop specs: C2D T7200, 3GB RAM, Windows XP)

So my question is :

  • If we're looking at full year transaction (roughly 15 millions rows in MSSQL 2008 Express, 3.6 GB in size), is there any issue with that 15 million rows in 1 table in SQL Express ?

  • Is there any performance issue with the pivot table at that time ? Can it still embed the source data ? (I google-ed but didn't find the maximum size of source data Excel 2007 can embed)

  • Any other suggestions on how we can better do this ? Given that we can't afford the full BI solution, any light-weight/budget/SaaS BI that you can recommend ?

Thanks

Sim

Posted 2009-11-14T11:13:43.737

Reputation: 201

Answers

0

There is a 4Gb limit on SQLServer Express 2008 (including log files), so it looks like you will be very close to the limit initially, and over the limit rather quickly. If it's just a single table, maybe you'd consider SQLite (or possibly mySQL).

In my experience with Pivot tables (and excel generally) is that it works well until a point, then performance degrades significantly.

Generally, I think your approach is the one I would choose though. Rather than trying to pivot the whole table, I would try to pre-summarize the data back in the database through a view, even if I had to have two separate worksheets with two different views on summarized data (e.g. maybe one that summarizes up to a daily or hourly and another that summarizes up a level in another key dimension -- perhaps by order or login or whatever). This should keep more of the labor back in the database and will leave the spreadsheet lighter.

Good luck!

CB in Aus

Posted 2009-11-14T11:13:43.737

Reputation:

Hi,

Use view to reduce the data coming in Excel is a good suggestion. Will definitely try it out.

Regards the 4GB limitation, actually we can afford the license for Enterprise edition. Might try out MySQL as well.

Thanks – Sim – 2009-11-15T02:48:29.113