Excel 2010 slow with large sets of data

1

Problem:

Recently I have been working with fairly large sets of data.

However, Excel seems to handle these very slowly (it takes 3-5 minutes just to open or save the file).

Spreadsheet details

  • Rows: 50,000
  • Coloumns: 90
  • Worksheets: 1
  • File size: 157mb.

The sheet consists of pure data. No formulae or VBA script has been added yet. There are no cell references (eg =H3) or arrays.

Specs:
Office 2010 32 bit
Processor : i7 3.4 GHZ Quad Core
Ram : 6 Gb
Windows 7

Steps taken thus far

I tried saving it as an Excel binary (.xlsb) file, which helped a bit (and reduced file size a bit as well).

However, 50,000 rows is nowhere near the limit of Excel 2010 (which is 1,048,576 rows), yet it's really slow. I also remember older versions of Excel being able to deal with large chunks of data relatively fast.

Is there any way of speeding Excel up?

BKen

Posted 2016-11-17T10:39:49.527

Reputation: 15

Can you define what you mean by really slow? Perhaps give an example of something you do and how long it takes. Also, while you've given us the number of rows, you haven't told us how many columns of data your working with? For example, 50,000 rows and 26 columns is a lot more data than say 50,000 rows and 3 columns, but we have no idea what your scenario is. Finally, you haven't provided the speed of your processor etc. – Monomeeth – 2016-11-17T12:45:11.983

1Thanks for the response.

Takes a few minutes (3-5 just to open up or save), I am worried that once I start converting it to a table and adding formulae it's going to increase significantly. Will update my question to reflect the additional details It's 90 coloumns. – BKen – 2016-11-17T12:49:51.240

1So you have 4.5 Million different data points. I don't have an absolute answer on why your excel would be slow but at that size why not use an actual database for it? Especially if you're considering adding logic on top of it. – Seth – 2016-11-17T13:08:01.497

you should use 64-bit Office for such large set of data – phuclv – 2016-11-18T03:28:53.323

Thanks for the responses, unfortunately 64 bit Office isn't an option at the moment, but I will definitely be looking into simple database solutions – BKen – 2016-11-18T06:52:47.390

Answers

0

I decided to try and mimic as closely as possible your scenario. To do this I:

  • created a MS Excel spreadsheet with 50,000 rows and 90 columns of data (thankfully I work with large data sets, so this wasn't too hard to do)
  • ensured the data was a mixture of formats: Text, Dates, Number and General
  • did not use any formulas or VBA code

However, for me this spreadsheet was only 24MB in size, which is only about 15% of the size yours is, so I cannot account for that.

Unfortunately, the only context you've provided is the read/write speed for opening/saving your spreadsheet, so this is all I could test for. My results are as follows:

  • 7.68 seconds to save the data to a local hard drive
  • 3 mins 53 seconds to save the data to a network drive
  • 10.4 seconds to open the data from a local drive
  • 50.43 seconds to open the data from a network drive

Now, since my 50,000 rows and 90 columns of data only amounted to 24MB, my data is obviously going to take a lot less to save and load than your 157MB file (which is about 6.5 times larger).

However, as you can see, the speed of opening and saving files is not dictated purely by the size of the file itself. Working with an SSD is a lot faster than a local hard drive which in turn is a lot faster than a networked drive. Also, the type of network has a huge impact on data transfer speeds (e.g. Gigabit ethernet is a lot faster than fast ethernet).

Unfortunately, the closest computer I could find to run this test had the following specs:

  • Windows 7 Enterprise (64-bit)
  • i5 2.6GHz
  • 8GB RAM
  • MS Excel 2010 (32-bit)

Summary

In a nutshell:

  • there are a lot of factors which impact read/write speeds, not just the amount of data but also your hardware and/or network environment
  • a better test would be to see how long it takes for Excel to do some actual number crunching with your data, as opposed to read/write transfer speeds
  • another factor in the case of Excel is if you're loading any add-ins etc and where they're located
  • your data is 157MB in size, 6.5 times the size of my spreadsheet also containing 4.5 million cells of data (so what accounts for this?)
  • install more RAM (I have worked with large data sets in Excel - Windows and Mac - for nearly two decades and the biggest factor in Excel's performance is the amount of RAM you have installed
  • with 4.5 million cells of data you may need to reconsider whether a database would serve your needs better

Monomeeth

Posted 2016-11-17T10:39:49.527

Reputation: 1 047

Thank you for taking the time and effort for troubleshooting, I truly appreciate it.

I tried reducing the data size by saving it to a .csv and .txt formats (in an attempt to remove any formatting I am not aware of) however it didn't help reduce the file size.

I'll take your advise, is there any database programs you'd recommend for database beginners? – BKen – 2016-11-18T06:51:14.467

Well, if you've already got MS Access as part of MS Office, you could use that. There are lots of resources to get you started and it's well worth putting the effort into. You can even create your database by importing your current Excel data. – Monomeeth – 2016-11-18T07:53:22.163

0

Spreadsheet is definitely not the right tool to handler large set of data because cell are loaded as object in memory (a very inefficient way).

You can use CSV (or TSV) format for your data and use dedicated tools to manipulate them. Such as open-refine or GNU awk, etc.

Setop

Posted 2016-11-17T10:39:49.527

Reputation: 151