0

We are upgrading our database server, and I'm running into a weird performance issue.

Our old server is a dual processor system with 8-cores and 4 GB of RAM, running Win2k3 R2 Standard (32 bit), MS SQL Server 2005, and SOLR 4.2 running on Tomcat 7.0.37 all on Java 6u22. We are using MS SQL JDBC 3.0 driver to run DIH to import our records into solr. This import process is taking roughly 4.5 hours.

Our new server is a dual processor system with 16-cores and 32 GB of RAM, running Win2k12 Standard (64 bit), MS SQL Server 2008 R2, and SOLR 4.2 running on Tomcat 7.0.39 all on Java 7u17. I used the same MS SQL JDBC 3.0 driver to run DIH. The import process took over 8 hours.

I am currently running an import test using the MSSQL JDBC 4.0 driver, but if the status is consistent with what I'm seeing now this will also take 7-8 hours.

Can anyone help me figure out this performance anomaly, and help me correct it? Ideally I'd expect to see the import process shorten (the server has more resources so it should), but I'd settle for getting the same speed.

Thanks.

Chris
  • 71
  • 8
  • That might be disk or memory bound, not processor related at al... – vonbrand Apr 08 '13 at 15:01
  • The hard disks are 15k versus 10k on the old, and as I mentioned above the new server has 32GB ram versus 4GB on the old. I'm not discounting those possibilities, but I'm not seeing how they could be the case either. I don't know the stats on the old server, but during the current import on the new, SQL Server is using 1.5% CPU/7,237.7MB Ram, and Tomcat/Solr is using 0.2%CPU/359MB Ram. The SQL server RAM DOES seem high as it is above even the capacity of the old server. – Chris Apr 08 '13 at 15:16

1 Answers1

1

I found the major slow-down for the import process. It was one, maybe two of the 10 child entities and their respective queries. Of the 10, these 2 were also the only ones not using the CachedSqlEntityProcessor. Though when I tried to rework to allow caching, I got out of memory exceptions. These queries are the same, that the old server used - so I'm not certain why it actually slowed down on the new server.

I decided to rework the entire process. I figured I would have better results from importing preprocessed files, rather than individually run sub-queries. So I created a stored procedure for bcp to use to export everything I needed as an XML file in a format ready for import with useSolrAddSchema turned on.

The bcp export takes roughly 20 minutes, with the dih using a FileDataSource taking another 5.

So in the end I'm satisfied with my performance gains. They started out with 10+ hours on Solr 1.3 on our old server, with the upgrade to Sol 4.2 I was able to cut it down to 4.5 hours, and now with the new server it got cut down to roughly 25 minutes. I'd call this a win for now.

Chris
  • 71
  • 8