Looking at the SO question I don't know that this is a systems-level problem -- The description over there sounds like an app bug. Either way upgrading your environment is always something it's good to think about, so I'll take a swing :-)
A general plan of action for a major software change or migration usually looks like this (From your SO question, everywhere I say DB/Database you should be thinking about your App2 server):
- Duplicate your environment as best you can on new hardware (and optionally upgraded software -- latest OS, web server, DB, etc.)
This can include cloning all your prooduction databases (which is great if you don't have convenient test data).
- Test the bejeebus out of it to make sure your problem is gone.
(This part is problematic in your case since you said you haven't been able to reliably reproduce the problem)
- Clean up the detritus from your testing
- Pick a convenient time to make the switch-over
("convenient" for your users: Unfortunately that typically means 3AM on a Saturday or something equally loathsome for the admin team)
- Make the switch-over - This includes (roughly in this order)
- Disconnecting the old environment from the network / disabling user access
- Putting the old environment into a quiescent state so it's not changing anymore
- Synchronizing any databases/volatile data to the new environment
- Doing any tests you can do before you make the new environment live
- Turning on access to the new environment if the tests pass
(or being ready to put the old one back)
In your case depending on where the funky behavior comes up you may be able to short-circuit most of this around step 3: If your admins are the only ones who see the misbehaving portion of the application then your admins can beat on a testing copy of the environment until they either reproduce the bug or are satisfied that it's gone (and if the bug pops up you're back in application-land).
If the problem is user-facing the only real solution is putting the new stuff out where users can get at it, which basically means going through the whole process.
You also have a few different challenges because you want to run your environments in parallel: If both environments will be writing to a database you will need to take precautions to ensure that either both environments write the same information to their copy of the database (multiplex the connections at your load balancer), or that both environments can safely interact with a single database.
Running in parallel pretty much eliminates the first and third bullets from #5 above (you don't duplicate the back-ends, and the "old" environment keeps running - you just prop up the new one next to it).
In your specific case with identical applications on App1 you may be able to use App2 as a shared database, but that's something you need to think about from a software standpoint (would App2 freak out if it saw multiple hosts talking to it?).
No matter what you do definitely hang on to your old environment for a while without touching it (this can be a longer or shorter while, depending on your particular situation -- For example in my company about 8 hours after a major DB Schema change we've accumulated so much data that we can't roll back: The data loss would be catastrophic and recovery protracted).
Once you're sure the new environment has solved your problem (or at least works as well as the old environment with no new problems) you can turn the old stuff into a development lab.