4

I work in a company that has strict antivirus policy. Each computer must have an up-to date antivirus with all the bells and whistles turned on.

But as a developers, this is giving us problems.

Most noticeable is performance degradation. As building big solution might be creating, deleting and moving around lots of files, antivirus tends to check each one, which slows things down.

Second, less noticeable and more problematic are random failures on either developer's machine or on our CI build machines. This often occurs when antivirus is locking a file for analysis, but the build is trying to access that file. As lots of files are moved around, this becomes highly unpredictable and non-deterministic. Sometimes, only 5-10% of our builds fail because of this problem.

The suggested solution is to exclude the build executables and build folders from antivirus analysis. But so far, we had no success convincing our security department to do that for us.

My question is : What are best-practices to minimizing virus risk on developer and build machines? If that question is too broad, then what are possible security risks of excluding build folders in antivirus?

Euphoric
  • 143
  • 3
  • 1
    Maybe try and gather some detailed evidence of the problem. For example, I would run (admin prompt) when the slow down is occurring: `wpr.exe -start GeneralProfile` Leave it for say 1 minute and then run: `wpr.exe -stop C:\GP.etl` Then you can open it in WPA and see if you can account for the issue. It may be you can provide your IT guy something they can forward to the security vendor to get an explanation. If they respond with we recommend x,y,z maybe it will happen, maybe they will fix it if it's a bug, etc.. – HelpingHand Sep 30 '19 at 21:14

1 Answers1

4

This is a common problem and different dev teams solve this is a variety of ways.

The best practice is to mitigate the risks appropriately. That sounds like a way of avoiding the question, but in cases like these, it's the only real way to approach the problem.

What's "appropriate"?

The threats are:

  1. Devs use their machines to create and release critically important product code that will run on production servers, and both code and servers have access to the business' most critical data. A compromise of the dev's machine can result in the compromise of the code, the machines, and the data the dev's code is run on.

That means that in order to protect the dev machine, which has a low critical value to the business (you could shoot it with a shotgun and the business would not care except for the replacement hardware costs) from the things with the highest critical value to the business, there needs to be a level of control/protection/defence. Some form of risk mitigation.

  1. Devs need the appropriate tools and resources to create the critically important product code upon which the business relies. Hampering of the development process or inconvenience to the devs could negatively impact code quality, product innovation, and speed of release.

That means that whatever mitigations are put in place need to have an impact on the devs that is within some definition of tolerance and needs to support the devs in their ability to meet productivity targets.

Given these threats, how does anti-virus fit in?

Anti-virus helps to automatically respond when the machine is compromised. The more "whistles and bells" are turned on, the increased number of types of compromise can be responded to. Reducing AV protection reduces the protection between the dev machine and the "crown jewels".

If the current AV level of protection to mitigate risk #1 is increasing risk #2, then you need to start looking at reducing risk #1 in new ways so that you can reduce the level of protection to bring risk #2 back into tolerance.

Reducing risk #1 in dev environments is to:

  1. reduce the likelihood of compromise of dev machines, and/or
  2. reduce the impact of compromise of dev machines

The more a dev machine is only used for development and not personal activities, the lower the likelihood of compromise of that machine. Devs like to do a ton of other things on their dev machine other than coding: music, videos, email, social media, browsing, games, installing random tools they find on whatever site they fancy. If those things could be done away from the coding machine, or coding environment, then you lower the likelihood of dev machine compromise. Then the AV protection level could be reduced to a more tolerable level.

Alternatively, the coding activity could be separated from the machine itself. This is typically done by having coding environments in a protected remote environment. That way, the dev machine can be filled with malware and the code in production is unaffected. In this way, in an extreme example, it might be possible to not have any AV on the dev machine at all. The remote coding environment can similarly have reduced protection because the likelihood of its compromise is much lower.

However, if you want the devs to do whatever they want on their machines and code and upload whatever code they want to the CI process, then increasing your tolerance for delays and bugs introduced by the protection mechanisms is the only way forward while also keeping risk #1 at a tolerable level.

Specific settings

Checking each file when accessed (moved around or run in the CI steps) is important in a dynamic file system environment like a laptop. If the environment was more static and predictable, then on-access scanning can be turned off.

There could be a file upload area on the CI servers that allows the AV to scan files, and then the CI process should pass it on to the rest of the CI process. Alternatively, if you are passing code and not compiled binaries, there is no need to run AV on each file when accessed, but scanning can be scheduled. Binaries, however, should be scanned before they are run in CI.

Excluding files or file locations just means that you intentionally set up a blind spot for malware to permanently hide and typically increases risk #1 unacceptably. Instead, increase the predictability of safety in the environment and you are freer to reduce the on-access, on-demand, and dynamic protections.

What about increasing tolerance in risk #1?

For completeness sake, I will also address this.

As a dev, you have no control over the acceptable level of risk #1 and it is entirely in the hands of management. However, you could, and should, ask and challenge the acceptable level. An honestly and accurately assessed risk to the business by having compromised machines drives all the other related risks, so it is important to get it at the correct level. Just throwing mitigations/protections at the highest level at things in hopes that "bad things don't happen" doesn't help anyone and wastes the company's time and money.

How devs can express their needs to the security department

  • Express your general understanding and buy-in of risk #1
  • Ask to get to know management's acceptable levels of risk #1
  • Ask for their, and management's, understanding and buy-in of risk #2
  • Work together to come up with suggestions to reduce the likelihood and impact of risk #1 in different ways so that the level of mitigation brings risk #2 to a more tolerable level
  • Understand that promises and trust will only go so far and that technical boundaries and controls have to come into play at some point
  • Commit to regular review with the security team about the risks and the mitigations to make sure it is all still working with everyone and to seek improvements and conveniences
schroeder
  • 123,438
  • 55
  • 284
  • 319
  • Thanks for your answer. I find it hard not to agree with everything. But I feel that it doesn't really help me in any way. I find it especially difficult on how I would quantify both risk #1 and #2. Right now, I feel that the my problems caused by over-zealous antivirus are more annoying than work-stopping. I would maybe appreciate more options than just antivirus everything and antivirus ignores dev stuff. – Euphoric Sep 30 '19 at 10:58
  • The problem is that I, or anyone outside your company, cannot give you the best options. There are so many factors involved. You cannot quantify risk #1, as I said. that's for management. You can quantify risks #2, and you need to be very vocal about that. I have some pretty solid and industry-standard approaches there: dev-only machines, remote dev environments, seeking stable, predictable dev actions. – schroeder Sep 30 '19 at 12:18