Is that a realistic assumption in practice? Are there ways to automatically generate such benign inputs? Or is that an academic infancy?
That highly depends on what kind of input data you're trying to simulate. So the short answer is: only someone who's familiar with your domain can decide that.
Here's what I mean: If the "benign inputs" you're trying to simulate is realistic user data from Google Location Services, or typical browsing behaviour on Amazon.com, then yes, the ability to simulate those inputs is "in its academic infancy".
On the other hand, if you're trying to pen test an application that accepts a standardized protocol - for example the Certificate Management Protocol (CMP) - which has a very small number of accepted message types (~30 for CMP), then no, it's actually quite easy to generate a complete and exhaustive set of example inputs.
So what are you trying to do? What type of input data are you trying to simulate? If you edit your question to provide more details, we can give you a better answer.