Let's consider the oldest biometric method - fingerprinting.
Although fingerprinting has been around since the 1890s as a forensic tool, there is still no uniform international standard for recording fingerprint Minutiae or comparing them. The closest de-facto standard is ANSI/NIST-ITL 1-2011 (see section 8.9). There is also no legal rigor in fingerprint matching - it essentially comes down to the opinions and best practices of forensic examiners (human beings) used as expert witnesses when needed by the court. Like any expert witness, two examiners can disagree.
This is relevant to biometric encryption keys because the information reduction (loss) towards a "match" is subjective and based off the reducing the level of Type-I and Type-II errors (false negatives and positives). For this reason most fingerprint databases still store the full image as the authoritative source when two or more expert witnesses have to duke it out in court.
But you can't just store the whole image as the key either due to distortions from oil, grit, fingerpad pressure, abrasions, microcuts, optical noise, axial orientation and so forth.
The net result is at least three stages of selective or simply implicit information loss occurs:
- Human to Image: Information discarded because it exceeds the lowest common denominator of all scanned images.
- The extra information could not be used anyway without raising the bar of what qualifies as the common denominator. Any user friendly scanner will have a different error light or noise for instances where the scanner rejected the scan due to not meeting a minimum quality metric (too smudged, too much axial rotation, too much ambient light, etc).
- Image to Integer Set: Information discarded when choosing which and how many minutiae to record.
- This should be consistent due to the deterministic nature of normal computer algorithms but there is no guarantee that the information loss is uniformly distributed across the data set. That is, if this was hash (which it is in a way), the collisions would not be evenly distributed.
- Integer Set Reduction: Information discarded because the algorithm or expert system for matching the fingerprints considers two fingerprints with different minutiae as the identical on some statistical or forensic basis.
- This is typically done to recover some of uniform distribution lost in Stage 2. The second and third stage are usually not condensed together because Stage 2 produces points for matching and Stage 3 pre-processes how many points are worth matching for a given context. Different end users of fingerprint matching can have different requirements for point-matching while retaining the same minutiae identification standard. For example, a laptop scanner may use a 4 point match for user convenience while a criminal court may require a 16 point match.
Regarding use in encryption
This problem of unevenly-distributed highly-compressed information loss exists for all human bio-markers (fingers, eyes, voice, face, earprints) except perhaps DNA since it has discrete GATC nucleobases (you just need several samples to avoid site specific mutation).
Using biometrics for encryption is possible if you consider the biometric reduction and matching process to be equivalent to a non-cryptographic hash with a very small digest range. This seems counter-intuitive given the vast range of fingerprints, unique even for identical twins, but the fundamental difference between using a biometric marker for authentication vs encryption is that digest reduction is required to handle real-world analog noise which makes the usable key space trivially small for brute-force attacks.
Companies providing biometric solutions for encryption typically embed a key derivation function into the hardware scanner to produce keys of a decent size and randomness.
Is it possible to reliably derive a key from a biometric fingerprint?
Yes, but the more reliable you make it, the weaker the inherent key; until almost all of the cryptographic security is coming from the static salt stored on the device or computer.