I would like to hear the best recommendations about where to apply rate limit on APIs. We use k8s (microservices) with an ingress controller that is behind an API gateway, that is behind a firewall.
The ingress controller and API gateway are on private subnets. Our VPN users can access both subnets by being members of specific VPN groups. The API gateway can also be reached by the firewall via a subnet integration. The firewall is public and is the entrypoint for all our applications.
We are having a hard time deciding on which of those layers rate limit should be enforced due to lack of articles and topics about this subject.
Should rate limit be implemented on the application itself, the ingress controller, the API gateway, or on a firewall before even getting to our infrastructure, or even in multiple layers?