We have ECS service behind ALBs at the moment. Our autoscaling is based on the RequestCountPerTarget metric (https://aws.amazon.com/about-aws/whats-new/2017/07/application-load-balancer-adds-support-for-new-requestcountpertarget-cloudwatch-metric/)
As we need to expose the service as a VPC endpoint, we are considering migrating to a NLB instead of ALB. Is it possible to autoscale based on request count per target when using NLB?
As far as I can tell, based on the cloudwatch docs: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-cloudwatch-metrics.html no such metric is exposed, but I might be missing something.
As far as I know there is no such metric as request counts for NLBs. You could use ActiveFlowCount or ProcessedBytes if you want to get an idea of the Load Balancer activity, but that's not really the same.
For other reasons, we are using a NLB in front of an ALB (not with containers), and I can tell you that there's a huge difference between ActiveFlowCount and RequestCounts.
To scale based on request counts, you could set up your NLB to have it in front of your ALB, but you'll have to handle the target IPs refreshing through Lambda (since ALB IPs aren't static) and you will have additionnal costs. You have a full tutorial on how to do so on AWS blog : Using static IP addresses for Application Load Balancers
I guess you have a good reason not to scale on your backend containers agregated resource metrics, but in my opinion you should autoscale based on your target group(s) metrics (which you could tweak as much as you want by sending custom metrics from your containers through the AWS API). That way you will only scale when more resource are effectively needed.
Finally, I'm not sure whether or not this might be useful to you but according to the docs you can set up VPC endpoints for your ECS through AWS PrivateLink: Amazon ECS Interface VPC Endpoints
External links referenced by this document: