I entrusted my dreams to Cloud Service Mesh for Cloud Run
#SRG(Service Reliability Group) mainly provides cross-sectional support for the infrastructure of our media services, improving existing services, launching new ones, contributing to OSS, etc.
This article isQualiArts Advent Calendar 2024It's the 13th day. I'm writing this post because I'm involved as an Embedded SRE.
We considered and tried to see if the issues we wanted to solve with Cloud Run could be solved with Cloud Service Mesh for Cloud Run.
The information in this blog is current as of December 1, 2024.
What is Cloud Service Mesh for Cloud Run?Issues to be resolvedIssue A: Increasing number of backend services in multiple dev environments for gRPC-based applicationsChallenge B: Blue/Green deployment of gRPC-based applications with immediate all-traffic switchoverThe ideal solution I dreamed of. And I can't come to a conclusion.Ideal A for Task A and the results of the studyIdeal B for Task BConsideration of Ideal B with BookinfoConfiguration DescriptionTerraform SampleConsideration result: If we could specify Cloud Run tags in NEG...Other dreams for Cloud Service Mesh for Cloud RunConclusion
What is Cloud Service Mesh for Cloud Run?
At the time of writing this article, the feature is in preview.
Advanced traffic management capabilities with Cloud Service Mesh are now available on Cloud Run.
For more information, please refer to the official documentation below.
Issues to be resolved
Issue A: Increasing number of backend services in multiple dev environments for gRPC-based applications
In the dev environment, different versions of the application are deployed to Cloud Run for each environment, such as dev01, dev02, etc. For each Cloud Run, you need to create a Backend Service as shown in Figure 1.
One issue with this configuration is that if you enable Cloud Armor Enterprise Paygo/Annual*1, you will be charged for the number of Backend Services even if you have not set a policy. This is a bottleneck if you want to try out the same Cloud Armor Enterprise settings in the Dev environment (or Stg environment) as in the Prod environment.

*1 Cloud Armor Enterprise Annual billing system
The flat rate is $3,000/month for up to 100 protected resources, and $30/month for each protected resource beyond 100 (see Figure 2).
The protected resources are Backend Service and Backend Bucket, and they are counted even if Cloud Armor policy is not applied. For details,documentPlease refer to.

Challenge B: Blue/Green deployment of gRPC-based applications with immediate all-traffic switchover
Traffic shiftingWhen switching revisions using , even if you set it so that traffic is 100% directed to the Green revision, requests may be sent to the Blue revision between the time the changes are applied and the time the traffic is switched. *2
For backwards incompatible releases, this behavior is not acceptable.
*2 In the case of REST APISession affinityThis ensures that users who access a newer revision are not redirected to older versions.
The ideal solution I dreamed of. And I can't come to a conclusion.
Ideal A for Task A and the results of the study
In order to reduce the number of backend services, we wondered if we could aggregate NEGs using Cloud Service Mesh (Figure 4).
Figure 4 is the ideal diagram I had in mind before researching Cloud Service Mesh for Cloud Run.
Bookinfo is a sample application provided by Istio.was considered as an example.
(In the Dev environment that I would like to use, I would use gRPC route to communicate.)

Figure 5 shows the results of our investigation into Cloud Service Mesh for Cloud Run. Cloud Service Mesh controls the route to the backend service, but does not have the functionality to consolidate NEGs or backend services.

Ideal B for Task B
Figure 6 shows the configuration I came up with in the same way as in Task B. Ideally, by changing the HTTPRoute in this configuration, reviews-v2 will stop returning responses at the same time that reviews-v3 starts returning responses.
(The API I want to use communicates with gRPC, so I use gRPC Route, but I'm using HTTP Route to test with Bookinfo.)

Figure 7 shows what we were able to build. In the case of Cloud Service Mesh, we were unable to register a NEG with a revision tag in the Backend Service, so we had to create a separate Cloud Run service for each release version. We concluded that this configuration was unacceptable from an operational perspective.
We have not conducted performance tests to confirm that all traffic is switched as expected.

Consideration of Ideal B with Bookinfo
We will show you the Terraform we used when testing the Istio sample application Bookinfo to see if it can actually be built.
Configuration Description
Since we are using Bookinfo this time, we are using HTTPRoute, but the same settings as those used this time can also be set up with gRPCRote.
google_network_services_http_route.reviews
The behavior verification content is translucens'I tried out Cloud Run's service meshI will skip this as it is the same as this blog.
Terraform Sample
Here is a sample of the terraform code to build Bookinfo.
Please modify the example as appropriate.
Consideration result: If we could specify Cloud Run tags in NEG...
INTERNAL_SELF_MANAGED

This constraint required us to create a Cloud Run service for each release version.
Other dreams for Cloud Service Mesh for Cloud Run
I would be happy if I could access Cloud Run directly from LB via Cloud Service Mesh. It seems that it was introduced to improve traffic between Cloud Run services, so this function is out of its role.
Conclusion
In this article, we tried to see if Cloud Service Mesh for Cloud Run could solve the following issues and concluded that it could not.
- Unifying BackendServices in multiple dev environments for applications using gRPC
- Blue/Green deployment for instantaneous traffic switching in gRPC-based applications
SRG is looking for people to work with us. If you are interested, please contact us here.