Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Firebase functions container silently freezes on making a get request to Firestore if multiple requests were created before container starts #774

Open
@antopolskiy

Description

Environment

  • Operating System version: Firebase Functions for Python (v2), Cloud Run
  • Firebase SDK version: firebase-admin==6.4.0
  • Firebase Product: firestore
  • Python version: 3.11
  • Pip version: 23.0.1

Description

Context

We've noticed this issue on our production and dev environments in Firebase, and launched an investigation together with GCP support (Support Case 49715474). GCP acknowledged the strange behaviour, but they were eventually unhelpful, offering generic advice on how it can be partially mitigated, which does not work in our use case, and it doesn't solve the issue. I continued the investigation on my own, and I think I pinned down the cause.

The issue seems to be rather generic and in my opinion very serious (silent failure -> no recovery until container goes out of scope), and happens for at least 1 use case which I presume is common in Firebase Functions: reading a document in Firestore using firebase_admin library. Since it seems one of the major use cases for firebase_admin, I decided to raise an issue also here.

I'd be very grateful for any suggestions to solve or mitigate this issue, as we've completely blocked on that right now.

Here is a reference to the issue on firebase-functions-python: firebase/firebase-functions-python#181

What happens

  • Python function is silently freezing on an attempt to read a document in firestore (firestore.client().document.get()) if multiple invocations were requested in a short period of time (in my tests I used as low as 5 invocations submitted as quickly as possible).
  • Further attempts to invoke the function do not produce any result.

This behaviour is observed in both http-triggered functions and firestore-triggered functions.

Reproduction and results

Setup

Standard Firebase Functions for Python setup, as suggested by the official documentation.

In the section below I provide minimal code to reproduce the behaviour. I provide the code for http-triggered function, as it is easier to test (for firestore-triggered need to get service key to write to db, etc; but the behavior is exactly the same).

I deploy the function to a Firebase Project (with Blaze plan).

npx firebase-tools --version
12.5.3

npx firebase-tools deploy --only functions:functions-python

Additional setup:

  • In Firestore, create a collection "test" with document "testDoc" (and any content, e.g. {"test" : "test"}). I did it manually through Firebase Console. However, in reality this is not necessary, as the issue is reproducible even without it.
  • In Cloud Run -> open Function -> Security -> Allow unauthenticated invocations

Reproduction

I use hey to generate requests.

hey -n 10 -c 10 https://http-trig-read-fs-l-<...>.a.run.app

Summary:
 Total:	20.0057 secs
 Slowest:	0.0000 secs
 Fastest:	0.0000 secs
 Average:	 NaN secs
 Requests/sec:	0.4999
Response time histogram:
Latency distribution:
Details (average, fastest, slowest):
 DNS+dialup:	 NaN secs, 0.0000 secs, 0.0000 secs
 DNS-lookup:	 NaN secs, 0.0000 secs, 0.0000 secs
 req write:	 NaN secs, 0.0000 secs, 0.0000 secs
 resp wait:	 NaN secs, 0.0000 secs, 0.0000 secs
 resp read:	 NaN secs, 0.0000 secs, 0.0000 secs
Status code distribution:
Error distribution:
 [10]	Get "https://http-trig-read-fs-l-<...>-uc.a.run.app": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

As you can see, the endpoint is unresponsive. In the Cloud Run logs we can see the issue (I am posing the end of the log, there are no further log statements in response to this run:

image

The container becomes completely unresponsive, and functions run until they time out (can be seen in Cloud Run metrics -- run times just above 1 minute is when 1 did this load test):

image

However, if after the container goes out of score, or the function is redeployed to renew the container revision, I first try to make a single request to "wake up" the container, it can then process very high loads.

hey -n 1 -c 1 https://http-trig-read-fs-l-<...>-uc.a.run.app

Summary:
 Total:	0.6506 secs
 Slowest:	0.6505 secs
 Fastest:	0.6505 secs
 Average:	0.6505 secs
 Requests/sec:	1.5372
....
Status code distribution:
 [200]	1 responses

hey -n 100 -c 100 https://http-trig-read-fs-l-<...>-uc.a.run.app

Summary:
 Total:	1.9869 secs
 Slowest:	1.9861 secs
 Fastest:	0.4788 secs
 Average:	1.2266 secs
 Requests/sec:	50.3297
...
Status code distribution:
 [200]	100 responses

Relevant Code:

firebase.json

...
"functions": [{
 "source": "functions-python",
 "codebase": "functions-python",
 "ignore": ["venv", ".git", "firebase-debug.log", "firebase-debug.*.log"],
 "runtime": "python311"
 }],
...

functions-python/requirements.txt

firebase_functions==0.2.0

functions-python/main.py

(I added print statements before every code line so that it is apparent where the function freezes)

from firebase_admin import firestore, initialize_app
from firebase_functions import https_fn, options
initialize_app()
counter = 0
# note that defining firestore client inside or outside the function doesn't make any difference, I tried both
print("firestore.client()")
db = firestore.client()
@https_fn.on_request(
 cors=options.CorsOptions(
 cors_origins=["*"],
 cors_methods=["get", "post"],
 )
)
def http_trig_read_fs_l(req: https_fn.Request) -> https_fn.Response:
 print("global counter")
 global counter
 print("db.collection('test')")
 col = db.collection("test")
 print("col.document('testDoc')")
 doc = col.document("testDoc")
 print("doc.get()")
 get = doc.get()
 print("get.to_dict()")
 data_dict = get.to_dict()
 print("counter += 1")
 counter += 1
 print("counter", counter, "data", data_dict)
 return https_fn.Response(f"counter {counter} data {data_dict}")

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions

        AltStyle によって変換されたページ (->オリジナル) /