Prompt recipe - controlling request rate?
TrevorHall
Registered Posts: 15 ✭✭✭✭
Is it possible to control the rate at which a prompt recipe issues requests? I ran my first one yesterday and very quickly hit the TPM rate cap on our OpenAI model deployment.
Best Answer
-
Hi,
On the connection, there are some settings to control the rate of queries. There is no direct limit for TPM, but reducing parallelism will usually fix it.
Answers
-
Thanks! I set "Max parallelism" to 1 and that seems to be working. I also upped the retry delay to 2 seconds.