How it works
This documentation section aims to give a high-level explanation of what happens under the hood when you run batchling through the SDK or CLI.
There are basically three components working together hand in hand, in order:
-
The context manager
-
The HTTP hooks
-
The batcher
Let's detail the role of each of them next.
Context manager
This is the batchify function you call in the SDK (or that is called for you if you use the CLI).
What it does:
-
Install HTTP hooks to capture target incoming HTTP requests
-
Initialize the batcher, which will do most of the hard work later
-
Activate a
BatchingContextcontext var when the user enters the context, basically saying "batch mode is activated" at the context manager scope
HTTP Hooks
The HTTP hooks are installed by the context manager.
Essentially, they patch the httpx and aiohttp libraries such that incoming requests matching certain characteristics (hostname, endpoint, method..) are captured.
The requests that are not captured just flow naturally like if nothing happened, which makes batchling non-intrusive.
The captured requests are repurposed and sent to the batcher for further processing.
Batcher
The batcher receives tons of requests coming from differents sources.
His role is the one of the orchestrator:
-
Sorts requests by the provider, endpoint, model triplet. This is required because most Batch APIs don't allow mixing models and endpoints.
-
In practice, the batcher manages a set of queues that represent future batches (stack of requests).
-
Detects cached requests and fast-tracks them to the polling state.
-
Monitors time elapsed since a request spawned a queue and each queue size.
-
Uses the providers abstraction to submit, poll and download results from batches.
-
Returns results as a dump of HTTP Responses, like if nothing had happened in the middle.
Finally, the batcher sends back those responses to the patched HTTP client, which returns them to your framework of choice, and the code continues its execution.