Incremental model provider stats computation#6290
Conversation
d3a38c6 to
9cac4cf
Compare
|
@BugBot review |
|
/merge-queue |
|
🚀 Merge queue workflow triggered! View the run: https://github.com/tensorzero/tensorzero/actions/runs/21927347534 |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9cac4cfb5b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
...ero-core/src/db/postgres/migrations/20260209183000_incremental_model_provider_statistics.sql
Show resolved
Hide resolved
...ero-core/src/db/postgres/migrations/20260209183000_incremental_model_provider_statistics.sql
Show resolved
Hide resolved
0aa3538 to
de24a8c
Compare
...ero-core/src/db/postgres/migrations/20260209183000_incremental_model_provider_statistics.sql
Outdated
Show resolved
Hide resolved
virajmehta
left a comment
There was a problem hiding this comment.
this makes sense but please add a bit more documentation
|
seems like one of the PG tests is failing too: stderr ─── Cancelling due to test failure: 3 tests still running |
de24a8c to
9aced72
Compare
|
The failing PG test was because I tried to be clever and coalesce null values to 0, but we actually expect null values, I think |
9aced72 to
108658f
Compare
This adds a way to incrementally refresh model provider stats, and schedules a pgcron job every 5 minutes to refresh the last 10 minutes of data. Also adds a way to do a full refresh (we use this in our tests). We will need to figure out the right way to schedule this job for existing users, but it's sufficient for new users.
Can get merged independently of the model inference table split, because it only touches the metadata columns.
A step towards #5691.
Note
Medium Risk
Touches Postgres migration logic and scheduled refresh behavior for production aggregates; correctness depends on the incremental window/watermark and could lead to stale or incorrect stats if misconfigured.
Overview
Switches
model_provider_statisticsfrom a materialized view to a regular table that is maintained via new PL/pgSQL refresh functions and a persisted watermark, avoiding full refreshes on large datasets.Updates pg_cron setup to run
refresh_model_provider_statistics_incrementalevery 5 minutes (with a 10-minute lookback), adjusts e2e tests to assert this job is scheduled exactly once, and changes the Postgres fixture loader to call a full backfill function instead ofREFRESH MATERIALIZED VIEW.Written by Cursor Bugbot for commit 9cac4cf. This will update automatically on new commits. Configure here.