-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add indexes for user_management_user #3054
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! A few minor comments added.
models.Index(fields=["organization", "is_active", "username", "role"]), | ||
models.Index(fields=["organization", "is_active", "email", "role"]), | ||
models.Index(fields=["username", "organization", "is_active", "permissions"]), | ||
models.Index(fields=["email", "organization", "is_active", "permissions"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for ordering fields differently? (ie. "organization", "is_active", "username"
vs "username", "organization", "is_active"
)
+1 to add some indexing to the User model, wondering if keeping "role"
and "permissions"
out wouldn't be enough (thinking on space)? so we only add:
"organization", "is_active", "username"
"organization", "is_active", "email"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order of the indexing matters for performance reasons depending on the SQL engine.
Cardinality matters as it reduces the search time for rows, which this is attempting to eliminate as many rows as possible to remove the very costly table scan.
All 4 indexes are required as they're effectively used as a set of joins to reduce the number of rows down to (preferably) 1 row (in the best case).
By any chance did you check if the recent updates around schedules improved your loading times (before applying these changes)? (e.g. besides some indexing added, there was some refactoring on how users details are fetched when RBAC is enabled) OTOH, creating an index including a JSON column (
So it will require some extra work to enable it (eventually we may need to re-think how permissions are stored/checked). |
No, technically speaking these changes aren't exactly what I had to apply since django won't accept:
Yeah I can see that in the tests, I'm not sure Django's ORM is giving a good experience here. |
Also #3067 is pretty poor form to pull the rug out from this PR. |
as larger instances struggle with slow queries
What this PR does
While running Grafana Oncall, I found that the select statements behind showing the current schedule of all users would generate sequential queries using regex to parse a JSON column and also using
lower()
on the email column, causing a high CPU load on the DB.Which issue(s) this PR fixes
This doesn't have a github issue.
We found that a larger instance of Oncall (207 rows in
user_management_user
) could have this table take up to 30 seconds to complete the table scan and that these indexes reduced that down to 600ms-900ms (averaging at about 700ms)Checklist
pr:no public docs
PR label added if not required)CHANGELOG.md
updated (orpr:no changelog
PR label added if not required)