Issue/4466/api access/project-specific-access by lsabor · Pull Request #4487 · Metaculus/metaculus

lsabor · 2026-03-14T19:23:39Z

addresses main site part of optional feature of #4466
followup to #4488

Summary
Renames WhitelistUser model to UserDataAccess and decouples API access tier grants from user-level data permissions.

Key changes
Model rename: WhitelistUser → UserDataAccess, with related_name updated from whitelists to data_accesses across all FKs
New view_user_data field: Replaces view_forecaster_data. Previously, the existence of a whitelist entry implied user data access. Now entries can exist solely for API tier overrides — only entries with view_user_data=True grant user-level data access. Existing rows are backfilled to True.
ApiAccessTier extracted to users/constants.py; BOT_BENCHMARKING renamed to BENCHMARKING
/users/me endpoint: reduced_api_restriction_projects replaced with project_data_access, only returned when ?with_data_access=true is passed
Renamed API surface: get-whitelist-status/ → get-data-access-status/, response key is_whitelisted → has_data_access
Frontend updated to match all backend renames (types, API client, download modal)
All changes consolidated into the existing 0008 migration
Add database constraints for project or post being null and a unique together constraint for user, project, and post.

Summary by CodeRabbit

New Features
- New API access tiers: Restricted, Benchmarking, Unrestricted.
- Per-project and per-post data-access grants with a toggle to allow viewing user-level (de‑anonymized) data.
- User profile can include project-level data-access info when requested.
Refactor
- Terminology, UI text, and API routes renamed from “whitelist” to “data access” across web UI, exports, and downloads.

coderabbitai · 2026-03-14T19:23:59Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Renames WhitelistUser → UserDataAccess, adds api_access_tier and view_user_data, moves ApiAccessTier to users.constants, updates related migrations, serializers, utils, views, URLs, frontend types/API/UI, admin, and tasks to use "data access" terminology and behavior.

Changes

Cohort / File(s)	Summary
Models & Migrations `misc/models.py`, `misc/migrations/0008_whitelistuser_api_access_tier.py`, `users/models.py`	Model renamed `WhitelistUser` → `UserDataAccess`; added `api_access_tier` and `view_user_data`; FK `related_name` changed to `data_accesses`; data migration sets existing `view_user_data=True`.
Constants & User migration `users/constants.py`, `users/migrations/0016_user_api_access_tier.py`	Introduced `ApiAccessTier` TextChoices (RESTRICTED, BENCHMARKING, UNRESTRICTED); updated user migration choice value from `bot_benchmarking` → `benchmarking`.
Backend utils, views & URLs `misc/utils.py`, `misc/views.py`, `misc/urls.py`, `utils/views.py`	Replaced whitelist helpers/endpoints with data-access equivalents (`get_whitelist_status` → `get_data_access_status`); checks query `UserDataAccess` (filter `view_user_data=True`); response key renamed to `has_data_access`.
Serializers & User API `users/serializers.py`, `users/views.py`, `utils/serializers.py`	Added `project_data_access` and `UserPrivateDataAccessSerializer`; `current_user_api_view` can return data-access-aware serializer via `with_data_access` flag; replaced internal `is_whitelisted` checks with `has_data_access`.
Frontend (types, API, UI) `front_end/src/types/utils.ts`, `front_end/src/services/api/posts/posts.shared.ts`, `front_end/src/app/.../download_question_data_modal/index.tsx`	Renamed `WhitelistStatus` → `DataAccessStatus`, `is_whitelisted` → `has_data_access`; API endpoint `/get-data-access-status/`; UI state/logic updated to `dataAccessStatus`.
Admin `misc/admin.py`	Admin registration updated to `UserDataAccess` and admin class renamed to `UserDataAccessAdmin`.
Utilities & Tasks `utils/csv_utils.py`, `utils/tasks.py`	Renamed parameter `is_whitelisted` → `has_data_access` and propagated through CSV export and email task.
Other `misc/utils.py`, `utils/views.py`, `utils/serializers.py`	Consistent renames from "whitelist"/`is_whitelisted` to "data access"/`has_data_access`; filtering now considers `view_user_data` and `api_access_tier` where applicable.

Sequence Diagram(s)

sequenceDiagram
    participant FE as Frontend (Client)
    participant API as Backend API (misc.views)
    participant Logic as Business Logic (misc.utils)
    participant DB as Database (UserDataAccess)
    FE->>API: GET /get-data-access-status?post_id&project_id
    API->>Logic: get_data_access_status(user, post_id, project_id)
    Logic->>DB: query UserDataAccess (user, project/post/null, view_user_data, api_access_tier)
    DB-->>Logic: matching entries / tiers
    Logic-->>API: (has_data_access, view_deanonymized_data)
    API-->>FE: JSON { has_data_access, view_deanonymized_data }

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

issue/4466/api-access/comments-and-bot_benchmarking #4488 — touches ApiAccessTier choices and the same user migration adjusting benchmarking enum value.

Suggested reviewers

elisescu
ncarazon
hlbmtc

Poem

🐰 I hopped from "whitelist" to brighter grass,
New tiers and flags in one tidy pass.
Models migrated, endpoints renamed,
Data-access rules now neatly framed.
🥕 Hop on—data's ready!

🚥 Pre-merge checks | ✅ 1 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The pull request title is vague and uses non-descriptive branch naming conventions rather than clearly summarizing the main change.	Improve the title to clearly describe the main change, e.g., 'Rename WhitelistUser model to UserDataAccess with project-specific data access grants' or similar.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch issue/4466/api-access/endpoint-updates

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comments/services/feed.py`:
- Around line 73-79: The current checks treat author_is_staff as truthy so False
is treated like "not provided"; update the conditional logic to detect presence
and explicit True/False values: use "author_is_staff is not None" to detect a
provided boolean and "author_is_staff is True" / "author_is_staff is False" for
behavior decisions. Concretely, change the branch conditions around author and
author_is_staff (the if that currently reads "if author is not None and
author_is_staff", the "elif author_is_staff", and related qs.filter calls) to
explicitly check for is not None and compare to True/False, and implement the
corresponding filters (author_id, author__is_staff=True, author__is_staff=False,
and parent=None where needed).

In `@users/serializers.py`:
- Around line 138-142: get_reduced_api_restriction_projects is returning
duplicate project IDs and loads full WhitelistUser objects; change the query on
user.whitelists to select only project_id and deduplicate in the DB by using
values_list('project_id', flat=True).distinct() combined with the existing
project_id__isnull=False filter so the method returns a lean, unique list of
project IDs without instantiating full model instances.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 912be90c-bc22-4669-8060-485dd631298d

📥 Commits

Reviewing files that changed from the base of the PR and between ec238b4 and 2e0916b.

📒 Files selected for processing (8)

comments/serializers/common.py
comments/services/feed.py
misc/migrations/0008_whitelistuser_view_forecaster_data.py
misc/models.py
misc/utils.py
users/migrations/0016_user_api_access_tier.py
users/models.py
users/serializers.py

comments/services/feed.py

users/serializers.py

github-actions · 2026-03-14T19:34:44Z

Cleanup: Preview Environment Removed

The preview environment for this PR has been destroyed.

Resource	Status
🌐 Preview App	Deleted
🗄️ PostgreSQL Branch	Deleted
⚡ Redis Database	Deleted
🔧 GitHub Deployments	Removed
📦 Docker Image	Retained (auto-cleanup via GHCR policies)

Cleanup triggered by PR close at 2026-03-26T19:37:30Z

addresses main site parts of primary spec of #4466 add bot_benchmarking to api access tiers add author_is_staff optional param to comments endpoint

…/api-access/comments-and-bot_benchmarking

…/api-access/endpoint-updates

…from user data permissions - Rename WhitelistUser model to UserDataAccess across backend, frontend, and migrations - Replace view_forecaster_data field with view_user_data (default False) to explicitly gate user-level data access, separate from API tier grants - Add api_access_tier field to UserDataAccess for project/post-scoped API tier overrides - Extract ApiAccessTier enum to users/constants.py - Rename all related identifiers: whitelists -> data_accesses, is_whitelisted -> has_data_access, get-whitelist-status -> get-data-access-status - Update /users/me endpoint: rename reduced_api_restriction_projects to project_data_access, only return it when ?with_data_access=true is passed - Migration backfills view_user_data=True for all existing rows

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

misc/utils.py (1)

27-45: ⚠️ Potential issue | 🟠 Major

Check all projects attached to the post, not just default_project.

utils/views.py:156-172 now treats any post.projects membership as sufficient for has_data_access. Here, Lines 29-31 only match post.default_project, and Lines 37-45 reuse that same narrowed project for the admin shortcut. Users whose access is granted through a non-default project on the post will get a false negative from this helper.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@misc/utils.py` around lines 27 - 45, The helper narrows checks to
post.default_project which misses permissions granted via other projects on a
post; update the logic in the has_data_access helper to iterate over
post.projects (or use post.projects.all()) instead of using
post.default_project: when post_id is set, collect data_access_entries for every
project in post.projects and also check
ProjectUserPermission.objects.filter(user=user, project__in=post.projects.all(),
permission=ObjectPermission.ADMIN).exists() so the admin shortcut and
data_access_entries include all projects attached to the post rather than only
the default_project.

utils/csv_utils.py (1)

214-236: ⚠️ Potential issue | 🔴 Critical

Tighten score scoping for only_include_user_ids and anonymous callers.

user_forecasts is re-scoped after Line 146, but the score branch is not. If a non-privileged caller provides only_include_user_ids, Lines 221-228 will return those users’ score rows. And when user is None, Line 232 collapses to Q(user__isnull=True) | Q(), which matches every score. Reapply the caller restriction after the optional ID filter, and use only Q(user__isnull=True) for anonymous exports.

Possible tightening

-        elif only_include_user_ids:
+        elif only_include_user_ids:
+            allowed_user_ids = set(only_include_user_ids)
+            if not (has_data_access or is_staff):
+                allowed_user_ids &= {user.id} if user else set()
             # only include user-specific scores for the given user_ids
             scores = scores.filter(
-                Q(user_id__in=only_include_user_ids) | Q(user__isnull=True)
+                Q(user_id__in=allowed_user_ids) | Q(user__isnull=True)
             )
             archived_scores = archived_scores.filter(
-                Q(user_id__in=only_include_user_ids) | Q(user__isnull=True)
+                Q(user_id__in=allowed_user_ids) | Q(user__isnull=True)
             )
         elif not (has_data_access or is_staff):
             # only include user-specific scores for the logged-in user
             scores = scores.filter(
-                Q(user__isnull=True) | (Q(user=user) if user else Q())
+                Q(user__isnull=True)
+                | (Q(user=user) if user else Q(user__isnull=True))
             )
             archived_scores = archived_scores.filter(
-                Q(user__isnull=True) | (Q(user=user) if user else Q())
+                Q(user__isnull=True)
+                | (Q(user=user) if user else Q(user__isnull=True))
             )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@utils/csv_utils.py` around lines 214 - 236, The scores/archived_scores query
branch incorrectly allows broader access when only_include_user_ids is set and
when user is None; update the logic so that after applying the optional
only_include_user_ids filter you reapply the caller restriction when not
(has_data_access or is_staff), and for anonymous callers (user is None) use only
Q(user__isnull=True) instead of Q(user__isnull=True) | Q(), i.e. ensure scores
and archived_scores are additionally filtered to (Q(user__isnull=True) |
Q(user=user)) for logged-in callers and to Q(user__isnull=True) for anonymous
callers while still honoring only_include_user_ids.

🧹 Nitpick comments (1)

misc/models.py (1)
111-120: Disallow deanonymized-only grants that never take effect.

Lines 111-119 allow view_deanonymized_data=True while view_user_data=False, but misc/utils.py filters entries by view_user_data=True before it checks deanonymization. That makes these rows look more permissive in admin than they are at runtime. A small CheckConstraint or matching admin validation would keep the permission matrix consistent.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@misc/models.py` around lines 111 - 120, Add a constraint and/or admin
validation to prevent rows where view_deanonymized_data is True while
view_user_data is False: in the model that defines the fields view_user_data and
view_deanonymized_data, add a CheckConstraint enforcing "NOT
view_deanonymized_data OR view_user_data" (i.e., view_deanonymized_data implies
view_user_data) and add matching clean()/ModelAdmin form validation to reject or
warn on such combinations so admin UI and runtime filtering remain consistent.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@misc/utils.py`:
- Around line 27-45: The helper narrows checks to post.default_project which
misses permissions granted via other projects on a post; update the logic in the
has_data_access helper to iterate over post.projects (or use
post.projects.all()) instead of using post.default_project: when post_id is set,
collect data_access_entries for every project in post.projects and also check
ProjectUserPermission.objects.filter(user=user, project__in=post.projects.all(),
permission=ObjectPermission.ADMIN).exists() so the admin shortcut and
data_access_entries include all projects attached to the post rather than only
the default_project.

In `@utils/csv_utils.py`:
- Around line 214-236: The scores/archived_scores query branch incorrectly
allows broader access when only_include_user_ids is set and when user is None;
update the logic so that after applying the optional only_include_user_ids
filter you reapply the caller restriction when not (has_data_access or
is_staff), and for anonymous callers (user is None) use only
Q(user__isnull=True) instead of Q(user__isnull=True) | Q(), i.e. ensure scores
and archived_scores are additionally filtered to (Q(user__isnull=True) |
Q(user=user)) for logged-in callers and to Q(user__isnull=True) for anonymous
callers while still honoring only_include_user_ids.

---

Nitpick comments:
In `@misc/models.py`:
- Around line 111-120: Add a constraint and/or admin validation to prevent rows
where view_deanonymized_data is True while view_user_data is False: in the model
that defines the fields view_user_data and view_deanonymized_data, add a
CheckConstraint enforcing "NOT view_deanonymized_data OR view_user_data" (i.e.,
view_deanonymized_data implies view_user_data) and add matching
clean()/ModelAdmin form validation to reject or warn on such combinations so
admin UI and runtime filtering remain consistent.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b4f09573-398e-498b-8ce8-e437f1a7a28f

📥 Commits

Reviewing files that changed from the base of the PR and between 2e0916b and 21688ce.

📒 Files selected for processing (17)

front_end/src/app/(main)/questions/[id]/components/download_question_data_modal/index.tsx
front_end/src/services/api/posts/posts.shared.ts
front_end/src/types/utils.ts
misc/admin.py
misc/migrations/0008_whitelistuser_api_access_tier.py
misc/models.py
misc/urls.py
misc/utils.py
misc/views.py
users/constants.py
users/models.py
users/serializers.py
users/views.py
utils/csv_utils.py
utils/serializers.py
utils/tasks.py
utils/views.py

misc/migrations/0008_whitelistuser_api_access_tier.py

elisescu

Had one more inline comment, but looks good otherwise.

misc/models.py

elisescu · 2026-03-26T09:49:16Z

I added @hlbmtc to review as well

…/api-access/endpoint-updates

misc/migrations/0008_whitelistuser_api_access_tier.py

users/constants.py

lsabor temporarily deployed to testing_env March 14, 2026 19:23 — with GitHub Actions Inactive

coderabbitai bot reviewed Mar 14, 2026

View reviewed changes

comments/services/feed.py Show resolved Hide resolved

users/serializers.py Outdated Show resolved Hide resolved

lsabor added 5 commits March 14, 2026 13:03

issue/4466/api-access/comments-and-bot_benchmarking

724a295

addresses main site parts of primary spec of #4466 add bot_benchmarking to api access tiers add author_is_staff optional param to comments endpoint

Merge branch 'main' of github.com:Metaculus/metaculus into issue/4466…

7326928

…/api-access/comments-and-bot_benchmarking

save work

7c0c84c

add view_forecaster_data to WhitelistUser model

d1bd3de

add bot_benchmarking api restriction tier

55a2c5f

lsabor force-pushed the issue/4466/api-access/endpoint-updates branch from 2e0916b to 55a2c5f Compare March 14, 2026 20:05

lsabor temporarily deployed to testing_env March 14, 2026 20:05 — with GitHub Actions Inactive

lsabor changed the base branch from main to issue/4466/api-access/comments-and-bot_benchmarking March 14, 2026 20:05

simplify api restriction serialized value

7c61770

lsabor changed the title ~~Issue/4466/api access/endpoint updates~~ Issue/4466/api access/project-specific-access Mar 14, 2026

Base automatically changed from issue/4466/api-access/comments-and-bot_benchmarking to main March 14, 2026 21:45

lsabor mentioned this pull request Mar 14, 2026

API Data Access Changes #4466

Open

6 tasks

lsabor added 2 commits March 16, 2026 08:12

Merge branch 'main' of github.com:Metaculus/metaculus into issue/4466…

de777d8

…/api-access/endpoint-updates

lsabor had a problem deploying to testing_env March 16, 2026 16:10 — with GitHub Actions Failure

lsabor temporarily deployed to testing_env March 16, 2026 16:10 — with GitHub Actions Inactive

coderabbitai bot reviewed Mar 16, 2026

View reviewed changes

misc/migrations/0008_whitelistuser_api_access_tier.py Outdated Show resolved Hide resolved

rename bot_benchmarking tier to benchmarking

b35ab3c

lsabor had a problem deploying to testing_env March 16, 2026 16:24 — with GitHub Actions Failure

lsabor temporarily deployed to testing_env March 16, 2026 16:25 — with GitHub Actions Inactive

lsabor requested review from elisescu March 16, 2026 16:47

update missing migration change

5fe25e8

lsabor temporarily deployed to testing_env March 16, 2026 17:37 — with GitHub Actions Inactive

lsabor had a problem deploying to testing_env March 21, 2026 21:02 — with GitHub Actions Error

remove unused import

55bec9c

lsabor had a problem deploying to testing_env March 21, 2026 21:03 — with GitHub Actions Failure

remove field from serializer

8cb8ced

lsabor temporarily deployed to testing_env March 21, 2026 21:32 — with GitHub Actions Inactive

elisescu approved these changes Mar 26, 2026

View reviewed changes

misc/models.py Show resolved Hide resolved

elisescu requested a review from hlbmtc March 26, 2026 09:49

Merge branch 'main' of github.com:Metaculus/metaculus into issue/4466…

2ea870f

…/api-access/endpoint-updates

lsabor had a problem deploying to testing_env March 26, 2026 14:45 — with GitHub Actions Error

add db constraints

80633a5

lsabor had a problem deploying to testing_env March 26, 2026 14:46 — with GitHub Actions Failure

lsabor temporarily deployed to testing_env March 26, 2026 14:46 — with GitHub Actions Inactive

ruff

847db19

lsabor temporarily deployed to testing_env March 26, 2026 14:54 — with GitHub Actions Inactive

hlbmtc reviewed Mar 26, 2026

View reviewed changes

misc/migrations/0008_whitelistuser_api_access_tier.py Outdated Show resolved Hide resolved

hlbmtc reviewed Mar 26, 2026

View reviewed changes

users/constants.py Outdated Show resolved Hide resolved

replace table not rename it

05a6af8

lsabor temporarily deployed to testing_env March 26, 2026 18:18 — with GitHub Actions Inactive

resolve nit

2d78f9e

lsabor temporarily deployed to testing_env March 26, 2026 19:07 — with GitHub Actions Inactive

lsabor merged commit 2ecf086 into main Mar 26, 2026
13 of 14 checks passed

lsabor deleted the issue/4466/api-access/endpoint-updates branch March 26, 2026 19:37

Conversation

lsabor commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Cleanup: Preview Environment Removed

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elisescu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elisescu commented Mar 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lsabor commented Mar 14, 2026 •

edited

Loading

coderabbitai bot commented Mar 14, 2026 •

edited

Loading

github-actions bot commented Mar 14, 2026 •

edited

Loading