Skip to content

fix(#149): replace hardcoded patch count with dynamic positional embedding in CNNT#199

Open
vignathi123-vi wants to merge 2 commits intoML4SCI:mainfrom
vignathi123-vi:fix/149-cnnt-dynamic-positional-embedding
Open

fix(#149): replace hardcoded patch count with dynamic positional embedding in CNNT#199
vignathi123-vi wants to merge 2 commits intoML4SCI:mainfrom
vignathi123-vi:fix/149-cnnt-dynamic-positional-embedding

Conversation

@vignathi123-vi
Copy link

@vignathi123-vi vignathi123-vi commented Mar 22, 2026

Closes #149
The CNNT model hardcoded num_patches = 16 * 16 in init, causing
shape mismatch errors for variable input resolutions.

Changes made:
Replaced hardcoded num_patches with max_patches = 256
Added dynamic slicing in forward() for inputs smaller than max_patches
Added interpolation in forward() for inputs larger than max_patches
This makes CNNT truly resolution-agnostic across all DeepLense datasets
and input sizes. Happy to make adjustments if needed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CNNT assumes fixed patch count causing positional embedding mismatch for different input sizes

1 participant