SP-2501: NB 300 adapt the DP0.2 anomalies tutorial for DP1#51
Open
SP-2501: NB 300 adapt the DP0.2 anomalies tutorial for DP1#51
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Contributor
|
Hi @plazas, Thanks for your patience with my review. This is an informative, instructive, clear, and well-written notebook. The comments below are from reading the notebook; I haven't run it yet, but I will do that soon.
|
0306590 to
2b778ce
Compare
1bceac9 to
2f01be4
Compare
85964de to
4abd2dc
Compare
Update notebook
…ion with isolation forests - Rename notebook from Anomaly_detection to Outlier_rejection_isolation_forests - Add definitions of outliers, anomalies, and novelties - Add citations to Liu et al. 2008 (Isolation Forest), Malanchev et al. 2021 (ZTF) - Add context about ZTF anomaly detection pipelines - Mention other techniques (OCSVM, autoencoders, LOF) with citations - Retarget notebook as pedagogical tool for understanding ML with different dimensions - Limit data to bright but not-too-bright stars (r_scienceFluxMean 1e4-5e5 nJy) - Expand exercises section with feature exploration tasks - Reference tutorials 207.1 and 207.2 for feature descriptions - Follow RTN-045 guidelines for notebook structure
aafe76d to
ee2500d
Compare
…tutorial Major changes: - Run Isolation Forest with 2, 3, 4, 5, and 6 features progressively - Add comparison visualizations showing how outliers change with dimensionality - Create 6 subplots comparing score distributions across feature sets - Add overlap analysis showing which outliers are consistent across dimensions - Plot outliers in feature space (psfFluxMean vs psfFluxSigma and psfFluxLinearSlope) - Update to use 6D outliers for light curve visualization - Add citation to arXiv:2510.23702 for outlier/anomaly definitions - Expand feature list to include Chi2, MAD, and Skew - Shorten exercises section to 4 focused tasks - Update section numbering and structure - Run pre-commit (nbstripout)
- Add 7th feature (r_psfFluxPercentile95) for 7D analysis - Update all figures to 7 panels (3x3 grid with 2 empty) - Remove 'we' usage throughout (use passive voice) - Format markdown cells: one sentence per line - Remove all comments from Python code cells - Add proper figure captions after each figure following RTN-045 - Add SNAD collaboration citation (Ishida et al. 2021) - Add Section 6: Next steps after outlier identification - Describe ZTF workflow for validation - Mention visual inspection, cross-matching, follow-up - Cite SNAD/Malanchev et al. as example - Update section numbering (exercises now Section 7) - Update all references to use 7D outliers for light curves - Fix typo: bulter -> butler, pallettes -> palettes - Run pre-commit (nbstripout applied)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.