Fetch remote FITS headers to provide accurate end times in e-Callisto search results#164
Fetch remote FITS headers to provide accurate end times in e-Callisto search results#164AbhiLohar wants to merge 2 commits intosunpy:mainfrom
Conversation
samaloney
left a comment
There was a problem hiding this comment.
Thanks for the PR. A couple of housekeeping issues: the title isn't very descriptive, and the big O notation isn't useful or meaningful in this context. Please retain and use the PR template which is created when the PR is opened. (You can see it here: https://github.com/sunpy/.github/blob/main/.github/PULL_REQUEST_TEMPLATE.md)
The main problem #60 was describing was in the search phase, not in the when the spectrogram is loaded where we already have as accurate as possible information.
During the search phase, Fido Scraper clients extract metadata from the file name and source-specific details. Most importantly for this issue is the start and end time of the data in the file. For Callisto, the file name doesn't contain this information, so we can either not give an end time as we currently do or assume one. Not having and end time reduce the usefulness of the search.
So the header extraction code you've written would need to be called in the client during the search phase. The headers already contain the DATE-OBS / DATE-END or similar keys which contain this information.
|
Thank you @samaloney sir, for the guidance, I will work upon the requested changes and on the PR and will try to fix it as soon as possible. |
92e4e90 to
f5e6016
Compare
|
Hi @samaloney sir, I have updated my PR as requested, please have a look on this. I am ready for any further changes if required. Thank you! |
|
I haven't forgotten this will have another look at this later this week. |
|
Sure @samaloney sir, i will wait for the review. |
samaloney
left a comment
There was a problem hiding this comment.
Interesting I guess since the tests pass this works but is the End Date include in the returned table?
I'd like to see the output before and with this PR for and also the time it takes before after for a few durations a few hours, a day and a week maybe? ipython's %%timeit should work
| ): | ||
| return False | ||
| return True | ||
| def _fetch_remote_header(self, url): |
There was a problem hiding this comment.
Need a test or two maybe one mocked and one against the real server?
| end = f"{date_end} {time_end}".strip() | ||
|
|
||
| return (start if date_obs else None), (end if date_end else None) | ||
| except Exception: |
There was a problem hiding this comment.
Very broad exception need to narrow down a bit to what you expect
|
|
||
| return (start if date_obs else None), (end if date_end else None) | ||
| except Exception: | ||
| return None, None |
There was a problem hiding this comment.
should probably log or warn the user
78193d6 to
f5e6016
Compare
|
Hello @samaloney sir, I've pushed an update that addresses all of your feedback: Replaced the broad Exception block with specific catches for URLError, ConnectionError, zlib.error, and ValueError.Added logging to warn the user. Sir, please let me know if any further changes are required. Thank you! |

This PR fixes #60 by implementing remote FITS header fetching for the e-Callisto client during the search phase.
Earlier, the 'END TIME' column in Fido.search results was either blank or inaccurate because the filename scraper cannot determine the duration of an observation. I have updated the eCALLISTOClient to use the post_search_hook to perform an HTTP range request, peeking at the first few kilobytes of the remote file to extract the DATE-END and TIME-END keywords.
Key Changes:
Added _fetch_remote_header to handle O(1) data transfer of gzipped FITS headers.
Updated post_search_hook to populate the End Time column with actual metadata.
Checked Start Time is also updated from the header for better precision.
Fixes #60