Skip to content

Fetch remote FITS headers to provide accurate end times in e-Callisto search results#164

Open
AbhiLohar wants to merge 2 commits intosunpy:mainfrom
AbhiLohar:fix-ecallisto-metadata
Open

Fetch remote FITS headers to provide accurate end times in e-Callisto search results#164
AbhiLohar wants to merge 2 commits intosunpy:mainfrom
AbhiLohar:fix-ecallisto-metadata

Conversation

@AbhiLohar
Copy link
Copy Markdown

@AbhiLohar AbhiLohar commented Feb 27, 2026

This PR fixes #60 by implementing remote FITS header fetching for the e-Callisto client during the search phase.

Earlier, the 'END TIME' column in Fido.search results was either blank or inaccurate because the filename scraper cannot determine the duration of an observation. I have updated the eCALLISTOClient to use the post_search_hook to perform an HTTP range request, peeking at the first few kilobytes of the remote file to extract the DATE-END and TIME-END keywords.

Key Changes:
Added _fetch_remote_header to handle O(1) data transfer of gzipped FITS headers.
Updated post_search_hook to populate the End Time column with actual metadata.
Checked Start Time is also updated from the header for better precision.

Fixes #60

Copy link
Copy Markdown
Member

@samaloney samaloney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. A couple of housekeeping issues: the title isn't very descriptive, and the big O notation isn't useful or meaningful in this context. Please retain and use the PR template which is created when the PR is opened. (You can see it here: https://github.com/sunpy/.github/blob/main/.github/PULL_REQUEST_TEMPLATE.md)

The main problem #60 was describing was in the search phase, not in the when the spectrogram is loaded where we already have as accurate as possible information.

During the search phase, Fido Scraper clients extract metadata from the file name and source-specific details. Most importantly for this issue is the start and end time of the data in the file. For Callisto, the file name doesn't contain this information, so we can either not give an end time as we currently do or assume one. Not having and end time reduce the usefulness of the search.

So the header extraction code you've written would need to be called in the client during the search phase. The headers already contain the DATE-OBS / DATE-END or similar keys which contain this information.

@AbhiLohar
Copy link
Copy Markdown
Author

Thank you @samaloney sir, for the guidance, I will work upon the requested changes and on the PR and will try to fix it as soon as possible.

@AbhiLohar AbhiLohar changed the title Implement O(1) metadata resolution for e-Callisto Fetch remote FITS headers to provide accurate end times in e-Callisto search results Feb 27, 2026
@AbhiLohar AbhiLohar force-pushed the fix-ecallisto-metadata branch 2 times, most recently from 92e4e90 to f5e6016 Compare February 27, 2026 18:01
@AbhiLohar
Copy link
Copy Markdown
Author

Hi @samaloney sir, I have updated my PR as requested, please have a look on this. I am ready for any further changes if required.

Thank you!

@AbhiLohar AbhiLohar requested a review from samaloney March 2, 2026 16:40
@samaloney
Copy link
Copy Markdown
Member

I haven't forgotten this will have another look at this later this week.

@AbhiLohar
Copy link
Copy Markdown
Author

Sure @samaloney sir, i will wait for the review.

Copy link
Copy Markdown
Member

@samaloney samaloney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting I guess since the tests pass this works but is the End Date include in the returned table?

I'd like to see the output before and with this PR for and also the time it takes before after for a few durations a few hours, a day and a week maybe? ipython's %%timeit should work

):
return False
return True
def _fetch_remote_header(self, url):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a test or two maybe one mocked and one against the real server?

end = f"{date_end} {time_end}".strip()

return (start if date_obs else None), (end if date_end else None)
except Exception:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very broad exception need to narrow down a bit to what you expect


return (start if date_obs else None), (end if date_end else None)
except Exception:
return None, None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably log or warn the user

@AbhiLohar AbhiLohar force-pushed the fix-ecallisto-metadata branch from 78193d6 to f5e6016 Compare March 30, 2026 06:00
@AbhiLohar
Copy link
Copy Markdown
Author

Hello @samaloney sir, I've pushed an update that addresses all of your feedback:

Replaced the broad Exception block with specific catches for URLError, ConnectionError, zlib.error, and ValueError.Added logging to warn the user.
Added a new test suite (test_ecallisto.py) containing:
i. A Mock Test using unittest.mock to validate the Range request, gzip decompression, and FITS parsing offline. And also did a real server test.
I am inserting the test result which i ran locally:
Screenshot 2026-03-30 124256

Sir, please let me know if any further changes are required.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

eCallisto archive frequency, time and other information

2 participants