HST_Notebooks: HST Data Reduction Pipeline
- HST_Notebooks is a suite of Jupyter notebooks demonstrating an end-to-end reduction workflow for HST/WFC3 exoplanet transit data.
- The pipeline employs methods like atlas-based background subtraction, cosmic ray correction, and polynomial spectral trace fitting for accurate light curve generation.
- Its modular design supports reproducible, open-access research crucial for planetary transmission spectroscopy and broader exoplanet atmospheric studies.
The HST_Notebooks GitHub repository, maintained by the Space Telescope Science Institute, provides a suite of Jupyter notebooks demonstrating methods for the reduction and analysis of astronomical datasets from the Hubble Space Telescope (HST). One of its prominent workflows, detailed in Alam et al. “Analyzing Exoplanet Transits Observed with the WFC3/UVIS G280 Grism” (Alam et al., 12 Nov 2025), implements a complete, modular reduction pipeline for time-series exoplanet transit observations with the WFC3/UVIS G280 grism. The notebook, available at https://github.com/spacetelescope/hst_notebooks/tree/main/notebooks/WFC3/uvis_g280_transit, illustrates every step necessary to derive normalized broadband and spectroscopic light curves suitable for planetary transmission spectroscopy.
1. Objectives and Repository Structure
The central objective of the G280 transit notebook is to present an end-to-end data reduction workflow, converting calibrated ("flt") HST/WFC3 exposures taken with the G280 grism into science-ready light curves. This supports limb-darkened transit fitting and planetary transmission spectrum analysis. The structure of the example directory under notebooks/WFC3/uvis_g280_transit/ highlights the modularity:
| File/Directory | Contents |
|---|---|
| requirements.txt | List of pinned Python dependencies (astroquery, numpy, scipy, astropy, matplotlib, etc.) |
| g280_transit_tools.py | Modular helper routines for each processing stage |
| G280_Exoplanet_Transits.ipynb | The main tutorial and workflow notebook |
The repository is organized to facilitate both didactic review and modular re-use in research pipelines.
2. Computational Environment and Data Acquisition
Environment preparation is achieved by cloning the repository and installing all required Python packages via the provided requirements.txt. Dependencies include, but are not limited to, astroquery, numpy, scipy, astropy, matplotlib, GRISMCONF, and wfc3tools. Data retrieval leverages Astroquery for seamless, programmatic access to HST/MAST, specifically downloading G280-calibrated (FLT) files for targets such as HAT-P-41b.
Example setup:
1 2 3 |
git clone https://github.com/spacetelescope/hst_notebooks.git cd hst_notebooks/notebooks/WFC3/uvis_g280_transit pip install -r requirements.txt |
The downloaded files populate a local data/flt/ directory, with subsequent processing outputs organized under data/flt_clean/ (post background/cosmic-ray cleaning) and data/flt_full/ (full-frame embedded images). The workflow requires users to also acquire G280 sky frames for both UVIS1 & UVIS2 detectors from the STScI grism-resources site.
3. Data Reduction Methodology: Preprocessing and Cleaning
This workflow emphasizes accurate photometric recovery and outlier mitigation through several preprocessing steps.
Background Subtraction
Rather than utilizing local histogram-based approaches, the pipeline applies a median-stacked, source-masked G280 sky image, scaled to the exposure’s median flux. For each exposure :
- Compute .
- Compute scaled sky frame where is the atlas sky image.
- Subtract: .
The use of this atlas approach is intended to provide superior precision over local pixel-based modeling.
Cosmic Ray Correction
Cosmic ray (CR) correction is implemented as a two-stage process:
- Temporal Correction: For each pixel, if , replace with the cube median at that position.
- Spatial Correction: For individual frames, apply a median filter, replacing any pixel deviating by from its local neighborhood.
Both steps yield cleaned science frames and Boolean masks indicating all pixel replacements.
4. Astrometric Alignment and Spectral Trace Analysis
The G280 grism data are acquired as subarray images, necessitating their embedding into full UVIS2 detector frames (using header keywords NAXIS, LTV1, LTV2). This reconstitution enables robust trace and dispersion mapping.
Spectral trace fitting leverages polynomial coefficients supplied by the GRISMCONF reference (typically of the form , ), yielding per-frame arrays of trace coordinates and wavelength solutions. The procedures produce both the pixel-wise trace and calibration arrays: , , (wavelength per pixel), and a sensitivity array for count-to-flux conversion.
5. Spectral Extraction, Time-series Construction, and Light Curve Generation
1D Spectral Extraction
For each full-frame exposure and spectral order, a 1D spectrum is extracted by summing pixels along the cross-dispersion () direction, for each column lying on :
where defines the half-aperture (e.g., $10$ pixels).
Time-series and Light Curve Formation
Looping the extraction process over all cleaned images yields a 2D flux array indexed by time and wavelength. Broadband (“white-light”) light curves are formed by summing fluxes over a specified wavelength range and normalizing to the out-of-transit baseline:
Spectroscopic light curves result from repeating this operation in user-specified wavelength bins, producing a set of time series suitable for wavelength-resolved transit depth analysis.
6. Data Export, Integration with Public Tools, and Downstream Analysis
The notebook provides routines to export both broadband and spectroscopic light curves as CSV or ASCII tables:
1 2 3 4 |
import numpy as np np.savetxt('white_light_curve_+1.csv', np.column_stack([wlc_times, wlc_flux]), header='time_MJD,flux_norm') |
- PacMan: ASCII tables of time, flux, and uncertainty per channel.
- WFC3 pipeline: Similarly-structured inputs.
- Eureka!: Dataframes or CSVs.
The notebook explicitly concludes at the delivery of normalized light curves. Users are encouraged to perform transit light curve fitting—e.g., extracting in each wavelength bin—using packages appropriate to their modeling needs, such as those implementing the Mandel & Agol (2002) quadratic limb darkening formalism:
where and gives the occulted area fraction.
Compilation of across wavelength bins forms the basis for deriving the planetary transmission spectrum.
7. Context, Adaptability, and Significance
The described pipeline is generalizable to any HST/WFC3 G280 time series where the source is observed in the 0th, 1st, and higher spectral orders. The modular design of g280_transit_tools.py enables adaptation to alternate targets, aperture definitions, and spectral orders. A plausible implication is that these scripts may serve as a template for extended analyses of UV-optical exoplanet atmospheres observed with HST/WFC3/UVIS G280, supporting rapid data reduction reproducibility and facilitating comparison with previous analyses.
The HST_Notebooks repository thus provides not only a canonical reduction solution but a model for open-access, reproducible, and extensible time-domain spectrophotometric workflows in exoplanetary astronomy.