Papers
Topics
Authors
Recent
2000 character limit reached

Getsources: Multi-Scale Source Extraction

Updated 8 September 2025
  • Getsources is a multi-scale, multi-wavelength extraction tool that decomposes images into single-scale components to isolate astrophysical sources from complex, structured backgrounds.
  • It integrates cross-band normalization, iterative noise filtering, and segmentation-based deblending to achieve robust photometry and source completeness in crowded fields.
  • Validated on Herschel and simulated data, the algorithm enhances core catalog generation, supports accurate star formation studies, and improves statistical reliability in astronomical imaging.

The getsources algorithm is an automated, multi-scale, multi-wavelength source extraction tool originally developed for the analysis of far-infrared Galactic star-forming region data from the Herschel Space Observatory. Its primary innovation lies in decomposing observed images into many finely spaced single-scale images to disentangle astrophysical source signals from structured filamentary backgrounds and noise. Unlike conventional single-threshold algorithms, getsources combines fine spatial decomposition, cross-band integration, noise filtering, segmentation-mask tracking, and iterative deblending with spatially realistic background estimation to achieve robust photometry and completeness in crowded and variable fields.

1. Overview and Conceptual Foundations

Getsources was designed to address the challenges of extracting compact and extended sources such as prestellar cores, protostars, and filaments in Herschel maps where backgrounds are highly non-Gaussian and beam sizes vary significantly between bands. The algorithm first resamples and aligns all bands to a common grid. It then produces a set of single-scale images (typically 50–100 per band), where each image isolates fluctuations on a narrow spatial frequency range using a “successive unsharp masking” process:

Il,jD=Gj1IlDGjIlD\mathcal{I}^D_{l,j} = \mathcal{G}_{j-1} * \mathcal{I}^D_l - \mathcal{G}_j * \mathcal{I}^D_l

with Gj\mathcal{G}_j denoting Gaussian smoothing kernels of increasing FWHM SjS_j.

At each spatial scale, the algorithm preserves structures with sizes Sj\sim S_j and filters out both larger and smaller scales. This approach yields a semi-Gaussian distribution of noise and background in each single-scale image, greatly facilitating robust thresholding.

2. Multi-Scale, Multi-Wavelength Integration

Getsources distinguishes itself by constructing wavelength-independent combined detection images for each scale, essential when beam sizes differ by up to \sim7-fold. Each band’s cleaned single-scale image is normalized and weighted, according to:

IjDc=1Nlfl,jϖl,jmax(Il,jDc,Tl,j)\mathcal{I}^{Dc}_j = \frac{1}{N} \sum_l \frac{f_{l,j}}{\varpi_{l,j}} \max\left(\mathcal{I}^{Dc}_{l,j}, T_{l,j}\right)

where ϖl,j\varpi_{l,j} is the cleaning threshold and fl,jf_{l,j} accounts for spatial resolution cutoff effects. By summing over wavelengths at each scale after normalization, getsources localizes sources at resolutions limited only by the shortest-wavelength data while retaining sensitivity gains from longer wavelength bands.

3. Noise and Background Filtering

In each single-scale image, noise and background fluctuations are iteratively “Gaussianized” via sigma-clipping. The algorithm estimates the threshold as:

ϖl,j=nl,jσl,j\varpi_{l,j} = n_{l,j} \cdot \sigma_{l,j}

where σl,j\sigma_{l,j} is computed outside detected peaks. Higher-order statistics (skewness sl,js_{l,j}, kurtosis kl,jk_{l,j}) are monitored and empirical constraints are imposed (such as slmax=klmax=max{2.14ln[(IlD,max/σl)+220]11.3,0.25}s^{max}_l = k^{max}_l = \max\{2.14 \ln[(I^{D,max}_l/\sigma_l)+220]-11.3, 0.25\}) to prevent excessive false positives or negatives. Iterative masking and recomputation of σ\sigma continues until convergence, facilitating effective thresholding within complex backgrounds.

4. Detection, Segmentation, and Measurement

Source detection is accomplished on the combined, cleaned, wavelength-independent detection images. A connected-component labeling (“Tint Fill”) identifies contiguous 4-connected groups of significant pixels, and segmentation masks are tracked over increasing scales to capture the growth and merging of sources. The optimal “footprinting scale” jFj_F is determined for each source, i.e., the scale where its contrast is maximized.

Measurements are then made on background-subtracted originals within elliptical footprints 2.3SjF\sim 2.3 S_{j_F}. All photometric quantities (peak, integrated fluxes; sizes; elongations; orientation) are computed via moment analysis:

E(x)=xI(x,y)d2rI(x,y)d2rE(x) = \frac{\int x I(x,y) d^2 r}{\int I(x,y) d^2 r}

Second moments yield major/minor axis FWHMs and position angles. Detection significance is defined by:

Ξ(i),l=I(i),l(jF)σl(jF)\Xi_{(i),l} = \frac{I_{(i),l}(j_F)}{\sigma_l(j_F)}

5. Iterative Deblending and Realistic Background Estimation

Deblending in crowded regions is addressed via iterative allocation of background-subtracted intensity according to empirically chosen Moffat-like profiles:

I(i),lM=F(i),lP[1+(r/R0)2]ζI^{M}_{(i),l} = F^P_{(i),l} [1 + (r/R_0)^2]^{-\zeta}

(usually ζ=10\zeta=10). For pixels included in overlapping footprints, each source’s share is proportional to its profile, with:

I(i),l=I(i),lMkI(k),lM×IlO,BSI_{(i),l} = \frac{I^{M}_{(i),l}}{\sum_k I^{M}_{(k),l}} \times I^{O,BS}_l

(repeated until convergence).

Background estimation is performed by linear interpolation from regions just outside the footprint along four principal directions, averaged per-pixel. The interpolated background image ICBO\mathcal{I}^{O}_{CB} is subtracted, yielding background-subtracted images IBSO\mathcal{I}^{O}_{BS} used for photometry.

6. Performance Validation and Data Products

Getsources was validated on simulated star-forming clouds and actual Herschel data (Aquila, Rosette). It achieves reliable extraction of both extended and compact sources, across spatial resolutions and in the presence of highly structured backgrounds. Its approach to resolution merging allows photometry and source separation at the highest available spatial scale per band.

Key products include:

  • Large catalogs enumerating source coordinates, fluxes, sizes, position angles, S/N, flags.
  • Cleaned single-scale images per band.
  • Combined (wavelength-independent) detection images; segmentation maps tracking source growth with scale.
  • De-blended source images.
  • Background and noise maps for reliability analysis.

7. Scientific Applications and Implications

Getsources has enabled systematic construction of core catalogs, mass functions, and spatially accurate studies of star-forming regions. Its output catalogs and auxiliary images are now standard for Herschel surveys and core mass function analyses. For instance, getsources results have demonstrated robust similarities between core mass functions and the initial mass function, reinforced the role of filaments in prestellar core and protostar formation, and enabled quantitative completeness and reliability studies necessary for statistical inference.

Its general framework—multi-scale decomposition, multi-band normalization, iterative deblending, and local background estimation—is extensible to other astronomical imaging contexts with variable resolution and intricate backgrounds.


Getsources represents a comprehensive methodological advance for source extraction in astronomical imaging, integrating multi-scale spatial decomposition and multi-wavelength data fusion with rigorous noise modeling, segmentation tracking, and physically motivated measurement. Its robust completeness and photometric reliability are underpinned by quantitative validation, systematic error analysis, and extensive deployment in surveys of star-forming regions (Men'shchikov et al., 2012).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Getsources Algorithm.