Providing datasets to measurements through blobs¶
A common pattern is to reduce a raw dataset and share that dataset between several measurements.
lsst.validate.base
expresses such datasets as blobs.
In the context of the lsst.validate.base
framework, a blob is an object that contains Datum
objects.
There are several advantages of storing input datasets in blob objects:
- Any pre-reduction of a raw dataset can be done in the blob object, keeping a codebase organized.
- Blobs can be passed to measurement objects, which simplifies the construction of measurements.
- Blobs are automatically serialized alongside measurements and are available to the SQUASH dashboard. Blobs can be shared among several measurements, with the blob data only being stored once.
Template for a blob class¶
Blobs are subclasses of BlobBase
that register one or more Datum
objects.
import astropy.units as u
from lsst.validate.base import BlobBase
class SimpleBlob(BlobBase):
name = 'SimpleBlob'
def __init__(self, g_mags, i_mags):
BlobBase.__init__(self)
self.register_datum('g', quantity=g_mags*u.mag, description='g-band magnitudes')
self.register_datum('i', quantity=i_mags*u.mag, description='i-band magnitudes')
self.register_datum('gi', description='g-i colour')
self.gi = self.g - self.i
name
is a required attribute for BlobBase
subclasses.
This name identifies the blob in the JSON output.
In this example, the g
and i
attributes are initially registered with quantities.
A third blob attribute, gi
, is also declared and its quantity is computed afterwards.
Notice that, like MeasurementBase.parameters
and MeasurementBase.extras
attributes of measurement classes, quantities contained in BlobBase
-type objects can be accessed and updated directly through instance attributes.
Accessing datum objects¶
Internally, blob attributes are stored as Datum
objects that can be accessed as items of the BlobBase.datums
attribute.
blob = SimpleBlob(g, i)
blob.datums['gi'].quantity # == blob.gi
blob.datums['gi'].unit # u.Unit('mag')
blob.datums['gi'].label # 'gi', this was automatically set from the name
blob.datums['gi'].description # 'g-i colour'
Linking measurements to blobs¶
When a blob is used by a measurement, the measurement class should declare that usage so that the SQUASH dashboard can provide rich context to measurements. Measurement classes can accomplish this simply by making the blob an instance attribute. For example:
class MeanColor(MeasurementBase):
def __init__(self, simple_blob):
self.metric = Metric.from_yaml(self.label)
self.simple_blob = simple_blob
self.quantity = np.mean(self.simple_blob.gi)
Accessing blobs in measurements¶
In addition to simply accessing blobs associated with a measurement through the instance attribute, blobs are also available as items of the measurement’s MeasurementBase.blobs
attribute:
color = SimpleBlob(g, i)
mean_color = MeanColor(color)
mean_color.blobs['simple_blob'].gi # array of g-i colours