MultiScene (Experimental)

Scene objects in Satpy are meant to represent a single geographic region at a specific single instant in time or range of time. This means they are not suited for handling multiple orbits of polar-orbiting satellite data, multiple time steps of geostationary satellite data, or other special data cases. To handle these cases Satpy provides the MultiScene class. The below examples will walk through some basic use cases of the MultiScene.

Warning

These features are still early in development and may change overtime as more user feedback is received and more features added.

MultiScene Creation

There are two ways to create a MultiScene. Either by manually creating and providing the scene objects,

>>> from satpy import Scene, MultiScene
>>> from glob import glob
>>> scenes = [
...    Scene(reader='viirs_sdr', filenames=glob('/data/viirs/day_1/*t180*.h5')),
...    Scene(reader='viirs_sdr', filenames=glob('/data/viirs/day_2/*t180*.h5'))
... ]
>>> mscn = MultiScene(scenes)
>>> mscn.load(['I04'])

or by using the MultiScene.from_files class method to create a MultiScene from a series of files. This uses the group_files() utility function to group files by start time or other filenames parameters.

>>> from satpy import MultiScene
>>> from glob import glob
>>> mscn = MultiScene.from_files(glob('/data/abi/day_1/*C0[12]*.nc'), reader='abi_l1b')
>>> mscn.load(['C01', 'C02'])

Added in version 0.12: The from_files and group_files functions were added in Satpy 0.12. See below for an alternative solution.

For older versions of Satpy we can manually create the Scene objects used. The glob() function and for loops are used to group files into Scene objects that, if used individually, could load the data we want. The code below is equivalent to the from_files code above:

>>> from satpy import Scene, MultiScene
>>> from glob import glob
>>> scene_files = []
>>> for time_step in ['1800', '1810', '1820', '1830']:
...     scene_files.append(glob('/data/abi/day_1/*C0[12]*s???????{}*.nc'.format(time_step)))
>>> scenes = [
...     Scene(reader='abi_l1b', filenames=files) for files in sorted(scene_files)
... ]
>>> mscn = MultiScene(scenes)
>>> mscn.load(['C01', 'C02'])

Blending Scenes in MultiScene

Scenes contained in a MultiScene can be combined in different ways.

Stacking scenes

The code below uses the blend() method of the MultiScene object to stack two separate orbits from a VIIRS sensor. By default the blend method will use the stack() function which uses the first dataset as the base of the image and then iteratively overlays the remaining datasets on top.

>>> from satpy import Scene, MultiScene
>>> from glob import glob
>>> from pyresample.geometry import AreaDefinition
>>> my_area = AreaDefinition(...)
>>> scenes = [
...    Scene(reader='viirs_sdr', filenames=glob('/data/viirs/day_1/*t180*.h5')),
...    Scene(reader='viirs_sdr', filenames=glob('/data/viirs/day_2/*t180*.h5'))
... ]
>>> mscn = MultiScene(scenes)
>>> mscn.load(['I04'])
>>> new_mscn = mscn.resample(my_area)
>>> blended_scene = new_mscn.blend()
>>> blended_scene.save_datasets()

Stacking scenes using weights

It is also possible to blend scenes together in a bit more sophisticated manner using pixel based weighting instead of just stacking the scenes on top of each other as described above. This can for instance be useful to make a cloud parameter (cover, height, etc) composite combining cloud parameters derived from both geostationary and polar orbiting satellite data close in time and over a given area. This is useful for instance at high latitudes where geostationary data degrade quickly with latitude and polar data are more frequent.

This weighted blending can be accomplished via the use of the builtin partial() function (see Partial) and the default stack() function. The stack() function can take the optional argument weights (None on default) which should be a sequence (of length equal to the number of scenes being blended) of arrays with pixel weights.

The code below gives an example of how two cloud scenes can be blended using the satellite zenith angles to weight which pixels to take from each of the two scenes. The idea being that the reliability of the cloud parameter is higher when the satellite zenith angle is small.

>>> from satpy import Scene, MultiScene,  DataQuery
>>> from functools import partial
>>> from satpy.resample import get_area_def
>>> areaid = get_area_def("myarea")
>>> geo_scene = Scene(filenames=glob('/data/to/nwcsaf/geo/files/*nc'), reader='nwcsaf-geo')
>>> geo_scene.load(['ct'])
>>> polar_scene = Scene(filenames=glob('/data/to/nwcsaf/pps/noaa18/files/*nc'), reader='nwcsaf-pps_nc')
>>> polar_scene.load(['cma', 'ct'])
>>> mscn = MultiScene([geo_scene, polar_scene])
>>> groups = {DataQuery(name='CTY_group'): ['ct']}
>>> mscn.group(groups)
>>> resampled = mscn.resample(areaid, reduce_data=False)
>>> weights = [1./geo_satz, 1./n18_satz]
>>> stack_with_weights = partial(stack, weights=weights)
>>> blended = resampled.blend(blend_function=stack_with_weights)
>>> blended_scene.save_dataset('CTY_group', filename='./blended_stack_weighted_geo_polar.nc')

Grouping Similar Datasets

By default, MultiScene only operates on datasets shared by all scenes. Use the group() method to specify groups of datasets that shall be treated equally by MultiScene, even if their names or wavelengths are different.

Example: Stacking scenes from multiple geostationary satellites acquired at roughly the same time. First, create scenes and load datasets individually:

>>> from satpy import Scene
>>> from glob import glob
>>> h8_scene = satpy.Scene(filenames=glob('/data/HS_H08_20200101_1200*'),
...                        reader='ahi_hsd')
>>> h8_scene.load(['B13'])
>>> g16_scene = satpy.Scene(filenames=glob('/data/OR_ABI*s20200011200*.nc'),
...                         reader='abi_l1b')
>>> g16_scene.load(['C13'])
>>> met10_scene = satpy.Scene(filenames=glob('/data/H-000-MSG4*-202001011200-__'),
...                           reader='seviri_l1b_hrit')
>>> met10_scene.load(['IR_108'])

Now create a MultiScene and group the three similar IR channels together:

>>> from satpy import MultiScene, DataQuery
>>> mscn = MultiScene([h8_scene, g16_scene, met10_scene])
>>> groups = {DataQuery('IR_group', wavelength=(10, 11, 12)): ['B13', 'C13', 'IR_108']}
>>> mscn.group(groups)

Finally, resample the datasets to a common grid and blend them together:

>>> from pyresample.geometry import AreaDefinition
>>> my_area = AreaDefinition(...)
>>> resampled = mscn.resample(my_area, reduce_data=False)
>>> blended = resampled.blend()  # you can also use a custom blend function

You can access the results via blended['IR_group'].

Timeseries

Using the blend() method with the timeseries() function will combine multiple scenes from different time slots by time. A single Scene with each dataset/channel extended by the time dimension will be returned. If used together with the to_geoviews() method, creation of interactive timeseries Bokeh plots is possible.

>>> from satpy import Scene, MultiScene
>>> from satpy.multiscene import timeseries
>>> from glob import glob
>>> from pyresample.geometry import AreaDefinition
>>> my_area = AreaDefinition(...)
>>> scenes = [
...    Scene(reader='viirs_sdr', filenames=glob('/data/viirs/day_1/*t180*.h5')),
...    Scene(reader='viirs_sdr', filenames=glob('/data/viirs/day_2/*t180*.h5'))
... ]
>>> mscn = MultiScene(scenes)
>>> mscn.load(['I04'])
>>> new_mscn = mscn.resample(my_area)
>>> blended_scene = new_mscn.blend(blend_function=timeseries)
>>> blended_scene['I04']
<xarray.DataArray (time: 2, y: 1536, x: 6400)>
dask.array<shape=(2, 1536, 6400), dtype=float64, chunksize=(1, 1536, 4096)>
Coordinates:
  * time     (time) datetime64[ns] 2012-02-25T18:01:24.570942 2012-02-25T18:02:49.975797
Dimensions without coordinates: y, x

Saving frames of an animation

The MultiScene can take “frames” of data and join them together in a single animation movie file. Saving animations requires the imageio python library and for most available formats the ffmpeg command line tool suite should also be installed. The below example saves a series of GOES-EAST ABI channel 1 and channel 2 frames to MP4 movie files.

>>> from satpy import Scene, MultiScene
>>> from glob import glob
>>> mscn = MultiScene.from_files(glob('/data/abi/day_1/*C0[12]*.nc'), reader='abi_l1b')
>>> mscn.load(['C01', 'C02'])
>>> mscn.save_animation('{name}_{start_time:%Y%m%d_%H%M%S}.mp4', fps=2)

This will compute one video frame (image) at a time and write it to the MPEG-4 video file. For users with more powerful systems it is possible to use the client and batch_size keyword arguments to compute multiple frames in parallel using the dask distributed library (if installed). See the dask distributed documentation for information on creating a Client object. If working on a cluster you may want to use dask jobqueue to take advantage of multiple nodes at a time.

It is possible to add an overlay or decoration to each frame of an animation. For text added as a decoration, string substitution will be applied based on the attributes of the dataset, for example:

>>> mscn.save_animation(
...     "{name:s}_{start_time:%Y%m%d_%H%M}.mp4",
...     enh_args={
...     "decorate": {
...         "decorate": [
...             {"text": {
...                 "txt": "time {start_time:%Y-%m-%d %H:%M}",
...                 "align": {
...                     "top_bottom": "bottom",
...                     "left_right": "right"},
...                 "font": '/usr/share/fonts/truetype/arial.ttf',
...                 "font_size": 20,
...                 "height": 30,
...                 "bg": "black",
...                 "bg_opacity": 255,
...                 "line": "white"}}]}})

If your file covers ABI MESO data for an hour for channel 2 lasting from 2020-04-12 01:00-01:59, then the output file will be called C02_20200412_0100.mp4 (because the first dataset/frame corresponds to an image that started to be taken at 01:00), consist of sixty frames (one per minute for MESO data), and each frame will have the start time for that frame floored to the minute blended into the frame. Note that this text is “burned” into the video and cannot be switched on or off later.

Warning

GIF images, although supported, are not recommended due to the large file sizes that can be produced from only a few frames.

Saving multiple scenes

The MultiScene object includes a save_datasets() method for saving the data from multiple Scenes to disk. By default this will operate on one Scene at a time, but similar to the save_animation method above this method can accept a dask distributed Client object via the client keyword argument to compute scenes in parallel (see documentation above). Note however that some writers, like the geotiff writer, do not support multi-process operations at this time and will fail when used with dask distributed. To save multiple Scenes use:

>>> from satpy import Scene, MultiScene
>>> from glob import glob
>>> mscn = MultiScene.from_files(glob('/data/abi/day_1/*C0[12]*.nc'), reader='abi_l1b')
>>> mscn.load(['C01', 'C02'])
>>> mscn.save_datasets(base_dir='/path/for/output')

Combining multiple readers

Added in version 0.23.

The from_files() constructor allows to automatically combine multiple readers into a single MultiScene. It is no longer necessary for the user to create the Scene objects themselves. For example, you can combine Advanced Baseline Imager (ABI) and Global Lightning Mapper (GLM) measurements. Constructing a multi-reader MultiScene requires more parameters than a single-reader MultiScene, because Satpy can poorly guess how to group files belonging to different instruments. For an example creating a video with lightning superimposed on ABI channel 14 (11.2 µm) using the built-in composite C14_flash_extent_density, which superimposes flash extent density from GLM (read with the NCGriddedGLML2 or glm_l2 reader) on ABI channel 14 data (read with the NC_ABI_L1B or abi_l1b reader), and therefore needs Scene objects that combine both readers:

>>> glm_dir = "/path/to/GLMC/"
>>> abi_dir = "/path/to/ABI/"
>>> ms = satpy.MultiScene.from_files(
...        glob.glob(glm_dir + "OR_GLM-L2-GLMC-M3_G16_s202010418*.nc") +
...        glob.glob(abi_dir + "C*/OR_ABI-L1b-RadC-M6C*_G16_s202010418*_e*_c*.nc"),
...        reader=["glm_l2", "abi_l1b"],
...        ensure_all_readers=True,
...        group_keys=["start_time"],
...        time_threshold=30)
>>> ms.load(["C14_flash_extent_density"])
>>> ms = ms.resample(ms.first_scene["C14"].attrs["area"])
>>> ms.save_animation("/path/for/output/{name:s}_{start_time:%Y%m%d_%H%M}.mp4")

In this example, we pass to from_files() the additional parameters ensure_all_readers=True, group_keys=["start_time"], time_threshold=30 so we only get scenes at times that both ABI and GLM have a file starting within 30 seconds from each other, and ignore all other differences for the purposes of grouping the two. For this example, the ABI files occur every 5 minutes but the GLM files (processed with glmtools) every minute. Scenes where there is a GLM file without an ABI file starting within at most ±30 seconds are skipped. The group_keys and time_threshold keyword arguments are processed by the group_files() function. The heavy work of blending the two instruments together is performed by the BackgroundCompositor class through the “C14_flash_extent_density” composite.