SatPy is designed to provide easy access to common operations for processing meteorological remote sensing data. Any details needed to perform these operations are configured internally to SatPy meaning users should not have to worry about how something is done, only ask for what they want. Most of the features provided by SatPy can be configured by keyword arguments (see the API Documentation or other specific section for more details). For more complex customizations or added features SatPy uses a set of configuration files that can be modified by the user. The various components and concepts of SatPy are described below. The Quickstart guide also provides simple example code for the available features of SatPy.
SatPy provides most of its functionality through the
Scene class. The acts as a container for the datasets
being operated on and provides methods for acting on those datasets. It
attempts to reduce the amount of low-level knowledge needed by the user while
still providing a pythonic interface to the functionality underneath.
A Scene object represents a single geographic region of data, typically at a single continuous time range. It is possible to combine Scenes to form a Scene with multiple regions or multiple time observations, but it is not guaranteed that all functionality works in these situations.
SatPy’s lower-level container for data is the
xarray.DataArray. For historical reasons DataArrays are often
referred to as “Datasets” in SatPy. These objects act similar to normal
numpy arrays, but add additional metadata and attributes for describing the
data. Metadata is stored in a
.attrs dictionary and named dimensions can
be accessed in a
.dims attribute, along with other attributes.
In most use cases these objects can be operated on like normal NumPy arrays
with special care taken to make sure the metadata dictionary contains
expected values. See the XArray documentation for more info on handling
Additionally, SatPy uses a special form of DataArrays where data is stored
dask.array.Array objects which allows SatPy to perform
multi-threaded lazy operations vastly improving the performance of processing.
For help on developing with dask and xarray see
Migrating to xarray and dask or the documentation for the specific
To uniquely identify
DataArray objects SatPy uses DatasetID. A
DatasetID consists of various pieces of available metadata. This usually
includes name and wavelength as identifying metadata, but also includes
resolution, calibration, polarization, and additional modifiers
to further distinguish one dataset from another.
XArray includes other object types called “Datasets”. These are different from the “Datasets” mentioned in SatPy.
One of the biggest advantages of using SatPy is the large number of input file formats that it can read. It encapsulates this functionality in to individual Readers. SatPy Readers handle all of the complexity of reading whatever format they represent. Meteorological Satellite file formats can be extremely complex and formats are rarely reused across satellites or instruments. No matter the format, SatPy’s Reader interface is meant to provide a consistent data loading interface while still providing flexibility to add new complex file formats.
Many users of satellite imagery combine multiple sensor channels to bring out certain features of the data. This includes using one dataset to enhance another, combining 3 or more datasets in to an RGB image, or any other combination of datasets. SatPy comes with a lot of common composite combinations built-in and allows the user to request them like any other dataset. SatPy also makes it possible to create your own custom composites and have SatPy treat them like any other dataset. See Composites for more information.
Satellite imagery data comes in two forms when it comes to geolocation, native satellite swath coordinates and uniform gridded projection coordinates. It is also common to see the channels from a single sensor in multiple resolutions, making it complicated to combine or compare the datasets. Many use cases of satellite data require the data to be in a certain projection other than the native projection or to have output imagery cover a specific area of interest. SatPy makes it easy to resample datasets to allow for users to combine them or grid them to these projections or areas of interest. SatPy uses the PyTroll pyresample package to provide nearest neighbor, bilinear, or elliptical weighted averaging resampling methods. See Resampling for more information.
When making images from satellite data the data has to be manipulated to be compatible with the output image format and still look good to the human eye. SatPy calls this functionality “enhancing” the data, also commonly called scaling or stretching the data. This process can become complicated not just because of how subjective the quality of an image can be, but also because of historical expectations of forecasters and other users for how the data should look. SatPy tries to hide the complexity of all the possible enhancement methods from the user and just provide the best looking image by default. SatPy still makes it possible to customize these procedures, but in most cases it shouldn’t be necessary. See the documentation on Writers for more information on what’s possible for output formats and enhancing images.
SatPy is designed to make data loading, manipulating, and analysis easy. However, the best way to get satellite imagery data out to as many users as possible is to make it easy to save it in multiple formats. SatPy allows users to save data in image formats like PNG or GeoTIFF as well as data file formats like NetCDF. Each format’s complexity is hidden behind the interface of individual Writer objects and includes keyword arguments for accessing specific format features like compression and output data type. See the Writers documentation for the available writers and how to use them.