satpy.writers.cf_writer module
Writer for netCDF4/CF.
Example usage
The CF writer saves datasets in a Scene as CF-compliant netCDF file. Here is an example with MSG SEVIRI data in HRIT format:
>>> from satpy import Scene
>>> import glob
>>> filenames = glob.glob('data/H*201903011200*')
>>> scn = Scene(filenames=filenames, reader='seviri_l1b_hrit')
>>> scn.load(['VIS006', 'IR_108'])
>>> scn.save_datasets(writer='cf', datasets=['VIS006', 'IR_108'], filename='seviri_test.nc',
exclude_attrs=['raw_metadata'])
You can select the netCDF backend using the
engine
keyword argument. If None if followsto_netcdf()
engine choices with a preference for ‘netcdf4’.For datasets with area definition you can exclude lat/lon coordinates by setting
include_lonlats=False
. If the area has a projected CRS, units are assumed to be in metre. If the area has a geographic CRS, units are assumed to be in degrees. The writer does not verify that the CRS is supported by the CF conventions. One commonly used projected CRS not supported by the CF conventions is the equirectangular projection, such as EPSG 4087.By default non-dimensional coordinates (such as scanline timestamps) are prefixed with the corresponding dataset name. This is because they are likely to be different for each dataset. If a non-dimensional coordinate is identical for all datasets, the prefix can be removed by setting
pretty=True
.Some dataset names start with a digit, like AVHRR channels 1, 2, 3a, 3b, 4 and 5. This doesn’t comply with CF https://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/build/ch02s03.html. These channels are prefixed with
"CHANNEL_"
by default. This can be controlled with the variable numeric_name_prefix to save_datasets. Setting it to None or ‘’ will skip the prefixing.
Grouping
All datasets to be saved must have the same projection coordinates x
and y
. If a scene holds datasets with
different grids, the CF compliant workaround is to save the datasets to separate files. Alternatively, you can save
datasets with common grids in separate netCDF groups as follows:
>>> scn.load(['VIS006', 'IR_108', 'HRV'])
>>> scn.save_datasets(writer='cf', datasets=['VIS006', 'IR_108', 'HRV'],
filename='seviri_test.nc', exclude_attrs=['raw_metadata'],
groups={'visir': ['VIS006', 'IR_108'], 'hrv': ['HRV']})
Note that the resulting file will not be fully CF compliant.
Dataset Encoding
Dataset encoding can be specified in two ways:
Via the
encoding
keyword argument ofsave_datasets
:>>> my_encoding = { ... 'my_dataset_1': { ... 'compression': 'zlib', ... 'complevel': 9, ... 'scale_factor': 0.01, ... 'add_offset': 100, ... 'dtype': np.int16 ... }, ... 'my_dataset_2': { ... 'compression': None, ... 'dtype': np.float64 ... } ... } >>> scn.save_datasets(writer='cf', filename='encoding_test.nc', encoding=my_encoding)
Via the
encoding
attribute of the datasets in a scene. For example>>> scn['my_dataset'].encoding = {'compression': 'zlib'} >>> scn.save_datasets(writer='cf', filename='encoding_test.nc')
See the xarray encoding documentation for all encoding options.
Note
Chunk-based compression can be specified with the compression
keyword
since
netCDF4-1.6.0 libnetcdf-4.9.0 xarray-2022.12.0
The zlib
keyword is deprecated. Make sure that the versions of
these modules are all above or all below that reference. Otherwise,
compression might fail or be ignored silently.
Attribute Encoding
In the above examples, raw metadata from the HRIT files have been excluded. If you want all attributes to be included,
just remove the exclude_attrs
keyword argument. By default, dict-type dataset attributes, such as the raw metadata,
are encoded as a string using json. Thus, you can use json to decode them afterwards:
>>> import xarray as xr
>>> import json
>>> # Save scene to nc-file
>>> scn.save_datasets(writer='cf', datasets=['VIS006', 'IR_108'], filename='seviri_test.nc')
>>> # Now read data from the nc-file
>>> ds = xr.open_dataset('seviri_test.nc')
>>> raw_mda = json.loads(ds['IR_108'].attrs['raw_metadata'])
>>> print(raw_mda['RadiometricProcessing']['Level15ImageCalibration']['CalSlope'])
[0.020865 0.0278287 0.0232411 0.00365867 0.00831811 0.03862197
0.12674432 0.10396091 0.20503568 0.22231115 0.1576069 0.0352385]
Alternatively it is possible to flatten dict-type attributes by setting flatten_attrs=True
. This is more human
readable as it will create a separate nc-attribute for each item in every dictionary. Keys are concatenated with
underscore separators. The CalSlope attribute can then be accessed as follows:
>>> scn.save_datasets(writer='cf', datasets=['VIS006', 'IR_108'], filename='seviri_test.nc',
flatten_attrs=True)
>>> ds = xr.open_dataset('seviri_test.nc')
>>> print(ds['IR_108'].attrs['raw_metadata_RadiometricProcessing_Level15ImageCalibration_CalSlope'])
[0.020865 0.0278287 0.0232411 0.00365867 0.00831811 0.03862197
0.12674432 0.10396091 0.20503568 0.22231115 0.1576069 0.0352385]
This is what the corresponding ncdump
output would look like in this case:
$ ncdump -h test_seviri.nc
...
IR_108:raw_metadata_RadiometricProcessing_Level15ImageCalibration_CalOffset = -1.064, ...;
IR_108:raw_metadata_RadiometricProcessing_Level15ImageCalibration_CalSlope = 0.021, ...;
IR_108:raw_metadata_RadiometricProcessing_MPEFCalFeedback_AbsCalCoeff = 0.021, ...;
...
- class satpy.writers.cf_writer.CFWriter(name=None, filename=None, base_dir=None, **kwargs)[source]
Bases:
Writer
Writer producing NetCDF/CF compatible datasets.
Initialize the writer object.
- Parameters:
name (str) – A name for this writer for log and error messages. If this writer is configured in a YAML file its name should match the name of the YAML file. Writer names may also appear in output file attributes.
filename (str) –
Filename to save data to. This filename can and should specify certain python string formatting fields to differentiate between data written to the files. Any attributes provided by the
.attrs
of a DataArray object may be included. Format and conversion specifiers provided by thetrollsift
package may also be used. Any directories in the provided pattern will be created if they do not exist. Example:{platform_name}_{sensor}_{name}_{start_time:%Y%m%d_%H%M%S}.tif
base_dir (str) – Base destination directories for all created files.
kwargs (dict) – Additional keyword arguments to pass to the
Plugin
class.
- static da2cf(dataarray, epoch=None, flatten_attrs=False, exclude_attrs=None, include_orig_name=True, numeric_name_prefix='CHANNEL_')[source]
Convert the dataarray to something cf-compatible.
- Parameters:
dataarray (xr.DataArray) – The data array to be converted.
epoch (str) – Reference time for encoding of time coordinates. If None, the default reference time is defined using from satpy.cf.coords import EPOCH
flatten_attrs (bool) – If True, flatten dict-type attributes.
exclude_attrs (list) – List of dataset attributes to be excluded.
include_orig_name (bool) – Include the original dataset name in the netcdf variable attributes.
numeric_name_prefix (str) – Prepend dataset name with this if starting with a digit.
- save_dataset(dataset, filename=None, fill_value=None, **kwargs)[source]
Save the dataset to a given filename.
- save_datasets(datasets, filename=None, groups=None, header_attrs=None, engine=None, epoch=None, flatten_attrs=False, exclude_attrs=None, include_lonlats=True, pretty=False, include_orig_name=True, numeric_name_prefix='CHANNEL_', **to_netcdf_kwargs)[source]
Save the given datasets in one netCDF file.
Note that all datasets (if grouping: in one group) must have the same projection coordinates.
- Parameters:
datasets (list) – List of xr.DataArray to be saved.
filename (str) – Output file.
groups (dict) – Group datasets according to the given assignment: {‘group_name’: [‘dataset1’, ‘dataset2’, …]}. The group name None corresponds to the root of the file, i.e., no group will be created. Warning: The results will not be fully CF compliant!
header_attrs – Global attributes to be included.
engine (str, optional) – Module to be used for writing netCDF files. Follows xarray’s
to_netcdf()
engine choices with a preference for ‘netcdf4’.epoch (str, optional) – Reference time for encoding of time coordinates. If None, the default reference time is defined using from satpy.cf.coords import EPOCH.
flatten_attrs (bool, optional) – If True, flatten dict-type attributes.
exclude_attrs (list, optional) – List of dataset attributes to be excluded.
include_lonlats (bool, optional) – Always include latitude and longitude coordinates, even for datasets with area definition.
pretty (bool, optional) – Don’t modify coordinate names, if possible. Makes the file prettier, but possibly less consistent.
include_orig_name (bool, optional) – Include the original dataset name as a variable attribute in the final netCDF.
numeric_name_prefix (str, optional) – Prefix to add to each variable with a name starting with a digit. Use ‘’ or None to leave this out.
- satpy.writers.cf_writer._check_backend_versions()[source]
Issue warning if backend versions do not match.