The date format is specified as ‘seconds since 1970-01-01 00:00:00’ but
+the missing values are stored as -1e+34 which is not supported by the
+default parsing mechanism in xarray.
+
This function returns replaced the missing value by NaN and returns a
+datetime instance.
This module provides functions and metadata to convert the Global Drifter
+Program (GDP) data to a clouddrift.RaggedArray instance. The functions
+defined in this module are common to both hourly (clouddrift.adapters.gdp1h)
+and six-hourly (clouddrift.adapters.gdp6h) GDP modules.
The date format is specified as 'seconds since 1970-01-01 00:00:00' but the missing values are stored as -1e+34 which is not supported by the default parsing mechanism in xarray.
From the previously sorted DataFrame of directory files, return the unique set of drifter IDs sorted by their start date (the date of the first quality-controlled data point).
The date format is specified as ‘seconds since 1970-01-01 00:00:00’ but
+the missing values are stored as -1e+34 which is not supported by the
+default parsing mechanism in xarray.
+
This function returns replaced the missing value by NaN and returns a
+datetime instance.
From the previously sorted DataFrame of directory files, return the
+unique set of drifter IDs sorted by their start date (the date of the first
+quality-controlled data point).
From the previously sorted DataFrame of directory files, return the
+unique set of drifter IDs sorted by their start date (the date of the first
+quality-controlled data point).
This module provides functions and metadata that can be used to convert the
+hourly Global Drifter Program (GDP) data to a clouddrift.RaggedArray
+instance.
Extract and preprocess the Lagrangian data and attributes.
+
This function takes an identification number that can be used to create a
+file or url pattern or select data from a Dataframe. It then preprocesses
+the data and returns a clean Xarray Dataset.
Extract and preprocess the Lagrangian data and attributes.
+
This function takes an identification number that can be used to create a
+file or url pattern or select data from a Dataframe. It then preprocesses
+the data and returns a clean Xarray Dataset.
This module provides functions and metadata that can be used to convert the
+6-hourly Global Drifter Program (GDP) data to a clouddrift.RaggedArray
+instance.
Extract and preprocess the Lagrangian data and attributes.
+
This function takes an identification number that can be used to create a
+file or url pattern or select data from a Dataframe. It then preprocesses
+the data and returns a clean Xarray Dataset.
Extract and preprocess the Lagrangian data and attributes.
+
This function takes an identification number that can be used to create a
+file or url pattern or select data from a Dataframe. It then preprocesses
+the data and returns a clean Xarray Dataset.
Özgökmen, Tamay. 2013. GLAD experiment CODE-style drifter trajectories (low-pass filtered, 15 minute interval records), northern Gulf of Mexico near DeSoto Canyon, July-October 2012. Distributed by: Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC), Harte Research Institute, Texas A&M University–Corpus Christi. doi:10.7266/N7VD6WC8
Reference: Angela Bliss, Jennifer Hutchings, Philip Anderson, Philipp Anhaus,
+Hans Jakob Belter, Jørgen Berge, Vladimir Bessonov, Bin Cheng, Sylvia Cole,
+Dave Costa, Finlo Cottier, Christopher J Cox, Pedro R De La Torre, Dmitry V Divine,
+Gilbert Emzivat, Ying-Chih Fang, Steven Fons, Michael Gallagher, Maxime Geoffrey,
+Mats A Granskog, … Guangyu Zuo. (2022). Sea ice drift tracks from the Distributed
+Network of autonomous buoys deployed during the Multidisciplinary drifting Observatory
+for the Study of Arctic Climate (MOSAiC) expedition 2019 - 2021. Arctic Data Center.
+doi:10.18739/A2KP7TS83.
This module defines functions to adapt as a ragged-array dataset a collection of data
+from 2193 trajectories of SOFAR, APEX, and RAFOS subsurface floats from 52 experiments
+across the world between 1989 and 2015.
Returns the ANDRO as a ragged array Xarray dataset.
+
The function will first look for the ragged-array dataset on the local
+filesystem. If it is not found, the dataset will be downloaded using the
+corresponding adapter function and stored for later access. The upstream
+data is available at https://www.seanoe.org/data/00360/47077/.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
Returns the NOAA Global Drifter Program (GDP) 6-hourly dataset as a ragged array
+Xarray dataset.
+
The data is accessed from a public HTTPS server at NOAA’s Atlantic
+Oceanographic and Meteorological Laboratory (AOML) accessible at
+https://www.aoml.noaa.gov/phod/gdp/index.php. It should be noted that the data loading
+method is platform dependent. Linux and Darwin (macOS) machines lazy load the datasets leveraging the
+byte-range feature of the netCDF-c library (dataset loading engine used by xarray).
+Windows machines download the entire dataset into a memory buffer which is then passed
+to xarray.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
Returns the Grand LAgrangian Deployment (GLAD) dataset as a ragged array
+Xarray dataset.
+
The function will first look for the ragged-array dataset on the local
+filesystem. If it is not found, the dataset will be downloaded using the
+corresponding adapter function and stored for later access.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
Özgökmen, Tamay. 2013. GLAD experiment CODE-style drifter trajectories (low-pass filtered, 15 minute interval records), northern Gulf of Mexico near DeSoto Canyon, July-October 2012. Distributed by: Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC), Harte Research Institute, Texas A&M University–Corpus Christi. doi:10.7266/N7VD6WC8
This module provides functions to easily access ragged array datasets. If the datasets are
+not accessed via cloud storage platforms or are not found on the local filesystem,
+they will be downloaded from their upstream repositories and stored for later access
+(~/.clouddrift for UNIX-based systems).
Returns the ANDRO as a ragged array Xarray dataset.
+
The function will first look for the ragged-array dataset on the local
+filesystem. If it is not found, the dataset will be downloaded using the
+corresponding adapter function and stored for later access. The upstream
+data is available at https://www.seanoe.org/data/00360/47077/.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
Returns the NOAA Global Drifter Program (GDP) 6-hourly dataset as a ragged array
+Xarray dataset.
+
The data is accessed from a public HTTPS server at NOAA’s Atlantic
+Oceanographic and Meteorological Laboratory (AOML) accessible at
+https://www.aoml.noaa.gov/phod/gdp/index.php. It should be noted that the data loading
+method is platform dependent. Linux and Darwin (macOS) machines lazy load the datasets leveraging the
+byte-range feature of the netCDF-c library (dataset loading engine used by xarray).
+Windows machines download the entire dataset into a memory buffer which is then passed
+to xarray.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
Returns the Grand LAgrangian Deployment (GLAD) dataset as a ragged array
+Xarray dataset.
+
The function will first look for the ragged-array dataset on the local
+filesystem. If it is not found, the dataset will be downloaded using the
+corresponding adapter function and stored for later access.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
Özgökmen, Tamay. 2013. GLAD experiment CODE-style drifter trajectories (low-pass filtered, 15 minute interval records), northern Gulf of Mexico near DeSoto Canyon, July-October 2012. Distributed by: Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC), Harte Research Institute, Texas A&M University–Corpus Christi. doi:10.7266/N7VD6WC8
Returns the MOSAiC sea-ice drift dataset as a ragged array Xarray dataset.
+
The function will first look for the ragged-array dataset on the local
+filesystem. If it is not found, the dataset will be downloaded using the
+corresponding adapter function and stored for later access.
Angela Bliss, Jennifer Hutchings, Philip Anderson, Philipp Anhaus,
+Hans Jakob Belter, Jørgen Berge, Vladimir Bessonov, Bin Cheng, Sylvia Cole,
+Dave Costa, Finlo Cottier, Christopher J Cox, Pedro R De La Torre, Dmitry V Divine,
+Gilbert Emzivat, Ying-Chih Fang, Steven Fons, Michael Gallagher, Maxime Geoffrey,
+Mats A Granskog, … Guangyu Zuo. (2022). Sea ice drift tracks from the Distributed
+Network of autonomous buoys deployed during the Multidisciplinary drifting Observatory
+for the Study of Arctic Climate (MOSAiC) expedition 2019 - 2021. Arctic Data Center.
+doi:10.18739/A2KP7TS83.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
Returns the subsurface floats dataset as a ragged array Xarray dataset.
+
The data is accessed from a public HTTPS server at NOAA’s Atlantic
+Oceanographic and Meteorological Laboratory (AOML) accessible at
+https://www.aoml.noaa.gov/phod/gdp/index.php.
This dataset of subsurface float observations was compiled by the WOCE Subsurface
+Float Data Assembly Center (WFDAC) in Woods Hole maintained by Andree Ramsey and
+Heather Furey and copied to NOAA/AOML in October 2014 (version 1) and in December
+2017 (version 2). Subsequent updates will be included as additional appropriate
+float data, quality controlled by the appropriate principal investigators, is
+submitted for inclusion.
+
Note that these observations are collected by ALACE/RAFOS/Eurofloat-style
+acoustically-tracked, neutrally-buoyant subsurface floats which collect data while
+drifting beneath the ocean surface. These data are the result of the effort and
+resources of many individuals and institutions. You are encouraged to acknowledge
+the work of the data originators and Data Centers in publications arising from use
+of these data.
+
The float data were originally divided by project at the WFDAC. Here they have been
+compiled in a single Matlab data set. See here for more information on the variables
+contained in these files.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
Returns the YoMaHa dataset as a ragged array Xarray dataset.
+
The function will first look for the ragged-array dataset on the local
+filesystem. If it is not found, the dataset will be downloaded using the
+corresponding adapter function and stored for later access. The upstream
+data is available at http://apdrc.soest.hawaii.edu/projects/yomaha/.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
Lebedev, K. V., Yoshinari, H., Maximenko, N. A., & Hacker, P. W. (2007). Velocity data
+assessed from trajectories of Argo floats at parking level and at the sea
+surface. IPRC Technical Note, 4(2), 1-16.
Returns the MOSAiC sea-ice drift dataset as a ragged array Xarray dataset.
+
The function will first look for the ragged-array dataset on the local
+filesystem. If it is not found, the dataset will be downloaded using the
+corresponding adapter function and stored for later access.
Angela Bliss, Jennifer Hutchings, Philip Anderson, Philipp Anhaus,
+Hans Jakob Belter, Jørgen Berge, Vladimir Bessonov, Bin Cheng, Sylvia Cole,
+Dave Costa, Finlo Cottier, Christopher J Cox, Pedro R De La Torre, Dmitry V Divine,
+Gilbert Emzivat, Ying-Chih Fang, Steven Fons, Michael Gallagher, Maxime Geoffrey,
+Mats A Granskog, … Guangyu Zuo. (2022). Sea ice drift tracks from the Distributed
+Network of autonomous buoys deployed during the Multidisciplinary drifting Observatory
+for the Study of Arctic Climate (MOSAiC) expedition 2019 - 2021. Arctic Data Center.
+doi:10.18739/A2KP7TS83.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
Returns the subsurface floats dataset as a ragged array Xarray dataset.
+
The data is accessed from a public HTTPS server at NOAA’s Atlantic
+Oceanographic and Meteorological Laboratory (AOML) accessible at
+https://www.aoml.noaa.gov/phod/gdp/index.php.
This dataset of subsurface float observations was compiled by the WOCE Subsurface
+Float Data Assembly Center (WFDAC) in Woods Hole maintained by Andree Ramsey and
+Heather Furey and copied to NOAA/AOML in October 2014 (version 1) and in December
+2017 (version 2). Subsequent updates will be included as additional appropriate
+float data, quality controlled by the appropriate principal investigators, is
+submitted for inclusion.
+
Note that these observations are collected by ALACE/RAFOS/Eurofloat-style
+acoustically-tracked, neutrally-buoyant subsurface floats which collect data while
+drifting beneath the ocean surface. These data are the result of the effort and
+resources of many individuals and institutions. You are encouraged to acknowledge
+the work of the data originators and Data Centers in publications arising from use
+of these data.
+
The float data were originally divided by project at the WFDAC. Here they have been
+compiled in a single Matlab data set. See here for more information on the variables
+contained in these files.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
Returns the YoMaHa dataset as a ragged array Xarray dataset.
+
The function will first look for the ragged-array dataset on the local
+filesystem. If it is not found, the dataset will be downloaded using the
+corresponding adapter function and stored for later access. The upstream
+data is available at http://apdrc.soest.hawaii.edu/projects/yomaha/.
If True, decode the time coordinate into a datetime object. If False, the time
+coordinate will be an int64 or float64 array of increments since the origin
+time indicated in the units attribute. Default is True.
Lebedev, K. V., Yoshinari, H., Maximenko, N. A., & Hacker, P. W. (2007). Velocity data
+assessed from trajectories of Argo floats at parking level and at the sea
+surface. IPRC Technical Note, 4(2), 1-16.
Return residual longitudes and latitudes along a trajectory on the spherical Earth after correcting for zonal and meridional displacements x and y in meters.
Extract inertial oscillations from consecutive geographical positions.
+
This function acts by performing a time-frequency analysis of horizontal displacements
+with analytic Morse wavelets. It extracts the portion of the wavelet transform signal
+that follows the inertial frequency (opposite of Coriolis frequency) as a function of time,
+potentially shifted in frequency by a measure of relative vorticity. The result is a pair
+of zonal and meridional relative displacements in meters.
+
This function is equivalent to a bandpass filtering of the horizontal displacements. The characteristics
+of the filter are defined by the relative bandwidth of the wavelet transform or by the duration of the wavelet,
+see the parameters below.
Bandwidth of the frequency-domain equivalent filter for the extraction of the inertial
+oscillations; a number less or equal to one which is a fraction of the inertial frequency.
+A value of 0.1 leads to a bandpass filter equivalent of +/- 10 percent of the inertial frequency.
+
+
wavelet_durationfloat, optional
Duration of the wavelet, or inverse of the relative bandwidth, which can be passed instead of the
+relative bandwidth.
+
+
time_stepfloat, optional
The constant time interval between data points in seconds. Default is 3600.
+
+
relative_vorticity: Optional, float or array-like
Relative vorticity adding to the local Coriolis frequency. If “f” is the Coriolis
+frequency then “f” + relative_vorticity will be the effective Coriolis frequency as defined by Kunze (1985).
+Positive values correspond to cyclonic vorticity, irrespectively of the latitudes of the data
+points.
To extract displacements from inertial oscillations from sequences of longitude
+and latitude values, equivalent to bandpass around 20 percent of the local inertial frequency:
If longitude and latitude arrays do not have the same shape.
+If relative_vorticity is an array and does not have the same shape as longitude and latitude.
+If time_step is not a float.
+If both relative_bandwidth and wavelet_duration are specified.
+If neither relative_bandwidth nor wavelet_duration are specified.
+If the absolute value of relative_bandwidth is not in the range (0,1].
+If the wavelet duration is not greater than or equal to 1.
Compute positions from arrays of velocities and time and a pair of origin
+coordinates.
+
The units of the result are degrees if coord_system=="spherical" (default).
+If coord_system=="cartesian", the units of the result are equal to the
+units of the input velocities multiplied by the units of the input time.
+For example, if the input velocities are in meters per second and the input
+time is in seconds, the units of the result will be meters.
+
Integration scheme can take one of three values:
+
+
+
+
“forward” (default): integration from x[i] to x[i+1] is performed
using the velocity at x[i].
+
+
+
+
+
“backward”: integration from x[i] to x[i+1] is performed using the
velocity at x[i+1].
+
+
+
+
+
“centered”: integration from x[i] to x[i+1] is performed using the
arithmetic average of the velocities at x[i] and x[i+1]. Note that
+this method introduces some error due to the averaging.
+
+
+
+
+
+
u, v, and time can be multi-dimensional arrays. If the time axis, along
+which the finite differencing is performed, is not the last one (i.e.
+x.shape[-1]), use the time_axis optional argument to specify along which
+axis should the differencing be done. x, y, and time must have
+the same shape.
+
This function will not do any special handling of longitude ranges. If the
+integrated trajectory crosses the antimeridian (dateline) in either direction, the
+longitude values will not be adjusted to stay in any specific range such
+as [-180, 180] or [0, 360]. If you need your longitudes to be in a specific
+range, recast the resulting longitude from this function using the function
+clouddrift.sphere.recast_lon().
If u and v do not have the same shape.
+If the time axis is outside of the valid range ([-1, N-1]).
+If lengths of x, y, and time along time_axis are not equal.
+If the input coordinate system is not “spherical” or “cartesian”.
+If the input integration scheme is not “forward”, “backward”, or “centered”
Return residual longitudes and latitudes along a trajectory on the spherical Earth
+after correcting for zonal and meridional displacements x and y in meters.
+
This is applicable as an example when one seeks to correct a trajectory for
+horizontal oscillations due to inertial motions, tides, etc.
Obtain the new geographical position for a displacement of 1/360-th of the
+circumference of the Earth from original position (longitude,latitude) = (1,0):
Compute spin continuously from velocities and times.
+
Spin is traditionally (Sawford, 1999; Veneziani et al., 2005) defined as
+(<u’dv’ - v’du’>) / (2 dt EKE) where u’ and v’ are eddy-perturbations of the
+velocity field, EKE is eddy kinetic energy, dt is the time step, and du’ and
+dv’ are velocity component increments during dt, and < > denotes ensemble
+average.
+
To allow computing spin based on full velocity fields, this function does
+not do any demeaning of the velocity fields. If you need the spin based on
+velocity anomalies, ensure to demean the velocity fields before passing
+them to this function. This function also returns instantaneous spin values,
+so the rank of the result is not reduced relative to the input.
+
u, v, and time can be multi-dimensional arrays. If the time
+axis, along which the finite differencing is performed, is not the last one
+(i.e. u.shape[-1]), use the time_axis optional argument to specify along
+which the spin should be calculated. u, v, and time must either have the
+same shape, or time must be a 1-d array with the same length as
+u.shape[time_axis].
+
Difference scheme can be one of three values:
+
+
+
“forward” (default): finite difference is evaluated as dx[i]=dx[i+1]-dx[i];
+
“backward”: finite difference is evaluated as dx[i]=dx[i]-dx[i-1];
+
“centered”: finite difference is evaluated as dx[i]=(dx[i+1]-dx[i-1])/2.
+
+
+
Forward and backward schemes are effectively the same except that the
+position at which the velocity is evaluated is shifted one element down in
+the backward scheme relative to the forward scheme. In the case of a
+forward or backward difference scheme, the last or first element of the
+velocity, respectively, is extrapolated from its neighboring point. In the
+case of a centered difference scheme, the start and end boundary points are
+evaluated using the forward and backward difference scheme, respectively.
If u and v do not have the same shape.
+If the time axis is outside of the valid range ([-1, N-1]).
+If lengths of u, v, and time along time_axis are not equal.
+If difference_scheme is not “forward”, “backward”, or “centered”.
Sawford, B.L., 1999. Rotation of trajectories in Lagrangian stochastic models of turbulent dispersion. Boundary-layer meteorology, 93, pp.411-424. https://doi.org/10.1023/A:1002114132715
+
Veneziani, M., Griffa, A., Garraffo, Z.D. and Chassignet, E.P., 2005. Lagrangian spin parameter and coherent structures from trajectories released in a high-resolution ocean model. Journal of Marine Research, 63(4), pp.753-788. https://elischolar.library.yale.edu/journal_of_marine_research/100/
Compute velocity from arrays of positions and time.
+
x and y can be provided as longitude and latitude in degrees if
+coord_system == “spherical” (default), or as easting and northing if
+coord_system == “cartesian”.
+
The units of the result are meters per unit of time if
+coord_system == “spherical”. For example, if the time is provided in the
+units of seconds, the resulting velocity is in the units of meters per
+second. Otherwise, if coord_system == “cartesian”, the units of the
+resulting velocity correspond to the units of the input. For example,
+if zonal and meridional displacements are in the units of kilometers and
+time is in the units of hours, the resulting velocity is in the units of
+kilometers per hour.
+
x, y, and time can be multi-dimensional arrays. If the time axis, along
+which the finite differencing is performed, is not the last one (i.e.
+x.shape[-1]), use the time_axis optional argument to specify along which
+axis should the differencing be done. x, y, and time must have the same
+shape.
+
Difference scheme can take one of three values:
+
+
“forward” (default): finite difference is evaluated as dx[i]=dx[i+1]-dx[i];
+
“backward”: finite difference is evaluated as dx[i]=dx[i]-dx[i-1];
+
“centered”: finite difference is evaluated as dx[i]=(dx[i+1]-dx[i-1])/2.
+
+
Forward and backward schemes are effectively the same except that the
+position at which the velocity is evaluated is shifted one element down in
+the backward scheme relative to the forward scheme. In the case of a
+forward or backward difference scheme, the last or first element of the
+velocity, respectively, is extrapolated from its neighboring point. In the
+case of a centered difference scheme, the start and end boundary points are
+evaluated using the forward and backward difference scheme, respectively.
If x and y do not have the same shape.
+If time_axis is outside of the valid range.
+If lengths of x, y, and time along time_axis are not equal.
+If coord_system is not “spherical” or “cartesian”.
+If difference_scheme is not “forward”, “backward”, or “centered”.
Extract inertial oscillations from consecutive geographical positions.
+
This function acts by performing a time-frequency analysis of horizontal displacements
+with analytic Morse wavelets. It extracts the portion of the wavelet transform signal
+that follows the inertial frequency (opposite of Coriolis frequency) as a function of time,
+potentially shifted in frequency by a measure of relative vorticity. The result is a pair
+of zonal and meridional relative displacements in meters.
+
This function is equivalent to a bandpass filtering of the horizontal displacements. The characteristics
+of the filter are defined by the relative bandwidth of the wavelet transform or by the duration of the wavelet,
+see the parameters below.
Bandwidth of the frequency-domain equivalent filter for the extraction of the inertial
+oscillations; a number less or equal to one which is a fraction of the inertial frequency.
+A value of 0.1 leads to a bandpass filter equivalent of +/- 10 percent of the inertial frequency.
+
+
wavelet_durationfloat, optional
Duration of the wavelet, or inverse of the relative bandwidth, which can be passed instead of the
+relative bandwidth.
+
+
time_stepfloat, optional
The constant time interval between data points in seconds. Default is 3600.
+
+
relative_vorticity: Optional, float or array-like
Relative vorticity adding to the local Coriolis frequency. If “f” is the Coriolis
+frequency then “f” + relative_vorticity will be the effective Coriolis frequency as defined by Kunze (1985).
+Positive values correspond to cyclonic vorticity, irrespectively of the latitudes of the data
+points.
To extract displacements from inertial oscillations from sequences of longitude
+and latitude values, equivalent to bandpass around 20 percent of the local inertial frequency:
If longitude and latitude arrays do not have the same shape.
+If relative_vorticity is an array and does not have the same shape as longitude and latitude.
+If time_step is not a float.
+If both relative_bandwidth and wavelet_duration are specified.
+If neither relative_bandwidth nor wavelet_duration are specified.
+If the absolute value of relative_bandwidth is not in the range (0,1].
+If the wavelet duration is not greater than or equal to 1.
Compute positions from arrays of velocities and time and a pair of origin
+coordinates.
+
The units of the result are degrees if coord_system=="spherical" (default).
+If coord_system=="cartesian", the units of the result are equal to the
+units of the input velocities multiplied by the units of the input time.
+For example, if the input velocities are in meters per second and the input
+time is in seconds, the units of the result will be meters.
+
Integration scheme can take one of three values:
+
+
+
+
“forward” (default): integration from x[i] to x[i+1] is performed
using the velocity at x[i].
+
+
+
+
+
“backward”: integration from x[i] to x[i+1] is performed using the
velocity at x[i+1].
+
+
+
+
+
“centered”: integration from x[i] to x[i+1] is performed using the
arithmetic average of the velocities at x[i] and x[i+1]. Note that
+this method introduces some error due to the averaging.
+
+
+
+
+
+
u, v, and time can be multi-dimensional arrays. If the time axis, along
+which the finite differencing is performed, is not the last one (i.e.
+x.shape[-1]), use the time_axis optional argument to specify along which
+axis should the differencing be done. x, y, and time must have
+the same shape.
+
This function will not do any special handling of longitude ranges. If the
+integrated trajectory crosses the antimeridian (dateline) in either direction, the
+longitude values will not be adjusted to stay in any specific range such
+as [-180, 180] or [0, 360]. If you need your longitudes to be in a specific
+range, recast the resulting longitude from this function using the function
+clouddrift.sphere.recast_lon().
If u and v do not have the same shape.
+If the time axis is outside of the valid range ([-1, N-1]).
+If lengths of x, y, and time along time_axis are not equal.
+If the input coordinate system is not “spherical” or “cartesian”.
+If the input integration scheme is not “forward”, “backward”, or “centered”
Return residual longitudes and latitudes along a trajectory on the spherical Earth
+after correcting for zonal and meridional displacements x and y in meters.
+
This is applicable as an example when one seeks to correct a trajectory for
+horizontal oscillations due to inertial motions, tides, etc.
Obtain the new geographical position for a displacement of 1/360-th of the
+circumference of the Earth from original position (longitude,latitude) = (1,0):
Compute spin continuously from velocities and times.
+
Spin is traditionally (Sawford, 1999; Veneziani et al., 2005) defined as
+(<u’dv’ - v’du’>) / (2 dt EKE) where u’ and v’ are eddy-perturbations of the
+velocity field, EKE is eddy kinetic energy, dt is the time step, and du’ and
+dv’ are velocity component increments during dt, and < > denotes ensemble
+average.
+
To allow computing spin based on full velocity fields, this function does
+not do any demeaning of the velocity fields. If you need the spin based on
+velocity anomalies, ensure to demean the velocity fields before passing
+them to this function. This function also returns instantaneous spin values,
+so the rank of the result is not reduced relative to the input.
+
u, v, and time can be multi-dimensional arrays. If the time
+axis, along which the finite differencing is performed, is not the last one
+(i.e. u.shape[-1]), use the time_axis optional argument to specify along
+which the spin should be calculated. u, v, and time must either have the
+same shape, or time must be a 1-d array with the same length as
+u.shape[time_axis].
+
Difference scheme can be one of three values:
+
+
+
“forward” (default): finite difference is evaluated as dx[i]=dx[i+1]-dx[i];
+
“backward”: finite difference is evaluated as dx[i]=dx[i]-dx[i-1];
+
“centered”: finite difference is evaluated as dx[i]=(dx[i+1]-dx[i-1])/2.
+
+
+
Forward and backward schemes are effectively the same except that the
+position at which the velocity is evaluated is shifted one element down in
+the backward scheme relative to the forward scheme. In the case of a
+forward or backward difference scheme, the last or first element of the
+velocity, respectively, is extrapolated from its neighboring point. In the
+case of a centered difference scheme, the start and end boundary points are
+evaluated using the forward and backward difference scheme, respectively.
If u and v do not have the same shape.
+If the time axis is outside of the valid range ([-1, N-1]).
+If lengths of u, v, and time along time_axis are not equal.
+If difference_scheme is not “forward”, “backward”, or “centered”.
Sawford, B.L., 1999. Rotation of trajectories in Lagrangian stochastic models of turbulent dispersion. Boundary-layer meteorology, 93, pp.411-424. https://doi.org/10.1023/A:1002114132715
+
Veneziani, M., Griffa, A., Garraffo, Z.D. and Chassignet, E.P., 2005. Lagrangian spin parameter and coherent structures from trajectories released in a high-resolution ocean model. Journal of Marine Research, 63(4), pp.753-788. https://elischolar.library.yale.edu/journal_of_marine_research/100/
Compute velocity from arrays of positions and time.
+
x and y can be provided as longitude and latitude in degrees if
+coord_system == “spherical” (default), or as easting and northing if
+coord_system == “cartesian”.
+
The units of the result are meters per unit of time if
+coord_system == “spherical”. For example, if the time is provided in the
+units of seconds, the resulting velocity is in the units of meters per
+second. Otherwise, if coord_system == “cartesian”, the units of the
+resulting velocity correspond to the units of the input. For example,
+if zonal and meridional displacements are in the units of kilometers and
+time is in the units of hours, the resulting velocity is in the units of
+kilometers per hour.
+
x, y, and time can be multi-dimensional arrays. If the time axis, along
+which the finite differencing is performed, is not the last one (i.e.
+x.shape[-1]), use the time_axis optional argument to specify along which
+axis should the differencing be done. x, y, and time must have the same
+shape.
+
Difference scheme can take one of three values:
+
+
“forward” (default): finite difference is evaluated as dx[i]=dx[i+1]-dx[i];
+
“backward”: finite difference is evaluated as dx[i]=dx[i]-dx[i-1];
+
“centered”: finite difference is evaluated as dx[i]=(dx[i+1]-dx[i-1])/2.
+
+
Forward and backward schemes are effectively the same except that the
+position at which the velocity is evaluated is shifted one element down in
+the backward scheme relative to the forward scheme. In the case of a
+forward or backward difference scheme, the last or first element of the
+velocity, respectively, is extrapolated from its neighboring point. In the
+case of a centered difference scheme, the start and end boundary points are
+evaluated using the forward and backward difference scheme, respectively.
If x and y do not have the same shape.
+If time_axis is outside of the valid range.
+If lengths of x, y, and time along time_axis are not equal.
+If coord_system is not “spherical” or “cartesian”.
+If difference_scheme is not “forward”, “backward”, or “centered”.
Given two sets of longitudes, latitudes, and times arrays, return in pairs
+the indices of collocated data points that are within prescribed distances
+in space and time. Also known as chance pairs.
Maximum allowable space distance in meters for a pair to qualify as chance pair.
+If the separation is within this distance, the pair is considered to be
+a chance pair. Default is 0, or no distance, i.e. the positions must be
+exactly the same.
+
+
time_distancefloat, optional
Maximum allowable time distance for a pair to qualify as chance pair.
+If a separation is within this distance, and a space distance
+condition is satisfied, the pair is considered a chance pair. Default is
+0, or no distance, i.e. the times must be exactly the same.
In the following example, we load the GLAD dataset, extract the first
+two trajectories, and find between these the array indices that satisfy
+the chance pair criteria of 6 km separation distance and no time separation:
Return all chance pairs of contiguous trajectories in a ragged array,
+and their collocated points in space and (optionally) time, given input
+ragged arrays of longitude, latitude, and (optionally) time, and chance
+pair criteria as maximum allowable distances in space and time.
+
If time and time_distance are omitted, the search will be done
+only on the spatial criteria, and the result will not include the time
+arrays.
+
If time and time_distance are provided, the search will be done
+on both the spatial and temporal criteria, and the result will include the
+time arrays.
Maximum space distance in meters for the pair to qualify as chance pair.
+If the separation is within this distance, the pair is considered to be
+a chance pair. Default is 0, or no distance, i.e. the positions must be
+exactly the same.
+
+
timearray_like, optional
Array of times.
+
+
time_distancefloat, optional
Maximum time distance allowed for the pair to qualify as chance pair.
+If the separation is within this distance, and the space distance
+condition is satisfied, the pair is considered a chance pair. Default is
+0, or no distance, i.e. the times must be exactly the same.
List of tuples, each tuple containing a Tuple of integer indices that
+corresponds to the trajectory rows in the ragged array, indicating the
+pair of trajectories that satisfy the chance pair criteria, and a Tuple
+of arrays containing the indices of the collocated points for each
+trajectory in the chance pair.
In the following example, we load GLAD dataset as a ragged array dataset,
+subset the result to retain the first five trajectories, and finally find
+all trajectories that satisfy the chance pair criteria of 12 km separation
+distance and no time separation, as well as the indices of the collocated
+points for each pair.
Given two sets of longitudes, latitudes, and times arrays, return in pairs the indices of collocated data points that are within prescribed distances in space and time.
Return all chance pairs of contiguous trajectories in a ragged array, and their collocated points in space and (optionally) time, given input ragged arrays of longitude, latitude, and (optionally) time, and chance pair criteria as maximum allowable distances in space and time.
Given two sets of longitudes, latitudes, and times arrays, return in pairs
+the indices of collocated data points that are within prescribed distances
+in space and time. Also known as chance pairs.
Maximum allowable space distance in meters for a pair to qualify as chance pair.
+If the separation is within this distance, the pair is considered to be
+a chance pair. Default is 0, or no distance, i.e. the positions must be
+exactly the same.
+
+
time_distancefloat, optional
Maximum allowable time distance for a pair to qualify as chance pair.
+If a separation is within this distance, and a space distance
+condition is satisfied, the pair is considered a chance pair. Default is
+0, or no distance, i.e. the times must be exactly the same.
In the following example, we load the GLAD dataset, extract the first
+two trajectories, and find between these the array indices that satisfy
+the chance pair criteria of 6 km separation distance and no time separation:
Return all chance pairs of contiguous trajectories in a ragged array,
+and their collocated points in space and (optionally) time, given input
+ragged arrays of longitude, latitude, and (optionally) time, and chance
+pair criteria as maximum allowable distances in space and time.
+
If time and time_distance are omitted, the search will be done
+only on the spatial criteria, and the result will not include the time
+arrays.
+
If time and time_distance are provided, the search will be done
+on both the spatial and temporal criteria, and the result will include the
+time arrays.
Maximum space distance in meters for the pair to qualify as chance pair.
+If the separation is within this distance, the pair is considered to be
+a chance pair. Default is 0, or no distance, i.e. the positions must be
+exactly the same.
+
+
timearray_like, optional
Array of times.
+
+
time_distancefloat, optional
Maximum time distance allowed for the pair to qualify as chance pair.
+If the separation is within this distance, and the space distance
+condition is satisfied, the pair is considered a chance pair. Default is
+0, or no distance, i.e. the times must be exactly the same.
List of tuples, each tuple containing a Tuple of integer indices that
+corresponds to the trajectory rows in the ragged array, indicating the
+pair of trajectories that satisfy the chance pair criteria, and a Tuple
+of arrays containing the indices of the collocated points for each
+trajectory in the chance pair.
In the following example, we load GLAD dataset as a ragged array dataset,
+subset the result to retain the first five trajectories, and finally find
+all trajectories that satisfy the chance pair criteria of 12 km separation
+distance and no time separation, as well as the indices of the collocated
+points for each pair.
Given two arrays of times (or any other monotonically increasing
+quantity), return indices where the times are within a prescribed distance.
+
Although higher-level array containers like xarray and pandas are supported
+for input arrays, this function is an order of magnitude faster when passing
+in numpy arrays.
Given two arrays of times (or any other monotonically increasing
+quantity), return indices where the times are within a prescribed distance.
+
Although higher-level array containers like xarray and pandas are supported
+for input arrays, this function is an order of magnitude faster when passing
+in numpy arrays.
Plot trajectories from a ragged array dataset on a Matplotlib Axes
+or a Cartopy GeoAxes object ax.
+
This function wraps Matplotlib’s plot function (plt.plot) and
+LineCollection (matplotlib.collections) to efficiently plot
+trajectories from a ragged array dataset.
Colors to use for plotting. If colors is the same shape as longitude and latitude,
+the trajectories are splitted into segments and each segment is colored according
+to the corresponding color value. If colors is the same shape as rowsize, the
+trajectories are uniformly colored according to the corresponding color value.
+
+
tolerancefloat
Longitude tolerance gap between data points (in degrees) for segmenting trajectories.
+For periodic domains, the tolerance parameter should be set to the maximum allowed gap
+between data points. Defaults to 180.
If longitude and latitude arrays do not have the same shape.
+If colors do not have the same shape as longitude and latitude arrays or rowsize.
+If ax is not a matplotlib Axes or GeoAxes object.
+If ax is a GeoAxes object and the transform keyword argument is not provided.
+
+
ImportError
If matplotlib is not installed.
+If the axis is a GeoAxes object and cartopy is not installed.
Plot trajectories from a ragged array dataset on a Matplotlib Axes
+or a Cartopy GeoAxes object ax.
+
This function wraps Matplotlib’s plot function (plt.plot) and
+LineCollection (matplotlib.collections) to efficiently plot
+trajectories from a ragged array dataset.
Colors to use for plotting. If colors is the same shape as longitude and latitude,
+the trajectories are splitted into segments and each segment is colored according
+to the corresponding color value. If colors is the same shape as rowsize, the
+trajectories are uniformly colored according to the corresponding color value.
+
+
tolerancefloat
Longitude tolerance gap between data points (in degrees) for segmenting trajectories.
+For periodic domains, the tolerance parameter should be set to the maximum allowed gap
+between data points. Defaults to 180.
If longitude and latitude arrays do not have the same shape.
+If colors do not have the same shape as longitude and latitude arrays or rowsize.
+If ax is not a matplotlib Axes or GeoAxes object.
+If ax is a GeoAxes object and the transform keyword argument is not provided.
+
+
ImportError
If matplotlib is not installed.
+If the axis is a GeoAxes object and cartopy is not installed.
The function func will be applied to each contiguous row of arrays as
+indicated by row sizes rowsize. The output of func will be
+concatenated into a single ragged array.
+
You can pass arrays as NumPy arrays or xarray DataArrays, however,
+the result will always be a NumPy array. Passing rows as an integer or
+a sequence of integers will make apply_ragged process and return only
+those specific rows, and otherwise, all rows in the input ragged array will
+be processed. Further, you can use the axis parameter to specify the
+ragged axis of the input array(s) (default is 0).
+
By default this function uses concurrent.futures.ThreadPoolExecutor to
+run func in multiple threads. The number of threads can be controlled by
+passing the max_workers argument to the executor instance passed to
+apply_ragged. Alternatively, you can pass the concurrent.futures.ProcessPoolExecutor
+instance to use processes instead. Passing alternative (3rd party library)
+concurrent executors may work if they follow the same executor interface as
+that of concurrent.futures, however this has not been tested yet.
The row(s) of the ragged array to apply func to. If rows is
+None (default), then func will be applied to all rows.
+
+
axisint, optional
The ragged axis of the input arrays. Default is 0.
+
+
executorconcurrent.futures.Executor, optional
Executor to use for concurrent execution. Default is ThreadPoolExecutor
+with the default number of max_workers.
+Another supported option is ProcessPoolExecutor.
Using velocity_from_position with apply_ragged, calculate the velocities of
+multiple particles, the coordinates of which are found in the ragged arrays x, y, and t
+that share row sizes 2, 3, and 4:
Divide an array x into equal chunks of length length. The result
+is a 2-dimensional NumPy array of shape (num_chunks,length). The resulting
+number of chunks is determined based on the length of x, length,
+and overlap.
+
chunk can be combined with apply_ragged() to chunk a ragged array.
The number of overlapping array elements across chunks. The default is 0.
+Must be smaller than length. For example, if length is 4 and
+overlap is 2, the chunks of [0,1,2,3,4,5] will be
+np.array([[0,1,2,3],[2,3,4,5]]). Negative overlap can be used
+to offset chunks by some number of elements. For example, if length
+is 2 and overlap is -1, the chunks of [0,1,2,3,4,5] will
+be np.array([[0,1],[3,4]]).
+
+
alignstr, optional [“start”, “middle”, “end”]
If the remainder of the length of x divided by the chunk length is a number
+N different from zero, this parameter controls which part of the array will be kept
+into the chunks. If align="start", the elements at the beginning of the array
+will be part of the chunks and N points are discarded at the end. If align=”middle”,
+floor(N/2) and ceil(N/2) elements will be discarded from the beginning and the end
+of the array, respectively. If align="end", the elements at the end of the array
+will be kept, and the N first elements are discarded. The default is “start”.
The function func will be applied to each contiguous row of arrays as
+indicated by row sizes rowsize. The output of func will be
+concatenated into a single ragged array.
+
You can pass arrays as NumPy arrays or xarray DataArrays, however,
+the result will always be a NumPy array. Passing rows as an integer or
+a sequence of integers will make apply_ragged process and return only
+those specific rows, and otherwise, all rows in the input ragged array will
+be processed. Further, you can use the axis parameter to specify the
+ragged axis of the input array(s) (default is 0).
+
By default this function uses concurrent.futures.ThreadPoolExecutor to
+run func in multiple threads. The number of threads can be controlled by
+passing the max_workers argument to the executor instance passed to
+apply_ragged. Alternatively, you can pass the concurrent.futures.ProcessPoolExecutor
+instance to use processes instead. Passing alternative (3rd party library)
+concurrent executors may work if they follow the same executor interface as
+that of concurrent.futures, however this has not been tested yet.
The row(s) of the ragged array to apply func to. If rows is
+None (default), then func will be applied to all rows.
+
+
axisint, optional
The ragged axis of the input arrays. Default is 0.
+
+
executorconcurrent.futures.Executor, optional
Executor to use for concurrent execution. Default is ThreadPoolExecutor
+with the default number of max_workers.
+Another supported option is ProcessPoolExecutor.
Using velocity_from_position with apply_ragged, calculate the velocities of
+multiple particles, the coordinates of which are found in the ragged arrays x, y, and t
+that share row sizes 2, 3, and 4:
Divide an array x into equal chunks of length length. The result
+is a 2-dimensional NumPy array of shape (num_chunks,length). The resulting
+number of chunks is determined based on the length of x, length,
+and overlap.
+
chunk can be combined with apply_ragged() to chunk a ragged array.
The number of overlapping array elements across chunks. The default is 0.
+Must be smaller than length. For example, if length is 4 and
+overlap is 2, the chunks of [0,1,2,3,4,5] will be
+np.array([[0,1,2,3],[2,3,4,5]]). Negative overlap can be used
+to offset chunks by some number of elements. For example, if length
+is 2 and overlap is -1, the chunks of [0,1,2,3,4,5] will
+be np.array([[0,1],[3,4]]).
+
+
alignstr, optional [“start”, “middle”, “end”]
If the remainder of the length of x divided by the chunk length is a number
+N different from zero, this parameter controls which part of the array will be kept
+into the chunks. If align="start", the elements at the beginning of the array
+will be part of the chunks and N points are discarded at the end. If align=”middle”,
+floor(N/2) and ceil(N/2) elements will be discarded from the beginning and the end
+of the array, respectively. If align="end", the elements at the end of the array
+will be kept, and the N first elements are discarded. The default is “start”.
Convert a ragged array to a two-dimensional array such that each contiguous segment
+of a ragged array is a row in the two-dimensional array. Each row of the two-dimensional
+array is padded with NaNs as needed. The length of the first dimension of the output
+array is the length of rowsize. The length of the second dimension is the maximum
+element of rowsize.
+
Note: Although this function accepts parameters of type xarray.DataArray,
+passing NumPy arrays is recommended for performance reasons.
The maximum signed difference between consecutive points in a segment.
+The array x will be segmented wherever differences exceed the tolerance.
+
+
rowsizenp.ndarray[int], optional
The size of rows if x is originally a ragged array. If present, x will be
+divided both by gaps that exceed the tolerance, and by the original rows
+of the ragged array.
If the array is already previously segmented (e.g. multiple rows in
+a ragged array), then the rowsize argument can be used to preserve
+the original segments:
The tolerance can also be negative. In this case, the input array is
+segmented where the negative difference exceeds the negative
+value of the tolerance, i.e. where x[n+1]-x[n]<-tolerance:
To segment an array for both positive and negative gaps, invoke the function
+twice, once for a positive tolerance and once for a negative tolerance.
+The result of the first invocation can be passed as the rowsize argument
+to the first segment invocation:
Subset a ragged array dataset as a function of one or more criteria.
+The criteria are passed with a dictionary, where a dictionary key
+is a variable to subset and the associated dictionary value is either a range
+(valuemin, valuemax), a list [value1, value2, valueN], a single value, or a
+masking function applied to every row of the ragged array using apply_ragged.
+
This function needs to know the names of the dimensions of the ragged array dataset
+(traj_dim_name and obs_dim_name), and the name of the rowsize variable (rowsize_var_name).
+Default values are provided for these arguments (see below), but they can be changed if needed.
dictionary containing the variables (as keys) and the ranges/values/functions (as values) to subset
+
+
id_var_namestr, optional
Name of the variable containing the ID of the trajectories (default is “id”)
+
+
rowsize_var_namestr, optional
Name of the variable containing the number of observations per trajectory (default is “rowsize”)
+
+
traj_dim_namestr, optional
Name of the trajectory dimension (default is “traj”)
+
+
obs_dim_namestr, optional
Name of the observation dimension (default is “obs”)
+
+
full_trajectoriesbool, optional
If True, it returns the complete trajectories (rows) where at least one observation
+matches the criteria, rather than just the segments where the criteria are satisfied.
+Default is False.
Criteria are combined on any data or metadata variables part of the Dataset.
+The following examples are based on NOAA GDP datasets which can be accessed with the
+clouddrift.datasets module.
+
Retrieve a region, like the Gulf of Mexico, using ranges of latitude and longitude:
+
>>> subset(ds,{"lat":(21,31),"lon":(-98,-78)})
+
+
+
The parameter full_trajectories can be used to retrieve trajectories passing through a region, for example all trajectories passing through the Gulf of Mexico:
Retrieve trajectory segments with temperature higher than 25°C (303.15K):
+
>>> subset(ds,{"sst":(303.15,np.inf)})
+
+
+
You can use the same approach to return only the trajectories that are
+shorter than some number of observations (similar to prune() but for
+the entire dataset):
+
>>> subset(ds,{"rowsize":(0,1000)})
+
+
+
Retrieve specific drifters from their IDs:
+
>>> subset(ds,{"id":[2578,2582,2583]})
+
+
+
Sometimes, you may want to retrieve specific rows of a ragged array.
+You can do that by filtering along the trajectory dimension directly, since
+this one corresponds to row numbers:
Note that to subset time variable, the range has to be defined as a function
+type of the variable. By default, xarray uses np.datetime64 to
+represent datetime data. If the datetime data is a datetime.datetime, or
+pd.Timestamp, the range would have to be defined accordingly.
Unpack a ragged array into a list of regular arrays.
+
Unpacking a np.ndarray ragged array is about 2 orders of magnitude
+faster than unpacking an xr.DataArray ragged array, so unless you need a
+DataArray as the result, we recommend passing np.ndarray as input.
Unpacking longitude arrays from a ragged Xarray Dataset:
+
lon=unpack(ds.lon,ds["rowsize"])# return a list[xr.DataArray] (slower)
+lon=unpack(ds.lon.values,ds["rowsize"])# return a list[np.ndarray] (faster)
+first_lon=unpack(ds.lon.values,ds["rowsize"],rows=0)# return only the first row
+first_two_lons=unpack(ds.lon.values,ds["rowsize"],rows=[0,1])# return first two rows
+
+
+
Looping over trajectories in a ragged Xarray Dataset to compute velocities
+for each:
Convert a ragged array to a two-dimensional array such that each contiguous segment
+of a ragged array is a row in the two-dimensional array. Each row of the two-dimensional
+array is padded with NaNs as needed. The length of the first dimension of the output
+array is the length of rowsize. The length of the second dimension is the maximum
+element of rowsize.
+
Note: Although this function accepts parameters of type xarray.DataArray,
+passing NumPy arrays is recommended for performance reasons.
The maximum signed difference between consecutive points in a segment.
+The array x will be segmented wherever differences exceed the tolerance.
+
+
rowsizenp.ndarray[int], optional
The size of rows if x is originally a ragged array. If present, x will be
+divided both by gaps that exceed the tolerance, and by the original rows
+of the ragged array.
If the array is already previously segmented (e.g. multiple rows in
+a ragged array), then the rowsize argument can be used to preserve
+the original segments:
The tolerance can also be negative. In this case, the input array is
+segmented where the negative difference exceeds the negative
+value of the tolerance, i.e. where x[n+1]-x[n]<-tolerance:
To segment an array for both positive and negative gaps, invoke the function
+twice, once for a positive tolerance and once for a negative tolerance.
+The result of the first invocation can be passed as the rowsize argument
+to the first segment invocation:
Subset a ragged array dataset as a function of one or more criteria.
+The criteria are passed with a dictionary, where a dictionary key
+is a variable to subset and the associated dictionary value is either a range
+(valuemin, valuemax), a list [value1, value2, valueN], a single value, or a
+masking function applied to every row of the ragged array using apply_ragged.
+
This function needs to know the names of the dimensions of the ragged array dataset
+(traj_dim_name and obs_dim_name), and the name of the rowsize variable (rowsize_var_name).
+Default values are provided for these arguments (see below), but they can be changed if needed.
dictionary containing the variables (as keys) and the ranges/values/functions (as values) to subset
+
+
id_var_namestr, optional
Name of the variable containing the ID of the trajectories (default is “id”)
+
+
rowsize_var_namestr, optional
Name of the variable containing the number of observations per trajectory (default is “rowsize”)
+
+
traj_dim_namestr, optional
Name of the trajectory dimension (default is “traj”)
+
+
obs_dim_namestr, optional
Name of the observation dimension (default is “obs”)
+
+
full_trajectoriesbool, optional
If True, it returns the complete trajectories (rows) where at least one observation
+matches the criteria, rather than just the segments where the criteria are satisfied.
+Default is False.
Criteria are combined on any data or metadata variables part of the Dataset.
+The following examples are based on NOAA GDP datasets which can be accessed with the
+clouddrift.datasets module.
+
Retrieve a region, like the Gulf of Mexico, using ranges of latitude and longitude:
+
>>> subset(ds,{"lat":(21,31),"lon":(-98,-78)})
+
+
+
The parameter full_trajectories can be used to retrieve trajectories passing through a region, for example all trajectories passing through the Gulf of Mexico:
Retrieve trajectory segments with temperature higher than 25°C (303.15K):
+
>>> subset(ds,{"sst":(303.15,np.inf)})
+
+
+
You can use the same approach to return only the trajectories that are
+shorter than some number of observations (similar to prune() but for
+the entire dataset):
+
>>> subset(ds,{"rowsize":(0,1000)})
+
+
+
Retrieve specific drifters from their IDs:
+
>>> subset(ds,{"id":[2578,2582,2583]})
+
+
+
Sometimes, you may want to retrieve specific rows of a ragged array.
+You can do that by filtering along the trajectory dimension directly, since
+this one corresponds to row numbers:
Note that to subset time variable, the range has to be defined as a function
+type of the variable. By default, xarray uses np.datetime64 to
+represent datetime data. If the datetime data is a datetime.datetime, or
+pd.Timestamp, the range would have to be defined accordingly.
Unpack a ragged array into a list of regular arrays.
+
Unpacking a np.ndarray ragged array is about 2 orders of magnitude
+faster than unpacking an xr.DataArray ragged array, so unless you need a
+DataArray as the result, we recommend passing np.ndarray as input.
Unpacking longitude arrays from a ragged Xarray Dataset:
+
lon=unpack(ds.lon,ds["rowsize"])# return a list[xr.DataArray] (slower)
+lon=unpack(ds.lon.values,ds["rowsize"])# return a list[np.ndarray] (faster)
+first_lon=unpack(ds.lon.values,ds["rowsize"],rows=0)# return only the first row
+first_two_lons=unpack(ds.lon.values,ds["rowsize"],rows=[0,1])# return first two rows
+
+
+
Looping over trajectories in a ragged Xarray Dataset to compute velocities
+for each:
This module defines the RaggedArray class, which is the intermediate data
+structure used by CloudDrift to process custom Lagrangian datasets to Xarray
+Datasets and Awkward Arrays.
Return the analytic signal from a real-valued signal or the analytic and
+conjugate analytic signals from a complex-valued signal.
+
If the input is a real-valued signal, the analytic signal is calculated as
+the inverse Fourier transform of the positive-frequency part of the Fourier
+transform. If the input is a complex-valued signal, the conjugate analytic signal
+is additionally calculated as the inverse Fourier transform of the positive-frequency
+part of the Fourier transform of the complex conjugate of the input signal.
+
For a complex-valued signal, the mean is evenly divided between the analytic and
+conjugate analytic signal.
+
The calculation is performed along the last axis of the input array by default.
+Alternatively, the user can specify the time axis of the input. The user can also
+specify the boundary conditions to be applied to the input array (default is “mirror”).
Analytic signal. It is a tuple if the input is a complex-valed signal
+with the first element being the analytic signal and the second element
+being the conjugate analytic signal.
Return rotary signals (wp,wn) from analytic Cartesian signals (ua,va).
+
If ua is the analytic signal from real-valued signal u, and va the analytic signal
+from real-valued signal v, then the positive (counterclockwise) and negative (clockwise)
+signals are defined by wp = 0.5*(up+1j*vp), wp = 0.5*(up-1j*vp).
Return the analytic signal from a real-valued signal or the analytic and
+conjugate analytic signals from a complex-valued signal.
+
If the input is a real-valued signal, the analytic signal is calculated as
+the inverse Fourier transform of the positive-frequency part of the Fourier
+transform. If the input is a complex-valued signal, the conjugate analytic signal
+is additionally calculated as the inverse Fourier transform of the positive-frequency
+part of the Fourier transform of the complex conjugate of the input signal.
+
For a complex-valued signal, the mean is evenly divided between the analytic and
+conjugate analytic signal.
+
The calculation is performed along the last axis of the input array by default.
+Alternatively, the user can specify the time axis of the input. The user can also
+specify the boundary conditions to be applied to the input array (default is “mirror”).
Analytic signal. It is a tuple if the input is a complex-valed signal
+with the first element being the analytic signal and the second element
+being the conjugate analytic signal.
Return rotary signals (wp,wn) from analytic Cartesian signals (ua,va).
+
If ua is the analytic signal from real-valued signal u, and va the analytic signal
+from real-valued signal v, then the positive (counterclockwise) and negative (clockwise)
+signals are defined by wp = 0.5*(up+1j*vp), wp = 0.5*(up-1j*vp).
Converts Cartesian three-dimensional coordinates to latitude and longitude on a
+spherical body.
+
The Cartesian coordinate system is a right-handed system whose
+origin lies at the center of the sphere. It is oriented with the
+Z-axis passing through the poles and the X-axis passing through
+the point lon = 0, lat = 0. This function is inverted by spherical_to_cartesian.
Project a three-dimensional Cartesian vector on a plane tangent to
+a spherical Earth.
+
The Cartesian coordinate system is a right-handed system whose
+origin lies at the center of a sphere. It is oriented with the
+Z-axis passing through the north pole at lat = 90, the X-axis passing through
+the point lon = 0, lat = 0, and the Y-axis passing through the point lon = 90,
+lat = 0.
Return elementwise great circle distance in meters between one or more
+points from arrays of their latitudes and longitudes, using the Haversine
+formula.
+
d = 2⋅r⋅asin √[sin²(Δφ/2) + cos φ1 ⋅ cos φ2 ⋅ sin²(Δλ/2)]
+
where (φ, λ) is (lat, lon) in radians and r is the radius of the sphere in
+meters.
Return elementwise great circle distance in meters between one or more points from arrays of their latitudes and longitudes, using the Haversine formula.
Return elementwise new position in degrees from arrays of latitude and longitude in degrees, distance in meters, and bearing in radians, based on the spherical law of cosines.
Converts Cartesian three-dimensional coordinates to latitude and longitude on a
+spherical body.
+
The Cartesian coordinate system is a right-handed system whose
+origin lies at the center of the sphere. It is oriented with the
+Z-axis passing through the poles and the X-axis passing through
+the point lon = 0, lat = 0. This function is inverted by spherical_to_cartesian.
Project a three-dimensional Cartesian vector on a plane tangent to
+a spherical Earth.
+
The Cartesian coordinate system is a right-handed system whose
+origin lies at the center of a sphere. It is oriented with the
+Z-axis passing through the north pole at lat = 90, the X-axis passing through
+the point lon = 0, lat = 0, and the Y-axis passing through the point lon = 90,
+lat = 0.
Return elementwise great circle distance in meters between one or more
+points from arrays of their latitudes and longitudes, using the Haversine
+formula.
+
d = 2⋅r⋅asin √[sin²(Δφ/2) + cos φ1 ⋅ cos φ2 ⋅ sin²(Δλ/2)]
+
where (φ, λ) is (lat, lon) in radians and r is the radius of the sphere in
+meters.
Convert Cartesian coordinates on a plane to spherical coordinates.
+
The arrays of input zonal and meridional displacements x and y are
+assumed to follow a contiguous trajectory. The spherical coordinate of each
+successive point is determined by following a great circle path from the
+previous point. The spherical coordinate of the first point is determined by
+following a great circle path from the origin, by default (0, 0).
+
The output arrays have the same floating-point output type as the input.
+
If projecting multiple trajectories onto the same plane, use
+apply_ragged() for highest accuracy.
Return elementwise new position in degrees from arrays of latitude and
+longitude in degrees, distance in meters, and bearing in radians, based on
+the spherical law of cosines.
+
The formula is:
+
φ2 = asin( sin φ1 ⋅ cos δ + cos φ1 ⋅ sin δ ⋅ cos θ )
+λ2 = λ1 + atan2( sin θ ⋅ sin δ ⋅ cos φ1, cos δ − sin φ1 ⋅ sin φ2 )
+
where (φ, λ) is (lat, lon) and θ is bearing, all in radians.
+Bearing is defined as zero toward East and positive counterclockwise.
Convert spherical coordinates to a tangent (Cartesian) plane.
+
The arrays of input longitudes and latitudes are assumed to be following
+a contiguous trajectory. The Cartesian coordinate of each successive point
+is determined by following a great circle path from the previous point.
+The Cartesian coordinate of the first point is determined by following a
+great circle path from the origin, by default (0, 0).
+
The output arrays have the same floating-point output type as the input.
+
If projecting multiple trajectories onto the same plane, use
+apply_ragged() for highest accuracy.
Converts latitude and longitude on a spherical body to
three-dimensional Cartesian coordinates.
+
+
+
The Cartesian coordinate system is a right-handed system whose
+origin lies at the center of a sphere. It is oriented with the
+Z-axis passing through the poles and the X-axis passing through
+the point lon = 0, lat = 0. This function is inverted by
+cartesian_to_spherical().
Return the three-dimensional Cartesian components of a vector contained in
+a plane tangent to a spherical Earth.
+
The Cartesian coordinate system is a right-handed system whose
+origin lies at the center of a sphere. It is oriented with the
+Z-axis passing through the north pole at lat = 90, the X-axis passing through
+the point lon = 0, lat = 0, and the Y-axis passing through the point lon = 90,
+lat = 0.
Convert Cartesian coordinates on a plane to spherical coordinates.
+
The arrays of input zonal and meridional displacements x and y are
+assumed to follow a contiguous trajectory. The spherical coordinate of each
+successive point is determined by following a great circle path from the
+previous point. The spherical coordinate of the first point is determined by
+following a great circle path from the origin, by default (0, 0).
+
The output arrays have the same floating-point output type as the input.
+
If projecting multiple trajectories onto the same plane, use
+apply_ragged() for highest accuracy.
Return elementwise new position in degrees from arrays of latitude and
+longitude in degrees, distance in meters, and bearing in radians, based on
+the spherical law of cosines.
+
The formula is:
+
φ2 = asin( sin φ1 ⋅ cos δ + cos φ1 ⋅ sin δ ⋅ cos θ )
+λ2 = λ1 + atan2( sin θ ⋅ sin δ ⋅ cos φ1, cos δ − sin φ1 ⋅ sin φ2 )
+
where (φ, λ) is (lat, lon) and θ is bearing, all in radians.
+Bearing is defined as zero toward East and positive counterclockwise.
Convert spherical coordinates to a tangent (Cartesian) plane.
+
The arrays of input longitudes and latitudes are assumed to be following
+a contiguous trajectory. The Cartesian coordinate of each successive point
+is determined by following a great circle path from the previous point.
+The Cartesian coordinate of the first point is determined by following a
+great circle path from the origin, by default (0, 0).
+
The output arrays have the same floating-point output type as the input.
+
If projecting multiple trajectories onto the same plane, use
+apply_ragged() for highest accuracy.
Converts latitude and longitude on a spherical body to
three-dimensional Cartesian coordinates.
+
+
+
The Cartesian coordinate system is a right-handed system whose
+origin lies at the center of a sphere. It is oriented with the
+Z-axis passing through the poles and the X-axis passing through
+the point lon = 0, lat = 0. This function is inverted by
+cartesian_to_spherical().
Return the three-dimensional Cartesian components of a vector contained in
+a plane tangent to a spherical Earth.
+
The Cartesian coordinate system is a right-handed system whose
+origin lies at the center of a sphere. It is oriented with the
+Z-axis passing through the north pole at lat = 90, the X-axis passing through
+the point lon = 0, lat = 0, and the Y-axis passing through the point lon = 90,
+lat = 0.
This module provides functions for computing wavelet transforms and time-frequency analyses,
+notably using generalized Morse wavelets.
+
The Python code in this module was translated from the MATLAB implementation
+by J. M. Lilly in the jWavelet module of jLab (http://jmlilly.net/code.html).
+
Lilly, J. M. (2021), jLab: A data analysis package for Matlab, v.1.7.1,
+doi:10.5281/zenodo.4547006, http://www.jmlilly.net/software.
+
jLab is licensed under the Creative Commons Attribution-Noncommercial-ShareAlike
+License (https://creativecommons.org/licenses/by-nc-sa/4.0/). The code that is
+directly translated from jLab/jWavelet is licensed under the same license.
+Any other code that is added to this module and that is specific to Python and
+not the MATLAB implementation is licensed under CloudDrift’s MIT license.
Calculate the amplitude coefficient of the generalized Morse wavelets.
+By default, the amplitude is calculated such that the maximum of the
+frequency-domain wavelet is equal to 2, which is the bandpass normalization.
+Optionally, specify normalization="energy" in order to return the coefficient
+giving the wavelets unit energies. See Lilly and Olhede (2009), doi doi: 10.1109/TSP.2008.2007607.
Normalization for the wavelets. By default it is assumed to be "bandpass"
+which uses a bandpass normalization, meaning that the FFT of the wavelets
+have peak value of 2 for all central frequencies radian_frequency. The other option is "energy"
+which uses the unit energy normalization. In this last case the time-domain wavelet
+energies np.sum(np.abs(wave)**2) are always unity.
Frequency measures for generalized Morse wavelets. This functions calculates
+three different measures fm, fe, and fi of the frequency of the lowest-order generalized Morse
+wavelet specified by parameters gamma and beta.
+
Note that all frequency quantities here are in radian as in cos(f t) and not
+cyclic as in np.cos(2 np.pi f t).
+
For beta=0, the corresponding wavelet becomes an analytic lowpass filter, and fm
+is not defined in the usual way but as the point at which the filter has decayed
+to one-half of its peak power.
+
For details see Lilly and Olhede (2009), doi: 10.1109/TSP.2008.2007607.
Compute logarithmically-spaced frequencies for generalized Morse wavelets
+with parameters gamma and beta. This is a useful function to obtain the frequencies
+needed for time-frequency analyses using wavelets. If radian_frequencies is the
+output, np.log(radian_frequencies) is uniformly spaced, following convention
+for wavelet analysis. See Lilly (2017), doi: 10.1098/rspa.2016.0776.
+
Default settings to compute the frequencies can be changed by passing optional
+arguments lowset, highset, and density. See below.
Tuple of values (eta, high) used for high-frequency cutoff calculation. The highest
+frequency is set to be the minimum of a specified value and a cutoff frequency
+based on a Nyquist overlap condition: the highest frequency is the minimum of
+the specified value high, and the largest frequency for which the wavelet will
+satisfy the threshold level eta. Here eta be a number between zero and one
+specifying the ratio of a frequency-domain wavelet at the Nyquist frequency
+to its peak value. Default is (eta, high) = (0.1, np.pi).
+
+
lowsettuple of floats, optional.
Tupe of values (P, low) set used for low-frequency cutoff calculation based on an
+endpoint overlap condition. The lowest frequency is set such that the lowest-frequency
+wavelet will reach some number P, called the packing number, times its central window
+width at the ends of the time series. A choice of P=1 corresponds to roughly 95% of
+the time-domain wavelet energy being contained within the time series endpoints for
+a wavelet at the center of the domain. The second value of the tuple is the absolute
+lowest frequency. Default is (P, low) = (5, 0).
+
+
densityint, optional
This optional argument controls the number of points in the returned frequency
+array. Higher values of density mean more overlap in the frequency
+domain between transforms. When density=1, the peak of one wavelet is located at the
+half-power points of the adjacent wavelet. The default density=4 means
+that four other wavelets will occur between the peak of one wavelet and
+its half-power point.
The radian frequencies at which the Fourier transform of the wavelets
+reach their maximum amplitudes. radian_frequency is between 0 and 2 * np.pi * 0.5,
+the normalized Nyquist radian frequency.
+
+
orderint, optional
Order of wavelets, default is 1.
+
+
normalizationstr, optional
Normalization for the wavelet output. By default it is assumed to be "bandpass"
+which uses a bandpass normalization, meaning that the FFT of the wavelets
+have peak value of 2 for all central frequencies radian_frequency. The other option is
+"energy"``whichusestheunitenergynormalization.Inthislastcase,thetime-domainwavelet
+energies``np.sum(np.abs(wave)**2) are always unity.
Apply a continuous wavelet transform to an input signal using the generalized Morse
+wavelets of Olhede and Walden (2002). The wavelet transform is normalized differently
+for complex-valued input than for real-valued input, and this in turns depends on whether the
+optional argument normalization is set to "bandpass" or "energy" normalizations.
Real- or complex-valued signals. The time axis is assumed to be the last. If not, specify optional
+argument time_axis.
+
+
gammafloat
Gamma parameter of the Morse wavelets.
+
+
betafloat
Beta parameter of the Morse wavelets.
+
+
radian_frequencynp.ndarray
An array of radian frequencies at which the Fourier transform of the wavelets
+reach their maximum amplitudes. radian_frequency is typically between 0 and 2 * np.pi * 0.5,
+the normalized Nyquist radian frequency.
+
+
complexboolean, optional
Specify explicitely if the input signal x is a complex signal. Default is False which
+means that the input is real but that is not explicitely tested by the function.
+This choice affects the normalization of the outputs and their interpretation.
+See examples below.
+
+
time_axisint, optional
Axis on which the time is defined for input x (default is last, or -1).
+
+
normalizationstr, optional
Normalization for the wavelet transforms. By default it is assumed to be
+"bandpass" which uses a bandpass normalization, meaning that the FFT
+of the wavelets have peak value of 2 for all central frequencies
+radian_frequency. However, if the optional argument complex=True
+is specified, the wavelets will be divided by 2 so that the total
+variance of the input complex signal is equal to the sum of the
+variances of the returned analytic (positive) and conjugate analytic
+(negative) parts. See examples below. The other option is "energy"
+which uses the unit energy normalization. In this last case, the
+time-domain wavelet energies np.sum(np.abs(wave)**2) are always
+unity.
+
+
boundarystr, optional
The boundary condition to be imposed at the edges of the input signal x.
+Allowed values are "mirror", "zeros", and "periodic". Default is "mirror".
If the input signal is real as specificied by complex=False:
+
+
wtxnp.ndarray
Time-domain wavelet transform of input x with shape ((x shape without time_axis), orders, frequencies, time_axis)
+but with dimensions of length 1 removed (squeezed).
+
+
+
If the input signal is complex as specificied by complex=True, a tuple is returned:
+
+
wtx_pnp.array
Time-domain positive wavelet transform of input x with shape ((x shape without time_axis), frequencies, orders),
+but with dimensions of length 1 removed (squeezed).
+
+
wtx_nnp.array
Time-domain negative wavelet transform of input x with shape ((x shape without time_axis), frequencies, orders),
+but with dimensions of length 1 removed (squeezed).
Apply a wavelet transform with a Morse wavelet with gamma parameter 3, beta parameter 4,
+for a complex input signal at radian frequency 0.2 cycles per unit time. This case returns the
+analytic and conjugate analytic components:
The same result as above can be otained by applying the Morse transform on the real and imaginary
+component of z and recombining the results as follows for the “bandpass” normalization:
+>>> wtz_real = morse_wavelet_transform(np.real(z)), 3, 4, np.array([2*np.pi*0.2]))
+>>> wtz_imag = morse_wavelet_transform(np.imag(z)), 3, 4, np.array([2*np.pi*0.2]))
+>>> wtz_p, wtz_n = (wtz_real + 1j*wtz_imag) / 2, (wtz_real - 1j*wtz_imag) / 2
+
For the “energy” normalization, the analytic and conjugate analytic components are obtained as follows
+with this alternative method:
+>>> wtz_real = morse_wavelet_transform(np.real(z)), 3, 4, np.array([2*np.pi*0.2]))
+>>> wtz_imag = morse_wavelet_transform(np.imag(z)), 3, 4, np.array([2*np.pi*0.2]))
+>>> wtz_p, wtz_n = (wtz_real + 1j*wtz_imag) / np.sqrt(2), (wtz_real - 1j*wtz_imag) / np.sqrt(2)
+
The input signal can have an arbitrary number of dimensions but its time_axis must be
+specified if it is not the last:
This function can be used to conduct a time-frequency analysis of the input signal by specifying
+a range of randian frequencies using the morse_logspace_freq function as an example:
If the time axis is outside of the valid range ([-1, np.ndim(x)-1]).
+If boundary optional argument is not in [“mirror”, “zeros”, “periodic”]``.
+If normalization optional argument is not in [“bandpass”, “energy”]``.
A suite of time-domain wavelets, typically returned by the function morse_wavelet.
+The length of the time axis of the wavelets must be the last one and matches the
+length of the time axis of x. The other dimensions (axes) of the wavelets (such as orders and frequencies) are
+typically organized as orders, frequencies, and time, unless specified by optional arguments freq_axis and order_axis.
+The normalization of the wavelets is assumed to be “bandpass”, if not, use kwarg normalization=”energy”, see morse_wavelet.
+
+
boundarystr, optional
The boundary condition to be imposed at the edges of the input signal x.
+Allowed values are "mirror", "zeros", and "periodic". Default is "mirror".
+
+
time_axisint, optional
Axis on which the time is defined for input x (default is last, or -1). Note that the time axis of the
+wavelets must be last.
+
+
freq_axisint, optional
Axis of wavelet for the frequencies (default is second or 1)
+
+
order_axisint, optional
Axis of wavelet for the orders (default is first or 0)
Time-domain wavelet transform of x with shape ((x shape without time_axis), orders, frequencies, time_axis)
+but with dimensions of length 1 removed (squeezed).
If the time axis is outside of the valid range ([-1, N-1]).
+If the shape of time axis is different for input signal and wavelet.
+If boundary optional argument is not in [“mirror”, “zeros”, “periodic”]``.
Calculate the amplitude coefficient of the generalized Morse wavelets.
+By default, the amplitude is calculated such that the maximum of the
+frequency-domain wavelet is equal to 2, which is the bandpass normalization.
+Optionally, specify normalization="energy" in order to return the coefficient
+giving the wavelets unit energies. See Lilly and Olhede (2009), doi doi: 10.1109/TSP.2008.2007607.
Normalization for the wavelets. By default it is assumed to be "bandpass"
+which uses a bandpass normalization, meaning that the FFT of the wavelets
+have peak value of 2 for all central frequencies radian_frequency. The other option is "energy"
+which uses the unit energy normalization. In this last case the time-domain wavelet
+energies np.sum(np.abs(wave)**2) are always unity.
Frequency measures for generalized Morse wavelets. This functions calculates
+three different measures fm, fe, and fi of the frequency of the lowest-order generalized Morse
+wavelet specified by parameters gamma and beta.
+
Note that all frequency quantities here are in radian as in cos(f t) and not
+cyclic as in np.cos(2 np.pi f t).
+
For beta=0, the corresponding wavelet becomes an analytic lowpass filter, and fm
+is not defined in the usual way but as the point at which the filter has decayed
+to one-half of its peak power.
+
For details see Lilly and Olhede (2009), doi: 10.1109/TSP.2008.2007607.
Compute logarithmically-spaced frequencies for generalized Morse wavelets
+with parameters gamma and beta. This is a useful function to obtain the frequencies
+needed for time-frequency analyses using wavelets. If radian_frequencies is the
+output, np.log(radian_frequencies) is uniformly spaced, following convention
+for wavelet analysis. See Lilly (2017), doi: 10.1098/rspa.2016.0776.
+
Default settings to compute the frequencies can be changed by passing optional
+arguments lowset, highset, and density. See below.
Tuple of values (eta, high) used for high-frequency cutoff calculation. The highest
+frequency is set to be the minimum of a specified value and a cutoff frequency
+based on a Nyquist overlap condition: the highest frequency is the minimum of
+the specified value high, and the largest frequency for which the wavelet will
+satisfy the threshold level eta. Here eta be a number between zero and one
+specifying the ratio of a frequency-domain wavelet at the Nyquist frequency
+to its peak value. Default is (eta, high) = (0.1, np.pi).
+
+
lowsettuple of floats, optional.
Tupe of values (P, low) set used for low-frequency cutoff calculation based on an
+endpoint overlap condition. The lowest frequency is set such that the lowest-frequency
+wavelet will reach some number P, called the packing number, times its central window
+width at the ends of the time series. A choice of P=1 corresponds to roughly 95% of
+the time-domain wavelet energy being contained within the time series endpoints for
+a wavelet at the center of the domain. The second value of the tuple is the absolute
+lowest frequency. Default is (P, low) = (5, 0).
+
+
densityint, optional
This optional argument controls the number of points in the returned frequency
+array. Higher values of density mean more overlap in the frequency
+domain between transforms. When density=1, the peak of one wavelet is located at the
+half-power points of the adjacent wavelet. The default density=4 means
+that four other wavelets will occur between the peak of one wavelet and
+its half-power point.
The radian frequencies at which the Fourier transform of the wavelets
+reach their maximum amplitudes. radian_frequency is between 0 and 2 * np.pi * 0.5,
+the normalized Nyquist radian frequency.
+
+
orderint, optional
Order of wavelets, default is 1.
+
+
normalizationstr, optional
Normalization for the wavelet output. By default it is assumed to be "bandpass"
+which uses a bandpass normalization, meaning that the FFT of the wavelets
+have peak value of 2 for all central frequencies radian_frequency. The other option is
+"energy"``whichusestheunitenergynormalization.Inthislastcase,thetime-domainwavelet
+energies``np.sum(np.abs(wave)**2) are always unity.
Apply a continuous wavelet transform to an input signal using the generalized Morse
+wavelets of Olhede and Walden (2002). The wavelet transform is normalized differently
+for complex-valued input than for real-valued input, and this in turns depends on whether the
+optional argument normalization is set to "bandpass" or "energy" normalizations.
Real- or complex-valued signals. The time axis is assumed to be the last. If not, specify optional
+argument time_axis.
+
+
gammafloat
Gamma parameter of the Morse wavelets.
+
+
betafloat
Beta parameter of the Morse wavelets.
+
+
radian_frequencynp.ndarray
An array of radian frequencies at which the Fourier transform of the wavelets
+reach their maximum amplitudes. radian_frequency is typically between 0 and 2 * np.pi * 0.5,
+the normalized Nyquist radian frequency.
+
+
complexboolean, optional
Specify explicitely if the input signal x is a complex signal. Default is False which
+means that the input is real but that is not explicitely tested by the function.
+This choice affects the normalization of the outputs and their interpretation.
+See examples below.
+
+
time_axisint, optional
Axis on which the time is defined for input x (default is last, or -1).
+
+
normalizationstr, optional
Normalization for the wavelet transforms. By default it is assumed to be
+"bandpass" which uses a bandpass normalization, meaning that the FFT
+of the wavelets have peak value of 2 for all central frequencies
+radian_frequency. However, if the optional argument complex=True
+is specified, the wavelets will be divided by 2 so that the total
+variance of the input complex signal is equal to the sum of the
+variances of the returned analytic (positive) and conjugate analytic
+(negative) parts. See examples below. The other option is "energy"
+which uses the unit energy normalization. In this last case, the
+time-domain wavelet energies np.sum(np.abs(wave)**2) are always
+unity.
+
+
boundarystr, optional
The boundary condition to be imposed at the edges of the input signal x.
+Allowed values are "mirror", "zeros", and "periodic". Default is "mirror".
If the input signal is real as specificied by complex=False:
+
+
wtxnp.ndarray
Time-domain wavelet transform of input x with shape ((x shape without time_axis), orders, frequencies, time_axis)
+but with dimensions of length 1 removed (squeezed).
+
+
+
If the input signal is complex as specificied by complex=True, a tuple is returned:
+
+
wtx_pnp.array
Time-domain positive wavelet transform of input x with shape ((x shape without time_axis), frequencies, orders),
+but with dimensions of length 1 removed (squeezed).
+
+
wtx_nnp.array
Time-domain negative wavelet transform of input x with shape ((x shape without time_axis), frequencies, orders),
+but with dimensions of length 1 removed (squeezed).
Apply a wavelet transform with a Morse wavelet with gamma parameter 3, beta parameter 4,
+for a complex input signal at radian frequency 0.2 cycles per unit time. This case returns the
+analytic and conjugate analytic components:
The same result as above can be otained by applying the Morse transform on the real and imaginary
+component of z and recombining the results as follows for the “bandpass” normalization:
+>>> wtz_real = morse_wavelet_transform(np.real(z)), 3, 4, np.array([2*np.pi*0.2]))
+>>> wtz_imag = morse_wavelet_transform(np.imag(z)), 3, 4, np.array([2*np.pi*0.2]))
+>>> wtz_p, wtz_n = (wtz_real + 1j*wtz_imag) / 2, (wtz_real - 1j*wtz_imag) / 2
+
For the “energy” normalization, the analytic and conjugate analytic components are obtained as follows
+with this alternative method:
+>>> wtz_real = morse_wavelet_transform(np.real(z)), 3, 4, np.array([2*np.pi*0.2]))
+>>> wtz_imag = morse_wavelet_transform(np.imag(z)), 3, 4, np.array([2*np.pi*0.2]))
+>>> wtz_p, wtz_n = (wtz_real + 1j*wtz_imag) / np.sqrt(2), (wtz_real - 1j*wtz_imag) / np.sqrt(2)
+
The input signal can have an arbitrary number of dimensions but its time_axis must be
+specified if it is not the last:
This function can be used to conduct a time-frequency analysis of the input signal by specifying
+a range of randian frequencies using the morse_logspace_freq function as an example:
If the time axis is outside of the valid range ([-1, np.ndim(x)-1]).
+If boundary optional argument is not in [“mirror”, “zeros”, “periodic”]``.
+If normalization optional argument is not in [“bandpass”, “energy”]``.
A suite of time-domain wavelets, typically returned by the function morse_wavelet.
+The length of the time axis of the wavelets must be the last one and matches the
+length of the time axis of x. The other dimensions (axes) of the wavelets (such as orders and frequencies) are
+typically organized as orders, frequencies, and time, unless specified by optional arguments freq_axis and order_axis.
+The normalization of the wavelets is assumed to be “bandpass”, if not, use kwarg normalization=”energy”, see morse_wavelet.
+
+
boundarystr, optional
The boundary condition to be imposed at the edges of the input signal x.
+Allowed values are "mirror", "zeros", and "periodic". Default is "mirror".
+
+
time_axisint, optional
Axis on which the time is defined for input x (default is last, or -1). Note that the time axis of the
+wavelets must be last.
+
+
freq_axisint, optional
Axis of wavelet for the frequencies (default is second or 1)
+
+
order_axisint, optional
Axis of wavelet for the orders (default is first or 0)
Time-domain wavelet transform of x with shape ((x shape without time_axis), orders, frequencies, time_axis)
+but with dimensions of length 1 removed (squeezed).
If the time axis is outside of the valid range ([-1, N-1]).
+If the shape of time axis is different for input signal and wavelet.
+If boundary optional argument is not in [“mirror”, “zeros”, “periodic”]``.
+"""
+This module provides functions and metadata to convert the Global Drifter
+Program (GDP) data to a ``clouddrift.RaggedArray`` instance. The functions
+defined in this module are common to both hourly (``clouddrift.adapters.gdp1h``)
+and six-hourly (``clouddrift.adapters.gdp6h``) GDP modules.
+"""
+
+fromclouddrift.adapters.utilsimportdownload_with_progress
+importnumpyasnp
+importos
+importpandasaspd
+importxarrayasxr
+
+GDP_COORDS=[
+ "ids",
+ "time",
+]
+
+GDP_METADATA=[
+ "ID",
+ "rowsize",
+ "WMO",
+ "expno",
+ "deploy_date",
+ "deploy_lat",
+ "deploy_lon",
+ "start_date",
+ "start_lat",
+ "start_lon",
+ "end_date",
+ "end_lat",
+ "end_lon",
+ "drogue_lost_date",
+ "typedeath",
+ "typebuoy",
+ "location_type",
+ "DeployingShip",
+ "DeploymentStatus",
+ "BuoyTypeManufacturer",
+ "BuoyTypeSensorArray",
+ "CurrentProgram",
+ "PurchaserFunding",
+ "SensorUpgrade",
+ "Transmissions",
+ "DeployingCountry",
+ "DeploymentComments",
+ "ManufactureYear",
+ "ManufactureMonth",
+ "ManufactureSensorType",
+ "ManufactureVoltage",
+ "FloatDiameter",
+ "SubsfcFloatPresence",
+ "DrogueType",
+ "DrogueLength",
+ "DrogueBallast",
+ "DragAreaAboveDrogue",
+ "DragAreaOfDrogue",
+ "DragAreaRatio",
+ "DrogueCenterDepth",
+ "DrogueDetectSensor",
+]
+
+
+
+[docs]
+defcast_float64_variables_to_float32(
+ ds:xr.Dataset,variables_to_skip:list[str]=["time","lat","lon"]
+)->xr.Dataset:
+"""Cast all float64 variables except ``variables_to_skip`` to float32.
+ Extra precision from float64 is not needed and takes up memory and disk
+ space.
+
+ Parameters
+ ----------
+ ds : xr.Dataset
+ Dataset to modify
+ variables_to_skip : list[str]
+ List of variables to skip; default is ["time", "lat", "lon"].
+
+ Returns
+ -------
+ ds : xr.Dataset
+ Modified dataset
+ """
+ forvarinds.variables:
+ ifvarinvariables_to_skip:
+ continue
+ ifds[var].dtype=="float64":
+ ds[var]=ds[var].astype("float32")
+ returnds
+
+
+
+
+[docs]
+defparse_directory_file(filename:str)->pd.DataFrame:
+"""Read a GDP directory file that contains metadata of drifter releases.
+
+ Parameters
+ ----------
+ filename : str
+ Name of the directory file to parse.
+
+ Returns
+ -------
+ df : pd.DataFrame
+ List of drifters from a single directory file as a pandas DataFrame.
+ """
+ GDP_DIRECTORY_FILE_URL="https://www.aoml.noaa.gov/ftp/pub/phod/buoydata/"
+ df=pd.read_csv(
+ os.path.join(GDP_DIRECTORY_FILE_URL,filename),delimiter="\s+",header=None
+ )
+
+ # Combine the date and time columns to easily parse dates below.
+ df[4]+=" "+df[5]
+ df[8]+=" "+df[9]
+ df[12]+=" "+df[13]
+ df=df.drop(columns=[5,9,13])
+ df.columns=[
+ "ID",
+ "WMO_number",
+ "program_number",
+ "buoys_type",
+ "Start_date",
+ "Start_lat",
+ "Start_lon",
+ "End_date",
+ "End_lat",
+ "End_lon",
+ "Drogue_off_date",
+ "death_code",
+ ]
+ fortin["Start_date","End_date","Drogue_off_date"]:
+ df[t]=pd.to_datetime(df[t],format="%Y/%m/%d %H:%M",errors="coerce")
+
+ returndf
+
+
+
+
+[docs]
+defget_gdp_metadata()->pd.DataFrame:
+"""Download and parse GDP metadata and return it as a Pandas DataFrame.
+
+ Returns
+ -------
+ df : pd.DataFrame
+ Sorted list of drifters as a pandas DataFrame.
+ """
+ directory_file_pattern="dirfl_{low}_{high}.dat"
+
+ dfs=[]
+ start=1
+ whileTrue:
+ name=directory_file_pattern.format(low=start,high=start+4999)
+ try:
+ dfs.append(parse_directory_file(name))
+ start+=5000
+ except:
+ break
+
+ name=directory_file_pattern.format(low=start,high="current")
+ dfs.append(parse_directory_file(name))
+
+ df=pd.concat(dfs)
+ df.sort_values(["Start_date"],inplace=True,ignore_index=True)
+ returndf
+
+
+
+
+[docs]
+deforder_by_date(df:pd.DataFrame,idx:list[int])->np.ndarray[int]:
+"""From the previously sorted DataFrame of directory files, return the
+ unique set of drifter IDs sorted by their start date (the date of the first
+ quality-controlled data point).
+
+ Parameters
+ ----------
+ idx : list
+ List of drifters to include in the ragged array
+
+ Returns
+ -------
+ idx : list
+ Unique set of drifter IDs sorted by their start date.
+ """
+ returndf.ID[np.where(np.in1d(df.ID,idx))[0]].values
+
+
+
+
+[docs]
+deffetch_netcdf(url:str,file:str):
+"""Download and save the file from the given url, if not already downloaded.
+
+ Parameters
+ ----------
+ url : str
+ URL from which to download the file.
+ file : str
+ Name of the file to save.
+ """
+ download_with_progress([(url,file)])
+
+
+
+
+[docs]
+defdecode_date(t):
+"""The date format is specified as 'seconds since 1970-01-01 00:00:00' but
+ the missing values are stored as -1e+34 which is not supported by the
+ default parsing mechanism in xarray.
+
+ This function returns replaced the missing value by NaN and returns a
+ datetime instance.
+
+ Parameters
+ ----------
+ t : array
+ Array of time values
+
+ Returns
+ -------
+ out : datetime
+ Datetime instance with the missing value replaced by NaN
+ """
+ nat_index=np.logical_or(np.isclose(t,-1e34),np.isnan(t))
+ t[nat_index]=np.nan
+ returnt
+
+
+
+
+[docs]
+deffill_values(var,default=np.nan):
+"""Change fill values (-1e+34, inf, -inf) in var array to the value
+ specified by default.
+
+ Parameters
+ ----------
+ var : array
+ Array to fill
+ default : float
+ Default value to use for fill values
+ """
+ missing_value=np.logical_or(np.isclose(var,-1e34),~np.isfinite(var))
+ ifnp.any(missing_value):
+ var[missing_value]=default
+ returnvar
+
+
+
+
+[docs]
+defstr_to_float(value:str,default:float=np.nan)->float:
+"""Convert a string to float, while returning the value of default if the
+ string is not convertible to a float, or if it's a NaN.
+
+ Parameters
+ ----------
+ value : str
+ String to convert to float
+ default : float
+ Default value to return if the string is not convertible to float
+
+ Returns
+ -------
+ out : float
+ Float value of the string, or default if the string is not convertible to float.
+ """
+ try:
+ fvalue=float(value)
+ ifnp.isnan(fvalue):
+ returndefault
+ else:
+ returnfvalue
+ exceptValueError:
+ returndefault
+
+
+
+
+[docs]
+defcut_str(value:str,max_length:int)->np.chararray:
+"""Cut a string to a specific length and return it as a numpy chararray.
+
+ Parameters
+ ----------
+ value : str
+ String to cut
+ max_length : int
+ Length of the output
+
+ Returns
+ -------
+ out : np.chararray
+ String with max_length characters
+ """
+ charar=np.chararray(1,max_length)
+ charar[:max_length]=value
+ returncharar
+
+
+
+
+[docs]
+defdrogue_presence(lost_time,time)->bool:
+"""Create drogue status from the drogue lost time and the trajectory time.
+
+ Parameters
+ ----------
+ lost_time
+ Timestamp of the drogue loss (or NaT)
+ time
+ Observation time
+
+ Returns
+ -------
+ out : bool
+ True if drogues and False otherwise
+ """
+ ifpd.isnull(lost_time)orlost_time>=time[-1]:
+ returnnp.ones_like(time,dtype="bool")
+ else:
+ returntime<lost_time
+"""
+This module provides functions and metadata that can be used to convert the
+hourly Global Drifter Program (GDP) data to a ``clouddrift.RaggedArray``
+instance.
+"""
+
+importclouddrift.adapters.gdpasgdp
+fromclouddrift.raggedarrayimportRaggedArray
+fromclouddrift.adapters.utilsimportdownload_with_progress
+fromdatetimeimportdatetime,timedelta
+importnumpyasnp
+importurllib.request
+importre
+importtempfile
+fromtypingimportOptional
+importos
+importwarnings
+importxarrayasxr
+
+GDP_VERSION="2.01"
+
+GDP_DATA_URL="https://www.aoml.noaa.gov/ftp/pub/phod/buoydata/hourly_product/v2.01/"
+GDP_DATA_URL_EXPERIMENTAL=(
+ "https://www.aoml.noaa.gov/ftp/pub/phod/lumpkin/hourly/experimental/"
+)
+GDP_TMP_PATH=os.path.join(tempfile.gettempdir(),"clouddrift","gdp")
+GDP_TMP_PATH_EXPERIMENTAL=os.path.join(tempfile.gettempdir(),"clouddrift","gdp_exp")
+GDP_DATA=[
+ "lon",
+ "lat",
+ "ve",
+ "vn",
+ "err_lat",
+ "err_lon",
+ "err_ve",
+ "err_vn",
+ "gap",
+ "sst",
+ "sst1",
+ "sst2",
+ "err_sst",
+ "err_sst1",
+ "err_sst2",
+ "flg_sst",
+ "flg_sst1",
+ "flg_sst2",
+ "drogue_status",
+]
+
+
+
+[docs]
+defdownload(
+ drifter_ids:list=None,
+ n_random_id:int=None,
+ url:str=GDP_DATA_URL,
+ tmp_path:str=None,
+):
+"""Download individual NetCDF files from the AOML server.
+
+ Parameters
+ ----------
+ drifter_ids : list
+ List of drifter to retrieve (Default: all)
+ n_random_id : int
+ Randomly select n_random_id drifter IDs to download (Default: None)
+ url : str
+ URL from which to download the data (Default: GDP_DATA_URL). Alternatively, it can be GDP_DATA_URL_EXPERIMENTAL.
+ tmp_path : str, optional
+ Path to the directory where the individual NetCDF files are stored
+ (default varies depending on operating system; /tmp/clouddrift/gdp on Linux)
+
+ Returns
+ -------
+ out : list
+ List of retrieved drifters
+ """
+
+ # adjust the tmp_path if using the experimental source
+ iftmp_pathisNone:
+ tmp_path=GDP_TMP_PATHifurl==GDP_DATA_URLelseGDP_TMP_PATH_EXPERIMENTAL
+
+ print(f"Downloading GDP hourly data from {url} to {tmp_path}...")
+
+ # Create a temporary directory if doesn't already exists.
+ os.makedirs(tmp_path,exist_ok=True)
+
+ ifurl==GDP_DATA_URL:
+ pattern="drifter_hourly_[0-9]*.nc"
+ filename_pattern="drifter_hourly_{id}.nc"
+ elifurl==GDP_DATA_URL_EXPERIMENTAL:
+ pattern="drifter_hourly_[0-9]*.nc"
+ filename_pattern="drifter_hourly_{id}.nc"
+
+ # retrieve all drifter ID numbers
+ ifdrifter_idsisNone:
+ urlpath=urllib.request.urlopen(url)
+ string=urlpath.read().decode("utf-8")
+ filelist=re.compile(pattern).findall(string)
+ drifter_ids=np.unique([int(f.split("_")[-1][:-3])forfinfilelist])
+
+ # retrieve only a subset of n_random_id trajectories
+ ifn_random_id:
+ ifn_random_id>len(drifter_ids):
+ warnings.warn(
+ f"Retrieving all listed trajectories because {n_random_id} is larger than the {len(drifter_ids)} listed trajectories."
+ )
+ else:
+ rng=np.random.RandomState(42)
+ drifter_ids=sorted(rng.choice(drifter_ids,n_random_id,replace=False))
+
+ download_requests=[
+ (os.path.join(url,file_name),os.path.join(tmp_path,file_name))
+ forfile_nameinmap(lambdad_id:filename_pattern.format(id=d_id),drifter_ids)
+ ]
+ download_with_progress(download_requests)
+ # Download the metadata so we can order the drifter IDs by end date.
+ gdp_metadata=gdp.get_gdp_metadata()
+
+ returngdp.order_by_date(gdp_metadata,drifter_ids)
+
+
+
+
+[docs]
+defpreprocess(index:int,**kwargs)->xr.Dataset:
+"""Extract and preprocess the Lagrangian data and attributes.
+
+ This function takes an identification number that can be used to create a
+ file or url pattern or select data from a Dataframe. It then preprocesses
+ the data and returns a clean Xarray Dataset.
+
+ Parameters
+ ----------
+ index : int
+ Drifter's identification number
+
+ Returns
+ -------
+ ds : xr.Dataset
+ Xarray Dataset containing the data and attributes
+ """
+ ds=xr.load_dataset(
+ os.path.join(kwargs["tmp_path"],kwargs["filename_pattern"].format(id=index)),
+ decode_times=False,
+ decode_coords=False,
+ )
+
+ # parse the date with custom function
+ ds["deploy_date"].data=gdp.decode_date(np.array([ds.deploy_date.data[0]]))
+ ds["end_date"].data=gdp.decode_date(np.array([ds.end_date.data[0]]))
+ ds["drogue_lost_date"].data=gdp.decode_date(
+ np.array([ds.drogue_lost_date.data[0]])
+ )
+ ds["time"].data=gdp.decode_date(np.array([ds.time.data[0]]))
+
+ # convert fill values to nan
+ forvarin[
+ "err_lon",
+ "err_lat",
+ "err_ve",
+ "err_vn",
+ "sst",
+ "sst1",
+ "sst2",
+ "err_sst",
+ "err_sst1",
+ "err_sst2",
+ ]:
+ try:
+ ds[var].data=gdp.fill_values(ds[var].data)
+ exceptKeyError:
+ warnings.warn(f"Variable {var} not found; skipping.")
+
+ # fix missing values stored as str
+ forvarin[
+ "longitude",
+ "latitude",
+ "err_lat",
+ "err_lon",
+ "ve",
+ "vn",
+ "err_ve",
+ "err_vn",
+ "sst",
+ "sst1",
+ "sst2",
+ ]:
+ try:
+ ds[var].encoding["missing value"]=-1e-34
+ exceptKeyError:
+ warnings.warn(f"Variable {var} not found in upstream data; skipping.")
+
+ # convert type of some variable
+ target_dtype={
+ "ID":"int64",
+ "WMO":"int32",
+ "expno":"int32",
+ "typedeath":"int8",
+ "flg_sst":"int8",
+ "flg_sst1":"int8",
+ "flg_sst2":"int8",
+ }
+
+ forvarintarget_dtype.keys():
+ ifvarinds.keys():
+ ds[var].data=ds[var].data.astype(target_dtype[var])
+ else:
+ warnings.warn(f"Variable {var} not found in upstream data; skipping.")
+
+ # new variables
+ ds["ids"]=(["traj","obs"],[np.repeat(ds.ID.values,ds.sizes["obs"])])
+ ds["drogue_status"]=(
+ ["traj","obs"],
+ [gdp.drogue_presence(ds.drogue_lost_date.data,ds.time.data[0])],
+ )
+
+ # convert attributes to variable
+ ds["location_type"]=(
+ ("traj"),
+ [Falseifds.get("location_type")=="Argos"elseTrue],
+ )# 0 for Argos, 1 for GPS
+ ds["DeployingShip"]=(("traj"),gdp.cut_str(ds.DeployingShip,20))
+ ds["DeploymentStatus"]=(("traj"),gdp.cut_str(ds.DeploymentStatus,20))
+ ds["BuoyTypeManufacturer"]=(("traj"),gdp.cut_str(ds.BuoyTypeManufacturer,20))
+ ds["BuoyTypeSensorArray"]=(("traj"),gdp.cut_str(ds.BuoyTypeSensorArray,20))
+ ds["CurrentProgram"]=(
+ ("traj"),
+ np.int32([gdp.str_to_float(ds.CurrentProgram,-1)]),
+ )
+ ds["PurchaserFunding"]=(("traj"),gdp.cut_str(ds.PurchaserFunding,20))
+ ds["SensorUpgrade"]=(("traj"),gdp.cut_str(ds.SensorUpgrade,20))
+ ds["Transmissions"]=(("traj"),gdp.cut_str(ds.Transmissions,20))
+ ds["DeployingCountry"]=(("traj"),gdp.cut_str(ds.DeployingCountry,20))
+ ds["DeploymentComments"]=(
+ ("traj"),
+ gdp.cut_str(
+ ds.DeploymentComments.encode("ascii","ignore").decode("ascii"),20
+ ),
+ )# remove non ascii char
+ ds["ManufactureYear"]=(
+ ("traj"),
+ np.int16([gdp.str_to_float(ds.ManufactureYear,-1)]),
+ )
+ ds["ManufactureMonth"]=(
+ ("traj"),
+ np.int16([gdp.str_to_float(ds.ManufactureMonth,-1)]),
+ )
+ ds["ManufactureSensorType"]=(("traj"),gdp.cut_str(ds.ManufactureSensorType,20))
+ ds["ManufactureVoltage"]=(
+ ("traj"),
+ np.int16([gdp.str_to_float(ds.ManufactureVoltage[:-6],-1)]),
+ )# e.g. 56 V
+ ds["FloatDiameter"]=(
+ ("traj"),
+ [gdp.str_to_float(ds.FloatDiameter[:-3])],
+ )# e.g. 35.5 cm
+ ds["SubsfcFloatPresence"]=(
+ ("traj"),
+ np.array([gdp.str_to_float(ds.SubsfcFloatPresence)],dtype="bool"),
+ )
+ ds["DrogueType"]=(("traj"),gdp.cut_str(ds.DrogueType,7))
+ ds["DrogueLength"]=(
+ ("traj"),
+ [gdp.str_to_float(ds.DrogueLength[:-2])],
+ )# e.g. 4.8 m
+ ds["DrogueBallast"]=(
+ ("traj"),
+ [gdp.str_to_float(ds.DrogueBallast[:-3])],
+ )# e.g. 1.4 kg
+ ds["DragAreaAboveDrogue"]=(
+ ("traj"),
+ [gdp.str_to_float(ds.DragAreaAboveDrogue[:-4])],
+ )# 10.66 m^2
+ ds["DragAreaOfDrogue"]=(
+ ("traj"),
+ [gdp.str_to_float(ds.DragAreaOfDrogue[:-4])],
+ )# e.g. 416.6 m^2
+ ds["DragAreaRatio"]=(("traj"),[gdp.str_to_float(ds.DragAreaRatio)])# e.g. 39.08
+ ds["DrogueCenterDepth"]=(
+ ("traj"),
+ [gdp.str_to_float(ds.DrogueCenterDepth[:-2])],
+ )# e.g. 20.0 m
+ ds["DrogueDetectSensor"]=(("traj"),gdp.cut_str(ds.DrogueDetectSensor,20))
+
+ # vars attributes
+ vars_attrs={
+ "ID":{"long_name":"Global Drifter Program Buoy ID","units":"-"},
+ "longitude":{"long_name":"Longitude","units":"degrees_east"},
+ "latitude":{"long_name":"Latitude","units":"degrees_north"},
+ "time":{"long_name":"Time","units":"seconds since 1970-01-01 00:00:00"},
+ "ids":{
+ "long_name":"Global Drifter Program Buoy ID repeated along observations",
+ "units":"-",
+ },
+ "rowsize":{
+ "long_name":"Number of observations per trajectory",
+ "sample_dimension":"obs",
+ "units":"-",
+ },
+ "location_type":{
+ "long_name":"Satellite-based location system",
+ "units":"-",
+ "comments":"0 (Argos), 1 (GPS)",
+ },
+ "WMO":{
+ "long_name":"World Meteorological Organization buoy identification number",
+ "units":"-",
+ },
+ "expno":{"long_name":"Experiment number","units":"-"},
+ "deploy_date":{
+ "long_name":"Deployment date and time",
+ "units":"seconds since 1970-01-01 00:00:00",
+ },
+ "deploy_lon":{"long_name":"Deployment longitude","units":"degrees_east"},
+ "deploy_lat":{"long_name":"Deployment latitude","units":"degrees_north"},
+ "start_date":{
+ "long_name":"First good date and time derived by DAC quality control",
+ "units":"seconds since 1970-01-01 00:00:00",
+ },
+ "start_lon":{
+ "long_name":"First good longitude derived by DAC quality control",
+ "units":"degrees_east",
+ },
+ "start_lat":{
+ "long_name":"Last good latitude derived by DAC quality control",
+ "units":"degrees_north",
+ },
+ "end_date":{
+ "long_name":"Last good date and time derived by DAC quality control",
+ "units":"seconds since 1970-01-01 00:00:00",
+ },
+ "end_lon":{
+ "long_name":"Last good longitude derived by DAC quality control",
+ "units":"degrees_east",
+ },
+ "end_lat":{
+ "long_name":"Last good latitude derived by DAC quality control",
+ "units":"degrees_north",
+ },
+ "drogue_lost_date":{
+ "long_name":"Date and time of drogue loss",
+ "units":"seconds since 1970-01-01 00:00:00",
+ },
+ "typedeath":{
+ "long_name":"Type of death",
+ "units":"-",
+ "comments":"0 (buoy still alive), 1 (buoy ran aground), 2 (picked up by vessel), 3 (stop transmitting), 4 (sporadic transmissions), 5 (bad batteries), 6 (inactive status)",
+ },
+ "typebuoy":{
+ "long_name":"Buoy type (see https://www.aoml.noaa.gov/phod/dac/dirall.html)",
+ "units":"-",
+ },
+ "DeployingShip":{"long_name":"Name of deployment ship","units":"-"},
+ "DeploymentStatus":{"long_name":"Deployment status","units":"-"},
+ "BuoyTypeManufacturer":{"long_name":"Buoy type manufacturer","units":"-"},
+ "BuoyTypeSensorArray":{"long_name":"Buoy type sensor array","units":"-"},
+ "CurrentProgram":{
+ "long_name":"Current Program",
+ "units":"-",
+ "_FillValue":"-1",
+ },
+ "PurchaserFunding":{"long_name":"Purchaser funding","units":"-"},
+ "SensorUpgrade":{"long_name":"Sensor upgrade","units":"-"},
+ "Transmissions":{"long_name":"Transmissions","units":"-"},
+ "DeployingCountry":{"long_name":"Deploying country","units":"-"},
+ "DeploymentComments":{"long_name":"Deployment comments","units":"-"},
+ "ManufactureYear":{
+ "long_name":"Manufacture year",
+ "units":"-",
+ "_FillValue":"-1",
+ },
+ "ManufactureMonth":{
+ "long_name":"Manufacture month",
+ "units":"-",
+ "_FillValue":"-1",
+ },
+ "ManufactureSensorType":{"long_name":"Manufacture Sensor Type","units":"-"},
+ "ManufactureVoltage":{
+ "long_name":"Manufacture voltage",
+ "units":"V",
+ "_FillValue":"-1",
+ },
+ "FloatDiameter":{"long_name":"Diameter of surface floater","units":"cm"},
+ "SubsfcFloatPresence":{"long_name":"Subsurface Float Presence","units":"-"},
+ "DrogueType":{"drogue_type":"Drogue Type","units":"-"},
+ "DrogueLength":{"long_name":"Length of drogue.","units":"m"},
+ "DrogueBallast":{
+ "long_name":"Weight of the drogue's ballast.",
+ "units":"kg",
+ },
+ "DragAreaAboveDrogue":{"long_name":"Drag area above drogue.","units":"m^2"},
+ "DragAreaOfDrogue":{"long_name":"Drag area drogue.","units":"m^2"},
+ "DragAreaRatio":{"long_name":"Drag area ratio","units":"m"},
+ "DrogueCenterDepth":{
+ "long_name":"Average depth of the drogue.",
+ "units":"m",
+ },
+ "DrogueDetectSensor":{"long_name":"Drogue detection sensor","units":"-"},
+ "ve":{"long_name":"Eastward velocity","units":"m/s"},
+ "vn":{"long_name":"Northward velocity","units":"m/s"},
+ "gap":{
+ "long_name":"Time interval between previous and next location",
+ "units":"s",
+ },
+ "err_lat":{
+ "long_name":"95% confidence interval in latitude",
+ "units":"degrees_north",
+ },
+ "err_lon":{
+ "long_name":"95% confidence interval in longitude",
+ "units":"degrees_east",
+ },
+ "err_ve":{
+ "long_name":"95% confidence interval in eastward velocity",
+ "units":"m/s",
+ },
+ "err_vn":{
+ "long_name":"95% confidence interval in northward velocity",
+ "units":"m/s",
+ },
+ "drogue_status":{
+ "long_name":"Status indicating the presence of the drogue",
+ "units":"-",
+ "flag_values":"1,0",
+ "flag_meanings":"drogued, undrogued",
+ },
+ "sst":{
+ "long_name":"Fitted sea water temperature",
+ "units":"Kelvin",
+ "comments":"Estimated near-surface sea water temperature from drifting buoy measurements. It is the sum of the fitted near-surface non-diurnal sea water temperature and fitted diurnal sea water temperature anomaly. Discrepancies may occur because of rounding.",
+ },
+ "sst1":{
+ "long_name":"Fitted non-diurnal sea water temperature",
+ "units":"Kelvin",
+ "comments":"Estimated near-surface non-diurnal sea water temperature from drifting buoy measurements",
+ },
+ "sst2":{
+ "long_name":"Fitted diurnal sea water temperature anomaly",
+ "units":"Kelvin",
+ "comments":"Estimated near-surface diurnal sea water temperature anomaly from drifting buoy measurements",
+ },
+ "err_sst":{
+ "long_name":"Standard uncertainty of fitted sea water temperature",
+ "units":"Kelvin",
+ "comments":"Estimated one standard error of near-surface sea water temperature estimate from drifting buoy measurements",
+ },
+ "err_sst1":{
+ "long_name":"Standard uncertainty of fitted non-diurnal sea water temperature",
+ "units":"Kelvin",
+ "comments":"Estimated one standard error of near-surface non-diurnal sea water temperature estimate from drifting buoy measurements",
+ },
+ "err_sst2":{
+ "long_name":"Standard uncertainty of fitted diurnal sea water temperature anomaly",
+ "units":"Kelvin",
+ "comments":"Estimated one standard error of near-surface diurnal sea water temperature anomaly estimate from drifting buoy measurements",
+ },
+ "flg_sst":{
+ "long_name":"Fitted sea water temperature quality flag",
+ "units":"-",
+ "flag_values":"0, 1, 2, 3, 4, 5",
+ "flag_meanings":"no-estimate, no-uncertainty-estimate, estimate-not-in-range-uncertainty-not-in-range, estimate-not-in-range-uncertainty-in-range estimate-in-range-uncertainty-not-in-range, estimate-in-range-uncertainty-in-range",
+ },
+ "flg_sst1":{
+ "long_name":"Fitted non-diurnal sea water temperature quality flag",
+ "units":"-",
+ "flag_values":"0, 1, 2, 3, 4, 5",
+ "flag_meanings":"no-estimate, no-uncertainty-estimate, estimate-not-in-range-uncertainty-not-in-range, estimate-not-in-range-uncertainty-in-range estimate-in-range-uncertainty-not-in-range, estimate-in-range-uncertainty-in-range",
+ },
+ "flg_sst2":{
+ "long_name":"Fitted diurnal sea water temperature anomaly quality flag",
+ "units":"-",
+ "flag_values":"0, 1, 2, 3, 4, 5",
+ "flag_meanings":"no-estimate, no-uncertainty-estimate, estimate-not-in-range-uncertainty-not-in-range, estimate-not-in-range-uncertainty-in-range estimate-in-range-uncertainty-not-in-range, estimate-in-range-uncertainty-in-range",
+ },
+ }
+
+ # global attributes
+ attrs={
+ "title":"Global Drifter Program hourly drifting buoy collection",
+ "history":f"version {GDP_VERSION}. Metadata from dirall.dat and deplog.dat",
+ "Conventions":"CF-1.6",
+ "time_coverage_start":"",
+ "time_coverage_end":"",
+ "date_created":datetime.now().isoformat(),
+ "publisher_name":"GDP Drifter DAC",
+ "publisher_email":"aoml.dftr@noaa.gov",
+ "publisher_url":"https://www.aoml.noaa.gov/phod/gdp",
+ "license":"freely available",
+ "processing_level":"Level 2 QC by GDP drifter DAC",
+ "metadata_link":"https://www.aoml.noaa.gov/phod/dac/dirall.html",
+ "contributor_name":"NOAA Global Drifter Program",
+ "contributor_role":"Data Acquisition Center",
+ "institution":"NOAA Atlantic Oceanographic and Meteorological Laboratory",
+ "acknowledgement":"Elipot, Shane; Sykulski, Adam; Lumpkin, Rick; Centurioni, Luca; Pazos, Mayra (2022). Hourly location, current velocity, and temperature collected from Global Drifter Program drifters world-wide. [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://doi.org/10.25921/x46c-3620. Accessed [date]. Elipot et al. (2022): A Dataset of Hourly Sea Surface Temperature From Drifting Buoys, Scientific Data, 9, 567, https://dx.doi.org/10.1038/s41597-022-01670-2. Elipot et al. (2016): A global surface drifter dataset at hourly resolution, J. Geophys. Res.-Oceans, 121, https://dx.doi.org/10.1002/2016JC011716.",
+ "summary":"Global Drifter Program hourly data",
+ "doi":"10.25921/x46c-3620",
+ }
+
+ # set attributes
+ forvarinvars_attrs.keys():
+ ifvarinds.keys():
+ ds[var].attrs=vars_attrs[var]
+ else:
+ warnings.warn(f"Variable {var} not found in upstream data; skipping.")
+ ds.attrs=attrs
+
+ # rename variables
+ ds=ds.rename_vars({"longitude":"lon","latitude":"lat"})
+
+ # Cast float64 variables to float32 to reduce memory footprint.
+ ds=gdp.cast_float64_variables_to_float32(ds)
+
+ returnds
+
+
+
+
+[docs]
+defto_raggedarray(
+ drifter_ids:Optional[list[int]]=None,
+ n_random_id:Optional[int]=None,
+ url:Optional[str]=GDP_DATA_URL,
+ tmp_path:Optional[str]=None,
+)->RaggedArray:
+"""Download and process individual GDP hourly files and return a RaggedArray
+ instance with the data.
+
+ Parameters
+ ----------
+ drifter_ids : list[int], optional
+ List of drifters to retrieve (Default: all)
+ n_random_id : list[int], optional
+ Randomly select n_random_id drifter NetCDF files
+ url : str, optional
+ URL from which to download the data (Default: GDP_DATA_URL).
+ Alternatively, it can be GDP_DATA_URL_EXPERIMENTAL.
+ tmp_path : str, optional
+ Path to the directory where the individual NetCDF files are stored
+ (default varies depending on operating system; /tmp/clouddrift/gdp on Linux)
+
+ Returns
+ -------
+ out : RaggedArray
+ A RaggedArray instance of the requested dataset
+
+ Examples
+ --------
+
+ Invoke `to_raggedarray` without any arguments to download all drifter data
+ from the 2.01 GDP feed:
+
+ >>> from clouddrift.adapters.gdp1h import to_raggedarray
+ >>> ra = to_raggedarray()
+
+ To download a random sample of 100 drifters, for example for development
+ or testing, use the `n_random_id` argument:
+
+ >>> ra = to_raggedarray(n_random_id=100)
+
+ To download a specific list of drifters, use the `drifter_ids` argument:
+
+ >>> ra = to_raggedarray(drifter_ids=[44136, 54680, 83463])
+
+ To download the experimental 2.01 GDP feed, use the `url` argument to
+ specify the experimental feed URL:
+
+ >>> from clouddrift.adapters.gdp1h import GDP_DATA_URL_EXPERIMENTAL, to_raggedarray
+ >>> ra = to_raggedarray(url=GDP_DATA_URL_EXPERIMENTAL)
+
+ Finally, `to_raggedarray` returns a `RaggedArray` instance which provides
+ a convenience method to emit a `xarray.Dataset` instance:
+
+ >>> ds = ra.to_xarray()
+
+ To write the ragged array dataset to a NetCDF file on disk, do
+
+ >>> ds.to_netcdf("gdp1h.nc", format="NETCDF4")
+
+ Alternatively, to write the ragged array to a Parquet file, first create
+ it as an Awkward Array:
+
+ >>> arr = ra.to_awkward()
+ >>> arr.to_parquet("gdp1h.parquet")
+ """
+
+ # adjust the tmp_path if using the experimental source
+ iftmp_pathisNone:
+ tmp_path=GDP_TMP_PATHifurl==GDP_DATA_URLelseGDP_TMP_PATH_EXPERIMENTAL
+
+ ids=download(drifter_ids,n_random_id,url,tmp_path)
+
+ ifurl==GDP_DATA_URL:
+ filename_pattern="drifter_hourly_{id}.nc"
+ elifurl==GDP_DATA_URL_EXPERIMENTAL:
+ filename_pattern="drifter_hourly_{id}.nc"
+ else:
+ raiseValueError(f"url must be {GDP_DATA_URL} or {GDP_DATA_URL_EXPERIMENTAL}.")
+
+ ra=RaggedArray.from_files(
+ indices=ids,
+ preprocess_func=preprocess,
+ name_coords=gdp.GDP_COORDS,
+ name_meta=gdp.GDP_METADATA,
+ name_data=GDP_DATA,
+ rowsize_func=gdp.rowsize,
+ filename_pattern=filename_pattern,
+ tmp_path=tmp_path,
+ )
+
+ # set dynamic global attributes
+ ra.attrs_global[
+ "time_coverage_start"
+ ]=f"{datetime(1970,1,1)+timedelta(seconds=int(np.min(ra.coords['time']))):%Y-%m-%d:%H:%M:%SZ}"
+ ra.attrs_global[
+ "time_coverage_end"
+ ]=f"{datetime(1970,1,1)+timedelta(seconds=int(np.max(ra.coords['time']))):%Y-%m-%d:%H:%M:%SZ}"
+
+ returnra
+"""
+This module provides functions and metadata that can be used to convert the
+6-hourly Global Drifter Program (GDP) data to a ``clouddrift.RaggedArray``
+instance.
+"""
+
+importclouddrift.adapters.gdpasgdp
+fromclouddrift.adapters.utilsimportdownload_with_progress
+fromclouddrift.raggedarrayimportRaggedArray
+fromdatetimeimportdatetime,timedelta
+importnumpyasnp
+importurllib.request
+importre
+importtempfile
+fromtypingimportOptional
+importos
+importwarnings
+importxarrayasxr
+
+GDP_VERSION="September 2023"
+
+GDP_DATA_URL="https://www.aoml.noaa.gov/ftp/pub/phod/buoydata/6h/"
+GDP_TMP_PATH=os.path.join(tempfile.gettempdir(),"clouddrift","gdp6h")
+GDP_DATA=[
+ "lon",
+ "lat",
+ "ve",
+ "vn",
+ "temp",
+ "err_lat",
+ "err_lon",
+ "err_temp",
+ "drogue_status",
+]
+
+
+
+[docs]
+defdownload(
+ drifter_ids:list=None,
+ n_random_id:int=None,
+ url:str=GDP_DATA_URL,
+ tmp_path:str=GDP_TMP_PATH,
+):
+"""Download individual NetCDF files from the AOML server.
+
+ Parameters
+ ----------
+ drifter_ids : list
+ List of drifter to retrieve (Default: all)
+ n_random_id : int
+ Randomly select n_random_id drifter IDs to download (Default: None)
+ url : str
+ URL from which to download the data (Default: GDP_DATA_URL). Alternatively, it can be GDP_DATA_URL_EXPERIMENTAL.
+ tmp_path : str, optional
+ Path to the directory where the individual NetCDF files are stored
+ (default varies depending on operating system; /tmp/clouddrift/gdp6h on Linux)
+
+ Returns
+ -------
+ out : list
+ List of retrieved drifters
+ """
+
+ print(f"Downloading GDP 6-hourly data to {tmp_path}...")
+
+ # Create a temporary directory if doesn't already exists.
+ os.makedirs(tmp_path,exist_ok=True)
+
+ pattern="drifter_6h_[0-9]*.nc"
+ directory_list=[
+ "netcdf_1_5000",
+ "netcdf_5001_10000",
+ "netcdf_10001_15000",
+ "netcdf_15001_current",
+ ]
+
+ # retrieve all drifter ID numbers
+ ifdrifter_idsisNone:
+ urlpath=urllib.request.urlopen(url)
+ string=urlpath.read().decode("utf-8")
+ drifter_urls=[]
+ fordirindirectory_list:
+ urlpath=urllib.request.urlopen(os.path.join(url,dir))
+ string=urlpath.read().decode("utf-8")
+ filelist=list(set(re.compile(pattern).findall(string)))
+ drifter_urls+=[os.path.join(url,dir,f)forfinfilelist]
+
+ # retrieve only a subset of n_random_id trajectories
+ ifn_random_id:
+ ifn_random_id>len(drifter_urls):
+ warnings.warn(
+ f"Retrieving all listed trajectories because {n_random_id} is larger than the {len(drifter_ids)} listed trajectories."
+ )
+ else:
+ rng=np.random.RandomState(42)
+ drifter_urls=rng.choice(drifter_urls,n_random_id,replace=False)
+
+ download_with_progress(
+ [(url,os.path.join(tmp_path,os.path.basename(url)))forurlindrifter_urls]
+ )
+
+ # Download the metadata so we can order the drifter IDs by end date.
+ gdp_metadata=gdp.get_gdp_metadata()
+ drifter_ids=[
+ int(os.path.basename(f).split("_")[2].split(".")[0])forfindrifter_urls
+ ]
+
+ returngdp.order_by_date(gdp_metadata,drifter_ids)
+
+
+
+
+[docs]
+defpreprocess(index:int,**kwargs)->xr.Dataset:
+"""Extract and preprocess the Lagrangian data and attributes.
+
+ This function takes an identification number that can be used to create a
+ file or url pattern or select data from a Dataframe. It then preprocesses
+ the data and returns a clean Xarray Dataset.
+
+ Parameters
+ ----------
+ index : int
+ Drifter's identification number
+
+ Returns
+ -------
+ ds : xr.Dataset
+ Xarray Dataset containing the data and attributes
+ """
+ ds=xr.load_dataset(
+ os.path.join(kwargs["tmp_path"],kwargs["filename_pattern"].format(id=index)),
+ decode_times=False,
+ decode_coords=False,
+ )
+
+ # parse the date with custom function
+ ds["deploy_date"].data=gdp.decode_date(np.array([ds.deploy_date.data[0]]))
+ ds["end_date"].data=gdp.decode_date(np.array([ds.end_date.data[0]]))
+ ds["drogue_lost_date"].data=gdp.decode_date(
+ np.array([ds.drogue_lost_date.data[0]])
+ )
+ ds["time"].data=gdp.decode_date(np.array([ds.time.data[0]]))
+
+ # convert fill values to nan
+ forvarin[
+ "err_lon",
+ "err_lat",
+ "temp",
+ "err_temp",
+ ]:
+ try:
+ ds[var].data=gdp.fill_values(ds[var].data)
+ exceptKeyError:
+ warnings.warn(f"Variable {var} not found; skipping.")
+
+ # fix missing values stored as str
+ forvarin[
+ "longitude",
+ "latitude",
+ "err_lat",
+ "err_lon",
+ "ve",
+ "vn",
+ "temp",
+ "err_temp",
+ ]:
+ try:
+ ds[var].encoding["missing value"]=-1e-34
+ exceptKeyError:
+ warnings.warn(f"Variable {var} not found in upstream data; skipping.")
+
+ # convert type of some variable
+ target_dtype={
+ "ID":"int64",
+ "WMO":"int32",
+ "expno":"int32",
+ "typedeath":"int8",
+ }
+
+ forvarintarget_dtype.keys():
+ ifvarinds.keys():
+ ds[var].data=ds[var].data.astype(target_dtype[var])
+ else:
+ warnings.warn(f"Variable {var} not found in upstream data; skipping.")
+
+ # new variables
+ ds["ids"]=(["traj","obs"],[np.repeat(ds.ID.values,ds.sizes["obs"])])
+ ds["drogue_status"]=(
+ ["traj","obs"],
+ [gdp.drogue_presence(ds.drogue_lost_date.data,ds.time.data[0])],
+ )
+
+ # convert attributes to variable
+ ds["location_type"]=(
+ ("traj"),
+ [Falseifds.get("location_type")=="Argos"elseTrue],
+ )# 0 for Argos, 1 for GPS
+ ds["DeployingShip"]=(("traj"),gdp.cut_str(ds.DeployingShip,20))
+ ds["DeploymentStatus"]=(("traj"),gdp.cut_str(ds.DeploymentStatus,20))
+ ds["BuoyTypeManufacturer"]=(("traj"),gdp.cut_str(ds.BuoyTypeManufacturer,20))
+ ds["BuoyTypeSensorArray"]=(("traj"),gdp.cut_str(ds.BuoyTypeSensorArray,20))
+ ds["CurrentProgram"]=(
+ ("traj"),
+ np.int32([gdp.str_to_float(ds.CurrentProgram,-1)]),
+ )
+ ds["PurchaserFunding"]=(("traj"),gdp.cut_str(ds.PurchaserFunding,20))
+ ds["SensorUpgrade"]=(("traj"),gdp.cut_str(ds.SensorUpgrade,20))
+ ds["Transmissions"]=(("traj"),gdp.cut_str(ds.Transmissions,20))
+ ds["DeployingCountry"]=(("traj"),gdp.cut_str(ds.DeployingCountry,20))
+ ds["DeploymentComments"]=(
+ ("traj"),
+ gdp.cut_str(
+ ds.DeploymentComments.encode("ascii","ignore").decode("ascii"),20
+ ),
+ )# remove non ascii char
+ ds["ManufactureYear"]=(
+ ("traj"),
+ np.int16([gdp.str_to_float(ds.ManufactureYear,-1)]),
+ )
+ ds["ManufactureMonth"]=(
+ ("traj"),
+ np.int16([gdp.str_to_float(ds.ManufactureMonth,-1)]),
+ )
+ ds["ManufactureSensorType"]=(("traj"),gdp.cut_str(ds.ManufactureSensorType,20))
+ ds["ManufactureVoltage"]=(
+ ("traj"),
+ np.int16([gdp.str_to_float(ds.ManufactureVoltage[:-6],-1)]),
+ )# e.g. 56 V
+ ds["FloatDiameter"]=(
+ ("traj"),
+ [gdp.str_to_float(ds.FloatDiameter[:-3])],
+ )# e.g. 35.5 cm
+ ds["SubsfcFloatPresence"]=(
+ ("traj"),
+ np.array([gdp.str_to_float(ds.SubsfcFloatPresence)],dtype="bool"),
+ )
+ ds["DrogueType"]=(("traj"),gdp.cut_str(ds.DrogueType,7))
+ ds["DrogueLength"]=(
+ ("traj"),
+ [gdp.str_to_float(ds.DrogueLength[:-2])],
+ )# e.g. 4.8 m
+ ds["DrogueBallast"]=(
+ ("traj"),
+ [gdp.str_to_float(ds.DrogueBallast[:-3])],
+ )# e.g. 1.4 kg
+ ds["DragAreaAboveDrogue"]=(
+ ("traj"),
+ [gdp.str_to_float(ds.DragAreaAboveDrogue[:-4])],
+ )# 10.66 m^2
+ ds["DragAreaOfDrogue"]=(
+ ("traj"),
+ [gdp.str_to_float(ds.DragAreaOfDrogue[:-4])],
+ )# e.g. 416.6 m^2
+ ds["DragAreaRatio"]=(("traj"),[gdp.str_to_float(ds.DragAreaRatio)])# e.g. 39.08
+ ds["DrogueCenterDepth"]=(
+ ("traj"),
+ [gdp.str_to_float(ds.DrogueCenterDepth[:-2])],
+ )# e.g. 20.0 m
+ ds["DrogueDetectSensor"]=(("traj"),gdp.cut_str(ds.DrogueDetectSensor,20))
+
+ # vars attributes
+ vars_attrs={
+ "ID":{"long_name":"Global Drifter Program Buoy ID","units":"-"},
+ "longitude":{"long_name":"Longitude","units":"degrees_east"},
+ "latitude":{"long_name":"Latitude","units":"degrees_north"},
+ "time":{"long_name":"Time","units":"seconds since 1970-01-01 00:00:00"},
+ "ids":{
+ "long_name":"Global Drifter Program Buoy ID repeated along observations",
+ "units":"-",
+ },
+ "rowsize":{
+ "long_name":"Number of observations per trajectory",
+ "sample_dimension":"obs",
+ "units":"-",
+ },
+ "location_type":{
+ "long_name":"Satellite-based location system",
+ "units":"-",
+ "comments":"0 (Argos), 1 (GPS)",
+ },
+ "WMO":{
+ "long_name":"World Meteorological Organization buoy identification number",
+ "units":"-",
+ },
+ "expno":{"long_name":"Experiment number","units":"-"},
+ "deploy_date":{
+ "long_name":"Deployment date and time",
+ "units":"seconds since 1970-01-01 00:00:00",
+ },
+ "deploy_lon":{"long_name":"Deployment longitude","units":"degrees_east"},
+ "deploy_lat":{"long_name":"Deployment latitude","units":"degrees_north"},
+ "end_date":{
+ "long_name":"End date and time",
+ "units":"seconds since 1970-01-01 00:00:00",
+ },
+ "end_lon":{"long_name":"End latitude","units":"degrees_north"},
+ "end_lat":{"long_name":"End longitude","units":"degrees_east"},
+ "drogue_lost_date":{
+ "long_name":"Date and time of drogue loss",
+ "units":"seconds since 1970-01-01 00:00:00",
+ },
+ "typedeath":{
+ "long_name":"Type of death",
+ "units":"-",
+ "comments":"0 (buoy still alive), 1 (buoy ran aground), 2 (picked up by vessel), 3 (stop transmitting), 4 (sporadic transmissions), 5 (bad batteries), 6 (inactive status)",
+ },
+ "typebuoy":{
+ "long_name":"Buoy type (see https://www.aoml.noaa.gov/phod/dac/dirall.html)",
+ "units":"-",
+ },
+ "DeployingShip":{"long_name":"Name of deployment ship","units":"-"},
+ "DeploymentStatus":{"long_name":"Deployment status","units":"-"},
+ "BuoyTypeManufacturer":{"long_name":"Buoy type manufacturer","units":"-"},
+ "BuoyTypeSensorArray":{"long_name":"Buoy type sensor array","units":"-"},
+ "CurrentProgram":{
+ "long_name":"Current Program",
+ "units":"-",
+ "_FillValue":"-1",
+ },
+ "PurchaserFunding":{"long_name":"Purchaser funding","units":"-"},
+ "SensorUpgrade":{"long_name":"Sensor upgrade","units":"-"},
+ "Transmissions":{"long_name":"Transmissions","units":"-"},
+ "DeployingCountry":{"long_name":"Deploying country","units":"-"},
+ "DeploymentComments":{"long_name":"Deployment comments","units":"-"},
+ "ManufactureYear":{
+ "long_name":"Manufacture year",
+ "units":"-",
+ "_FillValue":"-1",
+ },
+ "ManufactureMonth":{
+ "long_name":"Manufacture month",
+ "units":"-",
+ "_FillValue":"-1",
+ },
+ "ManufactureSensorType":{"long_name":"Manufacture Sensor Type","units":"-"},
+ "ManufactureVoltage":{
+ "long_name":"Manufacture voltage",
+ "units":"V",
+ "_FillValue":"-1",
+ },
+ "FloatDiameter":{"long_name":"Diameter of surface floater","units":"cm"},
+ "SubsfcFloatPresence":{"long_name":"Subsurface Float Presence","units":"-"},
+ "DrogueType":{"drogue_type":"Drogue Type","units":"-"},
+ "DrogueLength":{"long_name":"Length of drogue.","units":"m"},
+ "DrogueBallast":{
+ "long_name":"Weight of the drogue's ballast.",
+ "units":"kg",
+ },
+ "DragAreaAboveDrogue":{"long_name":"Drag area above drogue.","units":"m^2"},
+ "DragAreaOfDrogue":{"long_name":"Drag area drogue.","units":"m^2"},
+ "DragAreaRatio":{"long_name":"Drag area ratio","units":"m"},
+ "DrogueCenterDepth":{
+ "long_name":"Average depth of the drogue.",
+ "units":"m",
+ },
+ "DrogueDetectSensor":{"long_name":"Drogue detection sensor","units":"-"},
+ "ve":{"long_name":"Eastward velocity","units":"m/s"},
+ "vn":{"long_name":"Northward velocity","units":"m/s"},
+ "err_lat":{
+ "long_name":"95% confidence interval in latitude",
+ "units":"degrees_north",
+ },
+ "err_lon":{
+ "long_name":"95% confidence interval in longitude",
+ "units":"degrees_east",
+ },
+ "drogue_status":{
+ "long_name":"Status indicating the presence of the drogue",
+ "units":"-",
+ "flag_values":"1,0",
+ "flag_meanings":"drogued, undrogued",
+ },
+ "temp":{
+ "long_name":"Fitted sea water temperature",
+ "units":"Kelvin",
+ "comments":"Estimated near-surface sea water temperature from drifting buoy measurements. It is the sum of the fitted near-surface non-diurnal sea water temperature and fitted diurnal sea water temperature anomaly. Discrepancies may occur because of rounding.",
+ },
+ "err_temp":{
+ "long_name":"Standard uncertainty of fitted sea water temperature",
+ "units":"Kelvin",
+ "comments":"Estimated one standard error of near-surface sea water temperature estimate from drifting buoy measurements",
+ },
+ }
+
+ # global attributes
+ attrs={
+ "title":"Global Drifter Program drifting buoy collection",
+ "history":f"version {GDP_VERSION}. Metadata from dirall.dat and deplog.dat",
+ "Conventions":"CF-1.6",
+ "time_coverage_start":"",
+ "time_coverage_end":"",
+ "date_created":datetime.now().isoformat(),
+ "publisher_name":"GDP Drifter DAC",
+ "publisher_email":"aoml.dftr@noaa.gov",
+ "publisher_url":"https://www.aoml.noaa.gov/phod/gdp",
+ "license":"freely available",
+ "processing_level":"Level 2 QC by GDP drifter DAC",
+ "metadata_link":"https://www.aoml.noaa.gov/phod/dac/dirall.html",
+ "contributor_name":"NOAA Global Drifter Program",
+ "contributor_role":"Data Acquisition Center",
+ "institution":"NOAA Atlantic Oceanographic and Meteorological Laboratory",
+ "acknowledgement":f"Lumpkin, Rick; Centurioni, Luca (2019). NOAA Global Drifter Program quality-controlled 6-hour interpolated data from ocean surface drifting buoys. [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://doi.org/10.25921/7ntx-z961. Accessed {datetime.utcnow().strftime('%d %B %Y')}.",
+ "summary":"Global Drifter Program six-hourly data",
+ "doi":"10.25921/7ntx-z961",
+ }
+
+ # set attributes
+ forvarinvars_attrs.keys():
+ ifvarinds.keys():
+ ds[var].attrs=vars_attrs[var]
+ else:
+ warnings.warn(f"Variable {var} not found in upstream data; skipping.")
+ ds.attrs=attrs
+
+ # rename variables
+ ds=ds.rename_vars({"longitude":"lon","latitude":"lat"})
+
+ # Cast float64 variables to float32 to reduce memory footprint.
+ ds=gdp.cast_float64_variables_to_float32(ds)
+
+ returnds
+
+
+
+
+[docs]
+defto_raggedarray(
+ drifter_ids:Optional[list[int]]=None,
+ n_random_id:Optional[int]=None,
+ tmp_path:Optional[str]=GDP_TMP_PATH,
+)->RaggedArray:
+"""Download and process individual GDP 6-hourly files and return a
+ RaggedArray instance with the data.
+
+ Parameters
+ ----------
+ drifter_ids : list[int], optional
+ List of drifters to retrieve (Default: all)
+ n_random_id : list[int], optional
+ Randomly select n_random_id drifter NetCDF files
+ tmp_path : str, optional
+ Path to the directory where the individual NetCDF files are stored
+ (default varies depending on operating system; /tmp/clouddrift/gdp6h on Linux)
+
+ Returns
+ -------
+ out : RaggedArray
+ A RaggedArray instance of the requested dataset
+
+ Examples
+ --------
+
+ Invoke `to_raggedarray` without any arguments to download all drifter data
+ from the 6-hourly GDP feed:
+
+ >>> from clouddrift.adapters.gdp6h import to_raggedarray
+ >>> ra = to_raggedarray()
+
+ To download a random sample of 100 drifters, for example for development
+ or testing, use the `n_random_id` argument:
+
+ >>> ra = to_raggedarray(n_random_id=100)
+
+ To download a specific list of drifters, use the `drifter_ids` argument:
+
+ >>> ra = to_raggedarray(drifter_ids=[54375, 114956, 126934])
+
+ Finally, `to_raggedarray` returns a `RaggedArray` instance which provides
+ a convenience method to emit a `xarray.Dataset` instance:
+
+ >>> ds = ra.to_xarray()
+
+ To write the ragged array dataset to a NetCDF file on disk, do
+
+ >>> ds.to_netcdf("gdp6h.nc", format="NETCDF4")
+
+ Alternatively, to write the ragged array to a Parquet file, first create
+ it as an Awkward Array:
+
+ >>> arr = ra.to_awkward()
+ >>> arr.to_parquet("gdp6h.parquet")
+ """
+ ids=download(drifter_ids,n_random_id,GDP_DATA_URL,tmp_path)
+
+ ra=RaggedArray.from_files(
+ indices=ids,
+ preprocess_func=preprocess,
+ name_coords=gdp.GDP_COORDS,
+ name_meta=gdp.GDP_METADATA,
+ name_data=GDP_DATA,
+ rowsize_func=gdp.rowsize,
+ filename_pattern="drifter_6h_{id}.nc",
+ tmp_path=tmp_path,
+ )
+
+ # update dynamic global attributes
+ ra.attrs_global[
+ "time_coverage_start"
+ ]=f"{datetime(1970,1,1)+timedelta(seconds=int(np.min(ra.coords['time']))):%Y-%m-%d:%H:%M:%SZ}"
+ ra.attrs_global[
+ "time_coverage_end"
+ ]=f"{datetime(1970,1,1)+timedelta(seconds=int(np.max(ra.coords['time']))):%Y-%m-%d:%H:%M:%SZ}"
+
+ returnra
+"""
+This module defines functions used to adapt the Grand LAgrangian Deployment
+(GLAD) dataset as a ragged-array Xarray Dataset.
+
+The dataset and its description are hosted at https://doi.org/10.7266/N7VD6WC8.
+
+Example
+-------
+>>> from clouddrift.adapters import glad
+>>> ds = glad.to_xarray()
+
+Reference
+---------
+Özgökmen, Tamay. 2013. GLAD experiment CODE-style drifter trajectories (low-pass filtered, 15 minute interval records), northern Gulf of Mexico near DeSoto Canyon, July-October 2012. Distributed by: Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC), Harte Research Institute, Texas A&M University–Corpus Christi. doi:10.7266/N7VD6WC8
+"""
+fromclouddrift.adapters.utilsimportdownload_with_progress
+fromioimportBytesIO
+importnumpyasnp
+importpandasaspd
+importxarrayasxr
+
+
+
+[docs]
+defget_dataframe()->pd.DataFrame:
+"""Get the GLAD dataset as a pandas DataFrame."""
+ url="https://data.gulfresearchinitiative.org/pelagos-symfony/api/file/download/169841"
+ # GRIIDC server doesn't provide Content-Length header, so we'll hardcode
+ # the expected data length here.
+ file_size=155330876
+ buf=BytesIO(b"")
+ download_with_progress([(url,buf)])
+ buf.seek(0)
+ column_names=[
+ "id",
+ "date",
+ "time",
+ "latitude",
+ "longitude",
+ "position_error",
+ "u",
+ "v",
+ "velocity_error",
+ ]
+ df=pd.read_csv(buf,delim_whitespace=True,skiprows=5,names=column_names)
+ df["obs"]=pd.to_datetime(df["date"]+" "+df["time"])
+ df.drop(["date","time"],axis=1,inplace=True)
+ returndf
+
+
+
+
+[docs]
+defto_xarray()->xr.Dataset:
+"""Return the GLAD data as a ragged-array Xarray Dataset."""
+ df=get_dataframe()
+ ds=df.to_xarray()
+
+ traj,rowsize=np.unique(ds.id,return_counts=True)
+
+ # Make the dataset compatible with clouddrift functions.
+ ds=(
+ ds.swap_dims({"index":"obs"})
+ .drop_vars(["id","index"])
+ .assign_coords(traj=traj)
+ .assign({"rowsize":("traj",rowsize)})
+ .rename_vars({"obs":"time","traj":"id"})
+ )
+
+ # Cast double floats to singles
+ forvarinds.variables:
+ ifds[var].dtype=="float64":
+ ds[var]=ds[var].astype("float32")
+
+ # Set variable attributes
+ ds["longitude"].attrs={
+ "long_name":"longitude",
+ "standard_name":"longitude",
+ "units":"degrees_east",
+ }
+
+ ds["latitude"].attrs={
+ "long_name":"latitude",
+ "standard_name":"latitude",
+ "units":"degrees_north",
+ }
+
+ ds["position_error"].attrs={
+ "long_name":"position_error",
+ "units":"m",
+ }
+
+ ds["u"].attrs={
+ "long_name":"eastward_sea_water_velocity",
+ "standard_name":"eastward_sea_water_velocity",
+ "units":"m s-1",
+ }
+
+ ds["v"].attrs={
+ "long_name":"northward_sea_water_velocity",
+ "standard_name":"northward_sea_water_velocity",
+ "units":"m s-1",
+ }
+
+ ds["velocity_error"].attrs={
+ "long_name":"velocity_error",
+ "units":"m s-1",
+ }
+
+ # Set global attributes
+ ds.attrs={
+ "title":"GLAD experiment CODE-style drifter trajectories (low-pass filtered, 15 minute interval records), northern Gulf of Mexico near DeSoto Canyon, July-October 2012",
+ "institution":"Consortium for Advanced Research on Transport of Hydrocarbon in the Environment (CARTHE)",
+ "source":"CODE-style drifters",
+ "history":"Downloaded from https://data.gulfresearchinitiative.org/data/R1.x134.073:0004 and post-processed into a ragged-array Xarray Dataset by CloudDrift",
+ "references":"Özgökmen, Tamay. 2013. GLAD experiment CODE-style drifter trajectories (low-pass filtered, 15 minute interval records), northern Gulf of Mexico near DeSoto Canyon, July-October 2012. Distributed by: Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC), Harte Research Institute, Texas A&M University–Corpus Christi. doi:10.7266/N7VD6WC8",
+ }
+
+ returnds
+"""
+This module defines functions used to adapt the MOSAiC sea-ice drift dataset as
+a ragged-array dataset.
+
+The dataset is hosted at https://doi.org/10.18739/A2KP7TS83.
+
+Reference: Angela Bliss, Jennifer Hutchings, Philip Anderson, Philipp Anhaus,
+Hans Jakob Belter, Jørgen Berge, Vladimir Bessonov, Bin Cheng, Sylvia Cole,
+Dave Costa, Finlo Cottier, Christopher J Cox, Pedro R De La Torre, Dmitry V Divine,
+Gilbert Emzivat, Ying-Chih Fang, Steven Fons, Michael Gallagher, Maxime Geoffrey,
+Mats A Granskog, ... Guangyu Zuo. (2022). Sea ice drift tracks from the Distributed
+Network of autonomous buoys deployed during the Multidisciplinary drifting Observatory
+for the Study of Arctic Climate (MOSAiC) expedition 2019 - 2021. Arctic Data Center.
+doi:10.18739/A2KP7TS83.
+
+Example
+-------
+>>> from clouddrift.adapters import mosaic
+>>> ds = mosaic.to_xarray()
+"""
+fromdatetimeimportdatetime
+fromioimportBytesIO
+importnumpyasnp
+importpandasaspd
+importrequests
+fromtqdmimporttqdm
+importxarrayasxr
+importxml.etree.ElementTreeasET
+
+fromclouddrift.adapters.utilsimportdownload_with_progress
+
+MOSAIC_VERSION="2022"
+
+
+
+[docs]
+defget_dataframes()->tuple[pd.DataFrame,pd.DataFrame]:
+"""Get the MOSAiC data (obs dimension in the target Dataset) and metadata
+ (traj dimension in the target dataset ) as pandas DataFrames."""
+ xml=get_repository_metadata()
+ filenames,urls=get_file_urls(xml)
+ exclude_patterns=["site_buoy_summary","buoy_list"]
+ data_filenames=[
+ fforfinfilenamesifnotany([sinfforsinexclude_patterns])
+ ]
+ data_urls=[
+ f
+ forn,finenumerate(urls)
+ ifnotany([sinfilenames[n]forsinexclude_patterns])
+ ]
+ sensor_ids=[f.split("_")[-1].rstrip(".csv")forfindata_filenames]
+ sensor_list_url=urls[
+ filenames.index([fforfinfilenamesif"buoy_list"inf].pop())
+ ]
+ sensors=pd.read_csv(sensor_list_url)
+
+ # Sort the urls by the order of sensor IDs in the sensor list
+ order_index={id:nforn,idinenumerate(sensors["Sensor ID"])}
+ sorted_indices=sorted(
+ range(len(sensor_ids)),key=lambdak:order_index[sensor_ids[k]]
+ )
+ sorted_data_urls=[data_urls[i]foriinsorted_indices]
+ buffers=[BytesIO(b"")*len(sorted_data_urls)]
+
+ download_with_progress(zip(sorted_data_urls,buffers),desc="Downloading data")
+ dfs=[pd.read_csv(b)forbinbuffers]
+ obs_df=pd.concat(dfs)
+
+ # Use the index of the concatenated DataFrame to determine the count/rowsize
+ zero_indices=[nforn,valinenumerate(list(obs_df.index))ifval==0]
+ sensors["rowsize"]=np.diff(zero_indices+[len(obs_df)])
+
+ # Make the time column the index of the DataFrame, which will make it a
+ # coordinate in the xarray Dataset.
+ obs_df.set_index("datetime",inplace=True)
+ sensors.set_index("Sensor ID",inplace=True)
+
+ returnobs_df,sensors
+
+
+
+
+[docs]
+defget_file_urls(xml:str)->list[str]:
+"""Pass the MOSAiC XML string and return the list of filenames and URLs."""
+ filenames=[
+ tag.text
+ fortaginET.fromstring(xml).findall("./dataset/dataTable/physical/objectName")
+ ]
+ urls=[
+ tag.text
+ fortaginET.fromstring(xml).findall(
+ "./dataset/dataTable/physical/distribution/online/url"
+ )
+ ]
+ returnfilenames,urls
+
+
+
+
+[docs]
+defget_repository_metadata()->str:
+"""Get the MOSAiC repository metadata as an XML string.
+ Pass this string to other get_* functions to extract the data you need.
+ """
+ url="https://arcticdata.io/metacat/d1/mn/v2/object/doi:10.18739/A2KP7TS83"
+ r=requests.get(url)
+ returnr.content
+
+
+
+
+[docs]
+defto_xarray():
+"""Return the MOSAiC data as an ragged-array Xarray Dataset."""
+
+ # Download the data and metadata as pandas DataFrames.
+ obs_df,traj_df=get_dataframes()
+
+ # Dates and datetimes are strings; convert them to datetime64 instances
+ # for compatibility with CloudDrift's analysis functions.
+ obs_df.index=pd.to_datetime(obs_df.index)
+ forcolin[
+ "Deployment Date",
+ "Deployment Datetime",
+ "First Data Datetime",
+ "Last Data Datetime",
+ ]:
+ traj_df[col]=pd.to_datetime(traj_df[col])
+
+ # Merge into an Xarray Dataset and rename the dimensions and variables to
+ # follow the CloudDrift convention.
+ ds=xr.merge([obs_df.to_xarray(),traj_df.to_xarray()])
+ ds=ds.rename_dims({"datetime":"obs","Sensor ID":"traj"}).rename_vars(
+ {"datetime":"time","Sensor ID":"id"}
+ )
+
+ # Set variable attributes
+ ds["longitude"].attrs={
+ "long_name":"longitude",
+ "standard_name":"longitude",
+ "units":"degrees_east",
+ }
+
+ ds["latitude"].attrs={
+ "long_name":"latitude",
+ "standard_name":"latitude",
+ "units":"degrees_north",
+ }
+
+ # global attributes
+ ds.attrs={
+ "title":"Multidisciplinary drifting Observatory for the Study of Arctic Climate (MOSAiC) expedition 2019 - 2021",
+ "history":f"Dataset updated in {MOSAIC_VERSION}",
+ "date_created":datetime.now().isoformat(),
+ "publisher_name":"NSF Arctic Data Center",
+ "publisher_url":"https://arcticdata.io/catalog/view/doi:10.18739/A2KP7TS83",
+ "license":"Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/)",
+ }
+
+ returnds
+"""
+This module provides functions to easily access ragged array datasets. If the datasets are
+not accessed via cloud storage platforms or are not found on the local filesystem,
+they will be downloaded from their upstream repositories and stored for later access
+(~/.clouddrift for UNIX-based systems).
+"""
+fromioimportBufferedReader,BytesIO
+fromclouddriftimportadapters
+importos
+importplatform
+importxarrayasxr
+
+
+
+[docs]
+defgdp1h(decode_times:bool=True)->xr.Dataset:
+"""Returns the latest version of the NOAA Global Drifter Program (GDP) hourly
+ dataset as a ragged array Xarray dataset.
+
+ The data is accessed from zarr archive hosted on a public AWS S3 bucket accessible at
+ https://registry.opendata.aws/noaa-oar-hourly-gdp/. Original data source from NOAA NCEI
+ is https://doi.org/10.25921/x46c-3620).
+
+ Parameters
+ ----------
+ decode_times : bool, optional
+ If True, decode the time coordinate into a datetime object. If False, the time
+ coordinate will be an int64 or float64 array of increments since the origin
+ time indicated in the units attribute. Default is True.
+
+ Returns
+ -------
+ xarray.Dataset
+ Hourly GDP dataset as a ragged array
+
+ Examples
+ --------
+ >>> from clouddrift.datasets import gdp1h
+ >>> ds = gdp1h()
+ >>> ds
+ <xarray.Dataset>
+ Dimensions: (traj: 19396, obs: 197214787)
+ Coordinates:
+ id (traj) int64 ...
+ time (obs) datetime64[ns] ...
+ Dimensions without coordinates: traj, obs
+ Data variables: (12/60)
+ BuoyTypeManufacturer (traj) |S20 ...
+ BuoyTypeSensorArray (traj) |S20 ...
+ CurrentProgram (traj) float32 ...
+ DeployingCountry (traj) |S20 ...
+ DeployingShip (traj) |S20 ...
+ DeploymentComments (traj) |S20 ...
+ ... ...
+ start_lat (traj) float32 ...
+ start_lon (traj) float32 ...
+ typebuoy (traj) |S10 ...
+ typedeath (traj) int8 ...
+ ve (obs) float32 ...
+ vn (obs) float32 ...
+ Attributes: (12/16)
+ Conventions: CF-1.6
+ acknowledgement: Elipot, Shane; Sykulski, Adam; Lumpkin, Rick; Centurio...
+ contributor_name: NOAA Global Drifter Program
+ contributor_role: Data Acquisition Center
+ date_created: 2023-09-08T17:05:12.130123
+ doi: 10.25921/x46c-3620
+ ... ...
+ processing_level: Level 2 QC by GDP drifter DAC
+ publisher_email: aoml.dftr@noaa.gov
+ publisher_name: GDP Drifter DAC
+ publisher_url: https://www.aoml.noaa.gov/phod/gdp
+ summary: Global Drifter Program hourly data
+ title: Global Drifter Program hourly drifting buoy collection
+
+ See Also
+ --------
+ :func:`gdp6h`
+ """
+ url="https://noaa-oar-hourly-gdp-pds.s3.amazonaws.com/latest/gdp-v2.01.zarr"
+ ds=xr.open_dataset(url,engine="zarr",decode_times=decode_times)
+ ds=ds.rename_vars({"ID":"id"}).assign_coords({"id":ds.ID}).drop_vars(["ids"])
+ returnds
+
+
+
+
+[docs]
+defgdp6h(decode_times:bool=True)->xr.Dataset:
+"""Returns the NOAA Global Drifter Program (GDP) 6-hourly dataset as a ragged array
+ Xarray dataset.
+
+ The data is accessed from a public HTTPS server at NOAA's Atlantic
+ Oceanographic and Meteorological Laboratory (AOML) accessible at
+ https://www.aoml.noaa.gov/phod/gdp/index.php. It should be noted that the data loading
+ method is platform dependent. Linux and Darwin (macOS) machines lazy load the datasets leveraging the
+ byte-range feature of the netCDF-c library (dataset loading engine used by xarray).
+ Windows machines download the entire dataset into a memory buffer which is then passed
+ to xarray.
+
+ Parameters
+ ----------
+ decode_times : bool, optional
+ If True, decode the time coordinate into a datetime object. If False, the time
+ coordinate will be an int64 or float64 array of increments since the origin
+ time indicated in the units attribute. Default is True.
+
+ Returns
+ -------
+ xarray.Dataset
+ 6-hourly GDP dataset as a ragged array
+
+ Examples
+ --------
+ >>> from clouddrift.datasets import gdp6h
+ >>> ds = gdp6h()
+ >>> ds
+ <xarray.Dataset>
+ Dimensions: (traj: 27647, obs: 46535470)
+ Coordinates:
+ ids (obs) int64 7702204 7702204 ... 300234061198840
+ time (obs) float64 2.879e+08 2.879e+08 ... 1.697e+09
+ Dimensions without coordinates: traj, obs
+ Data variables: (12/50)
+ ID (traj) int64 7702204 7702201 ... 300234061198840
+ rowsize (traj) int32 92 1747 1943 1385 1819 ... 54 53 51 28
+ WMO (traj) int32 0 0 0 0 ... 6203890 6203888 4101885
+ expno (traj) int32 40 40 40 40 ... 31412 21421 21421 31412
+ deploy_date (traj) float32 2.878e+08 2.878e+08 ... 1.696e+09 nan
+ deploy_lat (traj) float32 -7.798 -4.9 -3.18 ... 9.9 11.9 nan
+ ... ...
+ vn (obs) float32 nan 0.1056 0.04974 ... 0.7384 nan
+ temp (obs) float32 28.35 28.3 nan ... 29.08 28.97 28.92
+ err_lat (obs) float32 0.009737 0.007097 ... 0.001659 0.001687
+ err_lon (obs) float32 0.00614 0.004583 ... 0.002471 0.002545
+ err_temp (obs) float32 0.08666 0.08757 ... 0.03665 0.03665
+ drogue_status (obs) bool False False False False ... True True True
+ Attributes: (12/18)
+ title: Global Drifter Program drifting buoy collection
+ history: version September 2023. Metadata from dirall.dat an...
+ Conventions: CF-1.6
+ time_coverage_start: 1979-02-15:00:00:00Z
+ time_coverage_end: 2023-10-18:18:00:00Z
+ date_created: 2023-12-22T17:50:22.242943
+ ... ...
+ contributor_name: NOAA Global Drifter Program
+ contributor_role: Data Acquisition Center
+ institution: NOAA Atlantic Oceanographic and Meteorological Labo...
+ acknowledgement: Lumpkin, Rick; Centurioni, Luca (2019). NOAA Global...
+ summary: Global Drifter Program six-hourly data
+ doi: 10.25921/7ntx-z961
+
+ See Also
+ --------
+ :func:`gdp1h`
+ """
+ url="https://www.aoml.noaa.gov/ftp/pub/phod/buoydata/gdp6h_ragged_may23.nc#mode=bytes"
+
+ ifplatform.system()=="Windows":
+ buffer=BytesIO()
+ adapters.utils.download_with_progress([(f"{url}#mode=bytes",buffer)])
+ reader=BufferedReader(buffer)
+ ds=xr.open_dataset(reader,decode_times=decode_times)
+ else:
+ ds=xr.open_dataset(f"{url}",decode_times=decode_times)
+
+ ds=ds.rename_vars({"ID":"id"}).assign_coords({"id":ds.ID}).drop_vars(["ids"])
+ returnds
+
+
+
+
+[docs]
+defglad(decode_times:bool=True)->xr.Dataset:
+"""Returns the Grand LAgrangian Deployment (GLAD) dataset as a ragged array
+ Xarray dataset.
+
+ The function will first look for the ragged-array dataset on the local
+ filesystem. If it is not found, the dataset will be downloaded using the
+ corresponding adapter function and stored for later access.
+
+ The upstream data is available at https://doi.org/10.7266/N7VD6WC8.
+
+ Parameters
+ ----------
+ decode_times : bool, optional
+ If True, decode the time coordinate into a datetime object. If False, the time
+ coordinate will be an int64 or float64 array of increments since the origin
+ time indicated in the units attribute. Default is True.
+
+ Returns
+ -------
+ xarray.Dataset
+ GLAD dataset as a ragged array
+
+ Examples
+ --------
+ >>> from clouddrift.datasets import glad
+ >>> ds = glad()
+ >>> ds
+ <xarray.Dataset>
+ Dimensions: (obs: 1602883, traj: 297)
+ Coordinates:
+ time (obs) datetime64[ns] ...
+ id (traj) object ...
+ Data variables:
+ latitude (obs) float32 ...
+ longitude (obs) float32 ...
+ position_error (obs) float32 ...
+ u (obs) float32 ...
+ v (obs) float32 ...
+ velocity_error (obs) float32 ...
+ rowsize (traj) int64 ...
+ Attributes:
+ title: GLAD experiment CODE-style drifter trajectories (low-pass f...
+ institution: Consortium for Advanced Research on Transport of Hydrocarbo...
+ source: CODE-style drifters
+ history: Downloaded from https://data.gulfresearchinitiative.org/dat...
+ references: Özgökmen, Tamay. 2013. GLAD experiment CODE-style drifter t...
+
+ Reference
+ ---------
+ Özgökmen, Tamay. 2013. GLAD experiment CODE-style drifter trajectories (low-pass filtered, 15 minute interval records), northern Gulf of Mexico near DeSoto Canyon, July-October 2012. Distributed by: Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC), Harte Research Institute, Texas A&M University–Corpus Christi. doi:10.7266/N7VD6WC8
+ """
+ clouddrift_path=(
+ os.path.expanduser("~/.clouddrift")
+ ifnotos.getenv("CLOUDDRIFT_PATH")
+ elseos.getenv("CLOUDDRIFT_PATH")
+ )
+ glad_path=f"{clouddrift_path}/data/glad.nc"
+ ifnotos.path.exists(glad_path):
+ print(f"{glad_path} not found; download from upstream repository.")
+ ds=adapters.glad.to_xarray()
+ os.makedirs(os.path.dirname(glad_path),exist_ok=True)
+ ds.to_netcdf(glad_path)
+ else:
+ ds=xr.open_dataset(glad_path,decode_times=decode_times)
+ returnds
+
+
+
+
+[docs]
+defmosaic(decode_times:bool=True)->xr.Dataset:
+"""Returns the MOSAiC sea-ice drift dataset as a ragged array Xarray dataset.
+
+ The function will first look for the ragged-array dataset on the local
+ filesystem. If it is not found, the dataset will be downloaded using the
+ corresponding adapter function and stored for later access.
+
+ The upstream data is available at https://arcticdata.io/catalog/view/doi:10.18739/A2KP7TS83.
+
+ Reference
+ ---------
+ Angela Bliss, Jennifer Hutchings, Philip Anderson, Philipp Anhaus,
+ Hans Jakob Belter, Jørgen Berge, Vladimir Bessonov, Bin Cheng, Sylvia Cole,
+ Dave Costa, Finlo Cottier, Christopher J Cox, Pedro R De La Torre, Dmitry V Divine,
+ Gilbert Emzivat, Ying-Chih Fang, Steven Fons, Michael Gallagher, Maxime Geoffrey,
+ Mats A Granskog, ... Guangyu Zuo. (2022). Sea ice drift tracks from the Distributed
+ Network of autonomous buoys deployed during the Multidisciplinary drifting Observatory
+ for the Study of Arctic Climate (MOSAiC) expedition 2019 - 2021. Arctic Data Center.
+ doi:10.18739/A2KP7TS83.
+
+ Parameters
+ ----------
+ decode_times : bool, optional
+ If True, decode the time coordinate into a datetime object. If False, the time
+ coordinate will be an int64 or float64 array of increments since the origin
+ time indicated in the units attribute. Default is True.
+
+ Returns
+ -------
+ xarray.Dataset
+ MOSAiC sea-ice drift dataset as a ragged array
+
+ Examples
+ --------
+ >>> from clouddrift.datasets import mosaic
+ >>> ds = mosaic()
+ >>> ds
+ <xarray.Dataset>
+ Dimensions: (obs: 1926226, traj: 216)
+ Coordinates:
+ time (obs) datetime64[ns] ...
+ id (traj) object ...
+ Dimensions without coordinates: obs, traj
+ Data variables: (12/19)
+ latitude (obs) float64 ...
+ longitude (obs) float64 ...
+ Deployment Leg (traj) int64 ...
+ DN Station ID (traj) object ...
+ IMEI (traj) object ...
+ Deployment Date (traj) datetime64[ns] ...
+ ... ...
+ Buoy Type (traj) object ...
+ Manufacturer (traj) object ...
+ Model (traj) object ...
+ PI (traj) object ...
+ Data Authors (traj) object ...
+ rowsize (traj) int64 ...
+ """
+ clouddrift_path=(
+ os.path.expanduser("~/.clouddrift")
+ ifnotos.getenv("CLOUDDRIFT_PATH")
+ elseos.getenv("CLOUDDRIFT_PATH")
+ )
+ mosaic_path=f"{clouddrift_path}/data/mosaic.nc"
+ ifnotos.path.exists(mosaic_path):
+ print(f"{mosaic_path} not found; download from upstream repository.")
+ ds=adapters.mosaic.to_xarray()
+ os.makedirs(os.path.dirname(mosaic_path),exist_ok=True)
+ ds.to_netcdf(mosaic_path)
+ else:
+ ds=xr.open_dataset(mosaic_path,decode_times=decode_times)
+ returnds
+
+
+
+
+[docs]
+defspotters(decode_times:bool=True)->xr.Dataset:
+"""Returns the Sofar Ocean Spotter drifters ragged array dataset as an Xarray dataset.
+
+ The data is accessed from a zarr archive hosted on a public AWS S3 bucket accessible
+ at https://sofar-spotter-archive.s3.amazonaws.com/spotter_data_bulk_zarr.
+
+ Parameters
+ ----------
+ decode_times : bool, optional
+ If True, decode the time coordinate into a datetime object. If False, the time
+ coordinate will be an int64 or float64 array of increments since the origin
+ time indicated in the units attribute. Default is True.
+
+ Returns
+ -------
+ xarray.Dataset
+ Sofar ocean floats dataset as a ragged array
+
+ Examples
+ --------
+ >>> from clouddrift.datasets import spotters
+ >>> ds = spotters()
+ >>> ds
+ <xarray.Dataset>
+ Dimensions: (index: 6390651, trajectory: 871)
+ Coordinates:
+ time (index) datetime64[ns] ...
+ * trajectory (trajectory) object 'SPOT-010001' ... 'SPOT-1975'
+ Dimensions without coordinates: index
+ Data variables:
+ latitude (index) float64 ...
+ longitude (index) float64 ...
+ meanDirection (index) float64 ...
+ meanDirectionalSpread (index) float64 ...
+ meanPeriod (index) float64 ...
+ peakDirection (index) float64 ...
+ peakDirectionalSpread (index) float64 ...
+ peakPeriod (index) float64 ...
+ rowsize (trajectory) int64 ...
+ significantWaveHeight (index) float64 ...
+ Attributes:
+ author: Isabel A. Houghton
+ creation_date: 2023-10-18 00:43:55.333537
+ email: isabel.houghton@sofarocean.com
+ institution: Sofar Ocean
+ references: https://content.sofarocean.com/hubfs/Spotter%20product%20...
+ source: Spotter wave buoy
+ title: Sofar Spotter Data Archive - Bulk Wave Parameters
+ """
+ url="https://sofar-spotter-archive.s3.amazonaws.com/spotter_data_bulk_zarr"
+ returnxr.open_dataset(url,engine="zarr",decode_times=decode_times)
+
+
+
+
+[docs]
+defsubsurface_floats(decode_times:bool=True)->xr.Dataset:
+"""Returns the subsurface floats dataset as a ragged array Xarray dataset.
+
+ The data is accessed from a public HTTPS server at NOAA's Atlantic
+ Oceanographic and Meteorological Laboratory (AOML) accessible at
+ https://www.aoml.noaa.gov/phod/gdp/index.php.
+
+ The upstream data is available at
+ https://www.aoml.noaa.gov/phod/float_traj/files/allFloats_12122017.mat.
+
+ This dataset of subsurface float observations was compiled by the WOCE Subsurface
+ Float Data Assembly Center (WFDAC) in Woods Hole maintained by Andree Ramsey and
+ Heather Furey and copied to NOAA/AOML in October 2014 (version 1) and in December
+ 2017 (version 2). Subsequent updates will be included as additional appropriate
+ float data, quality controlled by the appropriate principal investigators, is
+ submitted for inclusion.
+
+ Note that these observations are collected by ALACE/RAFOS/Eurofloat-style
+ acoustically-tracked, neutrally-buoyant subsurface floats which collect data while
+ drifting beneath the ocean surface. These data are the result of the effort and
+ resources of many individuals and institutions. You are encouraged to acknowledge
+ the work of the data originators and Data Centers in publications arising from use
+ of these data.
+
+ The float data were originally divided by project at the WFDAC. Here they have been
+ compiled in a single Matlab data set. See here for more information on the variables
+ contained in these files.
+
+ Parameters
+ ----------
+ decode_times : bool, optional
+ If True, decode the time coordinate into a datetime object. If False, the time
+ coordinate will be an int64 or float64 array of increments since the origin
+ time indicated in the units attribute. Default is True.
+
+ Returns
+ -------
+ xarray.Dataset
+ Subsurface floats dataset as a ragged array
+
+ Examples
+ --------
+ >>> from clouddrift.datasets import subsurface_floats
+ >>> ds = subsurface_floats()
+ >>> ds
+ <xarray.Dataset>
+ Dimensions: (traj: 2193, obs: 1402840)
+ Coordinates:
+ id (traj) uint16 ...
+ time (obs) datetime64[ns] ...
+ Dimensions without coordinates: traj, obs
+ Data variables: (12/13)
+ expList (traj) object ...
+ expName (traj) object ...
+ expOrg (traj) object ...
+ expPI (traj) object ...
+ indexExp (traj) uint8 ...
+ fltType (traj) object ...
+ ... ...
+ lon (obs) float64 ...
+ lat (obs) float64 ...
+ pres (obs) float64 ...
+ temp (obs) float64 ...
+ ve (obs) float64 ...
+ vn (obs) float64 ...
+ Attributes:
+ title: Subsurface float trajectories dataset
+ history: December 2017 (version 2)
+ date_created: 2023-11-14T22:30:38.831656
+ publisher_name: WOCE Subsurface Float Data Assembly Center and NOAA AOML
+ publisher_url: https://www.aoml.noaa.gov/phod/float_traj/data.php
+ license: freely available
+ acknowledgement: Maintained by Andree Ramsey and Heather Furey from the ...
+
+ References
+ ----------
+ WOCE Subsurface Float Data Assembly Center (WFDAC) https://www.aoml.noaa.gov/phod/float_traj/index.php
+ """
+
+ clouddrift_path=(
+ os.path.expanduser("~/.clouddrift")
+ ifnotos.getenv("CLOUDDRIFT_PATH")
+ elseos.getenv("CLOUDDRIFT_PATH")
+ )
+
+ local_file=f"{clouddrift_path}/data/subsurface_floats.nc"
+ ifnotos.path.exists(local_file):
+ print(f"{local_file} not found; download from upstream repository.")
+ ds=adapters.subsurface_floats.to_xarray()
+ else:
+ ds=xr.open_dataset(local_file,decode_times=decode_times)
+ returnds
+
+
+
+
+[docs]
+defyomaha(decode_times:bool=True)->xr.Dataset:
+"""Returns the YoMaHa dataset as a ragged array Xarray dataset.
+
+ The function will first look for the ragged-array dataset on the local
+ filesystem. If it is not found, the dataset will be downloaded using the
+ corresponding adapter function and stored for later access. The upstream
+ data is available at http://apdrc.soest.hawaii.edu/projects/yomaha/.
+
+ Parameters
+ ----------
+ decode_times : bool, optional
+ If True, decode the time coordinate into a datetime object. If False, the time
+ coordinate will be an int64 or float64 array of increments since the origin
+ time indicated in the units attribute. Default is True.
+
+ Returns
+ -------
+ xarray.Dataset
+ YoMaHa'07 dataset as a ragged array
+
+ Examples
+ --------
+
+ >>> from clouddrift.datasets import yomaha
+ >>> ds = yomaha()
+ >>> ds
+ <xarray.Dataset>
+ Dimensions: (obs: 1926743, traj: 12196)
+ Coordinates:
+ time_d (obs) datetime64[ns] ...
+ time_s (obs) datetime64[ns] ...
+ time_lp (obs) datetime64[ns] ...
+ time_lc (obs) datetime64[ns] ...
+ id (traj) int64 ...
+ Dimensions without coordinates: obs, traj
+ Data variables: (12/27)
+ lon_d (obs) float64 ...
+ lat_d (obs) float64 ...
+ pres_d (obs) float32 ...
+ ve_d (obs) float32 ...
+ vn_d (obs) float32 ...
+ err_ve_d (obs) float32 ...
+ ... ...
+ cycle (obs) int64 ...
+ time_inv (obs) int64 ...
+ rowsize (traj) int64 ...
+ wmo_id (traj) int64 ...
+ dac_id (traj) int64 ...
+ float_type (traj) int64 ...
+ Attributes:
+ title: YoMaHa'07: Velocity data assessed from trajectories of A...
+ history: Dataset updated on Tue Jun 28 03:14:34 HST 2022
+ date_created: 2023-12-08T00:52:08.478075
+ publisher_name: Asia-Pacific Data Research Center
+ publisher_url: http://apdrc.soest.hawaii.edu/index.php
+ license: Creative Commons Attribution 4.0 International License..
+
+ Reference
+ ---------
+ Lebedev, K. V., Yoshinari, H., Maximenko, N. A., & Hacker, P. W. (2007). Velocity data
+ assessed from trajectories of Argo floats at parking level and at the sea
+ surface. IPRC Technical Note, 4(2), 1-16.
+ """
+ clouddrift_path=(
+ os.path.expanduser("~/.clouddrift")
+ ifnotos.getenv("CLOUDDRIFT_PATH")
+ elseos.getenv("CLOUDDRIFT_PATH")
+ )
+ local_file=f"{clouddrift_path}/data/yomaha.nc"
+ ifnotos.path.exists(local_file):
+ print(f"{local_file} not found; download from upstream repository.")
+ ds=adapters.yomaha.to_xarray()
+ os.makedirs(os.path.dirname(local_file),exist_ok=True)
+ ds.to_netcdf(local_file)
+ else:
+ ds=xr.open_dataset(local_file,decode_times=decode_times)
+ returnds
+
+
+
+
+[docs]
+defandro(decode_times:bool=True)->xr.Dataset:
+"""Returns the ANDRO as a ragged array Xarray dataset.
+
+ The function will first look for the ragged-array dataset on the local
+ filesystem. If it is not found, the dataset will be downloaded using the
+ corresponding adapter function and stored for later access. The upstream
+ data is available at https://www.seanoe.org/data/00360/47077/.
+
+ Parameters
+ ----------
+ decode_times : bool, optional
+ If True, decode the time coordinate into a datetime object. If False, the time
+ coordinate will be an int64 or float64 array of increments since the origin
+ time indicated in the units attribute. Default is True.
+
+ Returns
+ -------
+ xarray.Dataset
+ ANDRO dataset as a ragged array
+ Examples
+ --------
+ >>> from clouddrift.datasets import andro
+ >>> ds = andro()
+ >>> ds
+ <xarray.Dataset>
+ Dimensions: (obs: 1360753, traj: 9996)
+ Coordinates:
+ time_d (obs) datetime64[ns] ...
+ time_s (obs) datetime64[ns] ...
+ time_lp (obs) datetime64[ns] ...
+ time_lc (obs) datetime64[ns] ...
+ id (traj) int64 ...
+ Dimensions without coordinates: obs, traj
+ Data variables: (12/33)
+ lon_d (obs) float64 ...
+ lat_d (obs) float64 ...
+ pres_d (obs) float32 ...
+ temp_d (obs) float32 ...
+ sal_d (obs) float32 ...
+ ve_d (obs) float32 ...
+ ... ...
+ lon_lc (obs) float64 ...
+ lat_lc (obs) float64 ...
+ surf_fix (obs) int64 ...
+ cycle (obs) int64 ...
+ profile_id (obs) float32 ...
+ rowsize (traj) int64 ...
+ Attributes:
+ title: ANDRO: An Argo-based deep displacement dataset
+ history: 2022-03-04
+ date_created: 2023-12-08T00:52:00.937120
+ publisher_name: SEANOE (SEA scieNtific Open data Edition)
+ publisher_url: https://www.seanoe.org/data/00360/47077/
+ license: freely available
+
+ Reference
+ ---------
+ Ollitrault Michel, Rannou Philippe, Brion Emilie, Cabanes Cecile, Piron Anne, Reverdin Gilles,
+ Kolodziejczyk Nicolas (2022). ANDRO: An Argo-based deep displacement dataset.
+ SEANOE. https://doi.org/10.17882/47077
+ """
+ clouddrift_path=(
+ os.path.expanduser("~/.clouddrift")
+ ifnotos.getenv("CLOUDDRIFT_PATH")
+ elseos.getenv("CLOUDDRIFT_PATH")
+ )
+ local_file=f"{clouddrift_path}/data/andro.nc"
+ ifnotos.path.exists(local_file):
+ print(f"{local_file} not found; download from upstream repository.")
+ ds=adapters.andro.to_xarray()
+ os.makedirs(os.path.dirname(local_file),exist_ok=True)
+ ds.to_netcdf(local_file)
+ else:
+ ds=xr.open_dataset(local_file,decode_times=decode_times)
+ returnds
+[docs]
+defkinetic_energy(
+ u:Union[float,list,np.ndarray,xr.DataArray,pd.Series],
+ v:Optional[Union[float,list,np.ndarray,xr.DataArray,pd.Series]]=None,
+)->Union[float,np.ndarray,xr.DataArray]:
+"""Compute kinetic energy from zonal and meridional velocities.
+
+ Parameters
+ ----------
+ u : float or array-like
+ Zonal velocity.
+ v : float or array-like, optional.
+ Meridional velocity. If not provided, the flow is assumed one-dimensional
+ in time and defined by ``u``.
+
+ Returns
+ -------
+ ke : float or array-like
+ Kinetic energy.
+
+ Examples
+ --------
+ >>> import numpy as np
+ >>> from clouddrift.kinematics import kinetic_energy
+ >>> u = np.array([1., 2., 3., 4.])
+ >>> v = np.array([1., 1., 1., 1.])
+ >>> kinetic_energy(u, v)
+ array([1. , 2.5, 5. , 8.5])
+
+ >>> u = np.reshape(np.tile([1., 2., 3., 4.], 2), (2, 4))
+ >>> v = np.reshape(np.tile([1., 1., 1., 1.], 2), (2, 4))
+ >>> kinetic_energy(u, v)
+ array([[1. , 2.5, 5. , 8.5],
+ [1. , 2.5, 5. , 8.5]])
+ """
+ ifvisNone:
+ v=np.zeros_like(u)
+ ke=(u**2+v**2)/2
+ returnke
+
+
+
+
+[docs]
+definertial_oscillation_from_position(
+ longitude:np.ndarray,
+ latitude:np.ndarray,
+ relative_bandwidth:Optional[float]=None,
+ wavelet_duration:Optional[float]=None,
+ time_step:Optional[float]=3600.0,
+ relative_vorticity:Optional[Union[float,np.ndarray]]=0.0,
+)->np.ndarray:
+"""Extract inertial oscillations from consecutive geographical positions.
+
+ This function acts by performing a time-frequency analysis of horizontal displacements
+ with analytic Morse wavelets. It extracts the portion of the wavelet transform signal
+ that follows the inertial frequency (opposite of Coriolis frequency) as a function of time,
+ potentially shifted in frequency by a measure of relative vorticity. The result is a pair
+ of zonal and meridional relative displacements in meters.
+
+ This function is equivalent to a bandpass filtering of the horizontal displacements. The characteristics
+ of the filter are defined by the relative bandwidth of the wavelet transform or by the duration of the wavelet,
+ see the parameters below.
+
+ Parameters
+ ----------
+ longitude : array-like
+ Longitude sequence. Unidimensional array input.
+ latitude : array-like
+ Latitude sequence. Unidimensional array input.
+ relative_bandwidth : float, optional
+ Bandwidth of the frequency-domain equivalent filter for the extraction of the inertial
+ oscillations; a number less or equal to one which is a fraction of the inertial frequency.
+ A value of 0.1 leads to a bandpass filter equivalent of +/- 10 percent of the inertial frequency.
+ wavelet_duration : float, optional
+ Duration of the wavelet, or inverse of the relative bandwidth, which can be passed instead of the
+ relative bandwidth.
+ time_step : float, optional
+ The constant time interval between data points in seconds. Default is 3600.
+ relative_vorticity: Optional, float or array-like
+ Relative vorticity adding to the local Coriolis frequency. If "f" is the Coriolis
+ frequency then "f" + `relative_vorticity` will be the effective Coriolis frequency as defined by Kunze (1985).
+ Positive values correspond to cyclonic vorticity, irrespectively of the latitudes of the data
+ points.
+
+ Returns
+ -------
+ xhat : array-like
+ Zonal relative displacement in meters from inertial oscillations.
+ yhat : array-like
+ Meridional relative displacement in meters from inertial oscillations.
+
+ Examples
+ --------
+ To extract displacements from inertial oscillations from sequences of longitude
+ and latitude values, equivalent to bandpass around 20 percent of the local inertial frequency:
+
+ >>> xhat, yhat = inertial_oscillation_from_position(longitude, latitude, relative_bandwidth=0.2)
+
+ The same result can be obtained by specifying the wavelet duration instead of the relative bandwidth:
+
+ >>> xhat, yhat = inertial_oscillation_from_position(longitude, latitude, wavelet_duration=5)
+
+ Next, the residual positions from the inertial displacements can be obtained with another function:
+
+ >>> residual_longitudes, residual_latitudes = residual_position_from_displacement(longitude, latitude, xhat, yhat)
+
+ Raises
+ ------
+ ValueError
+ If longitude and latitude arrays do not have the same shape.
+ If relative_vorticity is an array and does not have the same shape as longitude and latitude.
+ If time_step is not a float.
+ If both relative_bandwidth and wavelet_duration are specified.
+ If neither relative_bandwidth nor wavelet_duration are specified.
+ If the absolute value of relative_bandwidth is not in the range (0,1].
+ If the wavelet duration is not greater than or equal to 1.
+
+ See Also
+ --------
+ :func:`residual_position_from_displacement`, `wavelet_transform`, `morse_wavelet`
+
+ """
+ iflongitude.shape!=latitude.shape:
+ raiseValueError("longitude and latitude arrays must have the same shape.")
+
+ ifrelative_bandwidthisnotNoneandwavelet_durationisnotNone:
+ raiseValueError(
+ "Only one of 'relative_bandwidth' and 'wavelet_duration' can be specified"
+ )
+ elifrelative_bandwidthisNoneandwavelet_durationisNone:
+ raiseValueError(
+ "One of 'relative_bandwidth' and 'wavelet_duration' must be specified"
+ )
+
+ # length of data sequence
+ data_length=longitude.shape[0]
+
+ ifisinstance(relative_vorticity,float):
+ relative_vorticity=np.full_like(longitude,relative_vorticity)
+ elifisinstance(relative_vorticity,np.ndarray):
+ ifnotrelative_vorticity.shape==longitude.shape:
+ raiseValueError(
+ "relative_vorticity must be a float or the same shape as longitude and latitude."
+ )
+ ifrelative_bandwidthisnotNone:
+ ifnot0<np.abs(relative_bandwidth)<=1:
+ raiseValueError("relative_bandwidth must be in the (0, 1]) range")
+
+ ifwavelet_durationisnotNone:
+ ifnotwavelet_duration>=1:
+ raiseValueError("wavelet_duration must be greater than or equal to 1")
+
+ # wavelet parameters are gamma and beta
+ gamma=3# symmetric wavelet
+ density=16# results relative insensitive to this parameter
+ # calculate beta from wavelet duration or from relative bandwidth
+ ifrelative_bandwidthisnotNone:
+ wavelet_duration=1/np.abs(relative_bandwidth)# P parameter
+ beta=wavelet_duration**2/gamma
+
+ ifisinstance(latitude,xr.DataArray):
+ latitude=latitude.to_numpy()
+ ifisinstance(longitude,xr.DataArray):
+ longitude=longitude.to_numpy()
+
+ # Instantaneous absolute frequency of oscillations along trajectory in radian per second
+ cor_freq=np.abs(
+ coriolis_frequency(latitude)+relative_vorticity*np.sign(latitude)
+ )
+ cor_freq_max=np.max(cor_freq*1.05)
+ cor_freq_min=np.max(
+ [np.min(cor_freq*0.95),2*np.pi/(time_step*data_length)]
+ )
+
+ # logarithmically distributed frequencies for wavelet analysis
+ radian_frequency=morse_logspace_freq(
+ gamma,
+ beta,
+ data_length,
+ (0.05,cor_freq_max*time_step),
+ (5,cor_freq_min*time_step),
+ density,
+ )# frequencies in radian per unit time
+
+ # wavelet transform on a sphere
+ # unwrap longitude recasted in [0,360)
+ longitude_unwrapped=np.unwrap(recast_lon360(longitude),period=360)
+
+ # convert lat/lon to Cartesian coordinates x, y , z
+ x,y,z=spherical_to_cartesian(longitude_unwrapped,latitude)
+
+ # wavelet transform of x, y, z
+ wavelet,_=morse_wavelet(data_length,gamma,beta,radian_frequency)
+ wx=wavelet_transform(x,wavelet,boundary="mirror")
+ wy=wavelet_transform(y,wavelet,boundary="mirror")
+ wz=wavelet_transform(z,wavelet,boundary="mirror")
+
+ longitude_new,latitude_new=cartesian_to_spherical(
+ x-np.real(wx),y-np.real(wy),z-np.real(wz)
+ )
+
+ # convert transforms to horizontal displacements on tangent plane
+ wxh,wyh=cartesian_to_tangentplane(wx,wy,wz,longitude_new,latitude_new)
+
+ # rotary wavelet transforms to select inertial component; need to divide by sqrt(2)
+ wp=(wxh+1j*wyh)/np.sqrt(2)
+ wn=(wxh-1j*wyh)/np.sqrt(2)
+
+ # find the values of radian_frequency/dt that most closely match cor_freq
+ frequency_bins=[
+ np.argmin(np.abs(cor_freq[i]-radian_frequency/time_step))
+ foriinrange(data_length)
+ ]
+
+ # get the transform at the inertial and "anti-inertial" frequencies
+ # extract the values of wp and wn at the calculated index as a function of time
+ # positive is anticyclonic (inertial) in the southern hemisphere
+ # negative is anticyclonic (inertial) in the northern hemisphere
+ wp=wp[frequency_bins,np.arange(0,data_length)]
+ wn=wn[frequency_bins,np.arange(0,data_length)]
+
+ # indices of northern latitude points
+ north=latitude>=0
+
+ # initialize the zonal and meridional components of inertial displacements
+ wxhat=np.zeros_like(latitude,dtype=np.complex64)
+ wyhat=np.zeros_like(latitude,dtype=np.complex64)
+ # equations are x+ = 0.5*(z+ + z-) and y+ = -0.5*1j*(z+ - z-)
+ ifany(north):
+ wxhat[north]=wn[north]/np.sqrt(2)
+ wyhat[north]=1j*wn[north]/np.sqrt(2)
+ ifany(~north):
+ wxhat[~north]=wp[~north]/np.sqrt(2)
+ wyhat[~north]=-1j*wp[~north]/np.sqrt(2)
+
+ # inertial displacement in meters
+ xhat=np.real(wxhat)
+ yhat=np.real(wyhat)
+
+ returnxhat,yhat
+
+
+
+
+[docs]
+defresidual_position_from_displacement(
+ longitude:Union[float,np.ndarray,xr.DataArray],
+ latitude:Union[float,np.ndarray,xr.DataArray],
+ x:Union[float,np.ndarray],
+ y:Union[float,np.ndarray],
+)->Union[Tuple[float],Tuple[np.ndarray]]:
+"""
+ Return residual longitudes and latitudes along a trajectory on the spherical Earth
+ after correcting for zonal and meridional displacements x and y in meters.
+
+ This is applicable as an example when one seeks to correct a trajectory for
+ horizontal oscillations due to inertial motions, tides, etc.
+
+ Parameters
+ ----------
+ longitude : float or array-like
+ Longitude in degrees.
+ latitude : float or array-like
+ Latitude in degrees.
+ x : float or np.ndarray
+ Zonal displacement in meters.
+ y : float or np.ndarray
+ Meridional displacement in meters.
+
+ Returns
+ -------
+ residual_longitude : float or np.ndarray
+ Residual longitude after correcting for zonal displacement, in degrees.
+ residual_latitude : float or np.ndarray
+ Residual latitude after correcting for meridional displacement, in degrees.
+
+ Examples
+ --------
+ Obtain the new geographical position for a displacement of 1/360-th of the
+ circumference of the Earth from original position (longitude,latitude) = (1,0):
+
+ >>> from clouddrift.sphere import EARTH_RADIUS_METERS
+ >>> residual_position_from_displacement(1,0,2 * np.pi * EARTH_RADIUS_METERS / 360,0)
+ (0.0, 0.0)
+ """
+ # convert to numpy arrays to insure consistent outputs
+ ifisinstance(longitude,xr.DataArray):
+ longitude=longitude.to_numpy()
+ ifisinstance(latitude,xr.DataArray):
+ latitude=latitude.to_numpy()
+
+ latitudehat=180/np.pi*y/EARTH_RADIUS_METERS
+ longitudehat=(
+ 180/np.pi*x/(EARTH_RADIUS_METERS*np.cos(np.radians(latitude)))
+ )
+
+ residual_latitude=latitude-latitudehat
+ residual_longitude=recast_lon360(
+ np.degrees(np.angle(np.exp(1j*np.radians(longitude-longitudehat))))
+ )
+
+ returnresidual_longitude,residual_latitude
+
+
+
+
+[docs]
+defposition_from_velocity(
+ u:np.ndarray,
+ v:np.ndarray,
+ time:np.ndarray,
+ x_origin:float,
+ y_origin:float,
+ coord_system:Optional[str]="spherical",
+ integration_scheme:Optional[str]="forward",
+ time_axis:Optional[int]=-1,
+)->Tuple[np.ndarray,np.ndarray]:
+"""Compute positions from arrays of velocities and time and a pair of origin
+ coordinates.
+
+ The units of the result are degrees if ``coord_system == "spherical"`` (default).
+ If ``coord_system == "cartesian"``, the units of the result are equal to the
+ units of the input velocities multiplied by the units of the input time.
+ For example, if the input velocities are in meters per second and the input
+ time is in seconds, the units of the result will be meters.
+
+ Integration scheme can take one of three values:
+
+ 1. "forward" (default): integration from x[i] to x[i+1] is performed
+ using the velocity at x[i].
+ 2. "backward": integration from x[i] to x[i+1] is performed using the
+ velocity at x[i+1].
+ 3. "centered": integration from x[i] to x[i+1] is performed using the
+ arithmetic average of the velocities at x[i] and x[i+1]. Note that
+ this method introduces some error due to the averaging.
+
+ u, v, and time can be multi-dimensional arrays. If the time axis, along
+ which the finite differencing is performed, is not the last one (i.e.
+ x.shape[-1]), use the ``time_axis`` optional argument to specify along which
+ axis should the differencing be done. ``x``, ``y``, and ``time`` must have
+ the same shape.
+
+ This function will not do any special handling of longitude ranges. If the
+ integrated trajectory crosses the antimeridian (dateline) in either direction, the
+ longitude values will not be adjusted to stay in any specific range such
+ as [-180, 180] or [0, 360]. If you need your longitudes to be in a specific
+ range, recast the resulting longitude from this function using the function
+ :func:`clouddrift.sphere.recast_lon`.
+
+ Parameters
+ ----------
+ u : np.ndarray
+ An array of eastward velocities.
+ v : np.ndarray
+ An array of northward velocities.
+ time : np.ndarray
+ An array of time values.
+ x_origin : float
+ Origin x-coordinate or origin longitude.
+ y_origin : float
+ Origin y-coordinate or origin latitude.
+ coord_system : str, optional
+ The coordinate system of the input. Can be "spherical" or "cartesian".
+ Default is "spherical".
+ integration_scheme : str, optional
+ The difference scheme to use for computing the position. Can be
+ "forward" or "backward". Default is "forward".
+ time_axis : int, optional
+ The axis of the time array. Default is -1, which corresponds to the
+ last axis.
+
+ Returns
+ -------
+ x : np.ndarray
+ An array of zonal displacements or longitudes.
+ y : np.ndarray
+ An array of meridional displacements or latitudes.
+
+ Examples
+ --------
+
+ Simple integration on a plane, using the forward scheme by default:
+
+ >>> import numpy as np
+ >>> from clouddrift.analysis import position_from_velocity
+ >>> u = np.array([1., 2., 3., 4.])
+ >>> v = np.array([1., 1., 1., 1.])
+ >>> time = np.array([0., 1., 2., 3.])
+ >>> x, y = position_from_velocity(u, v, time, 0, 0, coord_system="cartesian")
+ >>> x
+ array([0., 1., 3., 6.])
+ >>> y
+ array([0., 1., 2., 3.])
+
+ As above, but using centered scheme:
+
+ >>> x, y = position_from_velocity(u, v, time, 0, 0, coord_system="cartesian", integration_scheme="centered")
+ >>> x
+ array([0., 1.5, 4., 7.5])
+ >>> y
+ array([0., 1., 2., 3.])
+
+ Simple integration on a sphere (default):
+
+ >>> u = np.array([1., 2., 3., 4.])
+ >>> v = np.array([1., 1., 1., 1.])
+ >>> time = np.array([0., 1., 2., 3.]) * 1e5
+ >>> x, y = position_from_velocity(u, v, time, 0, 0)
+ >>> x
+ array([0. , 0.89839411, 2.69584476, 5.39367518])
+ >>> y
+ array([0. , 0.89828369, 1.79601515, 2.69201609])
+
+ Integrating across the antimeridian (dateline) by default does not
+ recast the resulting longitude:
+
+ >>> u = np.array([1., 1.])
+ >>> v = np.array([0., 0.])
+ >>> time = np.array([0, 1e5])
+ >>> x, y = position_from_velocity(u, v, time, 179.5, 0)
+ >>> x
+ array([179.5 , 180.3983205])
+ >>> y
+ array([0., 0.])
+
+ Use the ``clouddrift.sphere.recast_lon`` function to recast the longitudes
+ to the desired range:
+
+ >>> from clouddrift.sphere import recast_lon
+ >>> recast_lon(x, -180)
+ array([ 179.5 , -179.6016795])
+
+ Raises
+ ------
+ ValueError
+ If u and v do not have the same shape.
+ If the time axis is outside of the valid range ([-1, N-1]).
+ If lengths of x, y, and time along time_axis are not equal.
+ If the input coordinate system is not "spherical" or "cartesian".
+ If the input integration scheme is not "forward", "backward", or "centered"
+
+ See Also
+ --------
+ :func:`velocity_from_position`
+ """
+ # Velocity arrays must have the same shape.
+ # Although the exception would be raised further down in the function,
+ # we do the check here for a clearer error message.
+ ifnotu.shape==v.shape:
+ raiseValueError("u and v must have the same shape.")
+
+ # time_axis must be in valid range
+ iftime_axis<-1ortime_axis>len(u.shape)-1:
+ raiseValueError(
+ f"time_axis ({time_axis}) is outside of the valid range ([-1,"
+ f" {len(x.shape)-1}])."
+ )
+
+ # Input arrays must have the same length along the time axis.
+ ifnotu.shape[time_axis]==v.shape[time_axis]==time.shape[time_axis]:
+ raiseValueError(
+ f"u, v, and time must have the same length along the time axis "
+ f"({time_axis})."
+ )
+
+ # Swap axes so that we can differentiate along the last axis.
+ # This is a syntax convenience rather than memory access optimization:
+ # np.swapaxes returns a view of the array, not a copy, if the input is a
+ # NumPy array. Otherwise, it returns a copy. For readability, introduce new
+ # variable names so that we can more easily differentiate between the
+ # original arrays and those with swapped axes.
+ u_=np.swapaxes(u,time_axis,-1)
+ v_=np.swapaxes(v,time_axis,-1)
+ time_=np.swapaxes(time,time_axis,-1)
+
+ x=np.zeros(u_.shape,dtype=u.dtype)
+ y=np.zeros(v_.shape,dtype=v.dtype)
+
+ dt=np.diff(time_)
+
+ ifintegration_scheme.lower()=="forward":
+ x[...,1:]=np.cumsum(u_[...,:-1]*dt,axis=-1)
+ y[...,1:]=np.cumsum(v_[...,:-1]*dt,axis=-1)
+ elifintegration_scheme.lower()=="backward":
+ x[...,1:]=np.cumsum(u_[1:]*dt,axis=-1)
+ y[...,1:]=np.cumsum(v_[1:]*dt,axis=-1)
+ elifintegration_scheme.lower()=="centered":
+ x[...,1:]=np.cumsum(0.5*(u_[...,:-1]+u_[...,1:])*dt,axis=-1)
+ y[...,1:]=np.cumsum(0.5*(v_[...,:-1]+v_[...,1:])*dt,axis=-1)
+ else:
+ raiseValueError(
+ 'integration_scheme must be "forward", "backward", or "centered".'
+ )
+
+ ifcoord_system.lower()=="cartesian":
+ x+=x_origin
+ y+=y_origin
+ elifcoord_system.lower()=="spherical":
+ dx=np.diff(x)
+ dy=np.diff(y)
+ distances=np.sqrt(dx**2+dy**2)
+ bearings=np.arctan2(dy,dx)
+ x[...,0],y[...,0]=x_origin,y_origin
+ forninrange(distances.shape[-1]):
+ x[...,n+1],y[...,n+1]=position_from_distance_and_bearing(
+ x[...,n],y[...,n],distances[...,n],bearings[...,n]
+ )
+ else:
+ raiseValueError('coord_system must be "spherical" or "cartesian".')
+
+ returnnp.swapaxes(x,time_axis,-1),np.swapaxes(y,time_axis,-1)
+
+
+
+
+[docs]
+defvelocity_from_position(
+ x:np.ndarray,
+ y:np.ndarray,
+ time:np.ndarray,
+ coord_system:Optional[str]="spherical",
+ difference_scheme:Optional[str]="forward",
+ time_axis:Optional[int]=-1,
+)->Tuple[xr.DataArray,xr.DataArray]:
+"""Compute velocity from arrays of positions and time.
+
+ x and y can be provided as longitude and latitude in degrees if
+ coord_system == "spherical" (default), or as easting and northing if
+ coord_system == "cartesian".
+
+ The units of the result are meters per unit of time if
+ coord_system == "spherical". For example, if the time is provided in the
+ units of seconds, the resulting velocity is in the units of meters per
+ second. Otherwise, if coord_system == "cartesian", the units of the
+ resulting velocity correspond to the units of the input. For example,
+ if zonal and meridional displacements are in the units of kilometers and
+ time is in the units of hours, the resulting velocity is in the units of
+ kilometers per hour.
+
+ x, y, and time can be multi-dimensional arrays. If the time axis, along
+ which the finite differencing is performed, is not the last one (i.e.
+ x.shape[-1]), use the time_axis optional argument to specify along which
+ axis should the differencing be done. x, y, and time must have the same
+ shape.
+
+ Difference scheme can take one of three values:
+
+ #. "forward" (default): finite difference is evaluated as ``dx[i] = dx[i+1] - dx[i]``;
+ #. "backward": finite difference is evaluated as ``dx[i] = dx[i] - dx[i-1]``;
+ #. "centered": finite difference is evaluated as ``dx[i] = (dx[i+1] - dx[i-1]) / 2``.
+
+ Forward and backward schemes are effectively the same except that the
+ position at which the velocity is evaluated is shifted one element down in
+ the backward scheme relative to the forward scheme. In the case of a
+ forward or backward difference scheme, the last or first element of the
+ velocity, respectively, is extrapolated from its neighboring point. In the
+ case of a centered difference scheme, the start and end boundary points are
+ evaluated using the forward and backward difference scheme, respectively.
+
+ Parameters
+ ----------
+ x : array_like
+ An N-d array of x-positions (longitude in degrees or zonal displacement in any unit)
+ y : array_like
+ An N-d array of y-positions (latitude in degrees or meridional displacement in any unit)
+ time : array_like
+ An N-d array of times as floating point values (in any unit)
+ coord_system : str, optional
+ Coordinate system that x and y arrays are in; possible values are "spherical" (default) or "cartesian".
+ difference_scheme : str, optional
+ Difference scheme to use; possible values are "forward", "backward", and "centered".
+ time_axis : int, optional
+ Axis along which to differentiate (default is -1)
+
+ Returns
+ -------
+ u : np.ndarray
+ Zonal velocity
+ v : np.ndarray
+ Meridional velocity
+
+ Raises
+ ------
+ ValueError
+ If x and y do not have the same shape.
+ If time_axis is outside of the valid range.
+ If lengths of x, y, and time along time_axis are not equal.
+ If coord_system is not "spherical" or "cartesian".
+ If difference_scheme is not "forward", "backward", or "centered".
+
+ Examples
+ --------
+ Simple integration on a sphere, using the forward scheme by default:
+
+ >>> import numpy as np
+ >>> from clouddrift.kinematics import velocity_from_position
+ >>> lon = np.array([0., 1., 3., 6.])
+ >>> lat = np.array([0., 1., 2., 3.])
+ >>> time = np.array([0., 1., 2., 3.]) * 1e5
+ >>> u, v = velocity_from_position(lon, lat, time)
+ >>> u
+ array([1.11307541, 2.22513331, 3.33515501, 3.33515501])
+ >>> v
+ array([1.11324496, 1.11409224, 1.1167442 , 1.1167442 ])
+
+ Integration on a Cartesian plane, using the forward scheme by default:
+
+ >>> x = np.array([0., 1., 3., 6.])
+ >>> y = np.array([0., 1., 2., 3.])
+ >>> time = np.array([0., 1., 2., 3.])
+ >>> u, v = velocity_from_position(x, y, time, coord_system="cartesian")
+ >>> u
+ array([1., 2., 3., 3.])
+ >>> v
+ array([1., 1., 1., 1.])
+
+ See Also
+ --------
+ :func:`position_from_velocity`
+ """
+
+ # Position arrays must have the same shape.
+ # Although the exception would be raised further down in the function,
+ # we do the check here for a clearer error message.
+ ifnotx.shape==y.shape:
+ raiseValueError("x and y arrays must have the same shape.")
+
+ # time_axis must be in valid range
+ iftime_axis<-1ortime_axis>len(x.shape)-1:
+ raiseValueError(
+ f"time_axis ({time_axis}) is outside of the valid range ([-1,"
+ f" {len(x.shape)-1}])."
+ )
+
+ # Input arrays must have the same length along the time axis.
+ ifnotx.shape[time_axis]==y.shape[time_axis]==time.shape[time_axis]:
+ raiseValueError(
+ f"x, y, and time must have the same length along the time axis "
+ f"({time_axis})."
+ )
+
+ # Swap axes so that we can differentiate along the last axis.
+ # This is a syntax convenience rather than memory access optimization:
+ # np.swapaxes returns a view of the array, not a copy, if the input is a
+ # NumPy array. Otherwise, it returns a copy. For readability, introduce new
+ # variable names so that we can more easily differentiate between the
+ # original arrays and those with swapped axes.
+ x_=np.swapaxes(x,time_axis,-1)
+ y_=np.swapaxes(y,time_axis,-1)
+ time_=np.swapaxes(time,time_axis,-1)
+
+ dx=np.empty(x_.shape)
+ dy=np.empty(y_.shape)
+ dt=np.empty(time_.shape)
+
+ # Compute dx, dy, and dt
+ ifdifference_scheme=="forward":
+ # All values except the ending boundary value are computed using the
+ # 1st order forward differencing. The ending boundary value is
+ # computed using the 1st order backward difference.
+
+ # Time
+ dt[...,:-1]=np.diff(time_)
+ dt[...,-1]=dt[...,-2]
+
+ # Space
+ ifcoord_system=="cartesian":
+ dx[...,:-1]=np.diff(x_)
+ dx[...,-1]=dx[...,-2]
+ dy[...,:-1]=np.diff(y_)
+ dy[...,-1]=dy[...,-2]
+
+ elifcoord_system=="spherical":
+ distances=distance(x_[...,:-1],y_[...,:-1],x_[...,1:],y_[...,1:])
+ bearings=bearing(x_[...,:-1],y_[...,:-1],x_[...,1:],y_[...,1:])
+ dx[...,:-1]=distances*np.cos(bearings)
+ dx[...,-1]=dx[...,-2]
+ dy[...,:-1]=distances*np.sin(bearings)
+ dy[...,-1]=dy[...,-2]
+
+ else:
+ raiseValueError('coord_system must be "spherical" or "cartesian".')
+
+ elifdifference_scheme=="backward":
+ # All values except the starting boundary value are computed using the
+ # 1st order backward differencing. The starting boundary value is
+ # computed using the 1st order forward difference.
+
+ # Time
+ dt[...,1:]=np.diff(time_)
+ dt[...,0]=dt[...,1]
+
+ # Space
+ ifcoord_system=="cartesian":
+ dx[...,1:]=np.diff(x_)
+ dx[...,0]=dx[...,1]
+ dy[...,1:]=np.diff(y_)
+ dy[...,0]=dy[...,1]
+
+ elifcoord_system=="spherical":
+ distances=distance(x_[...,:-1],y_[...,:-1],x_[...,1:],y_[...,1:])
+ bearings=bearing(x_[...,:-1],y_[...,:-1],x_[...,1:],y_[...,1:])
+ dx[...,1:]=distances*np.cos(bearings)
+ dx[...,0]=dx[...,1]
+ dy[...,1:]=distances*np.sin(bearings)
+ dy[...,0]=dy[...,1]
+
+ else:
+ raiseValueError('coord_system must be "spherical" or "cartesian".')
+
+ elifdifference_scheme=="centered":
+ # Inner values are computed using the 2nd order centered differencing.
+ # The start and end boundary values are computed using the 1st order
+ # forward and backward differencing, respectively.
+
+ # Time
+ dt[...,1:-1]=(time_[...,2:]-time_[...,:-2])/2
+ dt[...,0]=time_[...,1]-time_[...,0]
+ dt[...,-1]=time_[...,-1]-time_[...,-2]
+
+ # Space
+ ifcoord_system=="cartesian":
+ dx[...,1:-1]=(x_[...,2:]-x_[...,:-2])/2
+ dx[...,0]=x_[...,1]-x_[...,0]
+ dx[...,-1]=x_[...,-1]-x_[...,-2]
+ dy[...,1:-1]=(y_[...,2:]-y_[...,:-2])/2
+ dy[...,0]=y_[...,1]-y_[...,0]
+ dy[...,-1]=y_[...,-1]-y_[...,-2]
+
+ elifcoord_system=="spherical":
+ # Inner values
+ y1=(y_[...,:-2]+y_[...,1:-1])/2
+ x1=(x_[...,:-2]+x_[...,1:-1])/2
+ y2=(y_[...,2:]+y_[...,1:-1])/2
+ x2=(x_[...,2:]+x_[...,1:-1])/2
+ distances=distance(x1,y1,x2,y2)
+ bearings=bearing(x1,y1,x2,y2)
+ dx[...,1:-1]=distances*np.cos(bearings)
+ dy[...,1:-1]=distances*np.sin(bearings)
+
+ # Boundary values
+ distance1=distance(x_[...,0],y_[...,0],x_[...,1],y_[...,1])
+ bearing1=bearing(x_[...,0],y_[...,0],x_[...,1],y_[...,1])
+ dx[...,0]=distance1*np.cos(bearing1)
+ dy[...,0]=distance1*np.sin(bearing1)
+ distance2=distance(x_[...,-2],y_[...,-2],x_[...,-1],y_[...,-1])
+ bearing2=bearing(x_[...,-2],y_[...,-2],x_[...,-1],y_[...,-1])
+ dx[...,-1]=distance2*np.cos(bearing2)
+ dy[...,-1]=distance2*np.sin(bearing2)
+
+ else:
+ raiseValueError('coord_system must be "spherical" or "cartesian".')
+
+ else:
+ raiseValueError(
+ 'difference_scheme must be "forward", "backward", or "centered".'
+ )
+
+ # This should avoid an array copy when returning the result
+ dx/=dt
+ dy/=dt
+
+ returnnp.swapaxes(dx,time_axis,-1),np.swapaxes(dy,time_axis,-1)
+
+
+
+
+[docs]
+defspin(
+ u:np.ndarray,
+ v:np.ndarray,
+ time:np.ndarray,
+ difference_scheme:Optional[str]="forward",
+ time_axis:Optional[int]=-1,
+)->Union[float,np.ndarray]:
+"""Compute spin continuously from velocities and times.
+
+ Spin is traditionally (Sawford, 1999; Veneziani et al., 2005) defined as
+ (<u'dv' - v'du'>) / (2 dt EKE) where u' and v' are eddy-perturbations of the
+ velocity field, EKE is eddy kinetic energy, dt is the time step, and du' and
+ dv' are velocity component increments during dt, and < > denotes ensemble
+ average.
+
+ To allow computing spin based on full velocity fields, this function does
+ not do any demeaning of the velocity fields. If you need the spin based on
+ velocity anomalies, ensure to demean the velocity fields before passing
+ them to this function. This function also returns instantaneous spin values,
+ so the rank of the result is not reduced relative to the input.
+
+ ``u``, ``v``, and ``time`` can be multi-dimensional arrays. If the time
+ axis, along which the finite differencing is performed, is not the last one
+ (i.e. ``u.shape[-1]``), use the time_axis optional argument to specify along
+ which the spin should be calculated. u, v, and time must either have the
+ same shape, or time must be a 1-d array with the same length as
+ ``u.shape[time_axis]``.
+
+ Difference scheme can be one of three values:
+
+ 1. "forward" (default): finite difference is evaluated as ``dx[i] = dx[i+1] - dx[i]``;
+ 2. "backward": finite difference is evaluated as ``dx[i] = dx[i] - dx[i-1]``;
+ 3. "centered": finite difference is evaluated as ``dx[i] = (dx[i+1] - dx[i-1]) / 2``.
+
+ Forward and backward schemes are effectively the same except that the
+ position at which the velocity is evaluated is shifted one element down in
+ the backward scheme relative to the forward scheme. In the case of a
+ forward or backward difference scheme, the last or first element of the
+ velocity, respectively, is extrapolated from its neighboring point. In the
+ case of a centered difference scheme, the start and end boundary points are
+ evaluated using the forward and backward difference scheme, respectively.
+
+ Parameters
+ ----------
+ u : np.ndarray
+ Zonal velocity
+ v : np.ndarray
+ Meridional velocity
+ time : array-like
+ Time
+ difference_scheme : str, optional
+ Difference scheme to use; possible values are "forward", "backward", and "centered".
+ time_axis : int, optional
+ Axis along which the time varies (default is -1)
+
+ Returns
+ -------
+ s : float or np.ndarray
+ Spin
+
+ Raises
+ ------
+ ValueError
+ If u and v do not have the same shape.
+ If the time axis is outside of the valid range ([-1, N-1]).
+ If lengths of u, v, and time along time_axis are not equal.
+ If difference_scheme is not "forward", "backward", or "centered".
+
+ Examples
+ --------
+ >>> from clouddrift.kinematics import spin
+ >>> import numpy as np
+ >>> u = np.array([1., 2., -1., 4.])
+ >>> v = np.array([1., 3., -2., 1.])
+ >>> time = np.array([0., 1., 2., 3.])
+ >>> spin(u, v, time)
+ array([ 0.5 , -0.07692308, 1.4 , 0.41176471])
+
+ Use ``difference_scheme`` to specify an alternative finite difference
+ scheme for the velocity differences:
+
+ >>> spin(u, v, time, difference_scheme="centered")
+ array([0.5 , 0. , 0.6 , 0.41176471])
+ >>> spin(u, v, time, difference_scheme="backward")
+ array([ 0.5 , 0.07692308, -0.2 , 0.41176471])
+
+ References
+ ----------
+ * Sawford, B.L., 1999. Rotation of trajectories in Lagrangian stochastic models of turbulent dispersion. Boundary-layer meteorology, 93, pp.411-424. https://doi.org/10.1023/A:1002114132715
+ * Veneziani, M., Griffa, A., Garraffo, Z.D. and Chassignet, E.P., 2005. Lagrangian spin parameter and coherent structures from trajectories released in a high-resolution ocean model. Journal of Marine Research, 63(4), pp.753-788. https://elischolar.library.yale.edu/journal_of_marine_research/100/
+ """
+ ifnotu.shape==v.shape:
+ raiseValueError("u and v arrays must have the same shape.")
+
+ ifnottime.shape==u.shape:
+ ifnottime.size==u.shape[time_axis]:
+ raiseValueError("time must have the same length as u along time_axis.")
+
+ # axis must be in valid range
+ iftime_axis<-1ortime_axis>len(u.shape)-1:
+ raiseValueError(
+ f"axis ({time_axis}) is outside of the valid range ([-1,"
+ f" {len(u.shape)-1}])."
+ )
+
+ # Swap axes so that we can differentiate along the last axis.
+ # This is a syntax convenience rather than memory access optimization:
+ # np.swapaxes returns a view of the array, not a copy, if the input is a
+ # NumPy array. Otherwise, it returns a copy.
+ u=np.swapaxes(u,time_axis,-1)
+ v=np.swapaxes(v,time_axis,-1)
+ time=np.swapaxes(time,time_axis,-1)
+
+ ifnottime.shape==u.shape:
+ # time is 1-d array; broadcast to u.shape.
+ time=np.broadcast_to(time,u.shape)
+
+ du=np.empty(u.shape)
+ dv=np.empty(v.shape)
+ dt=np.empty(time.shape)
+
+ ifdifference_scheme=="forward":
+ du[...,:-1]=np.diff(u)
+ du[...,-1]=du[...,-2]
+ dv[...,:-1]=np.diff(v)
+ dv[...,-1]=dv[...,-2]
+ dt[...,:-1]=np.diff(time)
+ dt[...,-1]=dt[...,-2]
+ elifdifference_scheme=="backward":
+ du[...,1:]=np.diff(u)
+ du[...,0]=du[...,1]
+ dv[...,1:]=np.diff(v)
+ dv[...,0]=dv[...,1]
+ dt[...,1:]=np.diff(time)
+ dt[...,0]=dt[...,1]
+ elifdifference_scheme=="centered":
+ du[...,1:-1]=(u[...,2:]-u[...,:-2])/2
+ du[...,0]=u[...,1]-u[...,0]
+ du[...,-1]=u[...,-1]-u[...,-2]
+ dv[...,1:-1]=(v[...,2:]-v[...,:-2])/2
+ dv[...,0]=v[...,1]-v[...,0]
+ dv[...,-1]=v[...,-1]-v[...,-2]
+ dt[...,1:-1]=(time[...,2:]-time[...,:-2])/2
+ dt[...,0]=time[...,1]-time[...,0]
+ dt[...,-1]=time[...,-1]-time[...,-2]
+ else:
+ raiseValueError(
+ 'difference_scheme must be "forward", "backward", or "centered".'
+ )
+
+ # Compute spin
+ s=(u*dv-v*du)/(2*dt*kinetic_energy(u,v))
+
+ returnnp.swapaxes(s,time_axis,-1)
+"""
+Functions to analyze pairs of contiguous data segments.
+"""
+fromclouddriftimportragged,sphere
+fromconcurrent.futuresimportas_completed,ThreadPoolExecutor
+importitertools
+importnumpyasnp
+importpandasaspd
+importxarrayasxr
+fromtypingimportList,Optional,Tuple,Union
+
+array_like=Union[list[float],np.ndarray[float],pd.Series,xr.DataArray]
+
+
+
+[docs]
+defchance_pair(
+ lon1:array_like,
+ lat1:array_like,
+ lon2:array_like,
+ lat2:array_like,
+ time1:Optional[array_like]=None,
+ time2:Optional[array_like]=None,
+ space_distance:Optional[float]=0,
+ time_distance:Optional[float]=0,
+):
+"""Given two sets of longitudes, latitudes, and times arrays, return in pairs
+ the indices of collocated data points that are within prescribed distances
+ in space and time. Also known as chance pairs.
+
+ Parameters
+ ----------
+ lon1 : array_like
+ First array of longitudes in degrees.
+ lat1 : array_like
+ First array of latitudes in degrees.
+ lon2 : array_like
+ Second array of longitudes in degrees.
+ lat2 : array_like
+ Second array of latitudes in degrees.
+ time1 : array_like, optional
+ First array of times.
+ time2 : array_like, optional
+ Second array of times.
+ space_distance : float, optional
+ Maximum allowable space distance in meters for a pair to qualify as chance pair.
+ If the separation is within this distance, the pair is considered to be
+ a chance pair. Default is 0, or no distance, i.e. the positions must be
+ exactly the same.
+ time_distance : float, optional
+ Maximum allowable time distance for a pair to qualify as chance pair.
+ If a separation is within this distance, and a space distance
+ condition is satisfied, the pair is considered a chance pair. Default is
+ 0, or no distance, i.e. the times must be exactly the same.
+
+ Returns
+ -------
+ indices1 : np.ndarray[int]
+ Indices within the first set of arrays that lead to chance pair.
+ indices2 : np.ndarray[int]
+ Indices within the second set of arrays that lead to chance pair.
+
+ Examples
+ --------
+ In the following example, we load the GLAD dataset, extract the first
+ two trajectories, and find between these the array indices that satisfy
+ the chance pair criteria of 6 km separation distance and no time separation:
+
+ >>> from clouddrift.datasets import glad
+ >>> from clouddrift.pairs import chance_pair
+ >>> from clouddrift.ragged import unpack
+ >>> ds = glad()
+ >>> lon1 = unpack(ds["longitude"], ds["rowsize"], rows=0).pop()
+ >>> lat1 = unpack(ds["latitude"], ds["rowsize"], rows=0).pop()
+ >>> time1 = unpack(ds["time"], ds["rowsize"], rows=0).pop()
+ >>> lon2 = unpack(ds["longitude"], ds["rowsize"], rows=1).pop()
+ >>> lat2 = unpack(ds["latitude"], ds["rowsize"], rows=1).pop()
+ >>> time2 = unpack(ds["time"], ds["rowsize"], rows=1).pop()
+ >>> i1, i2 = chance_pair(lon1, lat1, lon2, lat2, time1, time2, 6000, np.timedelta64(0))
+ >>> i1, i2
+ (array([177, 180, 183, 186, 189, 192]), array([166, 169, 172, 175, 178, 181]))
+
+ Check to ensure our collocation in space worked by calculating the distance
+ between the identified pairs:
+
+ >>> sphere.distance(lon1[i1], lat1[i1], lon2[i2], lat2[i2])
+ array([5967.4844, 5403.253 , 5116.9136, 5185.715 , 5467.8555, 5958.4917],
+ dtype=float32)
+
+ Check the collocation in time:
+
+ >>> time1[i1] - time2[i2]
+ <xarray.DataArray 'time' (obs: 6)>
+ array([0, 0, 0, 0, 0, 0], dtype='timedelta64[ns]')
+ Coordinates:
+ time (obs) datetime64[ns] 2012-07-21T21:30:00.524160 ... 2012-07-22T0...
+ Dimensions without coordinates: obs
+
+ Raises
+ ------
+ ValueError
+ If ``time1`` and ``time2`` are not both provided or both omitted.
+ """
+ if(time1isNoneandtime2isnotNone)or(time1isnotNoneandtime2isNone):
+ raiseValueError(
+ "Both time1 and time2 must be provided or both must be omitted."
+ )
+
+ time_present=time1isnotNoneandtime2isnotNone
+
+ iftime_present:
+ # If time is provided, subset the trajectories to the overlapping times.
+ overlap1,overlap2=pair_time_overlap(time1,time2,time_distance)
+ else:
+ # Otherwise, initialize the overlap indices to the full length of the
+ # trajectories.
+ overlap1=np.arange(lon1.size)
+ overlap2=np.arange(lon2.size)
+
+ # Provided space distance is in meters, but here we convert it to degrees
+ # for the bounding box overlap check.
+ space_distance_degrees=np.degrees(space_distance/sphere.EARTH_RADIUS_METERS)
+
+ # Compute the indices for each trajectory where the two trajectories'
+ # bounding boxes overlap.
+ bbox_overlap1,bbox_overlap2=pair_bounding_box_overlap(
+ lon1[overlap1],
+ lat1[overlap1],
+ lon2[overlap2],
+ lat2[overlap2],
+ space_distance_degrees,
+ )
+
+ # bbox_overlap1 and bbox_overlap2 subset the overlap1 and overlap2 indices.
+ overlap1=overlap1[bbox_overlap1]
+ overlap2=overlap2[bbox_overlap2]
+
+ # If time is present, first search for collocation in time.
+ iftime_present:
+ time_separation=pair_time_distance(time1[overlap1],time2[overlap2])
+ time_match2,time_match1=np.where(time_separation<=time_distance)
+ overlap1=overlap1[time_match1]
+ overlap2=overlap2[time_match2]
+
+ # Now search for collocation in space.
+ space_separation=pair_space_distance(
+ lon1[overlap1],lat1[overlap1],lon2[overlap2],lat2[overlap2]
+ )
+ space_overlap=space_separation<=space_distance
+ iftime_present:
+ time_separation=pair_time_distance(time1[overlap1],time2[overlap2])
+ time_overlap=time_separation<=time_distance
+ match2,match1=np.where(space_overlap&time_overlap)
+ else:
+ match2,match1=np.where(space_overlap)
+
+ overlap1=overlap1[match1]
+ overlap2=overlap2[match2]
+
+ returnoverlap1,overlap2
+
+
+
+
+[docs]
+defchance_pairs_from_ragged(
+ lon:array_like,
+ lat:array_like,
+ rowsize:array_like,
+ space_distance:Optional[float]=0,
+ time:Optional[array_like]=None,
+ time_distance:Optional[float]=0,
+)->List[Tuple[Tuple[int,int],Tuple[np.ndarray,np.ndarray]]]:
+"""Return all chance pairs of contiguous trajectories in a ragged array,
+ and their collocated points in space and (optionally) time, given input
+ ragged arrays of longitude, latitude, and (optionally) time, and chance
+ pair criteria as maximum allowable distances in space and time.
+
+ If ``time`` and ``time_distance`` are omitted, the search will be done
+ only on the spatial criteria, and the result will not include the time
+ arrays.
+
+ If ``time`` and ``time_distance`` are provided, the search will be done
+ on both the spatial and temporal criteria, and the result will include the
+ time arrays.
+
+ Parameters
+ ----------
+ lon : array_like
+ Array of longitudes in degrees.
+ lat : array_like
+ Array of latitudes in degrees.
+ rowsize : array_like
+ Array of rowsizes.
+ space_distance : float, optional
+ Maximum space distance in meters for the pair to qualify as chance pair.
+ If the separation is within this distance, the pair is considered to be
+ a chance pair. Default is 0, or no distance, i.e. the positions must be
+ exactly the same.
+ time : array_like, optional
+ Array of times.
+ time_distance : float, optional
+ Maximum time distance allowed for the pair to qualify as chance pair.
+ If the separation is within this distance, and the space distance
+ condition is satisfied, the pair is considered a chance pair. Default is
+ 0, or no distance, i.e. the times must be exactly the same.
+
+ Returns
+ -------
+ pairs : List[Tuple[Tuple[int, int], Tuple[np.ndarray, np.ndarray]]]
+ List of tuples, each tuple containing a Tuple of integer indices that
+ corresponds to the trajectory rows in the ragged array, indicating the
+ pair of trajectories that satisfy the chance pair criteria, and a Tuple
+ of arrays containing the indices of the collocated points for each
+ trajectory in the chance pair.
+
+ Examples
+ --------
+ In the following example, we load GLAD dataset as a ragged array dataset,
+ subset the result to retain the first five trajectories, and finally find
+ all trajectories that satisfy the chance pair criteria of 12 km separation
+ distance and no time separation, as well as the indices of the collocated
+ points for each pair.
+
+ >>> from clouddrift.datasets import glad
+ >>> from clouddrift.pairs import chance_pairs_from_ragged
+ >>> from clouddrift.ragged import subset
+ >>> ds = subset(glad(), {"id": ["CARTHE_001", "CARTHE_002", "CARTHE_003", "CARTHE_004", "CARTHE_005"]}, id_var_name="id")
+ >>> pairs = chance_pairs_from_ragged(
+ ds["longitude"].values,
+ ds["latitude"].values,
+ ds["rowsize"].values,
+ space_distance=12000,
+ time=ds["time"].values,
+ time_distance=np.timedelta64(0)
+ )
+ [((0, 1),
+ (array([153, 156, 159, 162, 165, 168, 171, 174, 177, 180, 183, 186, 189,
+ 192, 195, 198, 201, 204, 207, 210, 213, 216]),
+ array([142, 145, 148, 151, 154, 157, 160, 163, 166, 169, 172, 175, 178,
+ 181, 184, 187, 190, 193, 196, 199, 202, 205]))),
+ ((3, 4),
+ (array([141, 144, 147, 150, 153, 156, 159, 162, 165, 168, 171, 174, 177,
+ 180, 183]),
+ array([136, 139, 142, 145, 148, 151, 154, 157, 160, 163, 166, 169, 172,
+ 175, 178])))]
+
+ The result above shows that 2 chance pairs were found.
+
+ Raises
+ ------
+ ValueError
+ If ``rowsize`` has fewer than two elements.
+ """
+ iflen(rowsize)<2:
+ raiseValueError("rowsize must have at least two elements.")
+ pairs=list(itertools.combinations(np.arange(rowsize.size),2))
+ i=ragged.rowsize_to_index(rowsize)
+ results=[]
+ withThreadPoolExecutor()asexecutor:
+ iftimeisNone:
+ futures=[
+ executor.submit(
+ chance_pair,
+ lon[i[j]:i[j+1]],
+ lat[i[j]:i[j+1]],
+ lon[i[k]:i[k+1]],
+ lat[i[k]:i[k+1]],
+ space_distance=space_distance,
+ )
+ forj,kinpairs
+ ]
+ else:
+ futures=[
+ executor.submit(
+ chance_pair,
+ lon[i[j]:i[j+1]],
+ lat[i[j]:i[j+1]],
+ lon[i[k]:i[k+1]],
+ lat[i[k]:i[k+1]],
+ time[i[j]:i[j+1]],
+ time[i[k]:i[k+1]],
+ space_distance,
+ time_distance,
+ )
+ forj,kinpairs
+ ]
+ forfutureinas_completed(futures):
+ res=future.result()
+ # chance_pair function returns empty arrays if no chance criteria
+ # are satisfied. We only want to keep pairs that satisfy the
+ # criteria. chance_pair returns a tuple of arrays that are always
+ # the same size, so we only need to check the length of the first
+ # array.
+ ifres[0].size>0:
+ results.append((pairs[futures.index(future)],res))
+ returnresults
+
+
+
+
+[docs]
+defpair_bounding_box_overlap(
+ lon1:array_like,
+ lat1:array_like,
+ lon2:array_like,
+ lat2:array_like,
+ distance:Optional[float]=0,
+)->Tuple[np.ndarray[bool],np.ndarray[bool]]:
+"""Given two arrays of longitudes and latitudes, return boolean masks for
+ their overlapping bounding boxes.
+
+ Parameters
+ ----------
+ lon1 : array_like
+ First array of longitudes in degrees.
+ lat1 : array_like
+ First array of latitudes in degrees.
+ lon2 : array_like
+ Second array of longitudes in degrees.
+ lat2 : array_like
+ Second array of latitudes in degrees.
+ distance : float, optional
+ Distance in degrees for the overlap. If the overlap is within this
+ distance, the bounding boxes are considered to overlap. Default is 0.
+
+ Returns
+ -------
+ overlap1 : np.ndarray[int]
+ Indices ``lon1`` and ``lat1`` where their bounding box overlaps with
+ that of ``lon2`` and ``lat2``.
+ overlap2 : np.ndarray[int]
+ Indices ``lon2`` and ``lat2`` where their bounding box overlaps with
+ that of ``lon1`` and ``lat1``.
+
+ Examples
+ --------
+ >>> lon1 = [0, 0, 1, 1]
+ >>> lat1 = [0, 0, 1, 1]
+ >>> lon2 = [1, 1, 2, 2]
+ >>> lat2 = [1, 1, 2, 2]
+ >>> pair_bounding_box_overlap(lon1, lat1, lon2, lat2, 0.5)
+ (array([2, 3]), array([0, 1]))
+ """
+ # First get the bounding box of each trajectory.
+ # We unwrap the longitudes before computing min/max because we want to
+ # consider trajectories that cross the dateline.
+ lon1_min,lon1_max=np.min(np.unwrap(lon1,period=360)),np.max(
+ np.unwrap(lon1,period=360)
+ )
+ lat1_min,lat1_max=np.min(lat1),np.max(lat1)
+ lon2_min,lon2_max=np.min(np.unwrap(lon2,period=360)),np.max(
+ np.unwrap(lon2,period=360)
+ )
+ lat2_min,lat2_max=np.min(lat2),np.max(lat2)
+
+ bounding_boxes_overlap=(
+ (lon1_min<=lon2_max+distance)
+ &(lon1_max>=lon2_min-distance)
+ &(lat1_min<=lat2_max+distance)
+ &(lat1_max>=lat2_min-distance)
+ )
+
+ # Now check if the trajectories overlap within the bounding box.
+ ifbounding_boxes_overlap:
+ overlap_start=(
+ max(lon1_min,lon2_min)-distance,# West
+ max(lat1_min,lat2_min)-distance,# South
+ )
+ overlap_end=(
+ min(lon1_max,lon2_max)+distance,# East
+ min(lat1_max,lat2_max)+distance,# North
+ )
+ overlap1=(
+ (lon1>=overlap_start[0])
+ &(lon1<=overlap_end[0])
+ &(lat1>=overlap_start[1])
+ &(lat1<=overlap_end[1])
+ )
+ overlap2=(
+ (lon2>=overlap_start[0])
+ &(lon2<=overlap_end[0])
+ &(lat2>=overlap_start[1])
+ &(lat2<=overlap_end[1])
+ )
+ returnnp.where(overlap1)[0],np.where(overlap2)[0]
+ else:
+ returnnp.array([],dtype=int),np.array([],dtype=int)
+
+
+
+
+[docs]
+defpair_space_distance(
+ lon1:array_like,
+ lat1:array_like,
+ lon2:array_like,
+ lat2:array_like,
+)->np.ndarray[float]:
+"""Given two arrays of longitudes and latitudes, return the distance
+ on a sphere between all pairs of points.
+
+ Parameters
+ ----------
+ lon1 : array_like
+ First array of longitudes in degrees.
+ lat1 : array_like
+ First array of latitudes in degrees.
+ lon2 : array_like
+ Second array of longitudes in degrees.
+ lat2 : array_like
+ Second array of latitudes in degrees.
+
+ Returns
+ -------
+ distance : np.ndarray[float]
+ Array of distances between all pairs of points.
+
+ Examples
+ --------
+ >>> lon1 = [0, 0, 1, 1]
+ >>> lat1 = [0, 0, 1, 1]
+ >>> lon2 = [1, 1, 2, 2]
+ >>> lat2 = [1, 1, 2, 2]
+ >>> pair_space_distance(lon1, lat1, lon2, lat2)
+ array([[157424.62387233, 157424.62387233, 0. ,
+ 0. ],
+ [157424.62387233, 157424.62387233, 0. ,
+ 0. ],
+ [314825.26360286, 314825.26360286, 157400.64794884,
+ 157400.64794884],
+ [314825.26360286, 314825.26360286, 157400.64794884,
+ 157400.64794884]])
+ """
+ # Create longitude and latitude matrices from arrays to compute distance
+ lon1_2d,lon2_2d=np.meshgrid(lon1,lon2,copy=False)
+ lat1_2d,lat2_2d=np.meshgrid(lat1,lat2,copy=False)
+
+ # Compute distance between all pairs of points
+ distance=sphere.distance(lon1_2d,lat1_2d,lon2_2d,lat2_2d)
+
+ returndistance
+
+
+
+
+[docs]
+defpair_time_distance(
+ time1:array_like,
+ time2:array_like,
+)->np.ndarray[float]:
+"""Given two arrays of times (or any other monotonically increasing
+ quantity), return the temporal distance between all pairs of times.
+
+ Parameters
+ ----------
+ time1 : array_like
+ First array of times.
+ time2 : array_like
+ Second array of times.
+
+ Returns
+ -------
+ distance : np.ndarray[float]
+ Array of distances between all pairs of times.
+
+ Examples
+ --------
+ >>> time1 = np.arange(4)
+ >>> time2 = np.arange(2, 6)
+ >>> pair_time_distance(time1, time2)
+ array([[2, 1, 0, 1],
+ [3, 2, 1, 0],
+ [4, 3, 2, 1],
+ [5, 4, 3, 2]])
+ """
+ # Create time matrices from arrays to compute distance
+ time1_2d,time2_2d=np.meshgrid(time1,time2,copy=False)
+
+ # Compute distance between all pairs of times
+ distance=np.abs(time1_2d-time2_2d)
+
+ returndistance
+
+
+
+
+[docs]
+defpair_time_overlap(
+ time1:array_like,
+ time2:array_like,
+ distance:Optional[float]=0,
+)->Tuple[np.ndarray[int],np.ndarray[int]]:
+"""Given two arrays of times (or any other monotonically increasing
+ quantity), return indices where the times are within a prescribed distance.
+
+ Although higher-level array containers like xarray and pandas are supported
+ for input arrays, this function is an order of magnitude faster when passing
+ in numpy arrays.
+
+ Parameters
+ ----------
+ time1 : array_like
+ First array of times.
+ time2 : array_like
+ Second array of times.
+ distance : float
+ Maximum distance within which the values of ``time1`` and ``time2`` are
+ considered to overlap. Default is 0, or, the values must be exactly the
+ same.
+
+ Returns
+ -------
+ overlap1 : np.ndarray[int]
+ Indices of ``time1`` where its time overlaps with ``time2``.
+ overlap2 : np.ndarray[int]
+ Indices of ``time2`` where its time overlaps with ``time1``.
+
+ Examples
+ --------
+ >>> time1 = np.arange(4)
+ >>> time2 = np.arange(2, 6)
+ >>> pair_time_overlap(time1, time2)
+ (array([2, 3]), array([0, 1]))
+
+ >>> pair_time_overlap(time1, time2, 1)
+ (array([1, 2, 3]), array([0, 1, 2]))
+ """
+ time1_min,time1_max=np.min(time1),np.max(time1)
+ time2_min,time2_max=np.min(time2),np.max(time2)
+ overlap_start=max(time1_min,time2_min)-distance
+ overlap_end=min(time1_max,time2_max)+distance
+ overlap1=np.where((time1>=overlap_start)&(time1<=overlap_end))[0]
+ overlap2=np.where((time2>=overlap_start)&(time2<=overlap_end))[0]
+ returnoverlap1,overlap2
+"""
+This module provides a function to easily and efficiently plot trajectories stored in a ragged array.
+"""
+
+fromclouddrift.raggedimportsegment,rowsize_to_index
+importnumpyasnp
+importpandasaspd
+fromtypingimportOptional,Union
+importxarrayasxr
+importpandasaspd
+fromtypingimportOptional,Union
+fromclouddrift.raggedimportsegment,rowsize_to_index
+
+
+
+[docs]
+defplot_ragged(
+ ax,
+ longitude:Union[list,np.ndarray,pd.Series,xr.DataArray],
+ latitude:Union[list,np.ndarray,pd.Series,xr.DataArray],
+ rowsize:Union[list,np.ndarray,pd.Series,xr.DataArray],
+ *args,
+ colors:Optional[Union[list,np.ndarray,pd.Series,xr.DataArray]]=None,
+ tolerance:Optional[Union[float,int]]=180,
+ **kwargs,
+):
+"""Plot trajectories from a ragged array dataset on a Matplotlib Axes
+ or a Cartopy GeoAxes object ``ax``.
+
+ This function wraps Matplotlib's ``plot`` function (``plt.plot``) and
+ ``LineCollection`` (``matplotlib.collections``) to efficiently plot
+ trajectories from a ragged array dataset.
+
+ Parameters
+ ----------
+ ax: matplotlib.axes.Axes or cartopy.mpl.geoaxes.GeoAxes
+ Axis to plot on.
+ longitude : array-like
+ Longitude sequence. Unidimensional array input.
+ latitude : array-like
+ Latitude sequence. Unidimensional array input.
+ rowsize : list
+ List of integers specifying the number of data points in each row.
+ *args : tuple
+ Additional arguments to pass to ``ax.plot``.
+ colors : array-like
+ Colors to use for plotting. If colors is the same shape as longitude and latitude,
+ the trajectories are splitted into segments and each segment is colored according
+ to the corresponding color value. If colors is the same shape as rowsize, the
+ trajectories are uniformly colored according to the corresponding color value.
+ tolerance : float
+ Longitude tolerance gap between data points (in degrees) for segmenting trajectories.
+ For periodic domains, the tolerance parameter should be set to the maximum allowed gap
+ between data points. Defaults to 180.
+ **kwargs : dict
+ Additional keyword arguments to pass to ``ax.plot``.
+
+ Returns
+ -------
+ list of matplotlib.lines.Line2D or matplotlib.collections.LineCollection
+ The plotted lines or line collection. Can be used to set a colorbar
+ after plotting or extract information from the lines.
+
+ Examples
+ --------
+
+ Plot the first 100 trajectories from the gdp1h dataset, assigning
+ a different color to each trajectory:
+
+ >>> from clouddrift import datasets
+ >>> import matplotlib.pyplot as plt
+ >>> ds = datasets.gdp1h()
+ >>> ds = subset(ds, {"ID": ds.ID[:100].values}).load()
+ >>> fig = plt.figure()
+ >>> ax = fig.add_subplot(1, 1, 1)
+
+ >>> plot_ragged(
+ >>> ax,
+ >>> ds.lon,
+ >>> ds.lat,
+ >>> ds.rowsize,
+ >>> colors=np.arange(len(ds.rowsize))
+ >>> )
+
+ To plot the same trajectories, but assigning a different color to each
+ observation and specifying a colormap:
+
+ >>> fig = plt.figure()
+ >>> ax = fig.add_subplot(1, 1, 1)
+ >>> time = [v.astype(np.int64) / 86400 / 1e9 for v in ds.time.values]
+ >>> lc = plot_ragged(
+ >>> ax,
+ >>> ds.lon,
+ >>> ds.lat,
+ >>> ds.rowsize,
+ >>> colors=np.floor(time),
+ >>> cmap="inferno"
+ >>> )
+ >>> fig.colorbar(lc[0])
+ >>> ax.set_xlim([-180, 180])
+ >>> ax.set_ylim([-90, 90])
+
+ Finally, to plot the same trajectories, but using a cartopy
+ projection:
+
+ >>> import cartopy.crs as ccrs
+ >>> fig = plt.figure()
+ >>> ax = fig.add_subplot(1, 1, 1, projection=ccrs.Mollweide())
+ >>> time = [v.astype(np.int64) / 86400 / 1e9 for v in ds.time.values]
+ >>> lc = plot_ragged(
+ >>> ax,
+ >>> ds.lon,
+ >>> ds.lat,
+ >>> ds.rowsize,
+ >>> colors=np.arange(len(ds.rowsize)),
+ >>> transform=ccrs.PlateCarree(),
+ >>> cmap=cmocean.cm.ice,
+ >>> )
+
+ Raises
+ ------
+ ValueError
+ If longitude and latitude arrays do not have the same shape.
+ If colors do not have the same shape as longitude and latitude arrays or rowsize.
+ If ax is not a matplotlib Axes or GeoAxes object.
+ If ax is a GeoAxes object and the transform keyword argument is not provided.
+
+ ImportError
+ If matplotlib is not installed.
+ If the axis is a GeoAxes object and cartopy is not installed.
+ """
+
+ # optional dependency
+ try:
+ importmatplotlib.pyplotasplt
+ importmatplotlib.colorsasmcolors
+ frommatplotlib.collectionsimportLineCollection
+ frommatplotlibimportcm
+ exceptImportError:
+ raiseImportError("missing optional dependency 'matplotlib'")
+
+ ifhasattr(ax,"coastlines"):# check if GeoAxes without cartopy
+ try:
+ fromcartopy.mpl.geoaxesimportGeoAxes
+
+ ifisinstance(ax,GeoAxes)andnotkwargs.get("transform"):
+ raiseValueError(
+ "For GeoAxes, the transform keyword argument must be provided."
+ )
+ exceptImportError:
+ raiseImportError("missing optional dependency 'cartopy'")
+ elifnotisinstance(ax,plt.Axes):
+ raiseValueError("ax must be either: plt.Axes or GeoAxes.")
+
+ ifnp.sum(rowsize)!=len(longitude):
+ raiseValueError("The sum of rowsize must equal the length of lon and lat.")
+
+ iflen(longitude)!=len(latitude):
+ raiseValueError("lon and lat must have the same length.")
+
+ ifcolorsisNone:
+ colors=np.arange(len(rowsize))
+ elifcolorsisnotNoneand(len(colors)notin[len(longitude),len(rowsize)]):
+ raiseValueError("shape colors must match the shape of lon/lat or rowsize.")
+
+ # define a colormap
+ cmap=kwargs.pop("cmap",cm.viridis)
+
+ # define a normalization obtain uniform colors
+ # for the sequence of lines or LineCollection
+ norm=kwargs.pop(
+ "norm",mcolors.Normalize(vmin=np.nanmin(colors),vmax=np.nanmax(colors))
+ )
+
+ mpl_plot=TrueifcolorsisNoneorlen(colors)==len(rowsize)elseFalse
+ traj_idx=rowsize_to_index(rowsize)
+
+ lines=[]
+ foriinrange(len(rowsize)):
+ lon_i,lat_i=(
+ longitude[traj_idx[i]:traj_idx[i+1]],
+ latitude[traj_idx[i]:traj_idx[i+1]],
+ )
+
+ start=0
+ forlengthinsegment(lon_i,tolerance,rowsize=segment(lon_i,-tolerance)):
+ end=start+length
+
+ ifmpl_plot:
+ line=ax.plot(
+ lon_i[start:end],
+ lat_i[start:end],
+ c=cmap(norm(colors[i]))ifcolorsisnotNoneelseNone,
+ *args,
+ **kwargs,
+ )
+ else:
+ colors_i=colors[traj_idx[i]:traj_idx[i+1]]
+ segments=np.column_stack(
+ [
+ lon_i[start:end-1],
+ lat_i[start:end-1],
+ lon_i[start+1:end],
+ lat_i[start+1:end],
+ ]
+ ).reshape(-1,2,2)
+ line=LineCollection(segments,cmap=cmap,norm=norm,*args,**kwargs)
+ line.set_array(
+ # color of a segment is the average of its two data points
+ np.convolve(colors_i[start:end],[0.5,0.5],mode="valid")
+ )
+ ax.add_collection(line)
+
+ start=end
+ lines.append(line)
+
+ # set axis limits
+ ax.set_xlim([np.min(longitude),np.max(longitude)])
+ ax.set_ylim([np.min(latitude),np.max(latitude)])
+
+ returnlines
+"""
+Transformational and inquiry functions for ragged arrays.
+"""
+
+importnumpyasnp
+fromtypingimportTuple,Union,Iterable,Callable
+importxarrayasxr
+importpandasaspd
+fromconcurrentimportfutures
+fromdatetimeimporttimedelta
+importwarnings
+
+
+
+[docs]
+defapply_ragged(
+ func:callable,
+ arrays:Union[list[Union[np.ndarray,xr.DataArray]],np.ndarray,xr.DataArray],
+ rowsize:Union[list[int],np.ndarray[int],xr.DataArray],
+ *args:tuple,
+ rows:Union[int,Iterable[int]]=None,
+ axis:int=0,
+ executor:futures.Executor=futures.ThreadPoolExecutor(max_workers=None),
+ **kwargs:dict,
+)->Union[tuple[np.ndarray],np.ndarray]:
+"""Apply a function to a ragged array.
+
+ The function ``func`` will be applied to each contiguous row of ``arrays`` as
+ indicated by row sizes ``rowsize``. The output of ``func`` will be
+ concatenated into a single ragged array.
+
+ You can pass ``arrays`` as NumPy arrays or xarray DataArrays, however,
+ the result will always be a NumPy array. Passing ``rows`` as an integer or
+ a sequence of integers will make ``apply_ragged`` process and return only
+ those specific rows, and otherwise, all rows in the input ragged array will
+ be processed. Further, you can use the ``axis`` parameter to specify the
+ ragged axis of the input array(s) (default is 0).
+
+ By default this function uses ``concurrent.futures.ThreadPoolExecutor`` to
+ run ``func`` in multiple threads. The number of threads can be controlled by
+ passing the ``max_workers`` argument to the executor instance passed to
+ ``apply_ragged``. Alternatively, you can pass the ``concurrent.futures.ProcessPoolExecutor``
+ instance to use processes instead. Passing alternative (3rd party library)
+ concurrent executors may work if they follow the same executor interface as
+ that of ``concurrent.futures``, however this has not been tested yet.
+
+ Parameters
+ ----------
+ func : callable
+ Function to apply to each row of each ragged array in ``arrays``.
+ arrays : list[np.ndarray] or np.ndarray or xr.DataArray
+ An array or a list of arrays to apply ``func`` to.
+ rowsize : list[int] or np.ndarray[int] or xr.DataArray[int]
+ List of integers specifying the number of data points in each row.
+ *args : tuple
+ Additional arguments to pass to ``func``.
+ rows : int or Iterable[int], optional
+ The row(s) of the ragged array to apply ``func`` to. If ``rows`` is
+ ``None`` (default), then ``func`` will be applied to all rows.
+ axis : int, optional
+ The ragged axis of the input arrays. Default is 0.
+ executor : concurrent.futures.Executor, optional
+ Executor to use for concurrent execution. Default is ``ThreadPoolExecutor``
+ with the default number of ``max_workers``.
+ Another supported option is ``ProcessPoolExecutor``.
+ **kwargs : dict
+ Additional keyword arguments to pass to ``func``.
+
+ Returns
+ -------
+ out : tuple[np.ndarray] or np.ndarray
+ Output array(s) from ``func``.
+
+ Examples
+ --------
+
+ Using ``velocity_from_position`` with ``apply_ragged``, calculate the velocities of
+ multiple particles, the coordinates of which are found in the ragged arrays x, y, and t
+ that share row sizes 2, 3, and 4:
+
+ >>> rowsize = [2, 3, 4]
+ >>> x = np.array([1, 2, 10, 12, 14, 30, 33, 36, 39])
+ >>> y = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8])
+ >>> t = np.array([1, 2, 1, 2, 3, 1, 2, 3, 4])
+ >>> u1, v1 = apply_ragged(velocity_from_position, [x, y, t], rowsize, coord_system="cartesian")
+ array([1., 1., 2., 2., 2., 3., 3., 3., 3.]),
+ array([1., 1., 1., 1., 1., 1., 1., 1., 1.]))
+
+ To apply ``func`` to only a subset of rows, use the ``rows`` argument:
+
+ >>> u1, v1 = apply_ragged(velocity_from_position, [x, y, t], rowsize, rows=0, coord_system="cartesian")
+ array([1., 1.]),
+ array([1., 1.]))
+ >>> u1, v1 = apply_ragged(velocity_from_position, [x, y, t], rowsize, rows=[0, 1], coord_system="cartesian")
+ array([1., 1., 2., 2., 2.]),
+ array([1., 1., 1., 1., 1.]))
+
+ Raises
+ ------
+ ValueError
+ If the sum of ``rowsize`` does not equal the length of ``arrays``.
+ IndexError
+ If empty ``arrays``.
+ """
+ # make sure the arrays is iterable
+ iftype(arrays)notin[list,tuple]:
+ arrays=[arrays]
+ # validate rowsize
+ forarrinarrays:
+ ifnotnp.sum(rowsize)==arr.shape[axis]:
+ raiseValueError("The sum of rowsize must equal the length of arr.")
+
+ # split the array(s) into trajectories
+ arrays=[unpack(np.array(arr),rowsize,rows,axis)forarrinarrays]
+ iter=[[arrays[i][j]foriinrange(len(arrays))]forjinrange(len(arrays[0]))]
+
+ # parallel execution
+ res=[executor.submit(func,*x,*args,**kwargs)forxiniter]
+ res=[r.result()forrinres]
+
+ # Concatenate the outputs.
+
+ # The following wraps items in a list if they are not already iterable.
+ res=[itemifisinstance(item,Iterable)else[item]foriteminres]
+
+ # np.concatenate can concatenate along non-zero axis iff the length of
+ # arrays to be concatenated is > 1. If the length is 1, for example in the
+ # case of func that reduces over the non-ragged axis, we can only
+ # concatenate along axis 0.
+ ifisinstance(res[0],tuple):# more than 1 parameter
+ outputs=[]
+ foriinrange(len(res[0])):# iterate over each result variable
+ # If we have multiple outputs and func is a reduction function,
+ # we now here have a list of scalars. We need to wrap them in a
+ # list to concatenate them.
+ result=[r[i]ifisinstance(r[i],Iterable)else[r[i]]forrinres]
+ iflen(result[0])>1:
+ # Arrays to concatenate are longer than 1 element, so we can
+ # concatenate along the non-zero axis.
+ outputs.append(np.concatenate(result,axis=axis))
+ else:
+ # Arrays to concatenate are 1 element long, so we can only
+ # concatenate along axis 0.
+ outputs.append(np.concatenate(result))
+ returntuple(outputs)
+ else:
+ iflen(res[0])>1:
+ # Arrays to concatenate are longer than 1 element, so we can
+ # concatenate along the non-zero axis.
+ returnnp.concatenate(res,axis=axis)
+ else:
+ # Arrays to concatenate are 1 element long, so we can only
+ # concatenate along axis 0.
+ returnnp.concatenate(res)
+
+
+
+
+[docs]
+defchunk(
+ x:Union[list,np.ndarray,xr.DataArray,pd.Series],
+ length:int,
+ overlap:int=0,
+ align:str="start",
+)->np.ndarray:
+"""Divide an array ``x`` into equal chunks of length ``length``. The result
+ is a 2-dimensional NumPy array of shape ``(num_chunks, length)``. The resulting
+ number of chunks is determined based on the length of ``x``, ``length``,
+ and ``overlap``.
+
+ ``chunk`` can be combined with :func:`apply_ragged` to chunk a ragged array.
+
+ Parameters
+ ----------
+ x : list or array-like
+ Array to divide into chunks.
+ length : int
+ The length of each chunk.
+ overlap : int, optional
+ The number of overlapping array elements across chunks. The default is 0.
+ Must be smaller than ``length``. For example, if ``length`` is 4 and
+ ``overlap`` is 2, the chunks of ``[0, 1, 2, 3, 4, 5]`` will be
+ ``np.array([[0, 1, 2, 3], [2, 3, 4, 5]])``. Negative overlap can be used
+ to offset chunks by some number of elements. For example, if ``length``
+ is 2 and ``overlap`` is -1, the chunks of ``[0, 1, 2, 3, 4, 5]`` will
+ be ``np.array([[0, 1], [3, 4]])``.
+ align : str, optional ["start", "middle", "end"]
+ If the remainder of the length of ``x`` divided by the chunk ``length`` is a number
+ N different from zero, this parameter controls which part of the array will be kept
+ into the chunks. If ``align="start"``, the elements at the beginning of the array
+ will be part of the chunks and N points are discarded at the end. If `align="middle"`,
+ floor(N/2) and ceil(N/2) elements will be discarded from the beginning and the end
+ of the array, respectively. If ``align="end"``, the elements at the end of the array
+ will be kept, and the `N` first elements are discarded. The default is "start".
+
+ Returns
+ -------
+ np.ndarray
+ 2-dimensional array of shape ``(num_chunks, length)``.
+
+ Examples
+ --------
+
+ Chunk a simple list; this discards the end elements that exceed the last chunk:
+
+ >>> chunk([1, 2, 3, 4, 5], 2)
+ array([[1, 2],
+ [3, 4]])
+
+ To discard the starting elements of the array instead, use ``align="end"``:
+
+ >>> chunk([1, 2, 3, 4, 5], 2, align="end")
+ array([[2, 3],
+ [4, 5]])
+
+ To center the chunks by discarding both ends of the array, use ``align="middle"``:
+
+ >>> chunk([1, 2, 3, 4, 5, 6, 7, 8], 3, align="middle")
+ array([[2, 3, 4],
+ [5, 6, 7]])
+
+ Specify ``overlap`` to get overlapping chunks:
+
+ >>> chunk([1, 2, 3, 4, 5], 2, overlap=1)
+ array([[1, 2],
+ [2, 3],
+ [3, 4],
+ [4, 5]])
+
+ Use ``apply_ragged`` to chunk a ragged array by providing the row sizes;
+ notice that you must pass the array to chunk as an array-like, not a list:
+
+ >>> x = np.array([1, 2, 3, 4, 5])
+ >>> rowsize = [2, 1, 2]
+ >>> apply_ragged(chunk, x, rowsize, 2)
+ array([[1, 2],
+ [4, 5]])
+
+ Raises
+ ------
+ ValueError
+ If ``length < 0``.
+ ValueError
+ If ``align not in ["start", "middle", "end"]``.
+ ZeroDivisionError
+ if ``length == 0``.
+ """
+ num_chunks=(len(x)-length)//(length-overlap)+1iflen(x)>=lengthelse0
+ remainder=len(x)-num_chunks*length+(num_chunks-1)*overlap
+ res=np.empty((num_chunks,length),dtype=np.array(x).dtype)
+
+ ifalign=="start":
+ start=0
+ elifalign=="middle":
+ start=remainder//2
+ elifalign=="end":
+ start=remainder
+ else:
+ raiseValueError("align must be one of 'start', 'middle', or 'end'.")
+
+ forninrange(num_chunks):
+ end=start+length
+ res[n]=x[start:end]
+ start=end-overlap
+
+ returnres
+
+
+
+
+[docs]
+defprune(
+ ragged:Union[list,np.ndarray,pd.Series,xr.DataArray],
+ rowsize:Union[list,np.ndarray,pd.Series,xr.DataArray],
+ min_rowsize:float,
+)->Tuple[np.ndarray,np.ndarray]:
+"""Within a ragged array, removes arrays less than a specified row size.
+
+ Parameters
+ ----------
+ ragged : np.ndarray or pd.Series or xr.DataArray
+ A ragged array.
+ rowsize : list or np.ndarray[int] or pd.Series or xr.DataArray[int]
+ The size of each row in the input ragged array.
+ min_rowsize :
+ The minimum row size that will be kept.
+
+ Returns
+ -------
+ tuple[np.ndarray, np.ndarray]
+ A tuple of ragged array and size of each row.
+
+ Examples
+ --------
+ >>> prune(np.array([1, 2, 3, 0, -1, -2]), np.array([3, 1, 2]),2)
+ (array([1, 2, 3, -1, -2]), array([3, 2]))
+
+ Raises
+ ------
+ ValueError
+ If the sum of ``rowsize`` does not equal the length of ``arrays``.
+ IndexError
+ If empty ``ragged``.
+
+ See Also
+ --------
+ :func:`segment`, `chunk`
+ """
+
+ ragged=apply_ragged(
+ lambdax,min_len:xiflen(x)>=min_lenelsenp.empty(0,dtype=x.dtype),
+ np.array(ragged),
+ rowsize,
+ min_len=min_rowsize,
+ )
+ rowsize=apply_ragged(
+ lambdax,min_len:xifx>=min_lenelsenp.empty(0,dtype=x.dtype),
+ np.array(rowsize),
+ np.ones_like(rowsize),
+ min_len=min_rowsize,
+ )
+
+ returnragged,rowsize
+
+
+
+
+[docs]
+defragged_to_regular(
+ ragged:Union[np.ndarray,pd.Series,xr.DataArray],
+ rowsize:Union[list,np.ndarray,pd.Series,xr.DataArray],
+ fill_value:float=np.nan,
+)->np.ndarray:
+"""Convert a ragged array to a two-dimensional array such that each contiguous segment
+ of a ragged array is a row in the two-dimensional array. Each row of the two-dimensional
+ array is padded with NaNs as needed. The length of the first dimension of the output
+ array is the length of ``rowsize``. The length of the second dimension is the maximum
+ element of ``rowsize``.
+
+ Note: Although this function accepts parameters of type ``xarray.DataArray``,
+ passing NumPy arrays is recommended for performance reasons.
+
+ Parameters
+ ----------
+ ragged : np.ndarray or pd.Series or xr.DataArray
+ A ragged array.
+ rowsize : list or np.ndarray[int] or pd.Series or xr.DataArray[int]
+ The size of each row in the ragged array.
+ fill_value : float, optional
+ Fill value to use for the trailing elements of each row of the resulting
+ regular array.
+
+ Returns
+ -------
+ np.ndarray
+ A two-dimensional array.
+
+ Examples
+ --------
+ By default, the fill value used is NaN:
+
+ >>> ragged_to_regular(np.array([1, 2, 3, 4, 5]), np.array([2, 1, 2]))
+ array([[ 1., 2.],
+ [ 3., nan],
+ [ 4., 5.]])
+
+ You can specify an alternative fill value:
+
+ >>> ragged_to_regular(np.array([1, 2, 3, 4, 5]), np.array([2, 1, 2]), fill_value=999)
+ array([[ 1., 2.],
+ [ 3., -999.],
+ [ 4., 5.]])
+
+ See Also
+ --------
+ :func:`regular_to_ragged`
+ """
+ res=fill_value*np.ones((len(rowsize),int(max(rowsize))),dtype=ragged.dtype)
+ unpacked=unpack(ragged,rowsize)
+ forninrange(len(rowsize)):
+ res[n,:int(rowsize[n])]=unpacked[n]
+ returnres
+
+
+
+
+[docs]
+defregular_to_ragged(
+ array:np.ndarray,fill_value:float=np.nan
+)->tuple[np.ndarray,np.ndarray]:
+"""Convert a two-dimensional array to a ragged array. Fill values in the input array are
+ excluded from the output ragged array.
+
+ Parameters
+ ----------
+ array : np.ndarray
+ A two-dimensional array.
+ fill_value : float, optional
+ Fill value used to determine the bounds of contiguous segments.
+
+ Returns
+ -------
+ tuple[np.ndarray, np.ndarray]
+ A tuple of the ragged array and the size of each row.
+
+ Examples
+ --------
+ By default, NaN values found in the input regular array are excluded from
+ the output ragged array:
+
+ >>> regular_to_ragged(np.array([[1, 2], [3, np.nan], [4, 5]]))
+ (array([1., 2., 3., 4., 5.]), array([2, 1, 2]))
+
+ Alternatively, a different fill value can be specified:
+
+ >>> regular_to_ragged(np.array([[1, 2], [3, -999], [4, 5]]), fill_value=-999)
+ (array([1., 2., 3., 4., 5.]), array([2, 1, 2]))
+
+ See Also
+ --------
+ :func:`ragged_to_regular`
+ """
+ ifnp.isnan(fill_value):
+ valid=~np.isnan(array)
+ else:
+ valid=array!=fill_value
+ returnarray[valid],np.sum(valid,axis=1)
+
+
+
+
+[docs]
+defrowsize_to_index(rowsize:Union[list,np.ndarray,xr.DataArray])->np.ndarray:
+"""Convert a list of row sizes to a list of indices.
+
+ This function is typically used to obtain the indices of data rows organized
+ in a ragged array.
+
+ Parameters
+ ----------
+ rowsize : list or np.ndarray or xr.DataArray
+ A list of row sizes.
+
+ Returns
+ -------
+ np.ndarray
+ A list of indices.
+
+ Examples
+ --------
+ To obtain the indices within a ragged array of three consecutive rows of sizes 100, 202, and 53:
+
+ >>> rowsize_to_index([100, 202, 53])
+ array([0, 100, 302, 355])
+ """
+ returnnp.cumsum(np.insert(np.array(rowsize),0,0))
+
+
+
+
+[docs]
+defsegment(
+ x:np.ndarray,
+ tolerance:Union[float,np.timedelta64,timedelta,pd.Timedelta],
+ rowsize:np.ndarray[int]=None,
+)->np.ndarray[int]:
+"""Divide an array into segments based on a tolerance value.
+
+ Parameters
+ ----------
+ x : list, np.ndarray, or xr.DataArray
+ An array to divide into segment.
+ tolerance : float, np.timedelta64, timedelta, pd.Timedelta
+ The maximum signed difference between consecutive points in a segment.
+ The array x will be segmented wherever differences exceed the tolerance.
+ rowsize : np.ndarray[int], optional
+ The size of rows if x is originally a ragged array. If present, x will be
+ divided both by gaps that exceed the tolerance, and by the original rows
+ of the ragged array.
+
+ Returns
+ -------
+ np.ndarray[int]
+ An array of row sizes that divides the input array into segments.
+
+ Examples
+ --------
+ The simplest use of ``segment`` is to provide a tolerance value that is
+ used to divide an array into segments:
+
+ >>> x = [0, 1, 1, 1, 2, 2, 3, 3, 3, 3, 4]
+ >>> segment(x, 0.5)
+ array([1, 3, 2, 4, 1])
+
+ If the array is already previously segmented (e.g. multiple rows in
+ a ragged array), then the ``rowsize`` argument can be used to preserve
+ the original segments:
+
+ >>> x = [0, 1, 1, 1, 2, 2, 3, 3, 3, 3, 4]
+ >>> rowsize = [3, 2, 6]
+ >>> segment(x, 0.5, rowsize)
+ array([1, 2, 1, 1, 1, 4, 1])
+
+ The tolerance can also be negative. In this case, the input array is
+ segmented where the negative difference exceeds the negative
+ value of the tolerance, i.e. where ``x[n+1] - x[n] < -tolerance``:
+
+ >>> x = [0, 1, 2, 0, 1, 2]
+ >>> segment(x, -0.5)
+ array([3, 3])
+
+ To segment an array for both positive and negative gaps, invoke the function
+ twice, once for a positive tolerance and once for a negative tolerance.
+ The result of the first invocation can be passed as the ``rowsize`` argument
+ to the first ``segment`` invocation:
+
+ >>> x = [1, 1, 2, 2, 1, 1, 2, 2]
+ >>> segment(x, 0.5, rowsize=segment(x, -0.5))
+ array([2, 2, 2, 2])
+
+ If the input array contains time objects, the tolerance must be a time interval:
+
+ >>> x = np.array([np.datetime64("2023-01-01"), np.datetime64("2023-01-02"),
+ np.datetime64("2023-01-03"), np.datetime64("2023-02-01"),
+ np.datetime64("2023-02-02")])
+ >>> segment(x, np.timedelta64(1, "D"))
+ np.array([3, 2])
+ """
+
+ # for compatibility with datetime list or np.timedelta64 arrays
+ iftype(tolerance)in[np.timedelta64,timedelta]:
+ tolerance=pd.Timedelta(tolerance)
+
+ iftype(tolerance)==pd.Timedelta:
+ positive_tol=tolerance>=pd.Timedelta("0 seconds")
+ else:
+ positive_tol=tolerance>=0
+
+ ifrowsizeisNone:
+ ifpositive_tol:
+ exceeds_tolerance=np.diff(x)>tolerance
+ else:
+ exceeds_tolerance=np.diff(x)<tolerance
+ segment_sizes=np.diff(np.insert(np.where(exceeds_tolerance)[0]+1,0,0))
+ segment_sizes=np.append(segment_sizes,len(x)-np.sum(segment_sizes))
+ returnsegment_sizes
+ else:
+ ifnotnp.sum(rowsize)==len(x):
+ raiseValueError("The sum of rowsize must equal the length of x.")
+ segment_sizes=[]
+ start=0
+ forrinrowsize:
+ end=start+int(r)
+ segment_sizes.append(segment(x[start:end],tolerance))
+ start=end
+ returnnp.concatenate(segment_sizes)
+
+
+
+
+[docs]
+defsubset(
+ ds:xr.Dataset,
+ criteria:dict,
+ id_var_name:str="id",
+ rowsize_var_name:str="rowsize",
+ traj_dim_name:str="traj",
+ obs_dim_name:str="obs",
+ full_trajectories=False,
+)->xr.Dataset:
+"""Subset a ragged array dataset as a function of one or more criteria.
+ The criteria are passed with a dictionary, where a dictionary key
+ is a variable to subset and the associated dictionary value is either a range
+ (valuemin, valuemax), a list [value1, value2, valueN], a single value, or a
+ masking function applied to every row of the ragged array using ``apply_ragged``.
+
+ This function needs to know the names of the dimensions of the ragged array dataset
+ (`traj_dim_name` and `obs_dim_name`), and the name of the rowsize variable (`rowsize_var_name`).
+ Default values are provided for these arguments (see below), but they can be changed if needed.
+
+ Parameters
+ ----------
+ ds : xr.Dataset
+ Dataset stored as ragged arrays
+ criteria : dict
+ dictionary containing the variables (as keys) and the ranges/values/functions (as values) to subset
+ id_var_name : str, optional
+ Name of the variable containing the ID of the trajectories (default is "id")
+ rowsize_var_name : str, optional
+ Name of the variable containing the number of observations per trajectory (default is "rowsize")
+ traj_dim_name : str, optional
+ Name of the trajectory dimension (default is "traj")
+ obs_dim_name : str, optional
+ Name of the observation dimension (default is "obs")
+ full_trajectories : bool, optional
+ If True, it returns the complete trajectories (rows) where at least one observation
+ matches the criteria, rather than just the segments where the criteria are satisfied.
+ Default is False.
+
+ Returns
+ -------
+ xr.Dataset
+ subset Dataset matching the criterion(a)
+
+ Examples
+ --------
+ Criteria are combined on any data or metadata variables part of the Dataset.
+ The following examples are based on NOAA GDP datasets which can be accessed with the
+ ``clouddrift.datasets`` module.
+
+ Retrieve a region, like the Gulf of Mexico, using ranges of latitude and longitude:
+
+ >>> subset(ds, {"lat": (21, 31), "lon": (-98, -78)})
+
+ The parameter `full_trajectories` can be used to retrieve trajectories passing through a region, for example all trajectories passing through the Gulf of Mexico:
+
+ >>> subset(ds, {"lat": (21, 31), "lon": (-98, -78)}, full_trajectories=True)
+
+ Retrieve drogued trajectory segments:
+
+ >>> subset(ds, {"drogue_status": True})
+
+ Retrieve trajectory segments with temperature higher than 25°C (303.15K):
+
+ >>> subset(ds, {"sst": (303.15, np.inf)})
+
+ You can use the same approach to return only the trajectories that are
+ shorter than some number of observations (similar to :func:`prune` but for
+ the entire dataset):
+
+ >>> subset(ds, {"rowsize": (0, 1000)})
+
+ Retrieve specific drifters from their IDs:
+
+ >>> subset(ds, {"id": [2578, 2582, 2583]})
+
+ Sometimes, you may want to retrieve specific rows of a ragged array.
+ You can do that by filtering along the trajectory dimension directly, since
+ this one corresponds to row numbers:
+
+ >>> rows = [5, 6, 7]
+ >>> subset(ds, {"traj": rows})
+
+ Retrieve a specific time period:
+
+ >>> subset(ds, {"time": (np.datetime64("2000-01-01"), np.datetime64("2020-01-31"))})
+
+ Note that to subset time variable, the range has to be defined as a function
+ type of the variable. By default, ``xarray`` uses ``np.datetime64`` to
+ represent datetime data. If the datetime data is a ``datetime.datetime``, or
+ ``pd.Timestamp``, the range would have to be defined accordingly.
+
+ Those criteria can also be combined:
+
+ >>> subset(ds, {"lat": (21, 31), "lon": (-98, -78), "drogue_status": True, "sst": (303.15, np.inf), "time": (np.datetime64("2000-01-01"), np.datetime64("2020-01-31"))})
+
+ You can also use a function to filter the data. For example, retrieve every other observation
+ of each trajectory (row):
+
+ >>> func = (lambda arr: ((arr - arr[0]) % 2) == 0)
+ >>> subset(ds, {"time": func})
+
+ Raises
+ ------
+ ValueError
+ If one of the variable in a criterion is not found in the Dataset
+ """
+ mask_traj=xr.DataArray(
+ data=np.ones(ds.sizes[traj_dim_name],dtype="bool"),dims=[traj_dim_name]
+ )
+ mask_obs=xr.DataArray(
+ data=np.ones(ds.sizes[obs_dim_name],dtype="bool"),dims=[obs_dim_name]
+ )
+
+ forkeyincriteria.keys():
+ ifkeyindsorkeyinds.dims:
+ ifds[key].dims==(traj_dim_name,):
+ mask_traj=np.logical_and(
+ mask_traj,
+ _mask_var(
+ ds[key],criteria[key],ds[rowsize_var_name],traj_dim_name
+ ),
+ )
+ elifds[key].dims==(obs_dim_name,):
+ mask_obs=np.logical_and(
+ mask_obs,
+ _mask_var(
+ ds[key],criteria[key],ds[rowsize_var_name],obs_dim_name
+ ),
+ )
+ else:
+ raiseValueError(f"Unknown variable '{key}'.")
+
+ # remove data when trajectories are filtered
+ traj_idx=rowsize_to_index(ds[rowsize_var_name].values)
+ foriinnp.where(~mask_traj)[0]:
+ mask_obs[slice(traj_idx[i],traj_idx[i+1])]=False
+
+ # remove trajectory completely filtered in mask_obs
+ ids_with_mask_obs=np.repeat(ds[id_var_name].values,ds[rowsize_var_name].values)[
+ mask_obs
+ ]
+ mask_traj=np.logical_and(
+ mask_traj,np.in1d(ds[id_var_name],np.unique(ids_with_mask_obs))
+ )
+
+ # reset mask_obs to True to keep complete trajectories
+ iffull_trajectories:
+ foriinnp.where(mask_traj)[0]:
+ mask_obs[slice(traj_idx[i],traj_idx[i+1])]=True
+ ids_with_mask_obs=np.repeat(
+ ds[id_var_name].values,ds[rowsize_var_name].values
+ )[mask_obs]
+
+ ifnotany(mask_traj):
+ warnings.warn("No data matches the criteria; returning an empty dataset.")
+ returnxr.Dataset()
+ else:
+ # apply the filtering for both dimensions
+ ds_sub=ds.isel({traj_dim_name:mask_traj,obs_dim_name:mask_obs})
+ _,unique_idx,sorted_rowsize=np.unique(
+ ids_with_mask_obs,return_index=True,return_counts=True
+ )
+ ds_sub[rowsize_var_name].values=sorted_rowsize[np.argsort(unique_idx)]
+ returnds_sub
+
+
+
+
+[docs]
+defunpack(
+ ragged_array:np.ndarray,
+ rowsize:np.ndarray[int],
+ rows:Union[int,Iterable[int]]=None,
+ axis:int=0,
+)->list[np.ndarray]:
+"""Unpack a ragged array into a list of regular arrays.
+
+ Unpacking a ``np.ndarray`` ragged array is about 2 orders of magnitude
+ faster than unpacking an ``xr.DataArray`` ragged array, so unless you need a
+ ``DataArray`` as the result, we recommend passing ``np.ndarray`` as input.
+
+ Parameters
+ ----------
+ ragged_array : array-like
+ A ragged_array to unpack
+ rowsize : array-like
+ An array of integers whose values is the size of each row in the ragged
+ array
+ rows : int or Iterable[int], optional
+ A row or list of rows to unpack. Default is None, which unpacks all rows.
+ axis : int, optional
+ The axis along which to unpack the ragged array. Default is 0.
+
+ Returns
+ -------
+ list
+ A list of array-likes with sizes that correspond to the values in
+ rowsize, and types that correspond to the type of ragged_array
+
+ Examples
+ --------
+
+ Unpacking longitude arrays from a ragged Xarray Dataset:
+
+ .. code-block:: python
+
+ lon = unpack(ds.lon, ds["rowsize"]) # return a list[xr.DataArray] (slower)
+ lon = unpack(ds.lon.values, ds["rowsize"]) # return a list[np.ndarray] (faster)
+ first_lon = unpack(ds.lon.values, ds["rowsize"], rows=0) # return only the first row
+ first_two_lons = unpack(ds.lon.values, ds["rowsize"], rows=[0, 1]) # return first two rows
+
+ Looping over trajectories in a ragged Xarray Dataset to compute velocities
+ for each:
+
+ .. code-block:: python
+
+ for lon, lat, time in list(zip(
+ unpack(ds.lon.values, ds["rowsize"]),
+ unpack(ds.lat.values, ds["rowsize"]),
+ unpack(ds.time.values, ds["rowsize"])
+ )):
+ u, v = velocity_from_position(lon, lat, time)
+ """
+ indices=rowsize_to_index(rowsize)
+
+ ifrowsisNone:
+ rows=range(indices.size-1)
+ ifisinstance(rows,(int,np.integer)):
+ rows=[rows]
+
+ unpacked=np.split(ragged_array,indices[1:-1],axis=axis)
+
+ return[unpacked[i]foriinrows]
+
+
+
+def_mask_var(
+ var:xr.DataArray,
+ criterion:Union[tuple,list,np.ndarray,xr.DataArray,bool,float,int,Callable],
+ rowsize:xr.DataArray=None,
+ dim_name:str="dim_0",
+)->xr.DataArray:
+"""Return the mask of a subset of the data matching a test criterion.
+
+ Parameters
+ ----------
+ var : xr.DataArray
+ DataArray to be subset by the criterion
+ criterion : array-like or scalar or Callable
+ The criterion can take four forms:
+ - tuple: (min, max) defining a range
+ - list, np.ndarray, or xr.DataArray: An array-like defining multiples values
+ - scalar: value defining a single value
+ - function: a function applied against each trajectory using ``apply_ragged`` and returning a mask
+ rowsize : xr.DataArray, optional
+ List of integers specifying the number of data points in each row
+ dim_name : str, optional
+ Name of the masked dimension (default is "dim_0")
+
+ Examples
+ --------
+ >>> x = xr.DataArray(data=np.arange(0, 5))
+ >>> _mask_var(x, (2, 4))
+ <xarray.DataArray (dim_0: 5)>
+ array([False, False, True, True, True])
+ Dimensions without coordinates: dim_0
+
+ >>> _mask_var(x, [0, 2, 4])
+ <xarray.DataArray (dim_0: 5)>
+ array([ True, False, True, False, True])
+ Dimensions without coordinates: dim_0
+
+ >>> _mask_var(x, 4)
+ <xarray.DataArray (dim_0: 5)>
+ array([False, False, False, True, False])
+ Dimensions without coordinates: dim_0
+
+ >>> rowsize = xr.DataArray(data=[2, 3])
+ >>> _mask_var(x, lambda arr: arr==arr[0]+1, rowsize, "dim_0")
+ <xarray.DataArray (dim_0: 5)>
+ array([False, True, False, True, False])
+ Dimensions without coordinates: dim_0
+
+ Returns
+ -------
+ mask : xr.DataArray
+ The mask of the subset of the data matching the criteria
+ """
+ ifisinstance(criterion,tuple):# min/max defining range
+ mask=np.logical_and(var>=criterion[0],var<=criterion[1])
+ elifisinstance(criterion,(list,np.ndarray,xr.DataArray)):
+ # select multiple values
+ mask=np.isin(var,criterion)
+ elifcallable(criterion):
+ # mask directly created by applying `criterion` function
+ iflen(var)==len(rowsize):
+ mask=criterion(var)
+ else:
+ mask=apply_ragged(criterion,var,rowsize)
+
+ mask=xr.DataArray(data=mask,dims=[dim_name]).astype(bool)
+
+ ifnotlen(var)==len(mask):
+ raiseValueError(
+ "The `Callable` function must return a masked array that matches the length of the variable to filter."
+ )
+ else:# select one specific value
+ mask=var==criterion
+ returnmask
+
+"""
+This module defines the RaggedArray class, which is the intermediate data
+structure used by CloudDrift to process custom Lagrangian datasets to Xarray
+Datasets and Awkward Arrays.
+"""
+importawkwardasak
+fromclouddrift.raggedimportrowsize_to_index
+importxarrayasxr
+importnumpyasnp
+fromcollections.abcimportCallable
+fromtypingimportTuple,Optional
+fromtqdmimporttqdm
+importwarnings
+
+
+
+[docs]
+ @classmethod
+ deffrom_awkward(
+ cls,
+ array:ak.Array,
+ name_coords:Optional[list]=["time","lon","lat","ids"],
+ ):
+"""Load a RaggedArray instance from an Awkward Array.
+
+ Parameters
+ ----------
+ array : ak.Array
+ Awkward Array instance to load the data from
+ name_coords : list, optional
+ Names of the coordinate variables in the ragged arrays
+
+ Returns
+ -------
+ RaggedArray
+ A RaggedArray instance
+ """
+ coords={}
+ metadata={}
+ data={}
+ attrs_variables={}
+
+ attrs_global=array.layout.parameters["attrs"]
+
+ forvarinname_coords:
+ coords[var]=ak.flatten(array.obs[var]).to_numpy()
+ attrs_variables[var]=array.obs[var].layout.parameters["attrs"]
+
+ forvarin[vforvinarray.fieldsifv!="obs"]:
+ metadata[var]=array[var].to_numpy()
+ attrs_variables[var]=array[var].layout.parameters["attrs"]
+
+ forvarin[vforvinarray.obs.fieldsifvnotinname_coords]:
+ data[var]=ak.flatten(array.obs[var]).to_numpy()
+ attrs_variables[var]=array.obs[var].layout.parameters["attrs"]
+
+ returncls(coords,metadata,data,attrs_global,attrs_variables)
+
+
+
+[docs]
+ @classmethod
+ deffrom_files(
+ cls,
+ indices:list,
+ preprocess_func:Callable[[int],xr.Dataset],
+ name_coords:list,
+ name_meta:Optional[list]=[],
+ name_data:Optional[list]=[],
+ rowsize_func:Optional[Callable[[int],int]]=None,
+ **kwargs,
+ ):
+"""Generate a ragged array archive from a list of trajectory files
+
+ Parameters
+ ----------
+ indices : list
+ Identification numbers list to iterate
+ preprocess_func : Callable[[int], xr.Dataset]
+ Returns a processed xarray Dataset from an identification number
+ name_coords : list
+ Name of the coordinate variables to include in the archive
+ name_meta : list, optional
+ Name of metadata variables to include in the archive (Defaults to [])
+ name_data : list, optional
+ Name of the data variables to include in the archive (Defaults to [])
+ rowsize_func : Optional[Callable[[int], int]], optional
+ Returns the number of observations from an identification number (to speed up processing) (Defaults to None)
+
+ Returns
+ -------
+ RaggedArray
+ A RaggedArray instance
+ """
+ # if no method is supplied, get the dimension from the preprocessing function
+ rowsize_func=(
+ rowsize_func
+ ifrowsize_func
+ elselambdai,**kwargs:preprocess_func(i,**kwargs).sizes["obs"]
+ )
+ rowsize=cls.number_of_observations(rowsize_func,indices,**kwargs)
+ coords,metadata,data=cls.allocate(
+ preprocess_func,
+ indices,
+ rowsize,
+ name_coords,
+ name_meta,
+ name_data,
+ **kwargs,
+ )
+ attrs_global,attrs_variables=cls.attributes(
+ preprocess_func(indices[0],**kwargs),
+ name_coords,
+ name_meta,
+ name_data,
+ )
+
+ returncls(coords,metadata,data,attrs_global,attrs_variables)
+
+
+
+[docs]
+ @classmethod
+ deffrom_netcdf(cls,filename:str):
+"""Read a ragged arrays archive from a NetCDF file.
+
+ This is a thin wrapper around ``from_xarray()``.
+
+ Parameters
+ ----------
+ filename : str
+ File name of the NetCDF archive to read.
+
+ Returns
+ -------
+ RaggedArray
+ A ragged array instance
+ """
+ returncls.from_xarray(xr.open_dataset(filename))
+
+
+
+[docs]
+ @classmethod
+ deffrom_parquet(
+ cls,filename:str,name_coords:Optional[list]=["time","lon","lat","ids"]
+ ):
+"""Read a ragged array from a parquet file.
+
+ Parameters
+ ----------
+ filename : str
+ File name of the parquet archive to read.
+ name_coords : list, optional
+ Names of the coordinate variables in the ragged arrays
+
+ Returns
+ -------
+ RaggedArray
+ A ragged array instance
+ """
+ returncls.from_awkward(ak.from_parquet(filename),name_coords)
+
+
+
+[docs]
+ @classmethod
+ deffrom_xarray(cls,ds:xr.Dataset,dim_traj:str="traj",dim_obs:str="obs"):
+"""Populate a RaggedArray instance from an xarray Dataset instance.
+
+ Parameters
+ ----------
+ ds : xr.Dataset
+ Xarray Dataset from which to load the RaggedArray
+ dim_traj : str, optional
+ Name of the trajectories dimension in the xarray Dataset
+ dim_obs : str, optional
+ Name of the observations dimension in the xarray Dataset
+
+ Returns
+ -------
+ RaggedArray
+ A RaggedArray instance
+ """
+ coords={}
+ metadata={}
+ data={}
+ attrs_global={}
+ attrs_variables={}
+
+ attrs_global=ds.attrs
+
+ forvarinds.coords.keys():
+ coords[var]=ds[var].data
+ attrs_variables[var]=ds[var].attrs
+
+ forvarinds.data_vars.keys():
+ iflen(ds[var])==ds.sizes[dim_traj]:
+ metadata[var]=ds[var].data
+ eliflen(ds[var])==ds.sizes[dim_obs]:
+ data[var]=ds[var].data
+ else:
+ warnings.warn(
+ f"""
+ Variable '{var}' has unknown dimension size of
+{len(ds[var])}, which is not traj={ds.sizes[dim_traj]} or
+ obs={ds.sizes[dim_obs]}; skipping.
+ """
+ )
+ attrs_variables[var]=ds[var].attrs
+
+ returncls(coords,metadata,data,attrs_global,attrs_variables)
+
+
+
+[docs]
+ @staticmethod
+ defnumber_of_observations(
+ rowsize_func:Callable[[int],int],indices:list,**kwargs
+ )->np.array:
+"""Iterate through the files and evaluate the number of observations.
+
+ Parameters
+ ----------
+ rowsize_func : Callable[[int], int]]
+ Function that returns the number observations of a trajectory from
+ its identification number
+ indices : list
+ Identification numbers list to iterate
+
+ Returns
+ -------
+ np.ndarray
+ Number of observations of each trajectory
+ """
+ rowsize=np.zeros(len(indices),dtype="int")
+
+ fori,indexintqdm(
+ enumerate(indices),
+ total=len(indices),
+ desc="Retrieving the number of obs",
+ ncols=80,
+ ):
+ rowsize[i]=rowsize_func(index,**kwargs)
+ returnrowsize
+
+
+
+[docs]
+ @staticmethod
+ defattributes(
+ ds:xr.Dataset,name_coords:list,name_meta:list,name_data:list
+ )->Tuple[dict,dict]:
+"""Return global attributes and the attributes of all variables
+ (name_coords, name_meta, and name_data) from an Xarray Dataset.
+
+ Parameters
+ ----------
+ ds : xr.Dataset
+ _description_
+ name_coords : list
+ Name of the coordinate variables to include in the archive
+ name_meta : list, optional
+ Name of metadata variables to include in the archive (default is [])
+ name_data : list, optional
+ Name of the data variables to include in the archive (default is [])
+
+ Returns
+ -------
+ Tuple[dict, dict]
+ The global and variables attributes
+ """
+ attrs_global=ds.attrs
+
+ # coordinates, metadata, and data
+ attrs_variables={}
+ forvarinname_coords+name_meta+name_data:
+ ifvarinds.keys():
+ attrs_variables[var]=ds[var].attrs
+ else:
+ warnings.warn(f"Variable {var} requested but not found; skipping.")
+
+ returnattrs_global,attrs_variables
+
+
+
+[docs]
+ @staticmethod
+ defallocate(
+ preprocess_func:Callable[[int],xr.Dataset],
+ indices:list,
+ rowsize:list,
+ name_coords:list,
+ name_meta:list,
+ name_data:list,
+ **kwargs,
+ )->Tuple[dict,dict,dict]:
+"""
+ Iterate through the files and fill for the ragged array associated
+ with coordinates, and selected metadata and data variables.
+
+ Parameters
+ ----------
+ preprocess_func : Callable[[int], xr.Dataset]
+ Returns a processed xarray Dataset from an identification number.
+ indices : list
+ List of indices separating trajectory in the ragged arrays.
+ rowsize : list
+ List of the number of observations per trajectory.
+ name_coords : list
+ Name of the coordinate variables to include in the archive.
+ name_meta : list, optional
+ Name of metadata variables to include in the archive (Defaults to []).
+ name_data : list, optional
+ Name of the data variables to include in the archive (Defaults to []).
+
+ Returns
+ -------
+ Tuple[dict, dict, dict]
+ Dictionaries containing numerical data and attributes of coordinates, metadata and data variables.
+ """
+
+ # open one file to get dtype of variables
+ ds=preprocess_func(indices[0],**kwargs)
+ nb_traj=len(rowsize)
+ nb_obs=np.sum(rowsize).astype("int")
+ index_traj=rowsize_to_index(rowsize)
+
+ # allocate memory
+ coords={}
+ forvarinname_coords:
+ coords[var]=np.zeros(nb_obs,dtype=ds[var].dtype)
+
+ metadata={}
+ forvarinname_meta:
+ try:
+ metadata[var]=np.zeros(nb_traj,dtype=ds[var].dtype)
+ exceptKeyError:
+ warnings.warn(f"Variable {var} requested but not found; skipping.")
+
+ data={}
+ forvarinname_data:
+ ifvarinds.keys():
+ data[var]=np.zeros(nb_obs,dtype=ds[var].dtype)
+ else:
+ warnings.warn(f"Variable {var} requested but not found; skipping.")
+ ds.close()
+
+ # loop and fill the ragged array
+ fori,indexintqdm(
+ enumerate(indices),
+ total=len(indices),
+ desc="Filling the Ragged Array",
+ ncols=80,
+ ):
+ withpreprocess_func(index,**kwargs)asds:
+ size=rowsize[i]
+ oid=index_traj[i]
+
+ forvarinname_coords:
+ coords[var][oid:oid+size]=ds[var].data
+
+ forvarinname_meta:
+ try:
+ metadata[var][i]=ds[var][0].data
+ exceptKeyError:
+ warnings.warn(
+ f"Variable {var} requested but not found; skipping."
+ )
+
+ forvarinname_data:
+ ifvarinds.keys():
+ data[var][oid:oid+size]=ds[var].data
+ else:
+ warnings.warn(
+ f"Variable {var} requested but not found; skipping."
+ )
+
+ returncoords,metadata,data
+
+
+
+[docs]
+ defvalidate_attributes(self):
+"""Validate that each variable has an assigned attribute tag."""
+ forkeyin(
+ list(self.coords.keys())
+ +list(self.metadata.keys())
+ +list(self.data.keys())
+ ):
+ ifkeynotinself.attrs_variables:
+ self.attrs_variables[key]={}
+
+
+
+[docs]
+ defto_xarray(self,cast_to_float32:bool=True):
+"""Convert ragged array object to a xarray Dataset.
+
+ Parameters
+ ----------
+ cast_to_float32 : bool, optional
+ Cast all float64 variables to float32 (default is True). This option aims at
+ minimizing the size of the xarray dataset.
+
+ Returns
+ -------
+ xr.Dataset
+ Xarray Dataset containing the ragged arrays and their attributes
+ """
+
+ xr_coords={}
+ forvarinself.coords.keys():
+ xr_coords[var]=(["obs"],self.coords[var],self.attrs_variables[var])
+
+ xr_data={}
+ forvarinself.metadata.keys():
+ xr_data[var]=(["traj"],self.metadata[var],self.attrs_variables[var])
+
+ forvarinself.data.keys():
+ xr_data[var]=(["obs"],self.data[var],self.attrs_variables[var])
+
+ returnxr.Dataset(coords=xr_coords,data_vars=xr_data,attrs=self.attrs_global)
+[docs]
+defanalytic_signal(
+ x:Union[np.ndarray,xr.DataArray],
+ boundary:Optional[str]="mirror",
+ time_axis:Optional[int]=-1,
+)->Union[np.ndarray,Tuple[np.ndarray,np.ndarray]]:
+"""Return the analytic signal from a real-valued signal or the analytic and
+ conjugate analytic signals from a complex-valued signal.
+
+ If the input is a real-valued signal, the analytic signal is calculated as
+ the inverse Fourier transform of the positive-frequency part of the Fourier
+ transform. If the input is a complex-valued signal, the conjugate analytic signal
+ is additionally calculated as the inverse Fourier transform of the positive-frequency
+ part of the Fourier transform of the complex conjugate of the input signal.
+
+ For a complex-valued signal, the mean is evenly divided between the analytic and
+ conjugate analytic signal.
+
+ The calculation is performed along the last axis of the input array by default.
+ Alternatively, the user can specify the time axis of the input. The user can also
+ specify the boundary conditions to be applied to the input array (default is "mirror").
+
+ Parameters
+ ----------
+ x : array_like
+ Real- or complex-valued signal.
+ boundary : str, optional
+ The boundary condition to be imposed at the edges of the time series.
+ Allowed values are "mirror", "zeros", and "periodic".
+ Default is "mirror".
+ time_axis : int, optional
+ Axis on which the time is defined (default is -1).
+
+ Returns
+ -------
+ xa : np.ndarray
+ Analytic signal. It is a tuple if the input is a complex-valed signal
+ with the first element being the analytic signal and the second element
+ being the conjugate analytic signal.
+
+ Examples
+ --------
+
+ To obtain the analytic signal of a real-valued signal:
+
+ >>> x = np.random.rand(99)
+ >>> xa = analytic_signal(x)
+
+ To obtain the analytic and conjugate analytic signals of a complex-valued signal:
+
+ >>> w = np.random.rand(99)+1j*np.random.rand(99)
+ >>> wp, wn = analytic_signal(w)
+
+ To specify that a periodic boundary condition should be used:
+
+ >>> x = np.random.rand(99)
+ >>> xa = analytic_signal(x, boundary="periodic")
+
+ To specify that the time axis is along the first axis and apply
+ zero boundary conditions:
+
+ >>> x = np.random.rand(100, 99)
+ >>> xa = analytic_signal(x, time_axis=0, boundary="zeros")
+
+ Raises
+ ------
+ ValueError
+ If the time axis is outside of the valid range ([-1, N-1]).
+ If ``boundary not in ["mirror", "zeros", "periodic"]``.
+
+ References
+ ----------
+ [1] Gabor D. 1946 Theory of communication. Proc. IEE 93, 429–457. (10.1049/ji-1.1947.0015).
+
+ [2] Lilly JM, Olhede SC. 2010 Bivariate instantaneous frequency and bandwidth.
+ IEEE T. Signal Proces. 58, 591–603. (10.1109/TSP.2009.2031729).
+
+ See Also
+ --------
+ :func:`rotary_to_cartesian`, :func:`cartesian_to_rotary`
+ """
+ # time_axis must be in valid range
+ iftime_axis<-1ortime_axis>len(x.shape)-1:
+ raiseValueError(
+ f"time_axis ({time_axis}) is outside of the valid range ([-1,"
+ f" {len(x.shape)-1}])."
+ )
+
+ # Swap the axis to make the time axis last (fast-varying).
+ # np.swapaxes returns a view to the input array, so no copy is made.
+ iftime_axis!=-1andtime_axis!=len(x.shape)-1:
+ x_=np.swapaxes(x,time_axis,-1)
+ else:
+ x_=x
+
+ # time dimension length
+ N=np.shape(x_)[-1]
+
+ # Subtract mean along time axis (-1); convert to np.array for compatibility
+ # with xarray.DataArray.
+ mx_=np.array(np.mean(x_,axis=-1,keepdims=True))
+ xa=x_-mx_
+
+ # apply boundary conditions
+ ifboundary=="mirror":
+ xa=np.concatenate((np.flip(xa,axis=-1),xa,np.flip(xa,axis=-1)),axis=-1)
+ elifboundary=="zeros":
+ xa=np.concatenate((np.zeros_like(xa),xa,np.zeros_like(xa)),axis=-1)
+ elifboundary=="periodic":
+ xa=np.concatenate((xa,xa,xa),axis=-1)
+ else:
+ raiseValueError("boundary must be one of 'mirror', 'align', or 'zeros'.")
+
+ # analytic signal
+ xap=np.fft.fft(xa)
+ # conjugate analytic signal
+ xan=np.fft.fft(np.conj(xa))
+
+ # time dimension of extended time series
+ M=np.shape(xa)[-1]
+
+ # zero negative frequencies
+ ifM%2==0:
+ xap[...,int(M/2+2)-1:int(M+1)+1]=0
+ xan[...,int(M/2+2)-1:int(M+1)+1]=0
+ # divide Nyquist component by 2 in even case
+ xap[...,int(M/2+1)-1]=xap[...,int(M/2+1)-1]/2
+ xan[...,int(M/2+1)-1]=xan[...,int(M/2+1)-1]/2
+ else:
+ xap[...,int((M+3)/2)-1:int(M+1)+1]=0
+ xan[...,int((M+3)/2)-1:int(M+1)+1]=0
+
+ # inverse Fourier transform along last axis
+ xap=np.fft.ifft(xap)
+ xan=np.fft.ifft(xan)
+
+ # return central part plus half the mean
+ xap=xap[...,int(N+1)-1:int(2*N+1)-1]+0.5*mx_
+ xan=xan[...,int(N+1)-1:int(2*N+1)-1]+0.5*np.conj(mx_)
+
+ ifnp.isrealobj(x):
+ xa=xap+xan
+ else:
+ xa=(xap,xan)
+
+ # return after reorganizing the axes
+ iftime_axis!=-1andtime_axis!=len(x.shape)-1:
+ returnnp.swapaxes(xa,time_axis,-1)
+ else:
+ returnxa
+
+
+
+
+[docs]
+defcartesian_to_rotary(
+ ua:Union[np.ndarray,xr.DataArray],
+ va:Union[np.ndarray,xr.DataArray],
+ time_axis:Optional[int]=-1,
+)->Tuple[np.ndarray,np.ndarray]:
+"""Return rotary signals (wp,wn) from analytic Cartesian signals (ua,va).
+
+ If ua is the analytic signal from real-valued signal u, and va the analytic signal
+ from real-valued signal v, then the positive (counterclockwise) and negative (clockwise)
+ signals are defined by wp = 0.5*(up+1j*vp), wp = 0.5*(up-1j*vp).
+
+ This function is the inverse of :func:`rotary_to_cartesian`.
+
+ Parameters
+ ----------
+ ua : array_like
+ Complex-valued analytic signal for first Cartesian component (zonal, east-west)
+ va : array_like
+ Complex-valued analytic signal for second Cartesian component (meridional, north-south)
+ time_axis : int, optional
+ The axis of the time array. Default is -1, which corresponds to the
+ last axis.
+
+ Returns
+ -------
+ wp : np.ndarray
+ Complex-valued positive (counterclockwise) rotary signal.
+ wn : np.ndarray
+ Complex-valued negative (clockwise) rotary signal.
+
+ Examples
+ --------
+ To obtain the rotary signals from a pair of real-valued signal:
+
+ >>> u = np.random.rand(99)
+ >>> v = np.random.rand(99)
+ >>> wp, wn = cartesian_to_rotary(analytic_signal(u), analytic_signal(v))
+
+ To specify that the time axis is along the first axis:
+
+ >>> u = np.random.rand(100, 99)
+ >>> v = np.random.rand(100, 99)
+ >>> wp, wn = cartesian_to_rotary(analytic_signal(u), analytic_signal(v), time_axis=0)
+
+ Raises
+ ------
+ ValueError
+ If the input arrays do not have the same shape.
+ If the time axis is outside of the valid range ([-1, N-1]).
+
+ References
+ ----------
+ Lilly JM, Olhede SC. 2010 Bivariate instantaneous frequency and bandwidth.
+ IEEE T. Signal Proces. 58, 591–603. (10.1109/TSP.2009.2031729)
+
+ See Also
+ --------
+ :func:`analytic_signal`, :func:`rotary_to_cartesian`
+ """
+ # u and v arrays must have the same shape.
+ ifnotua.shape==va.shape:
+ raiseValueError("u and v must have the same shape.")
+
+ # time_axis must be in valid range
+ iftime_axis<-1ortime_axis>len(ua.shape)-1:
+ raiseValueError(
+ f"time_axis ({time_axis}) is outside of the valid range ([-1,"
+ f" {len(ua.shape)-1}])."
+ )
+
+ wp=0.5*(ua+1j*va)
+ wn=0.5*(ua-1j*va)
+
+ returnwp,wn
+
+
+
+
+[docs]
+defellipse_parameters(
+ xa:Union[np.ndarray,xr.DataArray],
+ ya:Union[np.ndarray,xr.DataArray],
+)->Tuple[np.ndarray,np.ndarray,np.ndarray,np.ndarray]:
+"""Return the instantaneous parameters of a modulated elliptical signal from its analytic Cartesian signals.
+
+ Parameters
+ ----------
+ xa : array_like
+ Complex-valued analytic signal for first Cartesian component (zonal, east-west).
+ ya : array_like
+ Complex-valued analytic signal for second Cartesian component (meridional, north-south).
+
+ Returns
+ -------
+ kappa : np.ndarray
+ Ellipse root-mean-square amplitude.
+ lambda : np.ndarray
+ Ellipse linearity between -1 and 1, or departure from circular motion (lambda=0).
+ theta : np.ndarray
+ Ellipse orientation in radian.
+ phi : np.ndarray
+ Ellipse phase in radian.
+
+ Examples
+ --------
+
+ To obtain the ellipse parameters from a pair of real-valued signals (x, y):
+
+ >>> kappa, lambda, theta, phi = ellipse_parameters(analytic_signal(x), analytic_signal(y))
+
+ Raises
+ ------
+ ValueError
+ If the input arrays do not have the same shape.
+
+ References
+ ----------
+ Lilly JM, Olhede SC. 2010 Bivariate instantaneous frequency and bandwidth.
+ IEEE T. Signal Proces. 58, 591–603. (10.1109/TSP.2009.2031729).
+
+ See Also
+ --------
+ :func:`modulated_ellipse_signal`, :func:`analytic_signal`, :func:`rotary_to_cartesian`, :func:`cartesian_to_rotary`
+
+ """
+
+ # u and v arrays must have the same shape.
+ ifnotxa.shape==ya.shape:
+ raiseValueError("xa and ya must have the same shape.")
+
+ X=np.abs(xa)
+ Y=np.abs(ya)
+ phix=np.angle(xa)
+ phiy=np.angle(ya)
+
+ phia=0.5*(phix+phiy+0.5*np.pi)
+ phid=0.5*(phix-phiy-0.5*np.pi)
+
+ P=0.5*np.sqrt(X**2+Y**2+2*X*Y*np.cos(2*phid))
+ N=0.5*np.sqrt(X**2+Y**2-2*X*Y*np.cos(2*phid))
+
+ phip=np.unwrap(
+ phia
+ +np.unwrap(np.imag(np.log(X*np.exp(1j*phid)+Y*np.exp(-1j*phid))))
+ )
+ phin=np.unwrap(
+ phia
+ +np.unwrap(np.imag(np.log(X*np.exp(1j*phid)-Y*np.exp(-1j*phid))))
+ )
+
+ kappa=np.sqrt(P**2+N**2)
+ lambda_=(2*P*N*np.sign(P-N))/(P**2+N**2)
+
+ # For vanishing linearity, put in very small number to have sign information
+ lambda_[lambda_==0]=np.sign(P[lambda_==0]-N[lambda_==0])*(1e-12)
+
+ theta=np.unwrap(0.5*(phip-phin))
+ phi=np.unwrap(0.5*(phip+phin))
+
+ lambda_=np.real(lambda_)
+
+ returnkappa,lambda_,theta,phi
+
+
+
+
+[docs]
+defmodulated_ellipse_signal(
+ kappa:Union[np.ndarray,xr.DataArray],
+ lambda_:Union[np.ndarray,xr.DataArray],
+ theta:Union[np.ndarray,xr.DataArray],
+ phi:Union[np.ndarray,xr.DataArray],
+)->Tuple[np.ndarray,np.ndarray]:
+"""Return the analytic Cartesian signals (xa, ya) from the instantaneous parameters of a modulated elliptical signal.
+
+ This function is the inverse of :func:`ellipse_parameters`.
+
+ Parameters
+ ----------
+ kappa : array_like
+ Ellipse root-mean-square amplitude.
+ lambda : array_like
+ Ellipse linearity between -1 and 1, or departure from circular motion (lambda=0).
+ theta : array_like
+ Ellipse orientation in radian.
+ phi : array_like
+ Ellipse phase in radian.
+ time_axis : int, optional
+ The axis of the time array. Default is -1, which corresponds to the
+ last axis.
+
+ Returns
+ -------
+ xa : np.ndarray
+ Complex-valued analytic signal for first Cartesian component (zonal, east-west).
+ ya : np.ndarray
+ Complex-valued analytic signal for second Cartesian component (meridional, north-south).
+
+ Examples
+ --------
+
+ To obtain the analytic signals from the instantaneous parameters of a modulated elliptical signal:
+
+ >>> xa, ya = modulated_ellipse_signal(kappa, lambda, theta, phi)
+
+ Raises
+ ------
+ ValueError
+ If the input arrays do not have the same shape.
+
+ References
+ ----------
+ Lilly JM, Olhede SC. 2010 Bivariate instantaneous frequency and bandwidth.
+ IEEE T. Signal Proces. 58, 591–603. (10.1109/TSP.2009.2031729).
+
+ See Also
+ --------
+ :func:`ellipse_parameters`, :func:`analytic_signal`, :func:`rotary_to_cartesian`, :func:`cartesian_to_rotary`
+
+ """
+
+ # make sure all input arrays have the same shape
+ ifnotkappa.shape==lambda_.shape==theta.shape==phi.shape:
+ raiseValueError("All input arrays must have the same shape.")
+
+ # calculate semi major and semi minor axes
+ a=kappa*np.sqrt(1+np.abs(lambda_))
+ b=np.sign(lambda_)*kappa*np.sqrt(1-np.abs(lambda_))
+
+ # define b to be positive for lambda exactly zero
+ b[lambda_==0]=kappa[lambda_==0]
+
+ xa=np.exp(1j*phi)*(a*np.cos(theta)+1j*b*np.sin(theta))
+ ya=np.exp(1j*phi)*(a*np.sin(theta)-1j*b*np.cos(theta))
+
+ mask=np.isinf(kappa*lambda_*theta*phi)
+ xa[mask]=np.inf+1j*np.inf
+ ya[mask]=np.inf+1j*np.inf
+
+ returnxa,ya
+
+
+
+
+[docs]
+defrotary_to_cartesian(
+ wp:Union[np.ndarray,xr.DataArray],
+ wn:Union[np.ndarray,xr.DataArray],
+ time_axis:Optional[int]=-1,
+)->Tuple[np.ndarray,np.ndarray]:
+"""Return Cartesian analytic signals (ua, va) from rotary signals (wp, wn)
+ as ua = wp + wn and va = -1j * (wp - wn).
+
+ This function is the inverse of :func:`cartesian_to_rotary`.
+
+ Parameters
+ ----------
+ wp : array_like
+ Complex-valued positive (counterclockwise) rotary signal.
+ wn : array_like
+ Complex-valued negative (clockwise) rotary signal.
+ time_axis : int, optional
+ The axis of the time array. Default is -1, which corresponds to the
+ last axis.
+
+ Returns
+ -------
+ ua : array_like
+ Complex-valued analytic signal, first Cartesian component (zonal, east-west)
+ va : array_like
+ Complex-valued analytic signal, second Cartesian component (meridional, north-south)
+
+ Examples
+ --------
+
+ To obtain the Cartesian analytic signals from a pair of rotary signals (wp,wn):
+
+ >>> ua, va = rotary_to_cartesian(wp, wn)
+
+ To specify that the time axis is along the first axis:
+
+ >>> ua, va = rotary_to_cartesian(wp, wn, time_axis=0)
+
+ Raises
+ ------
+ ValueError
+ If the input arrays do not have the same shape.
+ If the time axis is outside of the valid range ([-1, N-1]).
+
+ References
+ ----------
+ Lilly JM, Olhede SC. 2010 Bivariate instantaneous frequency and bandwidth.
+ IEEE T. Signal Proces. 58, 591–603. (10.1109/TSP.2009.2031729)
+
+ See Also
+ --------
+ :func:`analytic_signal`, :func:`cartesian_to_rotary`
+ """
+
+ ifnotwp.shape==wn.shape:
+ raiseValueError("u and v must have the same shape.")
+
+ # time_axis must be in valid range
+ iftime_axis<-1ortime_axis>len(wp.shape)-1:
+ raiseValueError(
+ f"time_axis ({time_axis}) is outside of the valid range ([-1,"
+ f" {len(wp.shape)-1}])."
+ )
+
+ # I think this may return xarray dataarrays if that's the input
+ ua=wp+wn
+ va=-1j*(wp-wn)
+
+ returnua,va
+[docs]
+defcumulative_distance(
+ longitude:Union[list,np.ndarray,xr.DataArray],
+ latitude:Union[list,np.ndarray,xr.DataArray],
+)->np.ndarray:
+"""Return the cumulative great circle distance in meters along a sequence of geographical locations.
+
+ Parameters
+ ----------
+ latitude : array-like
+ Latitude sequence, in degrees.
+ longitude : array-like
+ Longitude sequence, in degrees.
+
+ Returns
+ -------
+ out : np.ndarray
+ Cumulative distance.
+
+ See Also
+ --------
+ :func:`distance`
+
+ Examples
+ --------
+ Calculate the cumulative distance in meters along a path of three points:
+
+ >>> cumulative_distance(np.array([0, 1, 2]), np.array([0, 1, 2]))
+ array([ 0. , 157424.62387233, 314825.27182116])
+ """
+ returnnp.cumsum(
+ np.concatenate(
+ (
+ [0],
+ distance(latitude[0:-1],longitude[0:-1],latitude[1:],longitude[1:]),
+ )
+ )
+ )
+
+
+
+
+[docs]
+defdistance(
+ lon1:Union[float,list,np.ndarray,xr.DataArray],
+ lat1:Union[float,list,np.ndarray,xr.DataArray],
+ lon2:Union[float,list,np.ndarray,xr.DataArray],
+ lat2:Union[float,list,np.ndarray,xr.DataArray],
+)->Union[float,np.ndarray]:
+"""Return elementwise great circle distance in meters between one or more
+ points from arrays of their latitudes and longitudes, using the Haversine
+ formula.
+
+ d = 2⋅r⋅asin √[sin²(Δφ/2) + cos φ1 ⋅ cos φ2 ⋅ sin²(Δλ/2)]
+
+ where (φ, λ) is (lat, lon) in radians and r is the radius of the sphere in
+ meters.
+
+ Parameters
+ ----------
+ lon1 : np.ndarray
+ Longitudes of the first set of points, in degrees
+ lat1 : np.ndarray
+ Latitudes of the first set of points, in degrees
+ lon2 : np.ndarray
+ Longitudes of the second set of points, in degrees
+ lat2 : np.ndarray
+ Latitudes of the second set of points, in degrees
+
+ Returns
+ -------
+ out : np.ndarray
+ Great circle distance
+
+ Examples
+ --------
+ Calculate the distance of one degree longitude on the equator:
+
+ >>> distance(0, 0, 0, 1)
+ 111318.84502145034
+
+ Calculate the distance of one degree longitude at 45-degrees North latitude:
+
+ >>> distance(0, 45, 1, 45)
+ 78713.81064540472
+
+ You can also pass array-like inputs to calculate an array of distances:
+
+ >>> distance([0, 0], [0, 45], [0, 1], [1, 45])
+ array([111318.84502145, 78713.8106454 ])
+ """
+
+ # Input coordinates are in degrees; convert to radians.
+ # If any of the input arrays are xr.DataArray, extract the values first
+ # because Xarray enforces alignment between coordinates.
+ iftype(lat1)isxr.DataArray:
+ lat1_rad=np.deg2rad(lat1.values)
+ else:
+ lat1_rad=np.deg2rad(lat1)
+ iftype(lon1)isxr.DataArray:
+ lon1_rad=np.deg2rad(lon1.values)
+ else:
+ lon1_rad=np.deg2rad(lon1)
+ iftype(lat2)isxr.DataArray:
+ lat2_rad=np.deg2rad(lat2.values)
+ else:
+ lat2_rad=np.deg2rad(lat2)
+ iftype(lon2)isxr.DataArray:
+ lon2_rad=np.deg2rad(lon2.values)
+ else:
+ lon2_rad=np.deg2rad(lon2)
+
+ dlat=lat2_rad-lat1_rad
+ dlon=lon2_rad-lon1_rad
+
+ h=(
+ np.sin(0.5*dlat)**2
+ +np.cos(lat1_rad)*np.cos(lat2_rad)*np.sin(0.5*dlon)**2
+ )
+
+ return2*np.arcsin(np.sqrt(h))*EARTH_RADIUS_METERS
+
+
+
+
+[docs]
+defbearing(
+ lon1:Union[float,list,np.ndarray,xr.DataArray],
+ lat1:Union[float,list,np.ndarray,xr.DataArray],
+ lon2:Union[float,list,np.ndarray,xr.DataArray],
+ lat2:Union[float,list,np.ndarray,xr.DataArray],
+)->Union[float,np.ndarray]:
+"""Return elementwise initial (forward) bearing in radians from arrays of
+ latitude and longitude in degrees, based on the spherical law of cosines.
+
+ The formula is:
+
+ θ = atan2(cos φ1 ⋅ sin φ2 - sin φ1 ⋅ cos φ2 ⋅ cos Δλ, sin Δλ ⋅ cos φ2)
+
+ where (φ, λ) is (lat, lon) and θ is bearing, all in radians.
+ Bearing is defined as zero toward East and positive counterclockwise.
+
+ Parameters
+ ----------
+ lon1 : float or array-like
+ Longitudes of the first set of points, in degrees
+ lat1 : float or array-like
+ Latitudes of the first set of points, in degrees
+ lon2 : float or array-like
+ Longitudes of the second set of points, in degrees
+ lat2 : float or array-like
+ Latitudes of the second set of points, in degrees
+
+ Returns
+ -------
+ theta : float or np.ndarray
+ Bearing angles in radians
+
+ Examples
+ --------
+ Calculate the bearing of one degree longitude on the equator:
+
+ >>> bearing(0, 0, 1, 0)
+ 0.0
+
+ Calculate the bearing of 10 degrees longitude at 45-degrees North latitude:
+
+ >>> bearing(0, 45, 10, 45)
+ 0.06178508761798218
+ """
+ # Input coordinates are in degrees; convert to radians.
+ # If any of the input arrays are xr.DataArray, extract the values first
+ # because Xarray enforces alignment between coordinates.
+ iftype(lat1)isxr.DataArray:
+ lat1_rad=np.deg2rad(lat1.values)
+ else:
+ lat1_rad=np.deg2rad(lat1)
+ iftype(lon1)isxr.DataArray:
+ lon1_rad=np.deg2rad(lon1.values)
+ else:
+ lon1_rad=np.deg2rad(lon1)
+ iftype(lat2)isxr.DataArray:
+ lat2_rad=np.deg2rad(lat2.values)
+ else:
+ lat2_rad=np.deg2rad(lat2)
+ iftype(lon2)isxr.DataArray:
+ lon2_rad=np.deg2rad(lon2.values)
+ else:
+ lon2_rad=np.deg2rad(lon2)
+
+ dlon=lon2_rad-lon1_rad
+
+ theta=np.arctan2(
+ np.cos(lat1_rad)*np.sin(lat2_rad)
+ -np.sin(lat1_rad)*np.cos(lat2_rad)*np.cos(dlon),
+ np.sin(dlon)*np.cos(lat2_rad),
+ )
+
+ returntheta
+
+
+
+
+[docs]
+defposition_from_distance_and_bearing(
+ lon:float,lat:float,distance:float,bearing:float
+)->Tuple[float,float]:
+"""Return elementwise new position in degrees from arrays of latitude and
+ longitude in degrees, distance in meters, and bearing in radians, based on
+ the spherical law of cosines.
+
+ The formula is:
+
+ φ2 = asin( sin φ1 ⋅ cos δ + cos φ1 ⋅ sin δ ⋅ cos θ )
+ λ2 = λ1 + atan2( sin θ ⋅ sin δ ⋅ cos φ1, cos δ − sin φ1 ⋅ sin φ2 )
+
+ where (φ, λ) is (lat, lon) and θ is bearing, all in radians.
+ Bearing is defined as zero toward East and positive counterclockwise.
+
+ Parameters
+ ----------
+ lon : float
+ Longitude of the first set of points, in degrees
+ lat : float
+ Latitude of the first set of points, in degrees
+ distance : array_like
+ Distance in meters
+ bearing : array_like
+ Bearing angles in radians
+
+ Returns
+ -------
+ lon2 : array_like
+ Latitudes of the second set of points, in degrees, in the range [-90, 90]
+ lat2 : array_like
+ Longitudes of the second set of points, in degrees, in the range [-180, 180]
+
+ Examples
+ --------
+ Calculate the position of one degree longitude distance on the equator:
+
+ >>> position_from_distance_and_bearing(0, 0, 111318.84502145034, 0)
+ (1.0, 0.0)
+
+ Calculate the position of one degree latitude distance from 45 degrees North latitude:
+
+ >>> position_from_distance_and_bearing(0, 45, 111318.84502145034, np.pi / 2)
+ (8.81429402840006e-17, 45.99999999999999)
+ """
+ lat_rad=np.deg2rad(lat)
+ lon_rad=np.deg2rad(lon)
+
+ distance_rad=distance/EARTH_RADIUS_METERS
+
+ lat2_rad=np.arcsin(
+ np.sin(lat_rad)*np.cos(distance_rad)
+ +np.cos(lat_rad)*np.sin(distance_rad)*np.sin(bearing)
+ )
+ lon2_rad=lon_rad+np.arctan2(
+ np.cos(bearing)*np.sin(distance_rad)*np.cos(lat_rad),
+ np.cos(distance_rad)-np.sin(lat_rad)*np.sin(lat2_rad),
+ )
+
+ returnnp.rad2deg(lon2_rad),np.rad2deg(lat2_rad)
+
+
+
+
+[docs]
+defrecast_lon(lon:np.ndarray,lon0:Optional[float]=-180)->np.ndarray:
+"""Recast (convert) longitude values to a selected range of 360 degrees
+ starting from ``lon0``.
+
+ Parameters
+ ----------
+ lon : np.ndarray or float
+ An N-d array of longitudes in degrees
+ lon0 : float, optional
+ Starting longitude of the recasted range (default -180).
+
+ Returns
+ -------
+ np.ndarray or float
+ Converted longitudes in the range `[lon0, lon0+360[`
+
+ Examples
+ --------
+ By default, ``recast_lon`` converts longitude values to the range
+ `[-180, 180[`:
+
+ >>> recast_lon(200)
+ -160
+
+ >>> recast_lon(180)
+ -180
+
+ The range of the output longitude is controlled by ``lon0``.
+ For example, with ``lon0 = 0``, the longitude values are converted to the
+ range `[0, 360[`.
+
+ >>> recast_lon(200, -180)
+ -160
+
+ With ``lon0 = 20``, longitude values are converted to range `[20, 380]`,
+ which can be useful to avoid cutting the major ocean basins.
+
+ >>> recast_lon(10, 20)
+ 370
+
+ See Also
+ --------
+ :func:`recast_lon360`, :func:`recast_lon180`
+ """
+ returnnp.mod(lon-lon0,360)+lon0
+
+
+
+
+[docs]
+defrecast_lon360(lon:np.ndarray)->np.ndarray:
+"""Recast (convert) longitude values to the range `[0, 360[`.
+ This is a convenience wrapper around :func:`recast_lon` with ``lon0 = 0``.
+
+ Parameters
+ ----------
+ lon : np.ndarray
+ An N-d array of longitudes in degrees
+
+ Returns
+ -------
+ np.ndarray
+ Converted longitudes in the range `[0, 360[`
+
+ Examples
+ --------
+ >>> recast_lon360(200)
+ 200
+
+ >>> recast_lon360(-200)
+ 160
+
+ See Also
+ --------
+ :func:`recast_lon`, :func:`recast_lon180`
+ """
+ returnrecast_lon(lon,0)
+
+
+
+
+[docs]
+defrecast_lon180(lon:np.ndarray)->np.ndarray:
+"""Recast (convert) longitude values to the range `[-180, 180[`.
+ This is a convenience wrapper around :func:`recast_lon` with ``lon0 = -180``.
+
+ Parameters
+ ----------
+ lon : np.ndarray
+ An N-d array of longitudes in degrees
+
+ Returns
+ -------
+ np.ndarray
+ Converted longitudes in the range `[-180, 180[`
+
+ Examples
+ --------
+ >>> recast_lon180(200)
+ -160
+
+ >>> recast_lon180(-200)
+ 160
+
+ See Also
+ --------
+ :func:`recast_lon`, :func:`recast_lon360`
+ """
+ returnrecast_lon(lon,-180)
+
+
+
+
+[docs]
+defplane_to_sphere(
+ x:np.ndarray,y:np.ndarray,lon_origin:float=0,lat_origin:float=0
+)->Tuple[np.ndarray,np.ndarray]:
+"""Convert Cartesian coordinates on a plane to spherical coordinates.
+
+ The arrays of input zonal and meridional displacements ``x`` and ``y`` are
+ assumed to follow a contiguous trajectory. The spherical coordinate of each
+ successive point is determined by following a great circle path from the
+ previous point. The spherical coordinate of the first point is determined by
+ following a great circle path from the origin, by default (0, 0).
+
+ The output arrays have the same floating-point output type as the input.
+
+ If projecting multiple trajectories onto the same plane, use
+ :func:`apply_ragged` for highest accuracy.
+
+ Parameters
+ ----------
+ x : np.ndarray
+ An N-d array of zonal displacements in meters
+ y : np.ndarray
+ An N-d array of meridional displacements in meters
+ lon_origin : float, optional
+ Origin longitude of the tangent plane in degrees, default 0
+ lat_origin : float, optional
+ Origin latitude of the tangent plane in degrees, default 0
+
+ Returns
+ -------
+ lon : np.ndarray
+ Longitude in degrees
+ lat : np.ndarray
+ Latitude in degrees
+
+ Examples
+ --------
+ >>> plane_to_sphere(np.array([0., 0.]), np.array([0., 1000.]))
+ (array([0.00000000e+00, 5.50062664e-19]), array([0. , 0.0089832]))
+
+ You can also specify an origin longitude and latitude:
+
+ >>> plane_to_sphere(np.array([0., 0.]), np.array([0., 1000.]), lon_origin=1, lat_origin=0)
+ (array([1., 1.]), array([0. , 0.0089832]))
+
+ Raises
+ ------
+ AttributeError
+ If ``x`` and ``y`` are not NumPy arrays
+
+ See Also
+ --------
+ :func:`sphere_to_plane`
+ """
+ lon=np.empty_like(x)
+ lat=np.empty_like(y)
+
+ # Cartesian distances between each point
+ dx=np.diff(x,prepend=0)
+ dy=np.diff(y,prepend=0)
+
+ distances=np.sqrt(dx**2+dy**2)
+ bearings=np.arctan2(dy,dx)
+
+ # Compute spherical coordinates following great circles between each
+ # successive point.
+ lon[...,0],lat[...,0]=position_from_distance_and_bearing(
+ lon_origin,lat_origin,distances[...,0],bearings[...,0]
+ )
+ forninrange(1,lon.shape[-1]):
+ lon[...,n],lat[...,n]=position_from_distance_and_bearing(
+ lon[...,n-1],lat[...,n-1],distances[...,n],bearings[...,n]
+ )
+
+ returnlon,lat
+
+
+
+
+[docs]
+defsphere_to_plane(
+ lon:np.ndarray,lat:np.ndarray,lon_origin:float=0,lat_origin:float=0
+)->Tuple[np.ndarray,np.ndarray]:
+"""Convert spherical coordinates to a tangent (Cartesian) plane.
+
+ The arrays of input longitudes and latitudes are assumed to be following
+ a contiguous trajectory. The Cartesian coordinate of each successive point
+ is determined by following a great circle path from the previous point.
+ The Cartesian coordinate of the first point is determined by following a
+ great circle path from the origin, by default (0, 0).
+
+ The output arrays have the same floating-point output type as the input.
+
+ If projecting multiple trajectories onto the same plane, use
+ :func:`apply_ragged` for highest accuracy.
+
+ Parameters
+ ----------
+ lon : np.ndarray
+ An N-d array of longitudes in degrees
+ lat : np.ndarray
+ An N-d array of latitudes in degrees
+ lon_origin : float, optional
+ Origin longitude of the tangent plane in degrees, default 0
+ lat_origin : float, optional
+ Origin latitude of the tangent plane in degrees, default 0
+
+ Returns
+ -------
+ x : np.ndarray
+ x-coordinates on the tangent plane
+ y : np.ndarray
+ y-coordinates on the tangent plane
+
+ Examples
+ --------
+ >>> sphere_to_plane(np.array([0., 1.]), np.array([0., 0.]))
+ (array([ 0. , 111318.84502145]), array([0., 0.]))
+
+ You can also specify an origin longitude and latitude:
+
+ >>> sphere_to_plane(np.array([0., 1.]), np.array([0., 0.]), lon_origin=1, lat_origin=0)
+ (array([-111318.84502145, 0. ]),
+ array([1.36326267e-11, 1.36326267e-11]))
+
+ Raises
+ ------
+ AttributeError
+ If ``lon`` and ``lat`` are not NumPy arrays
+
+ See Also
+ --------
+ :func:`plane_to_sphere`
+ """
+ x=np.empty_like(lon)
+ y=np.empty_like(lat)
+
+ distances=np.empty_like(x)
+ bearings=np.empty_like(x)
+
+ # Distance and bearing of the starting point relative to the origin
+ distances[0]=distance(lon_origin,lat_origin,lon[...,0],lat[...,0])
+ bearings[0]=bearing(lon_origin,lat_origin,lon[...,0],lat[...,0])
+
+ # Distance and bearing of the remaining points
+ distances[1:]=distance(lon[...,:-1],lat[...,:-1],lon[...,1:],lat[...,1:])
+ bearings[1:]=bearing(lon[...,:-1],lat[...,:-1],lon[...,1:],lat[...,1:])
+
+ dx=distances*np.cos(bearings)
+ dy=distances*np.sin(bearings)
+
+ x[...,:]=np.cumsum(dx,axis=-1)
+ y[...,:]=np.cumsum(dy,axis=-1)
+
+ returnx,y
+
+
+
+
+[docs]
+defspherical_to_cartesian(
+ lon:Union[float,list,np.ndarray,xr.DataArray],
+ lat:Union[float,list,np.ndarray,xr.DataArray],
+ radius:Optional[float]=EARTH_RADIUS_METERS,
+)->Tuple[np.ndarray,np.ndarray,np.ndarray]:
+"""Converts latitude and longitude on a spherical body to
+ three-dimensional Cartesian coordinates.
+
+ The Cartesian coordinate system is a right-handed system whose
+ origin lies at the center of a sphere. It is oriented with the
+ Z-axis passing through the poles and the X-axis passing through
+ the point lon = 0, lat = 0. This function is inverted by
+ :func:`cartesian_to_spherical`.
+
+ Parameters
+ ----------
+ lon : array-like
+ An N-d array of longitudes in degrees.
+ lat : array-like
+ An N-d array of latitudes in degrees.
+ radius: float, optional
+ The radius of the spherical body in meters. The default assumes the Earth with
+ EARTH_RADIUS_METERS = 6.3781e6.
+
+ Returns
+ -------
+ x : float or array-like
+ x-coordinates in 3D in meters.
+ y : float or array-like
+ y-coordinates in 3D in meters.
+ z : float or array-like
+ z-coordinates in 3D in meters.
+
+ Examples
+ --------
+ >>> spherical_to_cartesian(np.array([0, 45]), np.array([0, 45]))
+ (array([6378100., 3189050.]),
+ array([ 0., 3189050.]),
+ array([ 0. , 4509997.76108592]))
+
+ >>> spherical_to_cartesian(np.array([0, 45, 90]), np.array([0, 90, 180]), radius=1)
+ (array([ 1.00000000e+00, 4.32978028e-17, -6.12323400e-17]),
+ array([ 0.00000000e+00, 4.32978028e-17, -1.00000000e+00]),
+ array([0.0000000e+00, 1.0000000e+00, 1.2246468e-16]))
+
+ >>> x, y, z = spherical_to_cartesian(np.array([0, 5]), np.array([0, 5]))
+
+ Raises
+ ------
+ AttributeError
+ If ``lon`` and ``lat`` are not NumPy arrays.
+
+ See Also
+ --------
+ :func:`cartesian_to_spherical`
+ """
+ lonr,latr=np.deg2rad(lon),np.deg2rad(lat)
+
+ x=radius*np.cos(latr)*np.cos(lonr)
+ y=radius*np.cos(latr)*np.sin(lonr)
+ z=radius*np.sin(latr)
+
+ returnx,y,z
+
+
+
+
+[docs]
+defcartesian_to_spherical(
+ x:Union[float,np.ndarray,xr.DataArray],
+ y:Union[float,np.ndarray,xr.DataArray],
+ z:Union[float,np.ndarray,xr.DataArray],
+)->Tuple[np.ndarray,np.ndarray]:
+"""Converts Cartesian three-dimensional coordinates to latitude and longitude on a
+ spherical body.
+
+ The Cartesian coordinate system is a right-handed system whose
+ origin lies at the center of the sphere. It is oriented with the
+ Z-axis passing through the poles and the X-axis passing through
+ the point lon = 0, lat = 0. This function is inverted by `spherical_to_cartesian`.
+
+ Parameters
+ ----------
+ x : float or array-like
+ x-coordinates in 3D.
+ y : float or array-like
+ y-coordinates in 3D.
+ z : float or array-like
+ z-coordinates in 3D.
+
+ Returns
+ -------
+ lon : float or array-like
+ An N-d array of longitudes in degrees in range [-180, 180].
+ lat : float or array-like
+ An N-d array of latitudes in degrees.
+
+ Examples
+ --------
+ >>> x = EARTH_RADIUS_METERS * np.cos(np.deg2rad(45))
+ >>> y = EARTH_RADIUS_METERS * np.cos(np.deg2rad(45))
+ >>> z = 0 * x
+ >>> cartesian_to_spherical(x, y, z)
+ (44.99999999999985, 0.0)
+
+ ``cartesian_to_spherical`` is inverted by ``spherical_to_cartesian``:
+
+ >>> x, y, z = spherical_to_cartesian(np.array([45]),np.array(0))
+ >>> cartesian_to_spherical(x, y, z)
+ (array([45.]), array([0.]))
+
+ Raises
+ ------
+ AttributeError
+ If ``x``, ``y``, and ``z`` are not NumPy arrays.
+
+ See Also
+ --------
+ :func:`spherical_to_cartesian`
+ """
+
+ R=np.sqrt(x**2+y**2+z**2)
+ x/=R
+ y/=R
+ z/=R
+
+ withnp.errstate(divide="ignore"):
+ lon=np.where(
+ np.logical_and(x==0,y==0),
+ 0,
+ recast_lon180(np.rad2deg(np.imag(np.log((x+1j*y))))),
+ )
+ lat=np.rad2deg(np.arcsin(z))
+
+ returnlon,lat
+
+
+
+
+[docs]
+defcartesian_to_tangentplane(
+ u:Union[float,np.ndarray],
+ v:Union[float,np.ndarray],
+ w:Union[float,np.ndarray],
+ longitude:Union[float,np.ndarray],
+ latitude:Union[float,np.ndarray],
+)->Union[Tuple[float],Tuple[np.ndarray]]:
+"""
+ Project a three-dimensional Cartesian vector on a plane tangent to
+ a spherical Earth.
+
+ The Cartesian coordinate system is a right-handed system whose
+ origin lies at the center of a sphere. It is oriented with the
+ Z-axis passing through the north pole at lat = 90, the X-axis passing through
+ the point lon = 0, lat = 0, and the Y-axis passing through the point lon = 90,
+ lat = 0.
+
+ Parameters
+ ----------
+ u : float or np.ndarray
+ First component of Cartesian vector.
+ v : float or np.ndarray
+ Second component of Cartesian vector.
+ w : float or np.ndarray
+ Third component of Cartesian vector.
+ longitude : float or np.ndarray
+ Longitude in degrees of tangent point of plane.
+ latitude : float or np.ndarray
+ Latitude in degrees of tangent point of plane.
+
+ Returns
+ -------
+ up: float or np.ndarray
+ First component of projected vector on tangent plane (positive eastward).
+ vp: float or np.ndarray
+ Second component of projected vector on tangent plane (positive northward).
+
+ Raises
+ ------
+ Warning
+ Raised if the input latitude is not in the expected range [-90, 90].
+
+ Examples
+ --------
+ >>> u, v = cartesian_to_tangentplane(1, 1, 1, 45, 90)
+
+ See Also
+ --------
+ :func:`tangentplane_to_cartesian`
+ """
+ ifnp.any(latitude<-90)ornp.any(latitude>90):
+ warnings.warn("Input latitude outside of range [-90,90].")
+
+ phi=np.radians(latitude)
+ theta=np.radians(longitude)
+ u_projected=v*np.cos(theta)-u*np.sin(theta)
+ v_projected=(
+ w*np.cos(phi)
+ -u*np.cos(theta)*np.sin(phi)
+ -v*np.sin(theta)*np.sin(phi)
+ )
+ # JML says vh = w.*cos(phi)-u.*cos(theta).*sin(phi)-v.*sin(theta).*sin(phi) but vh=w./cos(phi) is the same
+ returnu_projected,v_projected
+
+
+
+
+[docs]
+deftangentplane_to_cartesian(
+ up:Union[float,np.ndarray],
+ vp:Union[float,np.ndarray],
+ longitude:Union[float,np.ndarray],
+ latitude:Union[float,np.ndarray],
+)->Union[Tuple[float],Tuple[np.ndarray]]:
+"""
+ Return the three-dimensional Cartesian components of a vector contained in
+ a plane tangent to a spherical Earth.
+
+ The Cartesian coordinate system is a right-handed system whose
+ origin lies at the center of a sphere. It is oriented with the
+ Z-axis passing through the north pole at lat = 90, the X-axis passing through
+ the point lon = 0, lat = 0, and the Y-axis passing through the point lon = 90,
+ lat = 0.
+
+ Parameters
+ ----------
+ up: float or np.ndarray
+ First component of vector on tangent plane (positive eastward).
+ vp: float or np.ndarray
+ Second component of vector on tangent plane (positive northward).
+ longitude : float or np.ndarray
+ Longitude in degrees of tangent point of plane.
+ latitude : float or np.ndarray
+ Latitude in degrees of tangent point of plane.
+
+ Returns
+ -------
+ u : float or np.ndarray
+ First component of Cartesian vector.
+ v : float or np.ndarray
+ Second component of Cartesian vector.
+ w : float or np.ndarray
+ Third component of Cartesian vector.
+
+ Examples
+ --------
+ >>> u, v, w = tangentplane_to_cartesian(1, 1, 45, 90)
+
+ Notes
+ -----
+ This function is inverted by :func:`cartesian_to_tangetplane`.
+
+ See Also
+ --------
+ :func:`cartesian_to_tangentplane`
+ """
+ phi=np.radians(latitude)
+ theta=np.radians(longitude)
+ u=-up*np.sin(theta)-vp*np.sin(phi)*np.cos(theta)
+ v=up*np.cos(theta)-vp*np.sin(phi)*np.sin(theta)
+ w=vp*np.cos(phi)
+
+ returnu,v,w
+
+
+
+
+[docs]
+defcoriolis_frequency(
+ latitude:Union[float,np.ndarray],
+)->Union[float,np.ndarray]:
+"""
+ Return the Coriolis frequency or commonly known `f` parameter in geophysical fluid dynamics.
+
+ Parameters
+ ----------
+ latitude : float or np.ndarray
+ Latitude in degrees.
+
+ Returns
+ -------
+ f : float or np.ndarray
+ Signed Coriolis frequency in radian per seconds.
+
+ Examples
+ --------
+ >>> f = coriolis_frequency(np.array([0, 45, 90]))
+ """
+ f=2*EARTH_ROTATION_RATE*np.sin(np.radians(latitude))
+
+ returnf
+"""
+This module provides functions for computing wavelet transforms and time-frequency analyses,
+notably using generalized Morse wavelets.
+
+The Python code in this module was translated from the MATLAB implementation
+by J. M. Lilly in the jWavelet module of jLab (http://jmlilly.net/code.html).
+
+Lilly, J. M. (2021), jLab: A data analysis package for Matlab, v.1.7.1,
+doi:10.5281/zenodo.4547006, http://www.jmlilly.net/software.
+
+jLab is licensed under the Creative Commons Attribution-Noncommercial-ShareAlike
+License (https://creativecommons.org/licenses/by-nc-sa/4.0/). The code that is
+directly translated from jLab/jWavelet is licensed under the same license.
+Any other code that is added to this module and that is specific to Python and
+not the MATLAB implementation is licensed under CloudDrift's MIT license.
+"""
+
+importnumpyasnp
+fromtypingimportOptional,Tuple,Union
+fromscipy.specialimportgammaas_gamma,gammalnas_lgamma
+
+
+
+[docs]
+defmorse_wavelet_transform(
+ x:np.ndarray,
+ gamma:float,
+ beta:float,
+ radian_frequency:np.ndarray,
+ complex:Optional[bool]=False,
+ order:Optional[int]=1,
+ normalization:Optional[str]="bandpass",
+ boundary:Optional[str]="mirror",
+ time_axis:Optional[int]=-1,
+)->Union[Tuple[np.ndarray],np.ndarray]:
+"""
+ Apply a continuous wavelet transform to an input signal using the generalized Morse
+ wavelets of Olhede and Walden (2002). The wavelet transform is normalized differently
+ for complex-valued input than for real-valued input, and this in turns depends on whether the
+ optional argument ``normalization`` is set to ``"bandpass"`` or ``"energy"`` normalizations.
+
+ Parameters
+ ----------
+ x : np.ndarray
+ Real- or complex-valued signals. The time axis is assumed to be the last. If not, specify optional
+ argument `time_axis`.
+ gamma : float
+ Gamma parameter of the Morse wavelets.
+ beta : float
+ Beta parameter of the Morse wavelets.
+ radian_frequency : np.ndarray
+ An array of radian frequencies at which the Fourier transform of the wavelets
+ reach their maximum amplitudes. ``radian_frequency`` is typically between 0 and 2 * np.pi * 0.5,
+ the normalized Nyquist radian frequency.
+ complex : boolean, optional
+ Specify explicitely if the input signal ``x`` is a complex signal. Default is False which
+ means that the input is real but that is not explicitely tested by the function.
+ This choice affects the normalization of the outputs and their interpretation.
+ See examples below.
+ time_axis : int, optional
+ Axis on which the time is defined for input ``x`` (default is last, or -1).
+ normalization : str, optional
+ Normalization for the wavelet transforms. By default it is assumed to be
+ ``"bandpass"`` which uses a bandpass normalization, meaning that the FFT
+ of the wavelets have peak value of 2 for all central frequencies
+ ``radian_frequency``. However, if the optional argument ``complex=True``
+ is specified, the wavelets will be divided by 2 so that the total
+ variance of the input complex signal is equal to the sum of the
+ variances of the returned analytic (positive) and conjugate analytic
+ (negative) parts. See examples below. The other option is ``"energy"``
+ which uses the unit energy normalization. In this last case, the
+ time-domain wavelet energies ``np.sum(np.abs(wave)**2)`` are always
+ unity.
+ boundary : str, optional
+ The boundary condition to be imposed at the edges of the input signal ``x``.
+ Allowed values are ``"mirror"``, ``"zeros"``, and ``"periodic"``. Default is ``"mirror"``.
+ order : int, optional
+ Order of Morse wavelets, default is 1.
+
+ Returns
+ -------
+ If the input signal is real as specificied by ``complex=False``:
+
+ wtx : np.ndarray
+ Time-domain wavelet transform of input ``x`` with shape ((x shape without time_axis), orders, frequencies, time_axis)
+ but with dimensions of length 1 removed (squeezed).
+
+ If the input signal is complex as specificied by ``complex=True``, a tuple is returned:
+
+ wtx_p : np.array
+ Time-domain positive wavelet transform of input ``x`` with shape ((x shape without time_axis), frequencies, orders),
+ but with dimensions of length 1 removed (squeezed).
+ wtx_n : np.array
+ Time-domain negative wavelet transform of input ``x`` with shape ((x shape without time_axis), frequencies, orders),
+ but with dimensions of length 1 removed (squeezed).
+
+ Examples
+ --------
+ Apply a wavelet transform with a Morse wavelet with gamma parameter 3, beta parameter 4,
+ at radian frequency 0.2 cycles per unit time:
+
+ >>> x = np.random.random(1024)
+ >>> wtx = morse_wavelet_transform(x, 3, 4, np.array([2*np.pi*0.2]))
+
+ Apply a wavelet transform with a Morse wavelet with gamma parameter 3, beta parameter 4,
+ for a complex input signal at radian frequency 0.2 cycles per unit time. This case returns the
+ analytic and conjugate analytic components:
+
+ >>> z = np.random.random(1024) + 1j*np.random.random(1024)
+ >>> wtz_p, wtz_n = morse_wavelet_transform(z, 3, 4, np.array([2*np.pi*0.2]), complex=True)
+
+ The same result as above can be otained by applying the Morse transform on the real and imaginary
+ component of z and recombining the results as follows for the "bandpass" normalization:
+ >>> wtz_real = morse_wavelet_transform(np.real(z)), 3, 4, np.array([2*np.pi*0.2]))
+ >>> wtz_imag = morse_wavelet_transform(np.imag(z)), 3, 4, np.array([2*np.pi*0.2]))
+ >>> wtz_p, wtz_n = (wtz_real + 1j*wtz_imag) / 2, (wtz_real - 1j*wtz_imag) / 2
+
+ For the "energy" normalization, the analytic and conjugate analytic components are obtained as follows
+ with this alternative method:
+ >>> wtz_real = morse_wavelet_transform(np.real(z)), 3, 4, np.array([2*np.pi*0.2]))
+ >>> wtz_imag = morse_wavelet_transform(np.imag(z)), 3, 4, np.array([2*np.pi*0.2]))
+ >>> wtz_p, wtz_n = (wtz_real + 1j*wtz_imag) / np.sqrt(2), (wtz_real - 1j*wtz_imag) / np.sqrt(2)
+
+ The input signal can have an arbitrary number of dimensions but its ``time_axis`` must be
+ specified if it is not the last:
+
+ >>> x = np.random.random((1024,10,15))
+ >>> wtx = morse_wavelet_transform(x, 3, 4, np.array([2*np.pi*0.2]), time_axis=0)
+
+ The default way to handle the boundary conditions is to mirror the ends points
+ but this can be changed by specifying the chosen boundary method:
+
+ >>> x = np.random.random((10,15,1024))
+ >>> wtx = morse_wavelet_transform(x, 3, 4, np.array([2*np.pi*0.2]), boundary="periodic")
+
+ This function can be used to conduct a time-frequency analysis of the input signal by specifying
+ a range of randian frequencies using the ``morse_logspace_freq`` function as an example:
+
+ >>> x = np.random.random(1024)
+ >>> gamma = 3
+ >>> beta = 4
+ >>> radian_frequency = morse_logspace_freq(gamma, beta, np.shape(x)[0])
+ >>> wtx = morse_wavelet_transform(x, gamma, beta, radian_frequency)
+
+ Raises
+ ------
+ ValueError
+ If the time axis is outside of the valid range ([-1, np.ndim(x)-1]).
+ If boundary optional argument is not in ["mirror", "zeros", "periodic"]``.
+ If normalization optional argument is not in ["bandpass", "energy"]``.
+
+ See Also
+ --------
+ :func:`morse_wavelet`, :func:`wavelet_transform`, :func:`morse_logspace_freq`
+
+ """
+ # time_axis must be in valid range
+ iftime_axis<-1ortime_axis>len(x.shape)-1:
+ raiseValueError(
+ f"time_axis ({time_axis}) is outside of the valid range ([-1,"
+ f" {len(x.shape)-1}])."
+ )
+ # generate the wavelet
+ wavelet,_=morse_wavelet(
+ np.shape(x)[time_axis],
+ gamma,
+ beta,
+ radian_frequency,
+ normalization=normalization,
+ order=order,
+ )
+
+ # apply the wavelet transform, distinguish complex and real cases
+ ifcomplex:
+ # imaginary case, divide by 2 the wavelet and return analytic and conjugate analytic
+ ifnormalization=="bandpass":
+ wtx_p=wavelet_transform(
+ 0.5*x,wavelet,boundary="mirror",time_axis=time_axis
+ )
+ wtx_n=wavelet_transform(
+ np.conj(0.5*x),wavelet,boundary="mirror",time_axis=time_axis
+ )
+ elifnormalization=="energy":
+ wtx_p=wavelet_transform(
+ x/np.sqrt(2),wavelet,boundary="mirror",time_axis=time_axis
+ )
+ wtx_n=wavelet_transform(
+ np.conj(x/np.sqrt(2)),wavelet,boundary="mirror",time_axis=time_axis
+ )
+ wtx=wtx_p,wtx_n
+
+ elifnotcomplex:
+ # real case
+ wtx=wavelet_transform(x,wavelet,boundary=boundary,time_axis=time_axis)
+
+ else:
+ raiseValueError(
+ "`complex` optional argument must be boolean 'True' or 'False'"
+ )
+
+ returnwtx
+
+
+
+
+[docs]
+defwavelet_transform(
+ x:np.ndarray,
+ wavelet:np.ndarray,
+ boundary:Optional[str]="mirror",
+ time_axis:Optional[int]=-1,
+ freq_axis:Optional[int]=-2,
+ order_axis:Optional[int]=-3,
+)->np.ndarray:
+"""
+ Apply a continuous wavelet transform to an input signal using an input wavelet
+ function. Such wavelet can be provided by the function ``morse_wavelet``.
+
+ Parameters
+ ----------
+ x : np.ndarray
+ Real- or complex-valued signals.
+ wavelet : np.ndarray
+ A suite of time-domain wavelets, typically returned by the function ``morse_wavelet``.
+ The length of the time axis of the wavelets must be the last one and matches the
+ length of the time axis of x. The other dimensions (axes) of the wavelets (such as orders and frequencies) are
+ typically organized as orders, frequencies, and time, unless specified by optional arguments freq_axis and order_axis.
+ The normalization of the wavelets is assumed to be "bandpass", if not, use kwarg normalization="energy", see ``morse_wavelet``.
+ boundary : str, optional
+ The boundary condition to be imposed at the edges of the input signal ``x``.
+ Allowed values are ``"mirror"``, ``"zeros"``, and ``"periodic"``. Default is ``"mirror"``.
+ time_axis : int, optional
+ Axis on which the time is defined for input ``x`` (default is last, or -1). Note that the time axis of the
+ wavelets must be last.
+ freq_axis : int, optional
+ Axis of ``wavelet`` for the frequencies (default is second or 1)
+ order_axis : int, optional
+ Axis of ``wavelet`` for the orders (default is first or 0)
+
+ Returns
+ -------
+ wtx : np.ndarray
+ Time-domain wavelet transform of ``x`` with shape ((x shape without time_axis), orders, frequencies, time_axis)
+ but with dimensions of length 1 removed (squeezed).
+
+ Examples
+ --------
+ Apply a wavelet transform with a Morse wavelet with gamma parameter 3, beta
+ parameter 4, at radian frequency 0.2 cycles per unit time:
+
+ >>> x = np.random.random(1024)
+ >>> wavelet, _ = morse_wavelet(1024, 3, 4, np.array([2*np.pi*0.2]))
+ >>> wtx = wavelet_transform(x, wavelet)
+
+ The input signal can have an arbitrary number of dimensions but its
+ ``time_axis`` must be specified if it is not the last:
+
+ >>> x = np.random.random((1024,10,15))
+ >>> wavelet, _ = morse_wavelet(1024, 3, 4, np.array([2*np.pi*0.2]))
+ >>> wtx = wavelet_transform(x, wavelet,time_axis=0)
+
+ Raises
+ ------
+ ValueError
+ If the time axis is outside of the valid range ([-1, N-1]).
+ If the shape of time axis is different for input signal and wavelet.
+ If boundary optional argument is not in ["mirror", "zeros", "periodic"]``.
+
+ See Also
+ --------
+ :func:`morse_wavelet`, :func:`morse_wavelet_transform`, :func:`morse_freq`
+ """
+ # time_axis must be in valid range
+ iftime_axis<-1ortime_axis>len(x.shape)-1:
+ raiseValueError(
+ f"time_axis ({time_axis}) is outside of the valid range ([-1,"
+ f" {len(x.shape)-1}])."
+ )
+ # Positions and time arrays must have the same shape.
+ ifx.shape[time_axis]!=wavelet.shape[-1]:
+ raiseValueError("x and wavelet time axes must have the same length.")
+
+ wavelet_=np.moveaxis(wavelet,[freq_axis,order_axis],[-2,-3])
+
+ # if x is of dimension 1 we need to expand
+ # otherwise make sure time axis is last
+ ifnp.ndim(x)<2:
+ x_=np.expand_dims(x,axis=0)
+ else:
+ x_=np.moveaxis(x,time_axis,-1)
+
+ # add detrending option eventually
+
+ # apply boundary conditions
+ ifboundary=="mirror":
+ x_=np.concatenate((np.flip(x_,axis=-1),x_,np.flip(x_,axis=-1)),axis=-1)
+ elifboundary=="zeros":
+ x_=np.concatenate((np.zeros_like(x_),x_,np.zeros_like(x_)),axis=-1)
+ elifboundary=="periodic":
+ pass
+ else:
+ raiseValueError("boundary must be one of 'mirror', 'zeros', or 'periodic'.")
+
+ time_length=np.shape(x)[time_axis]
+ time_length_=np.shape(x_)[-1]
+
+ # pad wavelet with zeros: JML ok
+ order_length,freq_length,_=np.shape(wavelet)
+ _wavelet=np.zeros((order_length,freq_length,time_length_),dtype=np.cdouble)
+
+ index=slice(
+ int(np.floor(time_length_-time_length)/2),
+ int(time_length+np.floor(time_length_-time_length)/2),
+ )
+ _wavelet[:,:,index]=wavelet_
+
+ # take fft along axis = -1
+ _wavelet_fft=np.fft.fft(_wavelet)
+ om=2*np.pi*np.linspace(0,1-1/time_length_,time_length_)
+ iftime_length_%2==0:
+ _wavelet_fft=(
+ _wavelet_fft
+ *np.exp(1j*-om*(time_length_+1)/2)
+ *np.sign(np.pi-om)
+ )
+ else:
+ _wavelet_fft=_wavelet_fft*np.exp(1j*-om*(time_length_+1)/2)
+
+ # here we should be able to automate the tiling without assuming extra dimensions of wave
+ X_=np.tile(
+ np.expand_dims(np.fft.fft(x_),(-3,-2)),
+ (1,order_length,freq_length,1),
+ )
+
+ # finally the transform; return precision of input `x``; central part only
+ complex_dtype=np.cdoubleifx.dtype==np.singleelsenp.csingle
+ wtx=np.fft.ifft(X_*np.conj(_wavelet_fft)).astype(complex_dtype)
+ wtx=wtx[...,index]
+
+ # reposition the time axis if needed from axis -1
+ iftime_axis!=-1:
+ wtx=np.moveaxis(wtx,-1,time_axis)
+
+ # remove extra dimensions if needed
+ wtx=np.squeeze(wtx)
+
+ returnwtx
+
+
+
+
+[docs]
+defmorse_wavelet(
+ length:int,
+ gamma:float,
+ beta:float,
+ radian_frequency:np.ndarray,
+ order:Optional[int]=1,
+ normalization:Optional[str]="bandpass",
+)->Tuple[np.ndarray,np.ndarray]:
+"""
+ Compute the generalized Morse wavelets of Olhede and Walden (2002), doi: 10.1109/TSP.2002.804066.
+
+ Parameters
+ ----------
+ length : int
+ Length of the wavelets.
+ gamma : float
+ Gamma parameter of the wavelets.
+ beta : float
+ Beta parameter of the wavelets.
+ radian_frequency : np.ndarray
+ The radian frequencies at which the Fourier transform of the wavelets
+ reach their maximum amplitudes. radian_frequency is between 0 and 2 * np.pi * 0.5,
+ the normalized Nyquist radian frequency.
+ order : int, optional
+ Order of wavelets, default is 1.
+ normalization : str, optional
+ Normalization for the ``wavelet`` output. By default it is assumed to be ``"bandpass"``
+ which uses a bandpass normalization, meaning that the FFT of the wavelets
+ have peak value of 2 for all central frequencies ``radian_frequency``. The other option is
+ ``"energy"``which uses the unit energy normalization. In this last case, the time-domain wavelet
+ energies ``np.sum(np.abs(wave)**2)`` are always unity.
+
+ Returns
+ -------
+ wavelet : np.ndarray
+ Time-domain wavelets with shape (order, radian_frequency, length).
+ wavelet_fft: np.ndarray
+ Frequency-domain wavelets with shape (order, radian_frequency, length).
+
+ Examples
+ --------
+ Compute a Morse wavelet with gamma parameter 3, beta parameter 4, at radian
+ frequency 0.2 cycles per unit time:
+
+ >>> wavelet, wavelet_fft = morse_wavelet(1024, 3, 4, np.array([2*np.pi*0.2]))
+ >>> np.shape(wavelet)
+ (1, 1, 1024)
+
+ Compute a suite of Morse wavelets with gamma parameter 3, beta parameter 4, up to order 3,
+ at radian frequencies 0.2 and 0.3 cycles per unit time:
+
+ >>> wavelet, wavelet_fft = morse_wavelet(1024, 3, 4, np.array([2*np.pi*0.2, 2*np.pi*0.3]), order=3)
+ >>> np.shape(wavelet)
+ (3, 2, 1024)
+
+ Compute a Morse wavelet specifying an energy normalization :
+ >>> wavelet, wavelet_fft = morse_wavelet(1024, 3, 4, np.array([2*np.pi*0.2]), normalization="energy")
+
+ Raises
+ ------
+ ValueError
+ If normalization optional argument is not in ["bandpass", "energy"]``.
+
+ See Also
+ --------
+ :func:`wavelet_transform`, :func:`morse_wavelet_transform`, :func:`morse_freq`, :func:`morse_logspace_freq`, :func:`morse_amplitude`, :func:`morse_properties`
+ """
+ # ad test for radian_frequency being a numpy array
+ # initialization
+ wavelet=np.zeros((length,order,len(radian_frequency)),dtype=np.cdouble)
+ waveletfft=np.zeros((length,order,len(radian_frequency)),dtype=np.cdouble)
+
+ # call to morse_wavelet take only gamma and be as float, no array
+ fo,_,_=morse_freq(gamma,beta)
+ foriinrange(len(radian_frequency)):
+ wavelet_tmp=np.zeros((length,order),dtype=np.cdouble)
+ waveletfft_tmp=np.zeros((length,order),dtype=np.cdouble)
+
+ # wavelet frequencies
+ fact=np.abs(radian_frequency[i])/fo
+ # norm_radian_frequency first dim is n points
+ norm_radian_frequency=(
+ 2*np.pi*np.linspace(0,1-1/length,length)/fact
+ )
+ ifnormalization=="energy":
+ withnp.errstate(divide="ignore"):
+ waveletzero=np.exp(
+ beta*np.log(norm_radian_frequency)
+ -norm_radian_frequency**gamma
+ )
+ elifnormalization=="bandpass":
+ ifbeta==0:
+ waveletzero=2*np.exp(-(norm_radian_frequency**gamma))
+ else:
+ withnp.errstate(divide="ignore"):
+ waveletzero=2*np.exp(
+ -beta*np.log(fo)
+ +fo**gamma
+ +beta*np.log(norm_radian_frequency)
+ -norm_radian_frequency**gamma
+ )
+ else:
+ raiseValueError(
+ "Normalization option (norm) must be one of 'energy' or 'bandpass'."
+ )
+ waveletzero[0]=0.5*waveletzero[0]
+ # Replace NaN with zeros in waveletzero
+ waveletzero=np.nan_to_num(waveletzero,copy=False,nan=0.0)
+ # second family is never used
+ waveletfft_tmp=_morse_wavelet_first_family(
+ fact,
+ gamma,
+ beta,
+ norm_radian_frequency,
+ waveletzero,
+ order=order,
+ normalization=normalization,
+ )
+ waveletfft_tmp=np.nan_to_num(waveletfft_tmp,posinf=0,neginf=0)
+ # shape of waveletfft_tmp is points, order
+ # center wavelet
+ norm_radian_frequency_mat=np.tile(
+ np.expand_dims(norm_radian_frequency,-1),(order)
+ )
+ waveletfft_tmp=waveletfft_tmp*np.exp(
+ 1j*norm_radian_frequency_mat*(length+1)/2*fact
+ )
+ # time domain waveletlet
+ wavelet_tmp=np.fft.ifft(waveletfft_tmp,axis=0)
+ ifradian_frequency[i]<0:
+ wavelet[:,:,i]=np.conj(wavelet_tmp)
+ waveletfft_tmp[1:-1,:]=np.flip(waveletfft_tmp[1:-1,:],axis=0)
+ waveletfft[:,:,i]=waveletfft_tmp
+ else:
+ waveletfft[:,:,i]=waveletfft_tmp
+ wavelet[:,:,i]=wavelet_tmp
+
+ # reorder dimension to be (order, frequency, time steps)
+ # enforce length 1 for first axis if order=1 (no squeezing)
+ wavelet=np.moveaxis(wavelet,[0,1,2],[2,0,1])
+ waveletfft=np.moveaxis(waveletfft,[0,1,2],[2,0,1])
+
+ returnwavelet,waveletfft
+[docs]
+defmorse_freq(
+ gamma:Union[np.ndarray,float],
+ beta:Union[np.ndarray,float],
+)->Union[Tuple[np.ndarray],Tuple[float]]:
+"""
+ Frequency measures for generalized Morse wavelets. This functions calculates
+ three different measures fm, fe, and fi of the frequency of the lowest-order generalized Morse
+ wavelet specified by parameters ``gamma`` and ``beta``.
+
+ Note that all frequency quantities here are in *radian* as in cos(f t) and not
+ cyclic as in np.cos(2 np.pi f t).
+
+ For ``beta=0``, the corresponding wavelet becomes an analytic lowpass filter, and fm
+ is not defined in the usual way but as the point at which the filter has decayed
+ to one-half of its peak power.
+
+ For details see Lilly and Olhede (2009), doi: 10.1109/TSP.2008.2007607.
+
+ Parameters
+ ----------
+ gamma : np.ndarray or float
+ Gamma parameter of the wavelets.
+ beta : np.ndarray or float
+ Beta parameter of the wavelets.
+
+ Returns
+ -------
+ fm : np.ndarray
+ The modal or peak frequency.
+ fe : np.ndarray
+ The energy frequency.
+ fi : np.ndarray
+ The instantaneous frequency at the wavelets' centers.
+
+ Examples
+ --------
+ >>> fm, fe, fi = morse_freq(3, 4)
+
+ >>> morse_freq(3, 4)
+ (array(1.10064242), 1.1025129235952809, 1.1077321674324723)
+
+ >>> morse_freq(3, np.array([10, 20, 30]))
+ (array([1.49380158, 1.88207206, 2.15443469]),
+ array([1.49421505, 1.88220264, 2.15450116]),
+ array([1.49543843, 1.88259299, 2.15470024]))
+
+ >>> morse_freq(np.array([3, 4, 5]), np.array([10, 20, 30]))
+ (array([1.49380158, 1.49534878, 1.43096908]),
+ array([1.49421505, 1.49080278, 1.4262489 ]),
+ array([1.49543843, 1.48652036, 1.42163583]))
+
+ >>> morse_freq(np.array([3, 4, 5]), 10)
+ (array([1.49380158, 1.25743343, 1.14869835]),
+ array([1.49421505, 1.25000964, 1.13759731]),
+ array([1.49543843, 1.24350315, 1.12739747]))
+
+ See Also
+ --------
+ :func:`morse_wavelet`, :func:`morse_amplitude`
+ """
+ withnp.errstate(divide="ignore"):# ignore warning when beta=0
+ fm=np.where(
+ beta==0,
+ np.log(2)**(1/gamma),
+ np.exp((1/gamma)*(np.log(beta)-np.log(gamma))),
+ )
+
+ fe=(
+ 1
+ /(2**(1/gamma))
+ *_gamma((2*beta+2)/gamma)
+ /_gamma((2*beta+1)/gamma)
+ )
+
+ fi=_gamma((beta+2)/gamma)/_gamma((beta+1)/gamma)
+
+ returnfm,fe,fi
+
+
+
+
+[docs]
+defmorse_logspace_freq(
+ gamma:float,
+ beta:float,
+ length:int,
+ highset:Optional[Tuple[float]]=(0.1,np.pi),
+ lowset:Optional[Tuple[float]]=(5,0),
+ density:Optional[int]=4,
+)->np.ndarray:
+"""
+ Compute logarithmically-spaced frequencies for generalized Morse wavelets
+ with parameters gamma and beta. This is a useful function to obtain the frequencies
+ needed for time-frequency analyses using wavelets. If ``radian_frequencies`` is the
+ output, ``np.log(radian_frequencies)`` is uniformly spaced, following convention
+ for wavelet analysis. See Lilly (2017), doi: 10.1098/rspa.2016.0776.
+
+ Default settings to compute the frequencies can be changed by passing optional
+ arguments ``lowset``, ``highset``, and ``density``. See below.
+
+ Parameters
+ ----------
+ gamma : float
+ Gamma parameter of the Morse wavelets.
+ beta : float
+ Beta parameter of the Morse wavelets.
+ length : int
+ Length of the Morse wavelets and input signals.
+ highset : tuple of floats, optional.
+ Tuple of values (eta, high) used for high-frequency cutoff calculation. The highest
+ frequency is set to be the minimum of a specified value and a cutoff frequency
+ based on a Nyquist overlap condition: the highest frequency is the minimum of
+ the specified value high, and the largest frequency for which the wavelet will
+ satisfy the threshold level eta. Here eta be a number between zero and one
+ specifying the ratio of a frequency-domain wavelet at the Nyquist frequency
+ to its peak value. Default is (eta, high) = (0.1, np.pi).
+ lowset : tuple of floats, optional.
+ Tupe of values (P, low) set used for low-frequency cutoff calculation based on an
+ endpoint overlap condition. The lowest frequency is set such that the lowest-frequency
+ wavelet will reach some number P, called the packing number, times its central window
+ width at the ends of the time series. A choice of P=1 corresponds to roughly 95% of
+ the time-domain wavelet energy being contained within the time series endpoints for
+ a wavelet at the center of the domain. The second value of the tuple is the absolute
+ lowest frequency. Default is (P, low) = (5, 0).
+ density : int, optional
+ This optional argument controls the number of points in the returned frequency
+ array. Higher values of ``density`` mean more overlap in the frequency
+ domain between transforms. When ``density=1``, the peak of one wavelet is located at the
+ half-power points of the adjacent wavelet. The default ``density=4`` means
+ that four other wavelets will occur between the peak of one wavelet and
+ its half-power point.
+
+ Returns
+ -------
+ radian_frequency : np.ndarray
+ Logarithmically-spaced frequencies in radians cycles per unit time,
+ sorted in descending order.
+
+ Examples
+ --------
+ Generate a frequency array for the generalized Morse wavelet
+ with parameters gamma=3 and beta=5 for a time series of length n=1024:
+
+ >>> radian_frequency = morse_logspace_freq(3, 5, 1024)
+ >>> radian_frequency = morse_logspace_freq(3, 5, 1024, highset=(0.2, np.pi), lowset=(5, 0))
+ >>> radian_frequency = morse_logspace_freq(3, 5, 1024, highset=(0.2, np.pi), lowset=(5, 0), density=10)
+
+ See Also
+ --------
+ :func:`morse_wavelet`, :func:`morse_freq`, :func:`morse_properties`
+ """
+ gamma_=np.array([gamma])
+ beta_=np.array([beta])
+ width,_,_=morse_properties(gamma_,beta_)
+
+ _high=_morsehigh(gamma_,beta_,highset[0])
+ high_=np.min(np.append(_high,highset[1]))
+
+ low=2*np.sqrt(2)*width*lowset[0]/length
+ low_=np.max(np.append(low,lowset[1]))
+
+ r=1+1/(density*width)
+ m=np.floor(np.log10(high_/low_)/np.log10(r)).astype(int)[0]
+ radian_frequency=high_*np.ones(int(m+1))/r**np.arange(0,m+1)
+
+ returnradian_frequency
+
+
+
+def_morsehigh(
+ gamma:np.ndarray,
+ beta:np.ndarray,
+ eta:float,
+)->Union[np.ndarray,float]:
+"""High-frequency cutoff of the generalized Morse wavelets.
+ gamma and be should be arrays of the same length. Internal use only.
+ """
+ m=10000
+ omhigh=np.linspace(0,np.pi,m)
+ f=np.zeros_like(gamma,dtype="float")
+
+ foriinrange(0,len(gamma)):
+ fm,_,_=morse_freq(gamma[i],beta[i])
+ withnp.errstate(all="ignore"):
+ om=fm*np.pi/omhigh
+ lnwave1=beta[i]/gamma[i]*np.log(np.exp(1)*gamma[i]/beta[i])
+ lnwave2=beta[i]*np.log(om)-om**gamma[i]
+ lnwave=lnwave1+lnwave2
+ index=np.nonzero(np.log(eta)-lnwave<0)[0][0]
+ f[i]=omhigh[index]
+
+ returnf
+
+
+
+[docs]
+defmorse_properties(
+ gamma:Union[np.ndarray,float],
+ beta:Union[np.ndarray,float],
+)->Union[Tuple[np.ndarray],Tuple[float]]:
+"""
+ Calculate the properties of the demodulated generalized Morse wavelets.
+ See Lilly and Olhede (2009), doi: 10.1109/TSP.2008.2007607.
+
+ Parameters
+ ----------
+ gamma : np.ndarray or float
+ Gamma parameter of the wavelets.
+ beta : np.ndarray or float
+ Beta parameter of the wavelets.
+
+ Returns
+ -------
+ width : np.ndarray or float
+ Dimensionless time-domain window width of the wavelets.
+ skew : np.ndarray or float
+ Imaginary part of normalized third moment of the time-domain demodulate,
+ or 'demodulate skewness'.
+ kurt : np.ndarray or float
+ Normalized fourth moment of the time-domain demodulate,
+ or 'demodulate kurtosis'.
+
+ Examples
+ --------
+ TODO
+
+ See Also
+ --------
+ :func:`morse_wavelet`, :func:`morse_freq`, :func:`morse_amplitude`, :func:`morse_logspace_freq`.
+ """
+ # test common size? or could be broadcasted
+ width=np.sqrt(gamma*beta)
+ skew=(gamma-3)/width
+ kurt=3-skew**2-2/width**2
+
+ returnwidth,skew,kurt
+
+
+
+
+[docs]
+defmorse_amplitude(
+ gamma:Union[np.ndarray,float],
+ beta:Union[np.ndarray,float],
+ order:Optional[np.int64]=1,
+ normalization:Optional[str]="bandpass",
+)->float:
+"""
+ Calculate the amplitude coefficient of the generalized Morse wavelets.
+ By default, the amplitude is calculated such that the maximum of the
+ frequency-domain wavelet is equal to 2, which is the bandpass normalization.
+ Optionally, specify ``normalization="energy"`` in order to return the coefficient
+ giving the wavelets unit energies. See Lilly and Olhede (2009), doi doi: 10.1109/TSP.2008.2007607.
+
+ Parameters
+ ----------
+ gamma : np.ndarray or float
+ Gamma parameter of the wavelets.
+ beta : np.ndarray or float
+ Beta parameter of the wavelets.
+ order : int, optional
+ Order of wavelets, default is 1.
+ normalization : str, optional
+ Normalization for the wavelets. By default it is assumed to be ``"bandpass"``
+ which uses a bandpass normalization, meaning that the FFT of the wavelets
+ have peak value of 2 for all central frequencies ``radian_frequency``. The other option is ``"energy"``
+ which uses the unit energy normalization. In this last case the time-domain wavelet
+ energies ``np.sum(np.abs(wave)**2)`` are always unity.
+
+ Returns
+ -------
+ amp : np.ndarray or float
+ The amplitude coefficient of the wavelets.
+
+ Examples
+ --------
+ TODO
+
+ See Also
+ --------
+ :func:`morse_wavelet`, :func:`morse_freq`, :func:`morse_properties`, :func:`morse_logspace_freq`.
+ """
+ # add test for type and shape in case of ndarray
+ ifnormalization=="energy":
+ r=(2*beta+1)/gamma
+ amp=(
+ 2
+ *np.pi
+ *gamma
+ *(2**r)
+ *np.exp(_lgamma(order)-_lgamma(order+r-1))
+ )**0.5
+ elifnormalization=="bandpass":
+ fm,_,_=morse_freq(gamma,beta)
+ amp=np.where(beta==0,2,2/(np.exp(beta*np.log(fm)-fm**gamma)))
+ else:
+ raiseValueError(
+ "Normalization option (normalization) must be one of 'energy' or 'bandpass'."
+ )
+
+ returnamp