monet.util.tools

Functions

calc_24hr_ave(df[, col])

Calculate 24-hour averages.

calc_3hr_ave(df[, col])

Calculate 3-hour averages.

calc_8hr_rolling_max(df[, col, window])

Calculate 8-hour rolling maximum values.

calc_annual_ave(df[, col])

Calculate annual averages.

findclosest(list, value)

Find the index and value of the closest element to a target value.

get_epa_region_bounds([index, acronym])

Get lat/lon boundaries for an EPA region.

get_epa_region_df(df)

Add EPA region information to DataFrame based on lat/lon.

get_giorgi_region_bounds([index, acronym])

Get lat/lon boundaries for a Giorgi region.

get_giorgi_region_df(df)

Add Giorgi region index and acronym to DataFrame based on lat/lon.

get_relhum(temp, press, vap)

Calculate relative humidity from temperature, pressure and vapor pressure.

kolmogorov_zurbenko_filter(df, col, window, ...)

Apply a Kolmogorov-Zurbenko filter to a specific column in a DataFrame.

linregress(x, y)

Perform a linear regression using statsmodels.

long_to_wide(df)

Convert a DataFrame from long (stacked) to wide format.

search_listinlist(array1, array2)

Find matching indices between two arrays.

wsdir2uv(ws, wdir)

Convert wind speed and direction to U and V components.

monet.util.tools.calc_24hr_ave(df, col=None)

Calculate 24-hour averages.

Parameters:
  • df (pandas.DataFrame) – Input data with ‘time_local’ and ‘siteid’ columns

  • col (str) – Column name to average

Returns:

DataFrame with added column containing daily averages

Return type:

pandas.DataFrame

monet.util.tools.calc_3hr_ave(df, col=None)

Calculate 3-hour averages.

Parameters:
  • df (pandas.DataFrame) – Input data with ‘time_local’ and ‘siteid’ columns

  • col (str) – Column name to average

Returns:

DataFrame with added column containing 3-hour averages

Return type:

pandas.DataFrame

monet.util.tools.calc_8hr_rolling_max(df, col=None, window=None)

Calculate 8-hour rolling maximum values.

Parameters:
  • df (pandas.DataFrame) – Input data with ‘time_local’ and ‘siteid’ columns

  • col (str) – Column name to calculate rolling max for

  • window (int) – Rolling window size in hours

Returns:

DataFrame with added column containing 8-hour maxima

Return type:

pandas.DataFrame

monet.util.tools.calc_annual_ave(df, col=None)

Calculate annual averages.

Parameters:
  • df (pandas.DataFrame) – Input data with ‘time_local’ and ‘siteid’ columns

  • col (str) – Column name to average

Returns:

DataFrame with added column containing annual averages

Return type:

pandas.DataFrame

monet.util.tools.findclosest(list, value)

Find the index and value of the closest element to a target value.

Parameters:
  • list (list-like) – Collection of values to search through.

  • value (float or int) – The target value to find the closest match to.

Returns:

(index, closest_value) where: - index is the position in the list of the closest value - closest_value is the value from the list that is closest to the target

Return type:

tuple

monet.util.tools.get_epa_region_bounds(index=None, acronym=None)

Get lat/lon boundaries for an EPA region.

Parameters:
  • index (int, optional) – EPA region number

  • acronym (str, optional) – EPA region acronym

Returns:

[latmin, lonmin, latmax, lonmax, acronym]

Return type:

list

monet.util.tools.get_epa_region_df(df)

Add EPA region information to DataFrame based on lat/lon.

Parameters:

df (pandas.DataFrame) – DataFrame containing ‘latitude’ and ‘longitude’ columns

Returns:

Input DataFrame with added EPA region columns

Return type:

pandas.DataFrame

monet.util.tools.get_giorgi_region_bounds(index=None, acronym=None)

Get lat/lon boundaries for a Giorgi region.

Giorgi regions are geographical regions defined for climate studies. Returns bounds for a region specified by index number or acronym.

Parameters:
  • index (int, optional) – Region index number (1-22)

  • acronym (str, optional) – Region acronym (e.g. ‘NAU’, ‘SAU’, etc)

Returns:

Array containing [latmin, lonmin, latmax, lonmax, acronym]

Return type:

numpy.ndarray

Notes

Either index or acronym must be provided. For region definitions see: https://web.northeastern.edu/sds/web/demsos/images_002/subregions.jpg

monet.util.tools.get_giorgi_region_df(df)

Add Giorgi region index and acronym to DataFrame based on lat/lon.

Parameters:

df (pandas.DataFrame) – DataFrame containing ‘latitude’ and ‘longitude’ columns

Returns:

Input DataFrame with added columns: - GIORGI_INDEX: region index number - GIORGI_ACRO: region acronym

Return type:

pandas.DataFrame

monet.util.tools.get_relhum(temp, press, vap)

Calculate relative humidity from temperature, pressure and vapor pressure.

Parameters:
  • temp (array-like) – Temperature in Kelvin

  • press (array-like) – Pressure in hPa/mb

  • vap (array-like) – Vapor pressure in hPa/mb

Returns:

Relative humidity as a percentage (0-100)

Return type:

array-like

monet.util.tools.kolmogorov_zurbenko_filter(df, col, window, iterations)

Apply a Kolmogorov-Zurbenko filter to a specific column in a DataFrame.

A Kolmogorov-Zurbenko filter is a low-pass filter created by iteratively applying a moving average of specified window length. This implementation applies the filter to a DataFrame grouped by site ID.

Parameters:
  • df (pandas.DataFrame) – DataFrame containing the data to filter.

  • col (str) – Column name to apply the filter to.

  • window (int) – Size of the moving average window.

  • iterations (int) – Number of times to apply the moving average filter.

Returns:

DataFrame with original data and filtered values merged in.

Return type:

pandas.DataFrame

monet.util.tools.linregress(x, y)

Perform a linear regression using statsmodels.

Parameters:
  • x (array-like) – Independent variable values.

  • y (array-like) – Dependent variable values.

Returns:

(slope, intercept, r_squared, standard_error) where: - slope is the regression line slope - intercept is the regression line y-intercept - r_squared is the coefficient of determination - standard_error is the standard error of the residuals

Return type:

tuple

monet.util.tools.long_to_wide(df)

Convert a DataFrame from long (stacked) to wide format.

Parameters:

df (pandas.DataFrame) – DataFrame in long format with ‘time’, ‘siteid’, ‘variable’, ‘obs’, and ‘units’ columns

Returns:

DataFrame in wide format with variables as columns

Return type:

pandas.DataFrame

monet.util.tools.search_listinlist(array1, array2)

Find matching indices between two arrays.

Parameters:
  • array1 (numpy.ndarray) – First array to search for matches

  • array2 (numpy.ndarray) – Second array to search for matches

Returns:

(index1, index2) containing: - index1: sorted array of indices in array1 where matches were found - index2: sorted array of indices in array2 where matches were found

Return type:

tuple

monet.util.tools.wsdir2uv(ws, wdir)

Convert wind speed and direction to U and V components.

Parameters:
  • ws (array-like) – Wind speed values.

  • wdir (array-like) – Wind direction values in degrees (meteorological convention: 0=North, 90=East).

Returns:

(u, v) where: - u is the zonal wind component (positive for eastward wind) - v is the meridional wind component (positive for northward wind)

Return type:

tuple