{ "cells": [ { "cell_type": "markdown", "id": "experienced-shoot", "metadata": {}, "source": [ "# Example Python workflow for shuffling the climate data with tests to check whether shuffling worked" ] }, { "cell_type": "markdown", "id": "guilty-basics", "metadata": {}, "source": [ "This notebook shows how you can shuffle the climate data (here for simplicity done on yearly basis). In addition, it produces test files of shuffled climate data for two glaciers (using nearest gridpoints of these glaciers). \n", "\n", "we choose these glaciers for testing:\n", "- **RGI60-11.00897**: Hintereisferner (lon: 10.758, lat: 46.800)\n", " - nearest gridpoint from isimip3b: (10.75, 46.75)\n", "- **RGI60-16.02207**: Shallap Glacier (lon: -9.486, lat: -77.334)\n", " - nearest gridpoint from isimip3b: (-9.25, -77.25)\n", "\n", "**Please check in your workflow if your shuffling works by testing if you get the same annual time series of shuffled climate time series for the two glaciers. Note: no weighting per month duration is performed for the annual mean)!**\n", "\n", "(we only check temperature shuffling with the ssp585 scenario and the ipsl-cm6a-lr gcm)\n", "\n", "test files for shuffled climate:\n", "- `test_shuffling/test_RGI60-11.00897_ipsl-cm6a-lr_ssp585_tasAdjust_shuffled.csv`\n", "- `test_shuffling/test_RGI60-16.02207_ipsl-cm6a-lr_ssp585_tasAdjust_shuffled.csv`\n", "---" ] }, { "cell_type": "code", "execution_count": 1, "id": "floating-disposition", "metadata": {}, "outputs": [], "source": [ "# import these packages \n", "import xarray as xr\n", "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "id": "talented-hearing", "metadata": {}, "source": [ "let's take the *ipsl-cm6a-lr* gcm, the *ssp585* scenario and temperature as an example!" ] }, { "cell_type": "code", "execution_count": 2, "id": "eleven-photographer", "metadata": {}, "outputs": [], "source": [ "gcm = 'ipsl-cm6a-lr'\n", "scenario = 'ssp585'\n", "typ = 'tasAdjust'\n", "glacier = 'RGI60-11.00897' \n", "# we also run the workflow for the Shallap glacier\n", "# just run the notebook instead with:\n", "# glacier = 'RGI60-16.02207'" ] }, { "cell_type": "code", "execution_count": 3, "id": "banned-hospital", "metadata": {}, "outputs": [], "source": [ "# take the right gridpoint!\n", "if glacier == 'RGI60-11.00897':\n", " # Hintereisferner\n", " lon, lat = (10.758, 46.800)\n", "elif glacier == 'RGI60-16.02207':\n", " # Shallap glacier\n", " lon, lat = (-77.334, -9.486)" ] }, { "cell_type": "code", "execution_count": 4, "id": "concerned-management", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
1851-18701901-19201951-19701995-20142021-20402041-20602061-20802081-2100
simulation_years
018611909195720002034204920732081
118581918195920132036204520722094
218521903196920052032205420652098
318621906196319962039205320612084
418681904197020102021204320632090
...........................
499518671905196220082024205320782088
499618521907196820052027205220732081
499718661913196720122026204920672093
499818631902195719982040205720652090
499918701919196519952030206020642082
\n", "

5000 rows × 8 columns

\n", "
" ], "text/plain": [ " 1851-1870 1901-1920 1951-1970 1995-2014 2021-2040 \\\n", "simulation_years \n", "0 1861 1909 1957 2000 2034 \n", "1 1858 1918 1959 2013 2036 \n", "2 1852 1903 1969 2005 2032 \n", "3 1862 1906 1963 1996 2039 \n", "4 1868 1904 1970 2010 2021 \n", "... ... ... ... ... ... \n", "4995 1867 1905 1962 2008 2024 \n", "4996 1852 1907 1968 2005 2027 \n", "4997 1866 1913 1967 2012 2026 \n", "4998 1863 1902 1957 1998 2040 \n", "4999 1870 1919 1965 1995 2030 \n", "\n", " 2041-2060 2061-2080 2081-2100 \n", "simulation_years \n", "0 2049 2073 2081 \n", "1 2045 2072 2094 \n", "2 2054 2065 2098 \n", "3 2053 2061 2084 \n", "4 2043 2063 2090 \n", "... ... ... ... \n", "4995 2053 2078 2088 \n", "4996 2052 2073 2081 \n", "4997 2049 2067 2093 \n", "4998 2057 2065 2090 \n", "4999 2060 2064 2082 \n", "\n", "[5000 rows x 8 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# get the shuffled year key:\n", "pd_shuffled_yrs = pd.read_csv('shuffled_years_GlacierMIP3.csv', index_col=0)\n", "pd_shuffled_yrs" ] }, { "cell_type": "code", "execution_count": 5, "id": "under-singer", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
1851-18701901-19201951-19701995-20142021-20402041-20602061-20802081-2100
simulation_years
0NaNNaNNaNNaNNaNNaNNaNNaN
1NaNNaNNaNNaNNaNNaNNaNNaN
2NaNNaNNaNNaNNaNNaNNaNNaN
3NaNNaNNaNNaNNaNNaNNaNNaN
4NaNNaNNaNNaNNaNNaNNaNNaN
...........................
4995NaNNaNNaNNaNNaNNaNNaNNaN
4996NaNNaNNaNNaNNaNNaNNaNNaN
4997NaNNaNNaNNaNNaNNaNNaNNaN
4998NaNNaNNaNNaNNaNNaNNaNNaN
4999NaNNaNNaNNaNNaNNaNNaNNaN
\n", "

5000 rows × 8 columns

\n", "
" ], "text/plain": [ " 1851-1870 1901-1920 1951-1970 1995-2014 2021-2040 \\\n", "simulation_years \n", "0 NaN NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN NaN \n", "... ... ... ... ... ... \n", "4995 NaN NaN NaN NaN NaN \n", "4996 NaN NaN NaN NaN NaN \n", "4997 NaN NaN NaN NaN NaN \n", "4998 NaN NaN NaN NaN NaN \n", "4999 NaN NaN NaN NaN NaN \n", "\n", " 2041-2060 2061-2080 2081-2100 \n", "simulation_years \n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "... ... ... ... \n", "4995 NaN NaN NaN \n", "4996 NaN NaN NaN \n", "4997 NaN NaN NaN \n", "4998 NaN NaN NaN \n", "4999 NaN NaN NaN \n", "\n", "[5000 rows x 8 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# template of shuffled climate values (that will be filled afterwards)\n", "periods = pd_shuffled_yrs.columns\n", "simulation_years = pd_shuffled_yrs.index # from 0 to 4999\n", "pd_empty_clim_template = pd.DataFrame(np.NaN, columns=periods, index=simulation_years)\n", "pd_empty_clim_template" ] }, { "cell_type": "code", "execution_count": 6, "id": "french-remark", "metadata": {}, "outputs": [], "source": [ "# open the right climate file\n", "if gcm in ['gfdl-esm4', 'ipsl-cm6a-lr', 'mpi-esm1-2-hr', 'mri-esm2-0']:\n", " ensemble = 'r1i1p1f1'\n", "elif gcm == 'ukesm1-0-ll':\n", " ensemble = 'r1i1p1f2'\n", "\n", "folder_output = f'isimip3b_{typ}_monthly'\n", "\n", "# Here you have to change the path to the isimip3b data folder\n", "isimip_folder = '/path/to/folder'\n", "\n", "# historical dataset \n", "path_output_tas_hist = f'{isimip_folder}/{folder_output}/{gcm}_{ensemble}_w5e5_historical_{typ}_global_monthly_1850_2014.nc'\n", "ds_tas_monthly_hist = xr.open_dataset(path_output_tas_hist)\n", "\n", "# ssp dataset (you have to change the path to isimip3b)\n", "path_output_tas_ssp = f'{isimip_folder}/{folder_output}/{gcm}_{ensemble}_w5e5_{scenario}_{typ}_global_monthly_2015_2100.nc'\n", "ds_tas_monthly_ssp = xr.open_dataset(path_output_tas_ssp)" ] }, { "cell_type": "code", "execution_count": 7, "id": "alpine-continuity", "metadata": {}, "outputs": [], "source": [ "# select the nearest grid point and get the annual means\n", "# (note: no weighting per month duration is performed for the annual mean)\n", "ds_yearly_hist = ds_tas_monthly_hist.sel(lon=lon, lat=lat, method='nearest').tasAdjust.groupby('time.year').mean()\n", "ds_yearly_ssp = ds_tas_monthly_ssp.sel(lon=lon, lat=lat, method='nearest').tasAdjust.groupby('time.year').mean()\n", "# concat historical with ssp file\n", "ds_yearly_clim = xr.concat([ds_yearly_hist, ds_yearly_ssp], dim='year')" ] }, { "cell_type": "markdown", "id": "dressed-classics", "metadata": {}, "source": [ "now we do the shuffling:" ] }, { "cell_type": "code", "execution_count": 8, "id": "ultimate-provision", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
1851-18701901-19201951-19701995-20142021-20402041-20602061-20802081-2100
simulation_years
0270.509644271.286499271.965332272.543823273.897675275.207062278.033691278.575348
1271.433563270.225555272.337372273.612549274.926422275.436096278.030304279.327179
2271.823822269.993378271.553131273.208374274.084625275.777252277.347626281.086548
3271.572571271.005493271.319458272.492340274.388916274.980591277.015839278.625946
4271.465179270.070770270.637848272.844513273.681488275.692261276.637665280.027466
...........................
4995270.676910269.957367272.018311272.476440272.553009274.980591278.677673279.783203
4996271.823822269.597382270.914856273.208374273.274567275.753815278.033691278.575348
4997269.981049270.118652272.242859273.874298273.564636275.207062276.350616280.937592
4998270.749786270.742859271.965332272.742950274.395111277.058197277.347626280.027466
4999272.592438270.868896272.335754272.216888273.728546276.563049276.505585278.055542
\n", "

5000 rows × 8 columns

\n", "
" ], "text/plain": [ " 1851-1870 1901-1920 1951-1970 1995-2014 2021-2040 \\\n", "simulation_years \n", "0 270.509644 271.286499 271.965332 272.543823 273.897675 \n", "1 271.433563 270.225555 272.337372 273.612549 274.926422 \n", "2 271.823822 269.993378 271.553131 273.208374 274.084625 \n", "3 271.572571 271.005493 271.319458 272.492340 274.388916 \n", "4 271.465179 270.070770 270.637848 272.844513 273.681488 \n", "... ... ... ... ... ... \n", "4995 270.676910 269.957367 272.018311 272.476440 272.553009 \n", "4996 271.823822 269.597382 270.914856 273.208374 273.274567 \n", "4997 269.981049 270.118652 272.242859 273.874298 273.564636 \n", "4998 270.749786 270.742859 271.965332 272.742950 274.395111 \n", "4999 272.592438 270.868896 272.335754 272.216888 273.728546 \n", "\n", " 2041-2060 2061-2080 2081-2100 \n", "simulation_years \n", "0 275.207062 278.033691 278.575348 \n", "1 275.436096 278.030304 279.327179 \n", "2 275.777252 277.347626 281.086548 \n", "3 274.980591 277.015839 278.625946 \n", "4 275.692261 276.637665 280.027466 \n", "... ... ... ... \n", "4995 274.980591 278.677673 279.783203 \n", "4996 275.753815 278.033691 278.575348 \n", "4997 275.207062 276.350616 280.937592 \n", "4998 277.058197 277.347626 280.027466 \n", "4999 276.563049 276.505585 278.055542 \n", "\n", "[5000 rows x 8 columns]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd_shuffle_clim = pd_empty_clim_template.copy()\n", "# get the shuffled climate data for each experiment (time period)\n", "for p in periods:\n", " pd_shuffle_clim[p] = ds_yearly_clim.sel(year=pd_shuffled_yrs[p].values).values\n", "\n", "# test file to check for your workflow\n", "pd_shuffle_clim.to_csv(f'test_shuffling/test_{glacier}_{gcm}_{scenario}_{typ}_shuffled.csv')\n", "pd_shuffle_clim" ] }, { "cell_type": "code", "execution_count": null, "id": "ea54114d-57d0-40d5-9879-c537b1fb840f", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 5 }