Forecasting the change of sea ice extent globally

Phuong Chau
6 min readDec 9, 2020

Background: What is Sea Ice? Why is it important to study?

Photo by Tapio Haaja on Unsplash

According to the Snow and Sea Ice Data Center, sea ice is frozen ocean water that grows during the winter months and melts during the summer months. Some sea ice exists all year in certain areas of both the Arctic and the Antarctic regions. In the Northern Hemisphere, it can exist as far south as Bohai Bay, China; and in the Southern Hemisphere, sea ice can develop only around Antarctica.

Sea ice’s bright surfaces do not absorb much solar energy, which keeps the temperatures of polar regions relatively cold. As global warming occurs, the sea ice starts to melt at a faster pace, making its reflective bright surfaces smaller. Therefore, the increased exposed ocean absorbs more solar energy causing the oceanic water becomes warmer. This cycle of warming and melting sea ice can lead to a significant warming in polar regions, even with a small increase in atmosphere temperature. However, too much sea ice can become an obstacle for travelling, hunting of animals, and people living in the polar regions.

In addition, sea ice makes the sea water below them have a higher concentration of salt and causes the ice to sink faster. This cold and dense water moves along the ocean bottom toward the equator, while warm water travels from the equator toward the poles. These new patterns can disrupt normal ocean circulation, creating a disruption in normal ocean circulation, which gradually leads to global warmings (Sea Ice and Global Climate).

Motivation

The project is motivated by our group’s interest in the effects of global warming due to the global increase in carbon dioxide and other greenhouse gas emissions. Specifically, our project uses time series data forecasting models to predict the amount of sea ice extent in the future.

Forecasting models are used to make predictions of the future information, based on a time-series data, which is a sequence of data points collected through repeated measurements over a specific period of time.

We used the data from Sea Ice Index, National Snow and Ice Data Center. This data includes the total 492 observations of the monthly average sea ice extent, in millions of square kilometers, from January 1979 to December 2019 for the Northern Hemisphere (NH).

Data overview

For this project, we did our analysis using the R statistical software, in particular the fpp3 package. Therefore, we converted the original dataset from the National Snow and Ice Data Center into the format that can be read and executed by R package.

Figure 1: Sea ice extent (measured in millions of squared kilometers) from December 1979 to December 2020

We first visualized the general pattern of the sea ice extent, measured in millions of square kilometers, over time. In Figure 1, we see that the sea ice extent varies consistently over the years, which means that our dataset has a yearly seasonality. Overall, the sea ice extent seems to decrease over years. These variations could be caused by errors or trends.

Methodology and Results

In order to select the best forecasting model for our dataset, we decided to assess different possible exponential smoothing models provided in Forecasting: Principles and Practice and select the best one for our dataset.

In exponential smoothing models, the predictions are created based on a weighted sum of past observations, in which each weight is exponentially decreasing as the observations get older. In other words, the more recent the observations, the more important the observation is in predicting the future value.

Any type of exponential smoothing models can be created by the decomposition of three main components: error, trend, and variation:

  • If the dataset only has a trend, it means that the data points are consistently increasing or decreasing over time.
  • If the dataset only has a seasonality, it means that the dataset has a certain variation that is repeated every short-term cycle.
  • If the dataset only has errors (remainders), it means that the variations of the data points over time are caused by randomness.

In order to understand the variety of patterns exhibited by our time-series data, we can split a time series into three components, each representing an underlying pattern category. In Figure 2, the top graph in Figure 2 shows the monthly sea ice extent pattern; the last three graphs show the decompositions of our sea ice extent: trend (“trend”), seasonality (“season_year”), and error (“remainder”)

Figure 2: The amount of sea ice extent over monthly time (top) and its three additive components.

When we look at the trend and remainder, they both have a smaller variation ranges than the seasonality variation range. Therefore, Figure 2 once again shows the significance of yearly seasonality in our time series data as we predicted from Figure 1.

We used the sea ice extent values from November 1979 to November 2006 to predict the sea ice extent variation from November 2006 to November 2010, so we can compare our predictions with the actual data provided in our original dataset and calculate their prediction error. After assessing all the possible exponential smoothing methods, we were able to select the best model for our time-series data that has the least prediction error: exponential smoothing method with seasonality and error.

Figure 3: Forecasting sea ice extent using the exponential smoothing method with error and seasonality, along with along with 80% and 90% prediction intervals (shown in dark and light blue shaded regions).

Figure 3 shows the actual sea ice extent monthly variation from 1979–2010 in black line and the forecasted sea ice extent values from 2006 to 2010 in blue line, using our exponential smoothing model. When plotted, the prediction intervals are shown as shaded region, with the strength of color indicating the probability associated with the interval: 80% prediction interval (lighter blue shade) and 95% (darker blue shade) prediction interval. From Figure 3, we see that the blue and black lines overlap each other and are within the shaded regions. This means that our model does capture the significant feature of the original dataset: seasonal pattern, with peaks observed in February each year, corresponding to the sea ice extent that grows over the winter months.

Key takeaways:

I would like to thank Prof. Albert Kim for your amazing lectures and feedbacks and for being very supportive! I learned a lot and benefited from your course! Thank you!

Here are a few skills/knowledges/tips that I have gained after this course:

  • I have definitely learned a lot about different models for time-series forecasting, how to engineer the data to fit the model, and how to use Github features from the class including branches, forks, pull requests, and many more.
  • I was able to understand the importance of a Minimum Viable Product (MVP): “Done is better than perfect” (quoted from Prof. Kim). This statement has helped me to overcome my perfectionism by having things done early even it is not perfect yet. It also helped me to overcome my fear and be more willing to ask for feedbacks even with “very bad” first draft.
  • I was able to determine and acknowledge the obstacles that are “holding you back” to not be afraid of them and then overcome them.
  • Collaborating from three different places, three different time zones could be very difficult and very different to anything I had ever done before. However, due to the the technology (Zoom, Slack, Github, etc), there was no problems in our group. Adaptability and understanding of any changing circumstances, especially during this pandemics, have helped our team to find the solutions and accommodations for any changes faster and more efficient.

--

--