Predicting the Indian monsoon has always been a challenge in climate studies due to India’s humid tropical climate and immense variation in local geography. The influence of global climate change also needs to be factored in every year.  Especially when the Indian monsoon is broadly divided into two phases: Early phase (June-July) and Late phase (August – September), each influenced by different factors.

Typically, in all monsoon forecast models, surface sea temperatures (SST) and sea level pressures (SLP) are used as predictors because they are essential contributors to the water cycle. The Indian Meteorological Department (IMD) actually uses 6 such predictors in their model, from specific regions in the Atlantic and Indian Oceans. As a consequence, global climate changes or aberrations that affect the water cycle are often missed.

To overcome this handicap, a collaborative research effort between atmospheric scientists and computer scientists evolved. Scientists from Indian Institute of Science, Bangalore and Indian Institute of Technology – Kharagpur collated SST and SLP values from all across the world for the period 1948-2000. They looked for a deep learning computational algorithm that can process such massive amounts of data and find better predictors for forecast models.

One machine learning algorithm excelled at this task – a stacked autoencoder.

“We tried [implementing] a couple of neural networks,” reminisces Moumita Saha, a computer science Ph.D. student from IIT-Kharagpur and also the first author of this published study. “But compared to them, this [algorithm] performed way better in terms of accuracy and catching the extremes,” she concludes. Their study is the first report of a stacked autoencoder being used to calculate predictors for Indian Monsoon.

A stacked autoencoder algorithm has multiple layers of calculations. Every layer discovers patterns in its input data by combining all the data points non-linearly.  These patterns are calculated and stored in hidden nodes of each layer. Hidden nodes of the first layer serve as input to the second layer, and so the process continues. Greater the number of layers in the designed stack, greater is the complexity of the patterns revealed by the algorithm.

The team’s 3-layer autoencoder algorithm had multiple hidden nodes that represented complex relationships between global SSTs and SLPs. Nodes that exhibited the highest correlation with Indian monsoon in the same period (1948-2000) were then chosen as the final set of predictors. They tested the performance of these predictors for the period 2001-2014 using a regressive-tree based monsoon forecast model for early and late monsoon phases.

It turned out that their model could predict early monsoon with mean absolute error of 6.8% in January, and the late monsoon with mean absolute error of 4.9% in March. Overall, their predicted rainfall was closer to the real long period average (LPA) of rainfall as compared to IMD predictions. In the years 2011 and 2012, IMD had predicted a deficit, but actual rainfall was in excess of the LPA. In contrast, predictions by this new model matched the deficit and excess trend of actual rainfall in every year of the test period.

The team went a few steps further. They created separate autoencoder stacks, each using global sea level pressures and surface sea temperatures independently. From the patterns obtained through these stacks, they derived distinct sets of monsoon predictions. The SLP-based model improved early monsoon prediction while the SST-based model was most accurate in predicting all the extreme cases – both drought and excess.

All these results point to various prediction possibilities using the stacked autoencoder model. So how will this work evolve in the future?

“We are computer scientists,” states Saha on behalf of the IIT-Kharagpur team, “and we see this process in terms of data – we have huge amounts of data and are trying to find patterns in it.” This is a vastly different approach to that of IMD, which typically works with finite physics-based models because their meteorologists know the basic physical processes behind the monsoon. Saha opines that there is much to gain from working together, “We would like to collaborate with meteorologists and try to implement a hybridization of the two approaches.”

Hybrid models can improve overall prediction accuracy and also give a peek into the ever-evolving global influences on the Indian Monsoon. The team also hopes to look at spatial and temporal variations of monsoon across the country by adding local climatic factors to these models. In a rain-fed economy like India, these possibilities could be empowering.


About the scientists:

Moumita Saha is a Ph.D. research scholar at the Department of Computer Science and Engineering, Indian Institute of Technology – Kharagpur. You can write to her at moumita.saha2012@gmail.com

Dr. Pabitra Mitra is an Associate Professor at the Department of Computer Science and Engineering, Indian Institute of Technology – Kharagpur.

Prof. Ravi Nanjundiah is a Professor and Chairman of the Centre for Atmospheric & Oceanic Sciences, Indian Institute of Science, Bangalore. He can be reached at ravi@caos.iisc.ernet.in

About the research paper:

This work was published last year in Procedia Computer Science as a part of the International Conference on Computational Science. Link to the published study can be found here.



This piece was developed as a press release for the Science Media Center at the Indian Institute of Science, Bangalore, India.

Posted by servingscienceblog

Hi! I'm Rajashree. Serving Science contains my weekly articles & musings on scientific news, concepts, research and pedagogy. If you'd like me to create scientific content for your organization or team, drop me an email.

Leave a comment