This discussion topic is meant to collect information and analyses for the known correlation between the Quasi-Biennial Oscillation (QBO) of stratospheric wind speeds and ENSO, see [1,2,3,4,5] refs at the bottom of this post.

What I find intriguing about this correlation is that the QBO periodicity is much stronger than the erratic ENSO periodicity. One would think this would have implications for predictability of ENSO -- since a stronger period is more predictable than a weaker, and since QBO is thought to drive ENSO in some way, we may be able to isolate a component of the time series. And if this forcing is strongly periodic, it may be used to project into the future.

The reason QBO is thought to be a driver is that the QBO winds downwell over the Pacific Ocean as they cycle. One can see this in the speed vs altitude plots, where a higher atmospheric pressure corresponds to a lower altitude.

This tends to push on the Pacific Ocean surface with the same cycle, causing water to pile up in the windward direction periodically.

The QBO has an average period of about 28 months since data collection started in 1952 (data link), which explains the quasi-biennial aspect. However the cycles show a measurable amount of jitter, which is a fluctuation in a given cycle's period. One question to consider is whether this jitter is random or shows an extra periodicity, which may be due to tidal beating (?) see http://contextearth.com/2014/06/17/the-qbom/ for some evidence that I collected.

The recent claim in [5] is that the ENSO and QBO time series have almost aligned over a recent 5-year interval. See the figure below.

That near-correlation could be just happenstance as in general the waveforms don't align, with the ENSO being much more erratic and the QBO showing a stricter periodicity. The other characteristic of QBO is that the peaks are generally broader and more flat-topped than the valleys. But now that I look at what others have plotted, the asymmetry is not quite as apparent (from here)

Yet, this asymmetry is not generally seen in ENSO.

I have been analyzing using the fundamental frequency of QBO as a driver to a nonlinear DiffEq equation model of ENSO and the results are intriguing enough that I have pursued this over several ENSO data sets such as SOI and ENSO proxies. Start here and follow the links backwards. The general finding is that periods close to the QBO period appear to be strong candidates for a forcing function, but the non-linear function causes a transformation that the response is much more erratic.

That's the general idea and I will post more analyses related to QBO below this entry.

Paul Pukite

[1] N. Calvo, M. A. Giorgetta, R. Garcia‐Herrera, and E. Manzini, “Nonlinearity of the combined warm ENSO and QBO effects on the Northern Hemisphere polar vortex in MAECHAM5 simulations,” Journal of Geophysical Research: Atmospheres (1984–2012), vol. 114, no. D13, 2009.

[2] M. Geller and W. Yuan, “QBO-ENSO Connections and Influence on the Tropical Cold Point Tropopause,” presented at the AGU Fall Meeting Abstracts, 2011, vol. 1, p. 03.

[3] S. Liess and M. A. Geller, “On the relationship between QBO and distribution of tropical deep convection,” Journal of Geophysical Research: Atmospheres (1984–2012), vol. 117, no. D3, 2012.

[4] W. M. Gray, J. D. Sheaffer, and J. A. Knaff, “Inﬂuence of the stratospheric QBO on ENSO variability,” J. Meteor: Soc. Japan, vol. 70, pp. 975–995, 1992.

[5] J. L. Neu, T. Flury, G. L. Manney, M. L. Santee, N. J. Livesey, and J. Worden, “Tropospheric ozone variations governed by changes in stratospheric circulation,” Nature Geoscience, vol. 7, no. 5, pp. 340–344, 2014.

P.S. I almost finished writing this a few days ago but I inadvertently hit the back button. I had to wait a few days to get back my motivation.

]]>The work done in the past had centered around developing 'toy' climate models for purposes of understanding and intellectual experiment. For example, a javascript-based interactive program for a model of stochastic resonance, which could have application to understanding glaciation cycles.

Times have changed, and now the current focus revolves around applied category theory. But, interestingly, as we are seeing through the buzz of activity around the MIT 2020 programming with categories course, one of the really exciting, tangible and pragmatic applications of category theory is to realm of functional programming.

So it occurs to me that there may be a potential for the code project to awaken, in a new form. I am starting this discussion as a placeholder for people share any ideas they might have on the topic.

]]>Time to take inventory!

]]>The model essentially emulates a 2nd-order wave equation as described by Allan Clarke [1]. The characteristic frequency $\omega$ has a period of approximately 4.25 years. This is modulated by variations due to TSI and forced by QBO wind variations, angular forcing variations characterized by the Chandler wobble effect, and by TSI. The result is a differential formulation with the characteristics of a Mathieu or Hill equation . This formulation is well-known in the hydrodynamics literature describing sloshing of liquid volumes [2][3].

$ f''(t) + (\omega^2 + k TSI(t)) f(t) = QBO(t) + TSI(t) + CW(t) $

To solve this equation, I tried to create closed-form expressions for the known forcings and then applied the expressions to solve the differential equation using Mathematica. The number above each chart is the correlation coefficient multiplied by 100. These formulations are pseudo-periodic so that the tweaking I applied is to best emulate the experimentally measured oscillations, without going overboard in fidelity. The curve labeled SOIM is the model of the ENSO SOI data.

As an example of a tweak, the TSI curve was broken into two pieces at 1980. Before that point in time, the 22-year period cycle was more evident and after that the 11-year was stronger. The CW correlation coefficient seems low because the period appears to exhibit a phase reversal before 1930. Whether or not this is a real physical transition, I am not sure, so I left it alone.

The million dollar question is whether such an overall fit is possible just by chance selection of these factors. Each one of the factors, QBO [4], CW [5], and TSI [] is suggested as important in ENSO behavior mechanics in the literature.

I wrote up the paper here on ARXIV. I submitted this to a journal but it got rejected w/o review.

[1] A. J. Clarke, S. Van Gorder, and G. Colantuono, “Wind stress curl and ENSO discharge/recharge in the equatorial Pacific,” Journal of physical oceanography, vol. 37, no. 4, pp. 1077–1091, 2007.

[2] O. M. Faltinsen and A. N. Timokha, Sloshing. Cambridge University Press, 2009.

[3] J. B. Frandsen, “Sloshing motions in excited tanks,” Journal of Computational Physics, vol. 196, no. 1, pp. 53–87, 2004.

[4] W. M. Gray, J. D. Sheaffer, and J. A. Knaff, “Inﬂuence of the stratospheric QBO on ENSO variability,” J. Meteor: Soc. Japan, vol. 70, pp. 975–995, 1992.

[5] R. S. Gross, “The excitation of the Chandler wobble,” Geophysical Research Letters, vol. 27, no. 15, pp. 2329–2332, 2000.

[66] L. Kuai, R.-L. Shia, X. Jiang, K. K. Tung, and Y. L. Yung, “Modulation of the period of the quasi-biennial oscillation by the solar cycle,” Journal of the Atmospheric Sciences, vol. 66, no. 8, pp. 2418–2428, 2009.

]]>This is aptly described by NASA oceanographer Josh Willis as the earth's Heat Bucket.

After combining all the data, Willis found that between mid-1993 and mid-2003, the heat content of the upper 750 meters of Earth’s global ocean increased at an average rate of 0.86 watts (plus or minus 0.12 watts) per square meter. Just 0.86 watts per square meter may not sound like much until you consider that we are talking about an area of about 337 trillion square meters (the 93 percent of the world ocean that Willis studied).

James Hansen ran five climate simulations covering the years 1880 to 2003 to estimate change in Earth’s energy budget.

Taking the average of the five model runs, the team found that over the last decade, heat content in the top 750 meters of the ocean increased by 6.0 plus or minus 0.6 watt-years per square meter.

The model described in Gerald North's book, A Simple Climate Model starts with a linear deterministic model of a single slab of the ocean (a 1-box model).

In this code the model will be go down to a depth of 80m.

The NOAA's Office of Climate Observation (OCO) concisely describes some of the physical aspects of oceans and climate:

Water vapor, evaporated from the ocean surface, provides latent heat energy to the atmosphere during the precipitation process. In units of 1,000 km3 per year, evaporation E over the oceans (436) exceeds precipitation P (399), leaving a net of 37 units of moisture transported onto land as water vapor. On average, this flow must be balanced by a return flow over and beneath the ground through river and stream flows, and subsurface ground water flow. The average precipitation rate over the oceans exceeds that over land by 72% (allowing for the differences in areas), and precipitation exceeds evapotranspiration over land by this same amount (37) (Dai and Trenberth 2002). This flow into the oceans occurs mainly in river mouths and is a substantial factor in the salinity of the oceans, thus affecting ocean density and currents. A simple calculation of the volume of the oceans of about 1330x106 km3 and the through-flow fluxes of E and P implies an average residence time of water in the ocean of over 3,000 years.

Changes in phase of water, from ice to liquid to water vapor, affect the storage of heat. However, even ignoring these complexities, many facets of the climate can be deduced simply by considering the heat capacity of the different components of the climate system. The total heat capacity depends on the mass of the substance involved as well as its capacity for holding heat, as measured by the specific heat of sea-water...

and

]]>The atmosphere does not have much capability to store heat. The heat capacity of the global atmosphere corresponds to that of only a 3.2 m layer of the ocean. However, the depth of ocean actively involved in climate is much greater than that. The specific heat of dry land is roughly a factor of 4.5 less than that of seawater (for moist land the factor is probably closer to 2). Moreover, heat penetration into land is limited by the low thermal conductivity (the degree to which a substance transmits heat), of the land surface; as a result only the top two meters or so of the land typically play an active role in heat storage and release (e.g., as the depth for most of the variations over annual time scales). Accordingly, land plays a much smaller role than the ocean in the storage of heat and in providing a memory for the climate system. Major ice sheets, like those over Antarctica and Greenland, have a large mass but, like land, the penetration of heat occurs primarily through conduction (molecular transfer of energy due to a temperature gradient), so that the mass experiencing temperature changes from year to year is small. Hence, ice sheets and glaciers do not play a strong role in heat capacity, while sea ice is important where it forms.

This link is the most recent revisit to the model, providing a mechanistic view of how the model is constructed.

http://ContextEarth.com/2015/01/30/csalt-re-analysis/

This is the first post in 2013. From that point onward, I only used data up to Oct 2013 for training and fitting.

http://ContextEarth.com/2013/10/26/csalt-model/

This is an index to most of the CSALT posts

http://ContextEarth.com/context_salt_model/

So it's now 2016 and I figured I would check how good the CSALT model did in capturing the temperature variation of 2014 and 2015, relying only on training from 1880 to 2013.

This is the extrapolation with updated CO2 + SOI + Aero + LOD + TSI data. Extra periodic factors capturing mainly long-periods associated with lunisolar cycles (which tended to improve the fit for 1880-2013) were left as is and simply projected forward.

The figure below is a zoomed version where you can see the plateau, and then the numbers snapping back up. The combination of factors worked to compensate for the plateau, and when they combined in a constructive phase, the modeled CO2 trend got back in line with the temperature upswing. That all happened in 2014 and 2015, which the model did a good job in projecting.

There's nothing contradictory in this model to the mainstream climate science findings. It finds a Transient Climate Response of over 2C per doubling of CO2, which is line with the Equilibrium Climate Sensitivity of 3C per doubling after the oceans equilibrate.

The reason I became interested in an ENSO model is that being able to predict ENSO dynamics should help in predict the temperature movement.

]]>- Mike Martin, Glycolytic oscillations: the Higgins–Selkov model.

I'd like to see more such things, since I'm looking at a lot of chemical reaction networks these days.

This evening Dara Shayda has - in just a couple hours! - created a demo that relies on the Wolfram Cloud:

- Dara Shayda, Glycolytic oscillations: the Higgins–Selkov model.

I believe Mike Martin's website relies on Mathematica too.

It's great that Dara can create such a website so fast. I could add text to this and turn it into a nice little introduction to this model of glycolytic oscillations. How could we make it even nicer?

Both these websites require that you type in some numbers. It would be more intuitive to use sliders. Dara had created a website that uses sliders, but the latency makes it very frustrating to use: you can try to slide the slider, but it just sits there for a while.

Using Javascript, some of the Azimuth gang made a webpage that works on my website, uses sliders to take inputs, and uses your browser to run a simple climate model in real time:

- Michael Knap and Taylor Baldwin, A simple stochastic energy balance model.

It looks and feels great. I get the feeling that main reason people don't do more of these is that people hate doing math programming in Javascript.

So, there seems to be some tradeoff between what's quick and easy to program (which is very important) and what feels nice to the user.

Here I am only interested in things that end users can run, using a web browser or maybe even a mobile phone, without downloading any special software.

]]>I wrote this up on my blog and will elaborate more if there is some interest http://contextearth.com/2015/05/25/changes-in-the-angular-momentum-of-the-earth/

I still haven't seen much by the way of research in characterizing ENSO as a forced sloshing model, using the angular momentum modulation factors as input. (Sometimes I have to wonder who makes the decision on what physics to use in climate models. )

Daniel's post on sloppy models has got me thinking of how to push these simplified models.

]]>https://aws.amazon.com/datasets/Climate

High resolution climate data to help assess the impacts of climate change primarily on agriculture. These open access datasets of climate projections will help researchers make climate change impact assessments. Last Modified: Dec 8, 2014 18:49 PM GMT

Three NASA NEX datasets are now available, including climate projections and satellite images of Earth. Last Modified: Nov 12, 2013 13:27 PM GMT

A collection of daily weather measurements (temperature, wind speed, humidity, pressure, &c.) from 9000+ weather stations around the world. Last Modified: Sep 29, 2009 0:48 AM GMT

]]>The overall trend is scaled log(CO2). The mutidecadal wobble is deviations of the Length-of-Day (-dLOD).

The fine structure is made up of the ENSO Southern Oscillation Index (-SOI) which is the atmospheric pressure at Tahiti minus the pressure at Darwin.

Any place the fine structure is not represented well by SOI, cooling spikes caused by significant volcanic eruptions are more than likely too blame. Thus, the model includes volcanic eruptions of those with at least a volcanic explosivity index (VEI) of 5.

Those are the main ingredients so far. Because the model works so well, I have a few more factors that I experiment with. One is a correction that I apply for the WWII years where the temperature calibration appears to be very poor.

I have shown this around to various blogs and the most common accusation I get concerns over-fitting. Fitting curves and manifolds to data is what science is all about when it comes down to it, so I guess that criticism comes with the territory.

]]>This is the code that reads the new GPM satellite(s) data, part Python and part Mathematica:

I have not fully ported my Mathematica scripts, but very soon.

The data is massive and even with a 16 cpu server (16 cpu IO) we find it quite cpu intensive and memory intensive to read large regions of data.

But once the data is read we convert it into CDF and video flip books for general consumption by desktop machines.

GPL 2.0 license for now.

Our hope is access all data from all these satellites and process them in real-time for public consumption.

]]>This is the satellite data download sample for the new orbital satellites recordings:

Friday Dec 12, start=12hr, end=16hr

The algorithm does Gaussian filter to smooth out the holes

Dara

]]>My idea is that resonances cascade starting with the strongest biennial and annual signals. These strong forcings then couple to the most compliant medium -- the low-density fluid of the stratosphere -- and create the first resonance, that of the QBO. And then this resonance couples to the higher inertial and less compliant medium of the ocean, and thus creating the ENSO resonance.

The figure below shows a simple model of a SemiAnnual Oscillation, found in the very-thin upper stratosphere, and how that transforms into a Quasi-biennial oscillation found in the lower stratosphere.

At the highest altitude the response function remains the forcing but as it moves lower, the resonance gradually takes over as the fluid response.

The contour is slanted as I added a linear phase shift as the waveform moves down the atmosphere.

That's the conceptual model that is trying to capture the following observed behavior.

]]>This is a nice short review of clustering algorithms, a more modern treatment and more applicable to your applications:

Community detection by graph Voronoi diagrams

Basically we should be able to take the atmospheric volumetric data and cluster as such to turn it into time-varying clusters.

]]>I took rather sketchy notes because they said their slides would be made public. Here are my notes:

We don't know how climate change will affect the tails of the temperature distribution - the probability of extreme events.

World Climate Research Programme 2013 grand challenge: understanding and improving predictions of extreme weather events.

Climate models are a very interesting playground for latent variables - in a climate model you can measure variables that can't be seen otherwise.

There's a lot of low-hanging fruit for machine learning techniques.

There's a workshop called Climate Informatics, which first met in 2011. The next meeting will be on September 25-26 in Boulder, Colorado.

Li, Nychka and Ammann, *JASA*, 2010 - Bayesian hierarchical model used to study "hockey stick".

Paleoclimate data can be reconstructed using "sparse matrix completion techniques" - if we can discover latent structure.

Also "data fusion" - combining data from different sources - is important.

Chatterjee et al, *SDM* 2012 studied the influence of ocean temperatures on on land temperature and precipitation. They used the "sparse group lasso" for high-dimensional regression. The idea: only a few ocean locations are relevant. This reminds me of Daniel Mahler's attempt to locate the most significant regions for El Niño prediction.

Climate downscaling: LatticeKrig is a method for "spatial downsizing" to generate good *local* climate predictions from global ones. Benestad et al *NCC* 2012.

Why use model ensembles?

"Ensembles of opportunity" - when there just happen to be lots of modeling groups making models.

"Initial condition ensembles" - change initial conditions to increase robustness of forecasts

"Perturbed physics ensembles" - change parameters in the model to increas robustness.

The Coupled Model Intercomparison Project or CMIP tried to improve the results of the IPCC ensemble. Average prediction over all models is better than any one. But what's the *best* way to combine model predictions? It's not simply taking the average.

Tracking Climate Models is a method of finding the best way to combine forecasts - a method which *changes with time* as conditions change.

Jia DelSole and Tippett, *J. Climate* 2013 - discovered a low intrinsic dimensionality of world climate models, due to El Niño and a few other features! This looks interesting for our El Niño project.

Kawale et al, *SDM* 2011, Steinback et al, *KDD* 2003 did automated discovery of pressure dipoles. This also looks very interesting, combining machine learning ideas with Paul Pukite's fondness for dipoles!

Ebert-Uphoff et al, A new type of climate network based on probabilistic graphical models: results of boreal winter versus summer *J. Clim.* 2012 used Bayesian network ideas to infer causal relationships between the 4 biggest teleconnections!

Deng et al, *GRL* 2014 - in climate models, information flow in the weather diminishes as we move forwards in time in global warming scenarios: the biggest northern-latitude "hubs" in climate networks disappear; remaining hubs move poleward.

Some of the top names in algorithm design do not know how to code! I worked with them in number occasions for number of years. So John you do not need to be a fancy programmer to design algorithms.

Specific to this field of climate, we need a designer of algorithms with mathematical prowess, coding is easy enough.

]]>Here is Random Forest, it is Mathematica v10 built in:

Random Forest Forecast for El Nino 3.4

I added the standard ANOVA like error analysis, I dislike this kinda of antique error analysis, but remember this is off-the-shelf.

If you need more details or changes to the training of the algorithm, I need to call TechSupport which I do not mind to do.

Dara

]]>check this out:

Parallel Clustering with applications to climatology

This could be used to cluster the massive data on global scales.

The C version here

]]>Here are some of the basic facts:

John is looking for material for an upcoming NIPS talk

We studied the Ludescher et. al paper

Graham reproduced their results (essentially)

We decided to focus on predicting a continuous El Niño index

We are focusing our attention on machine-learning-based inferences from temperature grid data

John has blogged about all of the above

He now wants to blog about climate networks

- Info: http://www.esrl.noaa.gov/psd/people/cathy.smith/best/
- Data: http://www.esrl.noaa.gov/psd/people/cathy.smith/best/enso.ts.1mn.txt

This is a good one for doing machine learning on because it is relatively free from noise and shows little by way of a trend. It is all oscillations.

The machine learning finds the usual QBO forcing period of around 28 months, and a Mathieu-like modulation of 9 to 12 year periods. It also finds a characteristic period of a little over 4 years, spanning an interval running back to 1880.

The top chart has double the complexity fit as the second chart. Both correlation coefficients are above 0.85.

]]>This is the simplest algorithm for Machine Learning forecasts, in particular it looks for signal self-similarity and averages out the out from each similar case called Nearest-neighbour.

Similar error of 9% to SVR and NN is found.

All known and popular distance functions were tried, but the most accurate forecasts were obtained by non-Euclidean metrics.

]]>http://twitter.com/WSI_Energy/status/534405282420768769/photo/1

Note the white spot north of the Himalayas

These don't happen too often. Read more here http://en.wikipedia.org/wiki/Sudden_stratospheric_warming

]]>In this post, I’ll discuss why the analog approach to forecasting often delivers disappointing results. Basically, it doesn’t work well because there are usually very few, if any, past cases on record that mimic the current situation sufficiently closely. The scarcity of analogs is important because dissimilarities between the past and the present, even if seemingly minor, amplify quickly so that the two cases end up going their separate ways.

What they are going by:

Van den Dool (1994)’s "Searching for analogs, how long must we wait?" calculates that we would have to wait about 10^30 years to find 2 observed atmospheric flow patterns that match to within observational error over the Northern Hemisphere. While the ocean is not as changeable as the atmospheric flow, it is clear that finding close matching analogs would also require a very long historical dataset.

This is counter to what I am finding. It is well known that periodic forcing when applied to a near-chaotic system can re-align the behavior to a more deterministic regime.

Osipov, Grigory V, Jürgen Kurths, and Changsong Zhou. Synchronization in Oscillatory Networks. Springer, 2007.

I think that the rationale for not finding the underlying ENSO pattern yet is that they may not have not looked hard enough. The 10^30 years is a red herring.

They do say that there is some hope, using alternative methods:

]]>Although this uncertainty in outcomes is somewhat smaller than that what we would have if we selected years completely randomly from the history, it is larger than that from our most advanced dynamical and statistical models. This is one reason analog forecasting systems have been largely abandoned over the last two decades as more modern prediction systems have proven to provide better accuracy.

Theorem

information non-increase K(f(x))≤K(x)+K(f) for computable functions f

If you run a computation function e.g. software algorithm on data, you cannot increase its information content by much except the length of the program itself.

Therefore if you are making a computation forecast based upon finite data, your forecast has certain amount of accuracy which is related to total amount information encapsulated into the data. If you run a computational functions on that data e.g. link strength or averaging you cannot increase the amount of information in that data by much i.e. almost nothing.

Therefore manipulation of the signal by computational functions will not increase its forecast accuracy by much, since it will not increase its amount of information.

What you could do is to add new data, which adds new information, therefore the amount of information is increased, possibly then you could make a more accurate forecast.

]]>I have seen this short term memory for the stock market signals as well.

But my main wonder is: if there is such a memory where is the structure that maintains it e.g. materials with shape memory or actual digital memory systems or human neuronal memory all have a structure that processes that memory and maintains it, but where is the atmospheric memory system? and what is it comprised off?

Dara

PS. Paul would this qualify as crackpot science?

]]>A bit improvement on the error in comparison to SVR's 9% reduced to 7.9% for NN, but the max deviation range was increased a bit.

**IMPORTANT: error for each of the 6 months was at the same level, not increasing as in the case of SVR**

3 Layers, Input and Hidden each of 37 length, in other words the training samples vectors of length 37.

Output layer of length 6, for next 6 months forecasts.

300 sweeps of entire data, each step repeated 30 times, total of 9000 learning sessions.

Dara

]]>SVR delta forecast of El Nino 3.4 Anomalies

Could you check and see if I took the right data, I copied it from your github address you had issued earlier:

https://raw.githubusercontent.com/johncarlosbaez/el-nino/master/R/nino3.4-anoms.txt

I could easily switch to another data.

I included all the algebra and math for SVR, I used someone else's professional mathematical writings to avoid my own.

Dara

]]>