How to calculate air-sea gas fluxes using R

A geeky post. If you haven’t a clue what I’m talking about, then apologies. If you’d like to know more, let me know and I’ll start from scratch some time.

A few months back I published a paper in the Open Access journal, Ocean Science, which described how to calculate the air-sea transfer velocity for any gas. It’s long and pretty tedious but brings together a whole bunch of stuff to enable other people to use the scheme to get the numbers they need without having to spend months trawling the literature for solubility data, diffusion coefficients, the equation of state for seawater etc. So all in all I hope it’s a useful contribution to science.

Along with the paper was an implementation of the scheme in R which is free for anybody to use. Since then I’ve put a copy up on github under an open source license. AS well as the core scheme from the paper there is a set of R functions which allow the reproduction of all the figures in the paper. I will also add extensions to the scheme into the github repository as I deem them useable by the community and/or publish them in future papers. Coming soon: chemical enhancement in the liquid and gas phases. Very exciting.

A few people have recently taken an interest in using the program to calculate air-sea trace gas fluxes for their gases of interest so I thought it would be useful to document how you might do that, starting with a spreadsheet containing temperature, salinity, wind speed and concentrations in both phases of the gas in question on each row, when one row equates to one flux calculation.

Here’s an example of some input data. It’s not very realistic a) because all the data is generated by picking random numbers from a gaussian distribution – most environmental data is skewed, especially concentration data and b) I don’t have time to check what a realistic concentration range for CH3I in atmosphere or ocean is. If someone would like to populate that spreadsheet with more realistic data in the future that’d be grand.

Anyway, step one is to generate a comma spaced value (.csv) file from your spreadsheet. In the case of the above you can get a csv version here, or in the case of your own data in (god forbid) Excel or some other spreadsheet, there’ll be a save as csv option.

So you’ve got your CSV file. Here I’ll assume that you have a working install of R pointed at your working directory which already contains the files from the github repository which make up the transfer velocity scheme. To test if the latter is the case, type (at the R console)


If nothing happens (i.e. R doesn’t complain that it has no idea what you’re talking about) then you’re good to go. Put your CSV file in the same directory (or somewhere else of your choosing if your happy to manipulate paths etc).

Now, to load that file in:


read.csv assumes that the first row is headings for each of the columns so this has worked out of the box. If your data is formatted differently you might need to use read.table, or specify header=FALSE in the command. For more info type


Anyway, the data is in and all is good at my end so we’ll press on.

In an R data frame you can get to each column of data in a number of ways. e.g.

> my_data[,3]
[1] 35.76926 35.36087 35.61887 35.35569 35.78936 35.47348 35.67157 35.45667
[9] 35.38169 35.77596 35.13184 35.57471 35.74122 35.60888 35.26568 34.96617
[17] 35.24797 35.19071 35.32195 35.34880
> my_data[,"Salinity"]
[1] 35.76926 35.36087 35.61887 35.35569 35.78936 35.47348 35.67157 35.45667
[9] 35.38169 35.77596 35.13184 35.57471 35.74122 35.60888 35.26568 34.96617
[17] 35.24797 35.19071 35.32195 35.34880

for convenience, let’s define some variables which are easy to remember:

> sal<-my_data[,"Salinity"]
> temp<-my_data[,"Temperature"]
> wind<-my_data[,"Windspeed"]
> sw_concn<-my_data[,6]
> atm_concn<-my_data[,5]

Simples. Next we need to use the K_calcs… program to calculate the transfer velocity for each compound.

make sure it’s loaded:


The function Kw calculates total transfer velocity for a gas with respect to a water phase concentration gradient.

You can see what a function does in R by calling it with no parentheses:

> Kw
#calculate total transfer velocity (Kw) after Liss and Slater

You can see from the above that the transfer velocity scheme is a series of nested function with the key parameters [compound, temperature, wind speed etc] all passed around between them. you can trace these through by looking at each function in turn in the same way as we did above for Kw. In fact you can call any function directly so if, e.g. you need the diffusivity of ammonia in air at 30 degrees you can call:

> D_air("NH3",30)
[1] 0.2111959

(diffusivities are in cm2/s)

Back to Kw. To calculate a single Kw value (e.g. for CH3I at 20 celcius, 5m/s windspeed and S=35) you’d do this:

> Kw("CH3I",20,5,35)
[1] 1.550430e-05

Which is great, but we want to calulate Kw for all the rows in our data at once. This is where R is particularly powerful. We can define a new column in my_data and use values in the other columns to calculate the value for each row of this new column, all at once!:

> my_data[,"Kw"]<-Kw("CH3I",temp,wind,sal)
[1] 7.708174e-06 5.370894e-05 8.074108e-05 2.681450e-05 1.489535e-05
[6] 3.510663e-05 3.820262e-05 5.879458e-05 4.813323e-05 1.059725e-04
[11] 4.354995e-05 2.856864e-05 1.078323e-05 1.036024e-05 4.048401e-06
[16] 3.977783e-05 6.687053e-05 2.779482e-05 3.426026e-05 3.584965e-07

Sweet huh?

So next let’s add a column which is the dimensionless henry’s law constant for CH3I at the temperature in each row:

> my_data[,"H"]<-KH("CH3I",temp,sal)
> my_data[,"H"]
[1] 0.1660876 0.1644287 0.1858428 0.1836445 0.1487224 0.1621694 0.1746626
[8] 0.2077991 0.1671034 0.1882404 0.2030620 0.1269990 0.1507726 0.1790700
[15] 0.1742638 0.1406145 0.1847995 0.1601485 0.2361064 0.1669096

Finally we can add a column which is the flux calculation, according to the flux equation:

F = Kw(Ca/H – Cw)

where F is the flux, Ca is the air concentration, Cw the water concentration, H the Henry’s law constant. [make sure all your units are correct -easy to go wrong here. The way I’m laying it out here is one correct way, but there are many others. Feel free to get in touch if you get stuck…]

In the case below I’ve multiplied by minus one to make a positive value of F represent a flux from sea to air – the way I like it ;-):

> my_data[,"F"]<-(my_data[,"Kw"]*((atm_concn/my_data[,"H"])-sw_concn))*(-1)

[1] 3.415365e-05 2.715205e-04 2.514285e-04 2.717888e-04 7.330746e-05
[6] 4.749550e-05 3.759730e-06 5.432819e-04 2.955133e-04 8.373667e-04
[11] 1.502132e-04 9.293719e-05 7.623237e-05 5.816068e-05 2.731569e-05
[16] 2.445745e-04 2.239382e-04 -1.252216e-05 3.216200e-04 6.304133e-07

* disclaimer: this has been done in a rush in the half hour before going away on holiday for a week – please check carefully before you rely on this. I’ll check it in detail when I get back.

You can then tell R to write you a nice CSV of all the data:


Download it here

Sorry about the ridiculous number of ‘significant’ figures throughout. R can be tamed in that respect but I haven’t had time. Example here.

Bob’s your uncle as they say.


4 thoughts on “How to calculate air-sea gas fluxes using R

  1. Ben Godfrey

    This is awesome! Great to see code on GitHub, but which license are you using? I strongly suggest you consider the GPLv3.

    Copyleft licenses (i.e. the GPL licenses) place more requirements on people who use your code, which IMHO is generally better for science. In the commercial arena, there is something to be said for permissive licenses (e.g. BSD) which allow software retailers to use open source code without participating very actively. In open science, you want maximum openness.

    See Wikipedia’s entry on permissive and copyleft open source licenses for more info on this debate.

    How to use the GPL. GPLv3 places a few more restrictions on people who use your code which ensures it doesn’t get used in bad faith. These were added after new ways to get around the earlier GPLs were revealed.

    1. martwine Post author

      Ben, that’s a really good point, and something I had completely forgotten to do. I think you’re right that the GPL license or similar copyleft license is probably best, it had always been my intention to release all my stuff in that way.

      However, you could make the argument that a permissive licence is better for science code – ultimately as a researcher I want maximum uptake of my ideas and attribution thereof. In terms of making funding bodies happy the more uptake by commercial organisations the better, so a more restrictive license may have some downsides. Also, I consider the intellectual property involved in the transfer velcoity code to be the formulation of the numerical scheme which underlies it – the code itsself is just one implementation which I threw together to test and use the scheme I developed. So I’m not terribly precious about it I guess.

      Nonethless, as a staunch supporter of open source software I will get on the case with GPLing it. Just need to figure out how I actually go about doing that…

      1. Ben Godfrey

        Interesting point. That’s the same debate that goes on in the commercial sector. It would be interesting to get RMS or someone to comment.

        I think the most important thing here is releasing the code and making it available. Just having an informed debate about licenses is pretty awesome. All important steps for creating a more open science.

  2. martwine Post author

    Had a question regarding this by email today. Posting here in case it’s useful to anybody else:

    In the input file: Temperature Salinity Windspeed CH3I.g….nmol.m3

    Observation no. is obvious.
    But Temp, is that air or surface ( I would assume surface ).
    As for windspeed, is 10m windspeed good enough, or do you need to try
    and adjust it (Im using ECMWF reanalysis data).
    As for concentrations… Whats g? Gas? In which case, is it at
    standard T and P? or adjusted using pV = nRT
    I assume sw is seawater and therefore my observed concentration…

    B (who measures oxygen concentrations)

    1. Temp – this scheme (and most other estimates of gas exchange) assume that the temperature adjacent to the interface on either side is equal. The surface ocean is a pretty good buffer of heat due to it’s huge specific heat capacity, so assuming surface seawater temperature throughout is probably a small uncertainty, especially compared to the effects of temperature gradients away from the interface leading to stability / instability, or the effects of a cool skin relative to your bulk surface measurement due to evaporation or condensation or… If your worried about these things, read this:

    2. The parameterisations used for kw and ka are both ‘tuned’ to u10, so that’s ideal. Bless the souls who go to the effort of determining the wind speed 1cm above the sea surface and then realise there’s no parameterisation for it! Important to consider your wind speed timescale and averaging – both because you need a representative average, and because averaging is tricky with non-linear relationships between transfer velocity and wind – the square-of-the-mean is not equal to the mean-of-the-squares. This is discussed in the above-linked article too.

    3. the gas (g) and seawater (sw) concentrations are both assumed to be in (n)moles per unit volume, as the Henry’s law constant used is the dimensionless one. This is fine for the kind of gases I normally deal with where we actually measure volumes of air and the gases in them, or where assuming standard T and P makes little difference to the resulting concentration relative to the concentration difference between atmosphere and ocean; but much less convenient for a major atmospheric constituent like oxygen (which is the subject of this question…). You have two choices: either use pV=nRT and your partial pressure of O2 and the measured temperature and pressure to convert your gas phase concentration into the same moles per volume as the seawater concentration and use the above code as is. Alternatively you could use the Henry’s law constant in units of M/atm, which means you can compare a partial pressure with a seawater concentration directly. Fortunately, the software can do this – instead of KH(“O2”,,) you can do KH_Molar_per_atmosphere(“O2”,,). Remember that this is liquid/gas and the one in the example above is gas/liquid so you’ll need to do a bit of inverting.

    Hope this helps!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s