#### April 27, 2015

One of the things that C2HS is lacking is a good tutorial. So I’m going to write one (or try to, anyway).

To make this as useful as possible, I’d like to base a large part of the tutorial on a realistic case study of producing Haskell bindings to a C library. My current plan is to break the tutorial into three parts: the basics, the case study and “everything else”, for C2HS features that don’t get covered in the first two parts. To make this *even more* useful, I’d like to base the case study on a C library that someone actually cares about and wants Haskell bindings for.

The requirements for the case study C library are:

There shouldn’t already be Haskell bindings for it – I don’t want to duplicate work.

The C library should be “medium-sized”: big enough to be realistic, not so big that it takes forever to write bindings.

The C library should be of medium complexity. By this, I mean that it should have a range of different kinds of C functions, structures and things that need to be made accessible from Haskell. It shouldn’t be completely trivial, and it should require a little thought to come up with good bindings. On the other hand, it shouldn’t be so unusual that the normal ways of using C2HS don’t work.

Ideally it should be something that more than one person might want to use.

It needs to be a library that’s available for Linux. I don’t have a Mac and I’m not that keen on doing something that’s Windows-only.

Requirements #2 and #3 are kind of squishy, but it should be fairly clear what’s appropriate and what’s not: any C library for which you think development of Haskell bindings would make a good C2HS tutorial case study is fair game.

If you have a library you think would be a good fit for this, drop me an email, leave a comment here or give me a shout on IRC (I’m usually on `#haskell`

as `iross`

or `iross_`

or something like that).

#### April 5, 2015

I’ve started doing a new thing this year to try to help with “getting things done”. I normally have a daily to-do list and a list of weekly goals from which I derive my daily tasks, but I’ve also now started having a list of quarterly goals to add another layer of structure. Three months is a good timespan for medium-term planning, and it’s very handy to have that list of quarterly goals in front of you (I printed it out and stuck it to the front of my computer so it’s there whenever I’m working). Whatever you’re doing, you can think “Is this contributing to fulfilling one of my goals?” and if the answer is “No, watching funny cat videos is not among my goals for this quarter”, it can be a bit of a boost to get you back to work.

So, how did I do? Not all that badly, although there were a couple of things that fell by the wayside.

#### April 3, 2015

OK, so we’re done with this epic of climate data analysis. I’ve prepared an index of the articles in this series, on the off chance that it might be useful for someone.

The goal of this exercise was mostly to try doing some “basic” climate data analysis tasks in Haskell, things that I might normally do using R or NCL or some cobbled-together C++ programs. Once you can read NetCDF files, a lot of the data manipulation is pretty easy, mostly making use of standard things from the `hmatrix`

package. It’s really not any harder than doing these things using “conventional” tools. The only downside is that most of the code that you need to write to do this stuff in Haskell already exists in those “conventional” tools. A bigger disadvantage is that data visualisation tools for Haskell are pretty thin on the ground – `diagrams`

and `Chart`

are good for simpler two-dimensional plots, but maps and geophysical data plotting aren’t really supported at all. I did all of the map and contour plots here using UCAR’s NCL language which although it’s not a very nice language from a theoretical point of view, has built-in capabilities for generating more or less all the plot types you’d ever need for climate data.

I think that this has been a reasonably useful exercise. It helped me to fix a couple of problems with my `hnetcdf`

package and it turned up a bug in `hmatrix`

. But it went on a little long – my notes are up to 90 pages. (Again: the same thing happened on the FFT stuff.) That’s too long to maintain interest in a problem you’re just using as a finger exercise. The next thing I have lined up should be quite a bit shorter. It’s a problem using satellite remote sensing data, which is always fun.

#### April 2, 2015

This is going to be the last substantive post of this series (which is probably as much of a relief to you as it is to me…). In this article, we’re going to look at phase space partitioning for our dimension-reduced $Z_{500}$ PCA data and we’re going to calculate Markov transition matrices for our partitions to try to pick out consistent non-diffusive transitions in atmospheric flow regimes.

#### March 30, 2015

I took over the day-to-day support for C2HS about 18 months ago and have now finally cleaned up all the issues on the GitHub issue tracker. It took a *lot* longer than I was expecting, mostly due to pesky “real work” getting in the way. Now seems like a good time to announce the 0.25.1 “Snowmelt” release of C2HS and to summarise some of the more interesting new C2HS features.

#### March 8, 2015

This is going to be the oldest of old hat for the cool Haskell kids who invent existential higher-kinded polymorphic whatsits before breakfast, but it amused me, and it’s the first time I’ve used some of these more interesting language extensions for something “real”.

#### February 12, 2015

(There’s no code in this post, just some examples to explain what we’re going to do next.)

Suppose we define the state of the system whose evolution we want to study by a probability vector $\mathbf{p}(t)$ – at any moment in time, we have a probability distribution over a finite partition of the state space of the system (so that if we partition the state space into $N$ components, then $\mathbf{p}(t) \in \mathbb{R}^N$). Evolution of the system as a Markov chain is then defined by the evolution rule

$\mathbf{p}(t + \Delta{}t) = \mathbf{M} \mathbf{p}(t), \qquad (1)$

where $\mathbf{M} \in \mathbb{R}^{N \times N}$ is a *Markov matrix*. This approach to modelling the evolution of probability densities has the benefit both of being simple to understand and to implement (in terms of estimating the matrix $\mathbf{M}$ from data) and, as we’ll see, of allowing us to distinguish between random “diffusive” evolution and conservative “non-diffusive” dynamics.

We’ll see how this works by examining a very simple example.

#### February 9, 2015

The analysis of preferred flow regimes in the previous article is all very well, and in its way quite illuminating, but it was an entirely *static* analysis – we didn’t make any use of the fact that the original $Z_{500}$ data we used was a time series, so we couldn’t gain any information about transitions between different states of atmospheric flow. We’ll attempt to remedy that situation now.

What sort of approach can we use to look at the dynamics of changes in patterns of $Z_{500}$? Our $(\theta, \phi)$ parameterisation of flow patterns seems like a good start, but we need some way to model transitions between different flow states, i.e. between different points on the $(\theta, \phi)$ sphere. Each of our original $Z_{500}$ maps corresponds to a point on this sphere, so we might hope that we can some up with a way of looking at trajectories of points in $(\theta, \phi)$ space that will give us some insight into the dynamics of atmospheric flow.

#### February 6, 2015

A quick post today to round off the “static” part of our atmospheric flow analysis.

Now that we’ve satisfied ourselves that the bumps in the spherical PDF in article 8 of this series are significant (in the narrowly defined sense of the word “significant” that we’ve discussed), we might ask what to sort of atmospheric flow regimes these bumps correspond. Since each point on our unit sphere is really a point in the three-dimensional space spanned by the first three $Z_{500}$ PCA eigenpatterns that we calculated earlier, we can construct composite maps to look at the spatial patterns of flow for each bump just by combining the first three PCA eigenpatterns in proportions given by the “$(x, y, z)$” coordinates of points on the unit sphere.

#### February 2, 2015

The spherical PDF we constructed by kernel density estimation in the article before last appeared to have “bumps”, i.e. it’s not uniform in $\theta$ and $\phi$. We’d like to interpret these bumps as preferred regimes of atmospheric flow, but before we do that, we need to decide whether these bumps are *significant*. There is a huge amount of confusion that surrounds this idea of significance, mostly caused by blind use of “standard recipes” in common data analysis cases. Here, we have some data analysis that’s anything but standard, and that will rather paradoxically make it much easier to understand what we really mean by significance.

« OLDER POSTS