Blog

A blog of Python-related topics and code.

Determining mean bond lengths from crystallographic data

The Cambridge Crystallographic Data Centre is a non-profit organisation devoted to small-molecule crystallography data. It curates, validates and distributes the Cambridge Structural Database (CSD) of over 800,000 organic and metal-organic crystal structures. The CSD has an excellent Python API which can be used to analyse these structures. Unfortunately, access to most of the CCDC data requires a paid-for licence or an institutional subscription. In the short project below I obtained the necessary crystal structures using my UCL credentials. Installation and configuration of the database and software is documented on the CCDC website.

Visualizing the gradient descent method

In the gradient descent method of optimization, a hypothesis function, $h_\boldsymbol{\theta}(x)$, is fitted to a data set, $(x^{(i)}, y^{(i)})$ ($i=1,2,\cdots,m$) by minimizing an associated cost function, $J(\boldsymbol{\theta})$ in terms of the parameters $\boldsymbol\theta = \theta_0, \theta_1, \cdots$. The cost function describes how closely the hypothesis fits the data for a given choice of $\boldsymbol \theta$.

The Earth-Venus "dance"

If you join the positions of Earth and Venus with a straight line and follow them as they orbit the sun, you get a nice picture that has been popular on Facebook. Here's the Python code for generating a similar diagram. The figure has the pleasing shape it does because the ratio of the planets' orbital periods is close to 13:8.

Has the end of April 2016 been unusually cold?

It's certainly been colder than usual in Britain over the last couple of weeks. How unusual is it to have these temperatures at the end of April?

Which is the cleanest coffee shop chain?

The UK government Food Standards Agency (FSA) collects together data on food hygiene from inspections carried out by local authorities in England, Wales and Northern Ireland participating in the national Food Hygiene Rating Scheme (FHRS).