A blog of Python-related topics and code.

Visualizing the bivariate Gaussian distribution

The multivariate Gaussian distribution of an $n$-dimensional vector $\boldsymbol{x}=(x_1, x_2, \cdots, x_n)$ may be written

Changes in real wages since 2007

The OECD provides a tool for studying the change in average real wages (compared in purchase power parity adjusted US dollars using a 2012 base year). Here is how real wages have changed in 15 countries, visualised on a bar chart visualized with Matplotlib.

Determining mean bond lengths from crystallographic data

The Cambridge Crystallographic Data Centre is a non-profit organisation devoted to small-molecule crystallography data. It curates, validates and distributes the Cambridge Structural Database (CSD) of over 800,000 organic and metal-organic crystal structures. The CSD has an excellent Python API which can be used to analyse these structures. Unfortunately, access to most of the CCDC data requires a paid-for licence or an institutional subscription. In the short project below I obtained the necessary crystal structures using my UCL credentials. Installation and configuration of the database and software is documented on the CCDC website.

Visualizing the gradient descent method

In the gradient descent method of optimization, a hypothesis function, $h_\boldsymbol{\theta}(x)$, is fitted to a data set, $(x^{(i)}, y^{(i)})$ ($i=1,2,\cdots,m$) by minimizing an associated cost function, $J(\boldsymbol{\theta})$ in terms of the parameters $\boldsymbol\theta = \theta_0, \theta_1, \cdots$. The cost function describes how closely the hypothesis fits the data for a given choice of $\boldsymbol \theta$.

The Earth-Venus "dance"

If you join the positions of Earth and Venus with a straight line and follow them as they orbit the sun, you get a nice picture that has been popular on Facebook. Here's the Python code for generating a similar diagram. The figure has the pleasing shape it does because the ratio of the planets' orbital periods is close to 13:8.