A blog of Python-related topics and code.

The multivariate Gaussian distribution of an $n$-dimensional vector $\boldsymbol{x}=(x_1, x_2, \cdots, x_n)$ may be written

The OECD provides a tool for studying the change in average real wages (compared in purchase power parity adjusted US dollars using a 2012 base year). Here is how real wages have changed in 15 countries, visualised on a bar chart visualized with Matplotlib.

The Cambridge Crystallographic Data Centre is a non-profit organisation devoted to small-molecule crystallography data. It curates, validates and distributes the Cambridge Structural Database (CSD) of over 800,000 organic and metal-organic crystal structures. The CSD has an excellent Python API which can be used to analyse these structures. Unfortunately, access to most of the CCDC data requires a paid-for licence or an institutional subscription. In the short project below I obtained the necessary crystal structures using my UCL credentials. Installation and configuration of the database and software is documented on the CCDC website.

In the *gradient descent method* of optimization, a *hypothesis* function, $h_\boldsymbol{\theta}(x)$, is fitted to a data set, $(x^{(i)}, y^{(i)})$ ($i=1,2,\cdots,m$) by minimizing an associated *cost function*, $J(\boldsymbol{\theta})$ in terms of the parameters $\boldsymbol\theta = \theta_0, \theta_1, \cdots$. The cost function describes how closely the hypothesis fits the data for a given choice of $\boldsymbol \theta$.

If you join the positions of Earth and Venus with a straight line and follow them as they orbit the sun, you get a nice picture that has been popular on Facebook. Here's the Python code for generating a similar diagram. The figure has the pleasing shape it does because the ratio of the planets' orbital periods is close to 13:8.