Balancing a Chemical Reaction

Simple chemical reactions can be balanced by inspection. For example, for the following reaction:

$$ \mathrm{C_2H_5OH} + \mathrm{O_2} \rightarrow \mathrm{CO_2} + \mathrm{H_2O} \quad \mathrm{[unbalanced]} $$

since the number of each type of atom must be the same on each side of the reaction, one molecule of ethanol (\( \mathrm{C_2H_5OH} \)) must produce two molecules of \(\mathrm{CO_2}\) and three of \(\mathrm{H_2O}\). This leaves seven oxygen atoms on the product side of the reaction. Since there is one in the ethanol reactant itself, the remaining six must come from the \(\mathrm{O_2}\):

$$ \mathrm{C_2H_5OH} + 3\mathrm{O_2} \rightarrow 2\mathrm{CO_2} + 3\mathrm{H_2O} \quad \mathrm{[balanced]} $$

Balancing reactions using Python

To turn this process into an algorithm that can be implemented in Python, the stoichiometric numbers can be written as unknown variables to be entered into set of simultaneous equations determined by the atom conservation requirements:

$$ a\mathrm{C_2H_5OH} + b\mathrm{O_2} \rightarrow c\mathrm{CO_2} + d\mathrm{H_2O}, $$

where $$ \begin{array}[ccclccc] \mathrm{C}: & 2a & & = & c\\ \mathrm{H}: & 6a & & = & && 2d\\ \mathrm{O}: & a & + & 2b = & 2c & + & d \end{array} $$

Although there are three equations for the four unknown coefficients, we can impose a further constraint on the total stoichiometry: the reaction remains balanced if we multiply them all by the same factor of our choosing; equivalently, we can just choose one of them to be fixed to some value, say \(a=1\) (the choice made in the reasoning above). The equations can now be expressed in matrix form, after rearranging to place zero on the right hand side as follows:

$$ \begin{pmatrix} 2 & 0 & -1 & 0 \\ 6 & 0 & 0 & -2 \\ 1 & 2 & -2 & -1 \\ 1 & 0 & 0 & 0 \end{pmatrix} \begin{pmatrix} a \\ b \\ c \\ d \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \\ 1 \end{pmatrix}. $$

This matrix equation has the form \( \mathbf{A}\mathbf{x} = \mathbf{b} \), where the column vector \(\mathbf{x}\) holds the stoichiometric numbers to be found. It is possible to find \(\mathbf{x}\) using the matrix inverse of \( \mathbf{A}\), since multiplying both sides by \(\mathbf{A}^{-1}\) gives \(\mathbf{x} = \mathbf{A}^{-1} \mathbf{b} \):

import numpy as np
A = np.array([[2, 0, -1,  0],
              [6, 0,  0, -2],
              [1, 2, -2, -1],
              [1, 0,  0,  0]])
b = np.array([[0, 0, 0, 1]]).T
x = np.linalg.inv(A) @ b
print(x)
[[1.]
 [3.]
 [2.]
 [3.]]

However, in general it is better to solve systems of linear equations using the more numerically-stable np.linalg.solve method:

np.linalg.solve(A, b)
array([[1.],
       [3.],
       [2.],
       [3.]])

Note that it isn't actually necessary to include the signs of the entries in the matrix \(\mathbf{A}\): they could just as easily be included in the coefficients vector, \(\mathbf{x}\):

$$ \begin{pmatrix} 2 & 0 & 1 & 0 \\ 6 & 0 & 0 & 2 \\ 1 & 2 & 2 & 1 \\ 1 & 0 & 0 & 0 \end{pmatrix} \begin{pmatrix} a \\ b \\ c \\ d \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \\ 1 \end{pmatrix}. $$

A = np.array([[2, 0, 1, 0],
              [6, 0, 0, 2],
              [1, 2, 2, 1],
              [1, 0, 0, 0]])
x = np.linalg.inv(A) @ b
print(x)
[[ 1.]
 [ 3.]
 [-2.]
 [-3.]]

The difference in signs in the entries of the vector \(\mathbf{x}\) indicates that they apply to the stoichiometric coefficients of species on different sides of the reaction, but which is the left hand side and which the right hand side is not something that the balancing algorithm can tell you (the direction of the reaction is a matter for thermodynamics not linear algebra!)

Redox reactions

Redox reactions can be balanced in a similar way, but in addition to the atom stoichiometry, the charge must be conserved as well. For example, dichromate ions oxidise Fe(II) in the presence of acid:

$$ a\mathrm{Fe^{2+}} + b\mathrm{Cr_2O_7^{2-}} + c\mathrm{H^+} \rightarrow d\mathrm{Cr^{3+}} + e\mathrm{H_2O} + f\mathrm{Fe^{3+}} $$

We have: $$ \begin{pmatrix} 1 & 0 & 0 & 0 & 0 & 1 \\ 0 & 2 & 0 & 1 & 0 & 0 \\ 0 & 7 & 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & 2 & 0 \\ 2 & -2 & 1 & 3 & 0 & 3 \\ 0 & 1 & 0 & 0 & 0 & 0 \\ \end{pmatrix} \begin{pmatrix} a \\ b \\ c \\ d \\ e \\ f \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \\0 \\ 0 \\ 1 \end{pmatrix}, $$ where the rows of the matrix correspond to the stoichiometries of Fe, Cr, O, and H, the charge balance, and the choice \(b=1\).

A = np.array([[1,  0, 0, 0, 0, 1],
              [0,  2, 0, 1, 0, 0],
              [0,  7, 0, 0, 1, 0],
              [0,  0, 1, 0, 2, 0],
              [2, -2, 1, 3, 0, 3],
              [0,  1, 0, 0, 0, 0]
             ])

b = np.array([[0, 0, 0, 0, 0, 1]]).T

np.linalg.solve(A, b)
array([[ 6.],
       [ 1.],
       [14.],
       [-2.],
       [-7.],
       [-6.]])

We can therefore balance the reaction as:

$$ 6\mathrm{Fe^{2+}} + \mathrm{Cr_2O_7^{2-}} + 14\mathrm{H^+} \rightarrow 2\mathrm{Cr^{3+}} + 7\mathrm{H_2O} + 6\mathrm{Fe^{3+}}. $$

Unbalanceable reactions

Not all reactions can be balanced. For a simple example, consider the hypothetical reaction between water and nitrogen dioxide to form nitric acid:

$$ a\mathrm{H_2O} + b\mathrm{NO_2} \rightarrow c\mathrm{HNO_3} $$

Following our previous approach, we might hope to construct the matrix equation $$ \begin{pmatrix} 2 & 0 & 1 \\ 1 & 2 & 3 \\ 0 & 1 & 1 \\ 1 & 0 & 0 \\ \end{pmatrix} \begin{pmatrix} a \\ b \\ c \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix}. $$

Immediately, we can see a potential problem in that there are more equations (4) than unknowns (3). If this system of equations were consistent, we should be able to solve for any three of them, say, the first three:

\begin{align*} 2a + c &= 0,\\ a + 2b + 3c & = 0,\\ b + c &= 0. \end{align*}

However, these three equations are inconsistent: taking, for example, twice the third equation added to the first yields \(2a + 2b + 3c = 0\), which, in comparison with the second equation leads to the trivial solution \(a = b = c = 0\):

M = np.array([[2, 0, 1],
              [1, 2, 3],
              [0, 1, 1]
             ])
b = np.array([[0, 0, 0]]).T
np.linalg.solve(M, b)
array([[ 0.],
       [ 0.],
       [-0.]])

In terms of linear algebra, this can be analysed by considering the nullity of the matrix \(\mathbf{M}\): the dimension of its so-called null space (also called its kernel). This is the set of vectors, \(\mathbf{x}\) for which \(\mathbf{M}\mathbf{x} = \mathbf{0}\). For a unique solution, the nullity should be 1: in terms of the above discussion, this is the remaining degree of freedom that we have to scale the reaction by multiplying the stoichiometric numbers by the same arbitrary factor and keep it balanced. In practical terms, the nullity can be calculated as the difference between the order of the matrix (here, the number of columns, \(n\), and its rank (\(r\), the number of linearly-independent rows, which NumPy can determine using the function np.linalg.matrix_rank:

# The number of columns of M
n = M.shape[1]
r = np.linalg.matrix_rank(M)
print(n - r)
0

In this case, therefore, the only solution to the set of linear equations is the trivial one (all coefficients are zero), which has no physical meaning.

Compare this with the combustion of ethanol considered above:

M = np.array([[2, 0, -1,  0],
              [6, 0,  0, -2],
              [1, 2, -2, -1]
             ])
n = M.shape[1]
r = np.linalg.matrix_rank(M)
print(n - r)
1

Therefore, as we have seen, there should be a (nontrivial) solution to the equations corresponding to the conditions of stoichiometric balance.

Reactions with no unique balance

At the other extreme, there are reactions with more than one way to balance them. In this case, the nullity of the chemical composition matrix is greater than 1. A famous example is the combustion of gunpowder (a mixture of potassium nitrate, sulfur and charcoal), which follows a complex mechanism but is sometimes written:

$$ a\mathrm{KNO_3} + b\mathrm{S} + c\mathrm{C} \rightarrow d\mathrm{K_2CO_3} + e\mathrm{K_2SO_4} + f\mathrm{CO_2} + g\mathrm{N_2} $$

The composition matrix can be constructed in Python as before:

M = np.array([[1, 0, 0, 2, 2, 0, 0],
              [1, 0, 0, 0, 0, 0, 2],
              [3, 0, 0, 3, 4, 2, 0],
              [0, 1, 0, 0, 1, 0, 0],
              [0, 0, 1, 1, 0, 1, 0]])
n = M.shape[1]
r = np.linalg.matrix_rank(M)
print(n - r)
2

Thus, even after scaling the overall reaction, there is still a remaining degree of freedom (for example, the relative amounts of carbon and sulfur). Choosing a C:S ratio of 3:8 gives one reaction:

# Add rows to M and b corresponding to fixing coefficients b and c.
A = np.vstack([M, [0, 1, 0, 0, 0, 0, 0], [0, 0, 1, 0, 0, 0, 0]])
b = np.array([[0, 0, 0, 0, 0, 3, 8]]).T
np.linalg.solve(A, b)
array([[10.],
       [ 3.],
       [ 8.],
       [-2.],
       [-3.],
       [-6.],
       [-5.]])

That is,

$$ 10\mathrm{KNO_3} + 3\mathrm{S} + 8\mathrm{C} \rightarrow 2\mathrm{K_2CO_3} + 3\mathrm{K_2SO_4} + 6\mathrm{CO_2} + 5\mathrm{N_2}. $$

But, for example, taking the C:S ratio to be 1:2 gives a different balancing:

b = np.array([[0, 0, 0, 0, 0, 1, 2]]).T
np.linalg.solve(A, b)
array([[ 2.8],
       [ 1. ],
       [ 2. ],
       [-0.4],
       [-1. ],
       [-1.6],
       [-1.4]])

Applying a global scaling factor of 5 to ensure the stoichiometric coefficients are integers gives the following balanced reaction:

$$ 14\mathrm{KNO_3} + 5\mathrm{S} + 10\mathrm{C} \rightarrow 2\mathrm{K_2CO_3} + 5\mathrm{K_2SO_4} + 8\mathrm{CO_2} + 7\mathrm{N_2}. $$

This comes about because, in the complex combustion mechanism, the oxidation of carbon and sulfur is nearly independent. But, as already mentioned, the reaction is complex and other products not included here are also possible, for example \(\mathrm{K_2S}\), \(\mathrm{H_2S}\) and \(\mathrm{KSCN}\).