Explain the following behaviour:
In [x]: a, b = np.zeros((3,)), np.ones((3,))
In [x]: a.dtype = 'int'
In [x]: a
Out[x]: array([0, 0, 0])
In [x]: b.dtype = 'int'
In [x]: b
Out[x]: array([4607182418800017408, 4607182418800017408, 4607182418800017408])
What is the correct way to convert an array of one data type to an array of another?
Changing an array's type by setting dtype
directly does not alter the data at the byte-level, only how that data is interpreted as a number, string etc. As it happens, the byte-representations of zero are the same for integers (int64
) and floats (float64
), so the result of setting dtype
is as expected. However, the 8-bytes representing 1.0
translate to the integer 4602678819172646912
.
To convert the data type properly, use astype()
which returns a new array (with its own data):
In [x]: a = np.ones((3,))
In [x]: a
Out[x]: array([ 1., 1., 1.])
In [x]: a.astype('int')
In [x]: a
Out[x]: array([1, 1, 1])