Scale Numpy array to certain range

Question 1

As I've described in a StackOverflow question, I'm trying to fit a NumPy array into a certain range.

Here is the solution I currently use:

import numpy as np
def scale_array(dat, out_range=(-1, 1)):
 domain = [np.min(dat, axis=0), np.max(dat, axis=0)]
 def interp(x):
 return out_range[0] * (1.0 - x) + out_range[1] * x
 def uninterp(x):
 b = 0
 if (domain[1] - domain[0]) != 0:
 b = domain[1] - domain[0]
 else:
 b = 1.0 / domain[1]
 return (x - domain[0]) / b
 return interp(uninterp(dat))
print(scale_array(np.array([-2, 0, 2], dtype=np.float)))
# Gives: [-1., 0., 1.]
print(scale_array(np.array([-3, -2, -1], dtype=np.float)))
# Gives: [-1., 0., 1.]

Is there a way to make this code cleaner? Is there a built-in function in NumPy or scikit-learn? This feels like a really common data pre-processing step and it feels weird that I keep re-implementing it.

Question 2

NumPy provides numpy.interp for 1-dimensional linear interpolation. In this case, where you want to map the minimum element of the array to −1 and the maximum to +1, and other elements linearly in-between, you can write:

np.interp(a, (a.min(), a.max()), (-1, +1))

For more advanced kinds of interpolation, there's scipy.interpolate.

Question 3

What you need here are basically two rescalings. The first is to rescale the data to be symmetric around 0 and the second is to shift and scale it to the out_range. Both can be simply written down, there is no need for your inner functions and their special cases.

def scale(x, out_range=(-1, 1)):
 domain = np.min(x), np.max(x)
 y = (x - (domain[1] + domain[0]) / 2) / (domain[1] - domain[0])
 return y * (out_range[1] - out_range[0]) + (out_range[1] + out_range[0]) / 2

Note that I removed the axis=0 arguments to np.min and np.max. By default they run over all axes. If that is not what you want, but you want to rescale only some axis, I would make this a parameter of the scale function to give the user full control:

def scale(x, out_range=(-1, 1), axis=None):
 domain = np.min(x, axis), np.max(x, axis)
 y = (x - (domain[1] + domain[0]) / 2) / (domain[1] - domain[0])
 return y * (out_range[1] - out_range[0]) + (out_range[1] + out_range[0]) / 2

This function behaves the same as yours, even with out_range = (-1, -1).

Gareth Rees Gareth Rees 50.1k3 gold badges130 silver badges210 bronze badges · Accepted Answer · 2018-01-23 16:53:30Z

NumPy provides numpy.interp for 1-dimensional linear interpolation. In this case, where you want to map the minimum element of the array to −1 and the maximum to +1, and other elements linearly in-between, you can write:

np.interp(a, (a.min(), a.max()), (-1, +1))

For more advanced kinds of interpolation, there's scipy.interpolate.

Stack Exchange Network

Scale Numpy array to certain range

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Scale Numpy array to certain range

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions