I would like to create a numpy array without creating a list first.
At the moment I've got this:
import pandas as pd
import numpy as np
dfa = pd.read_csv('csva.csv')
dfb = pd.read_csv('csvb.csv')
pa = np.array(dfa['location'])
pb = np.array(dfb['location'])
ra = [(pa[i+1] - pa[i]) / float(pa[i]) for i in range(9999)]
rb = [(pb[i+1] - pb[i]) / float(pb[i]) for i in range(9999)]
ra = np.array(ra)
rb = np.array(rb)
Is there any elegant way to do in one step the last fill in of this np array without creating the list first ?
Thanks
3 Answers 3
You can calculate with vectors in numpy, without the need of lists:
ra = (pa[1:] - pa[:-1]) / pa[:-1]
rb = (pb[1:] - pb[:-1]) / pb[:-1]
Comments
The title of your question and what you need to do in your specific case are actually two slighly different things.
To create a numpy array without "casting" a list (or other iterable) you can use one of the several methods defined by numpy itself that returns array:
np.empty,np.zeros,np.ones,np.fullto create arrays of given size with fixed valuesnp.random.*(where*can be various distributions, like normal, uniform, exponential ...), to create arrays of given size with random values
In general, read this: Array creation routines
In your case, you already have numpy arrays (pa and pb) and you don't have to create lists to calculate the new arrays (ra and rb), you can directly operate on the numpy arrays (which is the entire point of numpy: you can do operations on arrays way faster that would be iterating over each element!). Copied from @Daniel's answer:
ra = (pa[1:] - pa[:-1]) / pa[:-1]
rb = (pb[1:] - pb[:-1]) / pb[:-1]
This will be much faster than you're current implementation, not only because you avoid converting a list to ndarray, but because numpy arrays are order of magnuitude faster for mathematical and batch operations than iteration
Comments
Return a new array of given shape and type, filled with zeros.
or
Return a new array of given shape and type, filled with ones.
or
Return a new array of given shape and type, without initializing entries.
dfa['location'].valuesin palce ofnp.array(dfa['location'])