Numpy¶

Numpy特征和导入¶

用于多维数组的第三方Python包
更接近于底层和硬件 (高效)
专注于科学计算 (方便)
导入包:import numpy as np

list转为数组¶

a = np.array([0,1,2,3])
输出为:[0 1 2 3]
数据类型:<type 'numpy.ndarray'>

一维数组¶

a = np.array([1,2,3,4])属性 a.ndim–>维度为1 a.shape–>形状,返回(4,) len(a)–>长度,4
访问数组a[1:5:2]下标1-5,下标关系+2
逆序a[::-1]

多维数组¶

二维:a = np.array([[0,1,2,3],[1,2,3,4]]) 输出为:

[[0 1 2 3]
[1 2 3 4]]

a.ndm –>2 a.shape –>(2,4)–>行数,列数 len(a) –>2–>第一维大小

三维:a = np.array([[[0],[1]],[[2],[4]]]) a.shape–>(2,2,1)

用函数创建数组¶

np.arange()`

a = np.arange(0, 10)
b = np.arange(10)
c = np.arange(0,10,2)

输出:

[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 2 4 6 8]

np.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None) 等距离产生num个数
np.logspace(start, stop, num=50, endpoint=True, base=10.0, dtype=None) 以log函数取

常用数组¶

a = np.ones((3,3)) 输出:
```
[[ 1. 1. 1.]
[ 1. 1. 1.]
[ 1. 1. 1.]]
```
np.zeros((3,3))
np.eye(2)单位矩阵
np.diag([1,2,3],k=0)对角矩阵,k为对角线的偏移

随机数矩阵¶

a = np.random.rand(4) 输出:[ 0.99890402 0.41171695 0.40725671 0.42501804]范围在[0,1]之间
a = np.random.randn(4) Gaussian函数,
生成100个0-m的随机数: [t for t in [np.random.randint(x-x, m) for x in range(100)]] 也可以
```
m_arr = np.arange(0,m) # 生成0-m-1
np.random.shuffle(m_arr) # 打乱m_arr顺序
```
然后取前100个即可

查看数据类型¶

a.dtype

数组复制¶

共享内存
```
a = np.array([1,2,3,4,5])
b = a
print np.may_share_memory(a,b)
```
输出:True 说明使用的同一个存储区域,修改一个数组同时另外的也会修改
不共享内存
```
b = a.copy()
```

布尔型¶

a = np.random.random_integers(0,20,5)
print a
print a%3==0
print a[a % 3 == 0]

输出:

[14 3 6 15 4]
[False True True True False]
[ 3 6 15]

中间数、平均值¶

中间数np.median(a)
平均值np.mean(a), 若是矩阵,不指定axis默认求所有元素的均值 axis=0,求列的均值 axis=1,求行的均值

矩阵操作¶

乘积np.dot(a,b)

a = np.array([[1,2,3],[2,3,4]])
b = np.array([[1,2],[2,3],[2,2]])
print np.dot(a,b)

或者使用np.matrix()生成矩阵,相乘需要满足矩阵相乘的条件

内积np.inner(a,b) 行相乘
逆矩阵np.linalg.inv(a)
列的最大值np.max(a[:,0])–>返回第一列的最大值
每列的和np.sum(a,0)
每行的平均数np.mean(a,1)
求交集p.intersect1d(a,b),返回一维数组
转置:np.transpose(a)
两个矩阵对应对应元素相乘(点乘):a*b

文件操作¶

保存:tofile()

a = np.arange(10)
a.shape=2,5
a.tofile("test.bin")

读取:(需要注意指定保存的数据类型)

a = np.fromfile("test.bin",dtype=np.int32)
print a

保存:np.save("test",a)–>会保存成test.npy文件读取:a = np.load("test")

组合两个数组¶

垂直组合

a = np.array([1,2,3])
b = np.array([[1,2,3],[4,5,6]])
c = np.vstack((b,a))

水平组合

a = np.array([[1,2],[3,4]])
b = np.array([[1,2,3],[4,5,6]])
c = np.hstack((a,b))

读声音Wave文件¶

wave

import wave
from matplotlib import pyplot as plt
import numpy as np
# 打开WAV文档
f = wave.open(r"c:\WINDOWS\Media\ding.wav", "rb")
# 读取格式信息
# (nchannels, sampwidth, framerate, nframes, comptype, compname)
params = f.getparams()
nchannels, sampwidth, framerate, nframes = params[:4]
# 读取波形数据
str_data = f.readframes(nframes)
f.close()
#将波形数据转换为数组
wave_data = np.fromstring(str_data, dtype=np.short)
wave_data.shape = -1, 2
wave_data = wave_data.T
time = np.arange(0, nframes) * (1.0 / framerate)
# 绘制波形
plt.subplot(211) 
plt.plot(time, wave_data[0])
plt.subplot(212) 
plt.plot(time, wave_data[1], c="g")
plt.xlabel("time (seconds)")
plt.show()

where¶

找到y数组中=1的位置:np.where(y==1)

np.ravel(y)¶

将二维的转化为一维的,eg:(5000,1)-->(5000,)

ndarray.flat函数¶

将数据展开对应的数组,可以进行访问

应用:0/1映射

def dense_to_one_hot(label_dense,num_classes):
 num_labels = label_dense.shape[0]
 index_offset = np.arange(num_labels)*num_classes
 labels_one_hot = numpy.zeros((num_labels, num_classes))
 labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1
 return labels_one_hot

数组访问¶

X = np.array([[1,2],[3,4]])

X[0:1]和X[0:1,:]等价,都是系那是第一行数据

`np.c_()`¶

按照第二维度,即列拼接数据

np.c_[np.array([[1,2,3]]), 0, 0, np.array([[4,5,6]])]

输出:array([[1, 2, 3, 0, 0, 4, 5, 6]])

两个列表list拼接,长度要一致

np.c_[[1,2,3],[2,3,4]]

np.c_[range(1,5),range(2,6)]