Problem in defining multidimensional array matrix and regression

Peter Otten __peter__ at web.de
Sun Nov 19 12:01:19 EST 2017


shalu.ashu50 at gmail.com wrote:
> Hi, All,
>> I have 6 variables in CSV file. One is rainfall (dependent, at y-axis) and
> others are predictors (at x). I want to do multiple regression and create
> a correlation matrix between rainfall (y) and predictors (x; n1=5). Thus I
> want to read rainfall as a separate variable and others in separate
> columns, so I can apply the algo. However, I am not able to make a proper
> matrix for them.
>> Here are my data and codes?
> Please suggest me for the same.
> I am new to Python.
>> RF	P1	P2	P3	P4	P5
> 120.235	0.234	-0.012	0.145	21.023	0.233
> 200.14	0.512	-0.021	0.214	22.21	0.332
> 185.362	0.147	-0.32	0.136	24.65	0.423
> 201.895	0.002	-0.12	0.217	30.25	0.325
> 165.235	0.256	0.001	0.22	31.245	0.552
> 198.236	0.012	-0.362	0.215	32.25	0.333
> 350.263	0.98	-0.85	0.321	38.412	0.411
> 145.25	0.046	-0.36	0.147	39.256	0.872
> 198.654	0.65	-0.45	0.224	40.235	0.652
> 245.214	0.47	-0.325	0.311	26.356	0.632
> 214.02	0.18	-0.012	0.242	22.01	0.745
> 147.256	0.652	-0.785	0.311	18.256	0.924
>> import numpy as np
> import statsmodels as sm
> import statsmodels.formula as smf
> import csv
>> with open("pcp1.csv", "r") as csvfile:
> readCSV=csv.reader(csvfile)
>> rainfall = []
> csvFileList = []
>> for row in readCSV:
> Rain = row[0]
> rainfall.append(Rain)
>> if len (row) !=0:
> csvFileList = csvFileList + [row]
>> print(csvFileList)
> print(rainfall)

You are not the first to read tabular data from a file; therefore numpy (and 
pandas) offer highlevel function to do just that. Once you have the complete 
table extracting a specific column is easy. For instance:
$ cat rainfall.txt 
RF P1 P2 P3 P4 P5
120.235 0.234 -0.012 0.145 21.023 0.233
200.14 0.512 -0.021 0.214 22.21 0.332
185.362 0.147 -0.32 0.136 24.65 0.423
201.895 0.002 -0.12 0.217 30.25 0.325
165.235 0.256 0.001 0.22 31.245 0.552
198.236 0.012 -0.362 0.215 32.25 0.333
350.263 0.98 -0.85 0.321 38.412 0.411
145.25 0.046 -0.36 0.147 39.256 0.872
198.654 0.65 -0.45 0.224 40.235 0.652
245.214 0.47 -0.325 0.311 26.356 0.632
214.02 0.18 -0.012 0.242 22.01 0.745
147.256 0.652 -0.785 0.311 18.256 0.924
$ python3
Python 3.4.3 (default, Nov 17 2016, 01:08:31) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> rf = numpy.genfromtxt("rainfall.txt", names=True)
>>> rf["RF"]
array([ 120.235, 200.14 , 185.362, 201.895, 165.235, 198.236,
 350.263, 145.25 , 198.654, 245.214, 214.02 , 147.256])
>>> rf["P3"]
array([ 0.145, 0.214, 0.136, 0.217, 0.22 , 0.215, 0.321, 0.147,
 0.224, 0.311, 0.242, 0.311])


More information about the Python-list mailing list

AltStyle によって変換されたページ (->オリジナル) /