I've written a program that finds the difference between data and gives output. Here's the code:
import json
import numpy as np
print("This is a basic machine learning thing.")
baseData = {"collecting":True,"x":[],"y":[]}
while baseData["collecting"]:
baseData["x"].append(float(input("X:")))
baseData["y"].append(float(input("Y:")))
if input("Do you want to keep feeding data? Press enter for yes, or type anything for no.") != "":
baseData["collecting"] = False
if len(baseData["x"]) == len(baseData["y"]):
xdata = baseData["x"]
ydata = baseData["y"]
nums = []
for i in range(len(xdata)):
nums.append(xdata[i] - ydata[i])
median = np.median(nums)
else:
print("malformed data")
def getY(x):
pass
while True:
data = input("X/Data:")
print(int(data)-median)
To work the program, give it X and Y data, then give it X data and it will predict Y data.
-
\$\begingroup\$ Do you have a particular question or concern with your code? \$\endgroup\$KaPy3141– KaPy31412021年03月13日 22:16:33 +00:00Commented Mar 13, 2021 at 22:16
-
\$\begingroup\$ @KaPy3141 I want to know some ways I can improve or minify this code. \$\endgroup\$UCYT5040– UCYT50402021年03月13日 22:43:24 +00:00Commented Mar 13, 2021 at 22:43
2 Answers 2
Maybe you should check validity of the input and ask again if input is wrong?
while baseData["collecting"]:
baseData["x"].append(float(input("X:")))
This is always True, so just discard this part:
if len(baseData["x"]) == len(baseData["y"]):
Maybe you should give an option to exit?
while True:
data = input("X/Data:")
print(int(data)-median)
And in general, calling this a "machiene-learning thing" is quite a stretch of imagination, no? Maybe you should have a look at how to fit data. I.e. a basic linear model fit.
baseData
should be split into 3 separate variables(collecting
, xdata
, ydata
); there is no reason for it to be a dict.
nums = []
for i in range(len(xdata)):
nums.append(xdata[i] - ydata[i])
can be written more Pythonically as:
nums = []
for x, y in zip(xdata, ydata):
nums.append(x - y)
or even just:
nums = [x - y for x, y in zip(xdata, ydata)]
You don't need to import numpy only for the media; stdlib statistics.median
should work just fine.