Terms of interest:

Also called Scikit-learn.

X and y are separate things (y is the target variable/column) and X is multiple is columns used to get y.

Given any pandas df use .to_numpy to convert first.

classifier score?

-0.018 bad 0.72 good

importdata_cleaning (puts all values between -1 and 1)

skileanr.pipeline allows you to combine steps.

to save model use pickle with

with open(out_file,"wb") as out:
pickle.dump(pipe,out)

p-values in linear regression in sklearn