How to get a regression summary in scikit-learn like R does

0 votes

As an R user, I wanted to also get up to speed on scikit.

Creating a linear regression model(s) is fine, but can't seem to find a reasonable way to get a standard summary of regression output.

Code example:

# Linear Regression
import numpy as np
from sklearn import datasets
from sklearn.linear_model import LinearRegression

# Load the diabetes datasets
dataset = datasets.load_diabetes()

# Fit a linear regression model to the data
model = LinearRegression()
model.fit(dataset.data, dataset.target)
print(model)

# Make predictions
expected = dataset.target
predicted = model.predict(dataset.data)

# Summarize the fit of the model
mse = np.mean((predicted-expected)**2)
print model.intercept_, model.coef_, mse, 
print(model.score(dataset.data, dataset.target))

Issues:

  • seems like the intercept and coef are built into the model, and I just type print (second to last line) to see them.
  • What about all the other standard regression output like R^2, adjusted R^2, p values, etc. If I read the examples correctly, seems like you have to write a function/equation for each of these and then print it.
  • So, is there no standard summary output for lin. reg. models?
  • Also, in my printed array of outputs of coefficients, there are no variable names associated with each of these? I just get the numeric array. Is there a way to print these where I get an output of the coefficients and the variable they go with?

My printed output:

LinearRegression(copy_X=True, fit_intercept=True, normalize=False)
152.133484163 [ -10.01219782 -239.81908937  519.83978679  324.39042769 -792.18416163
  476.74583782  101.04457032  177.06417623  751.27932109   67.62538639] 2859.69039877
0.517749425413

Notes: Started off with Linear, Ridge and Lasso. I have gone through the examples. Below is for the basic OLS.

Mar 14, 2022 in Machine Learning by Nandini
• 5,480 points
3,763 views

1 answer to this question.

0 votes

In sklearn, there is no R type regression summary report. The fundamental reason for this is because sklearn is used for predictive modeling and machine learning, and the assessment criteria are based on performance on previously unseen data (for example, prediction r2 for regression).

sklearn.metrics.classification report is a summary function for classification that calculates multiple types of (predictive) scores on a classification model.

Check out statsmodels for a more traditional statistical approach

Supercharge Your Skills with Our Machine Learning Course!

answered Mar 15, 2022 by Dev
• 6,000 points

Related Questions In Machine Learning

0 votes
1 answer

How to save classifier to disk in scikit-learn?

Hi@akhtar, Classifiers are just objects that can be ...READ MORE

answered Jul 14, 2020 in Machine Learning by MD
• 95,460 points
1,150 views
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

How to add random and/or fixed effects into cloglog regression in R

The standard glm function can be used ...READ MORE

answered Apr 13, 2022 in Machine Learning by anonymous
726 views
0 votes
2 answers
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 7, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 4,516 views
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

How to use ICD10 Code in a regression model in R?

Using the concept of comorbidities is a ...READ MORE

answered Apr 12, 2022 in Machine Learning by Dev
• 6,000 points
671 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP