Is there a way to force the coefficient of the independent variable to be a positive coefficient in the linear regression model used in R

0 votes
In lm(y ~ x1 + x2+ x3 +...+ xn) , not all independent variables are positive. For example, we know that x1 to x5 must have positive coefficients and x6 to x10 must have negative coefficients. However, when lm(y ~ x1 + x2+ x3 +...+ x10) is performed using R, some of x1 ~ x5 have negative coefficients and some of x6 ~ x10 have positive coefficients. is the data analysis result. I want to control this using a linear regression method, is there any good way?
Mar 4, 2022 in Machine Learning by Nandini
• 5,480 points
2,386 views

1 answer to this question.

0 votes

A Few Constraints

This is an example of Simpson's Paradox, which illustrates situations in which the sign of a correlation might change depending on whether or not another variable is included.

In the case of nls with algorithm = "port," upper and lower constraints can be defined.

If all coefficients should be non-negative, use nnnpls in the nnls package, which supports upper and lower 0 bounds, or nnls in the same package.

In the bvls package, type bvls (bounded value least squares) and specify the bounds.
In the CVXR package's vignette, there is an example of executing non-negative least squares.

Use the quadprog package to reformulate it as a quadratic programming problem (see Wikipedia for the formulation).

The limSolve package contains nnls. To make it a non-negative least squares issue, delete the columns that should have negative coefficients.

The majority of these packages don't offer a formula interface and instead require a model matrix and dependent variable to be given as separate arguments. The model matrix can be calculated if df is a data frame containing the data and the first column is the dependent variable:

B <- model.matrix(~., df[-1])

and the dependent variable is

df[[1]]

Certain Penalties


Another option is to apply a penalty to the least squares objective function, such that it becomes the sum of the squares of the residuals plus one or more additional terms that are functions of the coefficients and tuning parameters. Despite the fact that this does not apply any strict limitations to ensure the appropriate signs, it may nevertheless result in the proper signs. This is especially helpful when the problem is poorly conditioned or there are more predictors than observations.

The ridge package's linearRidge function minimizes the sum of the squares of the residuals plus a penalty equal to lambda times the sum of squares of the coefficients. Lambda is a scalar tuning parameter that the software can determine automatically. When the lambda is zero, it reduces to least squares. The software has a formula technique, which, along with the automatic tuning, makes it quite simple to use.

glmnet introduces penalty terms with two tuning options. As special instances, least squares and ridge regression are included. It also allows for coefficient bounds. The two tuning parameters can be automatically set, although there is no formula technique and the operation is not as simple as in the ridge package. More information can be found in the vignettes that come with it.

answered Mar 7, 2022 by Dev
• 6,000 points

Related Questions In Machine Learning

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

How do I create a linear regression model in Weka without training?

Weka is a classification algorithm. This is ...READ MORE

answered Mar 9, 2022 in Machine Learning by Nandini
• 5,480 points
1,387 views
0 votes
1 answer

How to export regression equations for grouped data?

First, you'll need a linear model with ...READ MORE

answered Mar 14, 2022 in Machine Learning by Dev
• 6,000 points
586 views
0 votes
1 answer

Extract regression coefficient values

A quick rundown. These values are stored ...READ MORE

answered Mar 30, 2022 in Machine Learning by Dev
• 6,000 points
1,051 views
0 votes
1 answer

Big Data transformations with R

Dear Koushik, Hope you are doing great. You can ...READ MORE

answered Dec 18, 2017 in Data Analytics by Sudhir
• 1,570 points
1,030 views
0 votes
2 answers

Transforming a key/value string into distinct rows in R

We would start off by loading the ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,660 points
1,179 views
0 votes
1 answer

How to use ICD10 Code in a regression model in R?

Using the concept of comorbidities is a ...READ MORE

answered Apr 12, 2022 in Machine Learning by Dev
• 6,000 points
629 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP