All Levels of a Factor in a Model Matrix in R

0 votes
I have a data.frame that includes factor and numeric variables, as may be seen below.

testFrame = data.frame(First=sample (1:10), Second=sample (1:20), Third=sample (1:10), Replace=T);
Fifth=rep(c("Edward","Frank","Georgia","Hank","Isaac"),4) I want to construct a matrix that assigns dummy variables to the factor and leaves the numeric variables alone. Fourth=rep(c("Alice","Bob","Charlie","David"), 5), and Fifth=rep(c("Edward","Frank","Georgia

First + Second + Third + Fourth + Fifth, data=testFrame, model.matrix
This eliminates the reference level for one level of each factor, as expected when executing lm. But I want to create a matrix that includes a dummy or indicator variable for each level of every factor. I am not concerned about multicollinearity because I am developing this matrix for glmnet.

Is there a way to have model.matrix create the dummy for every level of the factor?
Jul 9, 2022 in Data Analytics by avinash
• 1,840 points
942 views

1 answer to this question.

0 votes

Yes, you can modify the model.matrix() function to create dummy variables for every level of a factor variable, including all levels of each factor. By default, model.matrix() uses a treatment contrast coding, which creates dummy variables for each level except one (reference level). To include all levels as separate dummy variables, you can use the contrasts.arg parameter in the model.matrix() function. Here's an example:

testFrame <- data.frame(First = sample(1:10), Second = sample(1:20), Third = sample(1:10), Replace = TRUE)
Fourth <- rep(c("Alice", "Bob", "Charlie", "David"), 5)
Fifth <- rep(c("Edward", "Frank", "Georgia", "Hank", "Isaac"), 4)
testFrame$Fourth <- as.factor(Fourth)
testFrame$Fifth <- as.factor(Fifth)

dummyMatrix <- model.matrix(~., data = testFrame, contrasts.arg = lapply(testFrame[ , sapply(testFrame, is.factor)], contrasts, contrasts = FALSE))

In this example, we convert the Fourth and Fifth variables to factors and then pass the testFrame data.frame to the model.matrix() function. The contrasts.arg parameter uses lapply() to apply the contrasts() function with contrasts = FALSE to all factor variables in testFrame. This ensures that dummy variables are created for all levels of each factor variable.

The resulting dummyMatrix will include dummy variables for every level of each factor variable, while leaving the numeric variables unchanged.

Enhance your data skills with our comprehensive Data Analytics Courses – Enroll now!

answered Jun 22, 2023 by anonymous
• 1,380 points

Related Questions In Data Analytics

0 votes
1 answer

How to write a custom function which will replace all the missing values in a vector with the mean of values in R?

Consider this vector: a<-c(1,2,3,NA,4,5,NA,NA) Write the function to impute ...READ MORE

answered Jul 4, 2018 in Data Analytics by CodingByHeart77
• 3,750 points
4,552 views
+2 votes
1 answer

Show a list of all variables in R

Hi Swathi, You can use ls() to list ...READ MORE

answered Jun 27, 2019 in Data Analytics by Cherukuri
• 33,030 points
96,342 views
0 votes
1 answer

How to replace all occurrences of a character in a character column in a data frame in R

If you used sub() to replace the ...READ MORE

answered Jun 29, 2019 in Data Analytics by anonymous
• 33,030 points
17,935 views
0 votes
1 answer

How do I make a matrix from a list of vectors in R?

Suppose l1 and l2 are my vectors, li = ...READ MORE

answered Aug 7, 2019 in Data Analytics by Cherukuri
• 33,030 points
1,298 views
0 votes
1 answer

Big Data transformations with R

Dear Koushik, Hope you are doing great. You can ...READ MORE

answered Dec 18, 2017 in Data Analytics by Sudhir
• 1,570 points
1,030 views
0 votes
2 answers

Transforming a key/value string into distinct rows in R

We would start off by loading the ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,660 points
1,178 views
0 votes
1 answer

Finding frequency of observations in R

You can use the "dplyr" package to ...READ MORE

answered Mar 26, 2018 in Data Analytics by Bharani
• 4,660 points
5,890 views
0 votes
1 answer

Left Join and Right Join using "dplyr"

The below is the code to perform ...READ MORE

answered Mar 27, 2018 in Data Analytics by Bharani
• 4,660 points
1,111 views
0 votes
1 answer

Error: could not find function ... in R

If you encounter an error stating "'some.function' ...READ MORE

answered Jun 22, 2023 in Data Analytics by anonymous
• 1,380 points
645 views
0 votes
1 answer

how to use the Box-Cox power transformation in R

Yes, you are on the right track ...READ MORE

answered Jun 22, 2023 in Data Analytics by anonymous
• 1,380 points
1,163 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP