Manage categorical variables with NA in R -


i using national survey run regression. df based on deographic , economic variables , sometime there missing values r address "na".

i have categorical variables sometime find problems: example have variables q takes value 1 if individual employee, 2 if he/she worker not employee , 3 if person doesn't work.

i know employees can work in private or public sector; problem don't know if employee worker of private or public sector (i have na).

i want construct categorical variable taking care if employee in private or public sector:

df$q2 <- ifelse(d.d$q=="3",1, ifelse(d.d$q=="2",2, ifelse(d.d$q=="1" & d.d$priv=="1",3, ifelse(d.d$q=="1" & d.d$pubbl=="1",4, 0))))  df$q2 <- as.factor(d.d$q2) levels(d.d$q2) "0","1","2","3","4" 

the level 0 suppos referred employee worker don't know working sector (private or public).

my desidered output levels 1,2,3,4 , drop level 0; tried search on web solution found drop observations.

just 1 more question: if create 4 dummies variable q2:

d.d$not_worker <- ifelse(d.d$q2=="1",1,0) d.d$public_employee <- ifelse(d.d$q2=="4",1,0) d.d$private_employee <- ifelse(d.d$q2=="3",1,0) d.d$worker_not_employee <- ifelse(d.d$q2=="2",1,0) 

and factorized of them command : as.factor() , running regression omitting variable d.d$not_worker soution?

d.d$not_worker <- as.factor(d.d$not_worker) d.d$public_employee <- as.factor(d.d$public_employee) d.d$private_employee <- as.factor(d.d$private_employee) d.d$worker_not_employee <- as.factor(d.d$worker_not_employee)  eq1 <- lm(pip ~ public_employee + private_employee + worker_not_employee, data=d.d) 

thank in advance


Comments

Popular posts from this blog

php - How to add and update images or image url in Volusion using Volusion API -

Laravel mail error `Swift_TransportException in StreamBuffer.php line 269: Connection could not be established with host smtp.gmail.com [ #0]` -

c# SetCompatibleTextRenderingDefault must be called before the first -