stata - Identify unique levels of categorical variable -
i have list of person ids, , types of medicines got on specific dates.
i create variable count
whereby can give indicator 1 first drug occurs, 2 second unique drug , 3 third unique drug. when first drug occurs after second , third, want still have indicator 1. likewise unique drug 2, should maintain value 2 throughout person's whole medication history, , same drug 3.
+-------------------------------------+ | p_id date agent_~e count | |-------------------------------------| 38. | 1001 13dec2001 thiazide 1| 39. | 1001 12apr2002 thiazide 1| 40. | 1001 15jul2002 thiazide 1| 41. | 1001 28aug2002 arb 2| 42. | 1001 26sep2002 ccb 3| |-------------------------------------| 43. | 1001 26sep2002 arb 2| 44. | 1001 10oct2002 ccb 3| 45. | 1001 10oct2002 thiazide 1| 46. | 1001 10oct2002 arb 2| 47. | 1001 10dec2002 ccb 3| |-------------------------------------| 48. | 1001 10dec2002 arb 2| +-------------------------------------+
because each person has different set of drugs, think need quite general solution opposed
gen count = 1 if agent_type == "thiazide".
for example, person 2 below , have different drug history person 1 above.
+-------------------------------+ | p_id date agent_t~e | |-------------------------------| 207. | 2001 08jul1999 ace_inhib | 208. | 2001 02aug1999 ace_inhib | 209. | 2001 25aug1999 ace_inhib | 210. | 2001 22oct1999 ace_inhib | 211. | 2001 18nov1999 ccb | |-------------------------------| 212. | 2001 18nov1999 ace_inhib | 213. | 2001 14dec1999 ccb | 214. | 2001 12jan2000 ccb | 215. | 2001 03feb2000 ccb | 216. | 2001 03feb2000 arb | |-------------------------------| 217. | 2001 02mar2000 ccb | +-------------------------------+
"unique" common misnomer here; strictly, means occurring once only, not mean @ all. "distinct" better word: discussion in stata context, see here.
please find out dataex
ssc able show data examples can copied , pasted directly. yours required engineering made easy use.
your problem stata faq found here. idea through faqs before posting.
* example generated -dataex-. install: ssc install dataex clear input float p_id str8 agent_type float(wanted date) 1001 "thiazide" 1 15322 1001 "thiazide" 1 15442 1001 "thiazide" 1 15536 1001 "arb" 2 15580 1001 "ccb" 3 15609 1001 "arb" 2 15609 1001 "ccb" 3 15623 1001 "thiazide" 1 15623 1001 "arb" 2 15623 1001 "ccb" 3 15684 1001 "arb" 2 15684 2001 "ace_inhi" 1 14433 2001 "ace_inhi" 1 14458 2001 "ace_inhi" 1 14481 2001 "ace_inhi" 1 14539 2001 "ccb" 2 14566 2001 "ace_inhi" 1 14566 2001 "ccb" 2 14592 2001 "ccb" 2 14621 2001 "ccb" 2 14643 2001 "arb" 3 14643 2001 "ccb" 2 14671 end format date %td bysort p_id agent_type (date) : gen firstdate = date[1] egen group = group(p_id firstdate agent_type) bysort p_id (group date agent_type): gen count = sum(group != group[_n-1]) assert count == wanted
note code takes care of possibility 2 or more drugs first used on same day same person.
Comments
Post a Comment