stata - Identify unique levels of categorical variable -


i have list of person ids, , types of medicines got on specific dates.

i create variable count whereby can give indicator 1 first drug occurs, 2 second unique drug , 3 third unique drug. when first drug occurs after second , third, want still have indicator 1. likewise unique drug 2, should maintain value 2 throughout person's whole medication history, , same drug 3.

     +-------------------------------------+      | p_id      date      agent_~e  count |      |-------------------------------------|  38. |  1001   13dec2001   thiazide       1|  39. |  1001   12apr2002   thiazide       1|  40. |  1001   15jul2002   thiazide       1|  41. |  1001   28aug2002        arb       2|  42. |  1001   26sep2002        ccb       3|      |-------------------------------------|  43. |  1001   26sep2002        arb       2|  44. |  1001   10oct2002        ccb       3|  45. |  1001   10oct2002   thiazide       1|  46. |  1001   10oct2002        arb       2|  47. |  1001   10dec2002        ccb       3|      |-------------------------------------|  48. |  1001   10dec2002        arb       2|      +-------------------------------------+ 

because each person has different set of drugs, think need quite general solution opposed

gen count = 1 if agent_type == "thiazide".  

for example, person 2 below , have different drug history person 1 above.

         +-------------------------------+          | p_id    date        agent_t~e |          |-------------------------------|     207. |  2001   08jul1999   ace_inhib |     208. |  2001   02aug1999   ace_inhib |     209. |  2001   25aug1999   ace_inhib |     210. |  2001   22oct1999   ace_inhib |     211. |  2001   18nov1999         ccb |          |-------------------------------|     212. |  2001   18nov1999   ace_inhib |     213. |  2001   14dec1999         ccb |     214. |  2001   12jan2000         ccb |     215. |  2001   03feb2000         ccb |     216. |  2001   03feb2000         arb |          |-------------------------------|     217. |  2001   02mar2000         ccb |          +-------------------------------+ 

"unique" common misnomer here; strictly, means occurring once only, not mean @ all. "distinct" better word: discussion in stata context, see here.

please find out dataex ssc able show data examples can copied , pasted directly. yours required engineering made easy use.

your problem stata faq found here. idea through faqs before posting.

* example generated -dataex-. install: ssc install dataex clear input float p_id str8 agent_type float(wanted date) 1001 "thiazide" 1 15322 1001 "thiazide" 1 15442 1001 "thiazide" 1 15536 1001 "arb"      2 15580 1001 "ccb"      3 15609 1001 "arb"      2 15609 1001 "ccb"      3 15623 1001 "thiazide" 1 15623 1001 "arb"      2 15623 1001 "ccb"      3 15684 1001 "arb"      2 15684 2001 "ace_inhi" 1 14433 2001 "ace_inhi" 1 14458 2001 "ace_inhi" 1 14481 2001 "ace_inhi" 1 14539 2001 "ccb"      2 14566 2001 "ace_inhi" 1 14566 2001 "ccb"      2 14592 2001 "ccb"      2 14621 2001 "ccb"      2 14643 2001 "arb"      3 14643 2001 "ccb"      2 14671 end format date %td   bysort p_id agent_type (date) : gen firstdate = date[1]  egen group = group(p_id firstdate agent_type)  bysort p_id (group date agent_type): gen count = sum(group != group[_n-1])   assert count == wanted  

note code takes care of possibility 2 or more drugs first used on same day same person.


Comments

Popular posts from this blog

php - How to add and update images or image url in Volusion using Volusion API -

javascript - jQuery UI Splitter/Resizable for unlimited amount of columns -

javascript - IE9 error '$'is not defined -