python - How to convert series to dataframe in Pandas -
i have 2 csvs need compare them based on 1 column. , need put matched rows in 1 csv , unmatched rows in other. so, created index on column in second csv , looped through first.
df1 = pd.read_csv(file1,nrows=100) df2 = pd.read_csv(file2,nrows=100) df2.set_index('crc', inplace = true) matched_list = [] non_matched_list = [] _, row in df1.iterrows(): try: x = df2.loc[row['crc']] matched_list.append(x) except keyerror: non_matched_list.append(row)
the x here series in following format
policyid 448094 statecode fl county clay county eq_site_limit 1322376.3 hu_site_limit 1322376.3 fl_site_limit 1322376.3 fr_site_limit 1322376.3 tiv_2011 1322376.3 tiv_2012 1438163.57 eq_site_deductible 0 hu_site_deductible 0.0 fl_site_deductible 0 fr_site_deductible 0 point_latitude 30.063936 point_longitude -81.707664 line residential construction masonry point_granularity 3 name: 448094,fl,clay county,1322376.3,1322376.3,1322376.3,1322376.3,1322376.3,0,0.0, dtype: object
my output csv should in following format
policyid,statecode,county,eq_site_limit,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity 114455,fl,clay county,498960,498960,498960,498960,498960,792148.9,0,9979.2,0,0,30.102261,-81.711777,residential,masonry,1
for series in matched , unmatched. how do it? can not rid off index in second csv performance in important.
following content of 2 csv files. file1:
policyid,statecode,county,crc,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity 114455,fl,clay county,589658,498960,498960,498960,498960,792148.9,0,9979.2,0,0,30.102261,-81.711777,residential,masonry,1 448094,fl,clay county,1322376.3,1322376.3,1322376.3,1322376.3,1322376.3,1438163.57,0,0,0,0,30.063936,-81.707664,residential,masonry,3 206893,fl,clay county,745689.4,190724.4,190724.4,190724.4,190724.4,192476.78,0,0,0,0,30.089579,-81.700455,residential,wood,1 333743,fl,clay county,0,12563.76,0,0,79520.76,86854.48,0,0,0,0,30.063236,-81.707703,residential,wood,3 172534,fl,clay county,0,254281.5,0,254281.5,254281.5,246144.49,0,0,0,0,30.060614,-81.702675,residential,wood,1 785275,fl,clay county,0,515035.62,0,0,515035.62,884419.17,0,0,0,0,30.063236,-81.707703,residential,masonry,3 995932,fl,clay county,0,19260000,0,0,19260000,20610000,0,0,0,0,30.102226,-81.713882,commercial,reinforced concrete,1 223488,fl,clay county,328500,328500,328500,328500,328500,348374.25,0,16425,0,0,30.102217,-81.707146,residential,wood,1 433512,fl,clay county,315000,315000,315000,315000,315000,265821.57,0,15750,0,0,30.118774,-81.704613,residential,wood,1 142071,fl,clay county,705600,705600,705600,705600,705600,1010842.56,14112,35280,0,0,30.100628,-81.703751,residential,masonry,1
file2:
policyid,statecode,county,crc,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity 119736,fl,clay county,498960,498960,498960,498960,498960,792148.9,0,9979.2,0,0,30.102261,-81.711777,residential,masonry,1 448094,fl,clay county,1322376.3,1322376.3,1322376.3,1322376.3,1322376.3,1438163.57,0,0,0,0,30.063936,-81.707664,residential,masonry,3 206893,fl,clay county,190724.4,190724.4,190724.4,190724.4,190724.4,192476.78,0,0,0,0,30.089579,-81.700455,residential,wood,1 333743,fl,clay county,0,79520.76,0,0,79520.76,86854.48,0,0,0,0,30.063236,-81.707703,residential,wood,3 172534,fl,clay county,0,254281.5,0,254281.5,254281.5,246144.49,0,0,0,0,30.060614,-81.702675,residential,wood,1 785275,fl,clay county,0,51564.9,0,0,515035.62,884419.17,0,0,0,0,30.063236,-81.707703,residential,masonry,3 995932,fl,clay county,0,457962,0,0,19260000,20610000,0,0,0,0,30.102226,-81.713882,commercial,reinforced concrete,1 223488,fl,clay county,328500,328500,328500,328500,328500,348374.25,0,16425,0,0,30.102217,-81.707146,residential,wood,1 433512,fl,clay county,315000,315000,315000,315000,315000,265821.57,0,15750,0,0,30.118774,-81.704613,residential,wood,1 142071,fl,clay county,705600,705600,705600,705600,705600,1010842.56,14112,35280,0,0,30.100628,-81.703751,residential,masonry,1 253816,fl,clay county,831498.3,831498.3,831498.3,831498.3,831498.3,1117791.48,0,0,0,0,30.10216,-81.719444,residential,masonry,1 894922,fl,clay county,0,24059.09,0,0,24059.09,33952.19,0,0,0,0,30.095957,-81.695099,residential,wood,1
edit: added sample csv
i think can way:
df1.loc[df1.crc.isin(df2.index)].to_csv('/path/to/matched.csv', index=false) df1.loc[~df1.crc.isin(df2.index)].to_csv('/path/to/unmatched.csv', index=false)
instead of looping...
demo:
in [62]: df1.loc[df1.crc.isin(df2.index)].to_csv(r'c:/temp/matched.csv', index=false) in [63]: df1.loc[~df1.crc.isin(df2.index)].to_csv(r'c:/temp/unmatched.csv', index=false)
results:
matched.csv:
policyid,statecode,county,crc,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity 448094,fl,clay county,1322376.3,1322376.3,1322376.3,1322376.3,1322376.3,1438163.57,0,0.0,0,0,30.063935999999998,-81.70766400000001,residential,masonry,3 333743,fl,clay county,0.0,12563.76,0.0,0.0,79520.76,86854.48,0,0.0,0,0,30.063236,-81.70770300000001,residential,wood,3 172534,fl,clay county,0.0,254281.5,0.0,254281.5,254281.5,246144.49,0,0.0,0,0,30.060614,-81.702675,residential,wood,1 785275,fl,clay county,0.0,515035.62,0.0,0.0,515035.62,884419.17,0,0.0,0,0,30.063236,-81.70770300000001,residential,masonry,3 995932,fl,clay county,0.0,19260000.0,0.0,0.0,19260000.0,20610000.0,0,0.0,0,0,30.102226,-81.713882,commercial,reinforced concrete,1 223488,fl,clay county,328500.0,328500.0,328500.0,328500.0,328500.0,348374.25,0,16425.0,0,0,30.102217,-81.707146,residential,wood,1 433512,fl,clay county,315000.0,315000.0,315000.0,315000.0,315000.0,265821.57,0,15750.0,0,0,30.118774,-81.704613,residential,wood,1 142071,fl,clay county,705600.0,705600.0,705600.0,705600.0,705600.0,1010842.56,14112,35280.0,0,0,30.100628000000004,-81.703751,residential,masonry,1
unmatched.csv:
policyid,statecode,county,crc,hu_site_limit,fl_site_limit,fr_site_limit,tiv_2011,tiv_2012,eq_site_deductible,hu_site_deductible,fl_site_deductible,fr_site_deductible,point_latitude,point_longitude,line,construction,point_granularity 114455,fl,clay county,589658.0,498960.0,498960.0,498960.0,498960.0,792148.9,0,9979.2,0,0,30.102261,-81.711777,residential,masonry,1 206893,fl,clay county,745689.4,190724.4,190724.4,190724.4,190724.4,192476.78,0,0.0,0,0,30.089578999999997,-81.700455,residential,wood,1
Comments
Post a Comment