r - Create multiple dataframes by subsetting one dataframe based on the condition of another dataframe -


suppose have dataframe df1

 home.id timeframe_start timeframe_end 2     58960      1476748800    1477353600 4     56862      1474329600    1474934400 6     40482      1454284800    1454889600 8     52105      1476748800    1477353600 10    37244      1476748800    1477353600 12    58213      1476748800    1477353600 14    17734      1458000000    1458604800 16    39786      1458000000    1458604800 18    42613      1458000000    1458604800 

then have second dataframe df2 includes same home_ids, has many different instances of (here part of displayed)

home_id             property_name timestamp_millis      value 1        58960        inside_temperature     1.475849e+12  18.510000 2        58960        inside_temperature     1.475850e+12  19.810000 3        58960        inside_temperature     1.475850e+12  19.630000 4        58960        inside_temperature     1.475850e+12  19.470000 5        58960        inside_temperature     1.475850e+12  19.300000 6        58960        inside_temperature     1.475851e+12  19.470000 2482     58960 boiler_output_temperature     1.476755e+12  55.000000 2483     58960 boiler_output_temperature     1.476755e+12  53.000000 2484     58960 boiler_output_temperature     1.476755e+12  51.000000 2485     58960 boiler_output_temperature     1.476755e+12  47.000000 2486     58960 boiler_output_temperature     1.476755e+12  46.000000 2487     58960 boiler_output_temperature     1.476756e+12  55.000000 2488     58960 boiler_output_temperature     1.476756e+12  58.000000 2489     58960 boiler_output_temperature     1.476756e+12  61.000000 

now create every row of df1 dataframe instances of df2 have same id , fulfill condition property name= 'inside_temperature' , timestamp within df1 columns timeframe start , timeframe end.

so results, have created 18 differet dataframes; 1 each instance in df1 - include 'inside temperature' , timestamp values defined in df1.

 home_id             property_name timestamp_millis      value     1        58960        inside_temperature     1.475849e+12  18.510000     2        58960        inside_temperature     1.475850e+12  19.810000     3        58960        inside_temperature     1.475850e+12  19.630000     4        58960        inside_temperature     1.475850e+12  19.470000     5        58960        inside_temperature     1.475850e+12  19.300000     6        58960        inside_temperature     1.475851e+12  19.470000 

since don't have dataframes reproduce code, i'd give general suggestion avoid for-loops , data in 1 place.

you can use tidyr , purrr packages.

for example:

# group home.id , nest  df1 <- df1 %>%              group_by(home.id) %>%              nest() 

then write function takes home.id , rest of data, apply conditions want filter df2 , give df desired rows.

getdetails <- function(id,data) {     # add conditions filter df2      df2 %>% filter(home_id==id &                    property_name== 'inside_temperature' &                    timestamp_millis> data$timeframe_start &                    timestamp_millis< data$timeframe_end     ) } 

then add column df hold lists, each list has resulting df previous step

df1 <- df1 %>%          mutate(all_data=map2(home.id,data,getdetails)) 

it might need modifications, sth work , give df 18 rows holding info.


Comments

Popular posts from this blog

php - How to add and update images or image url in Volusion using Volusion API -

javascript - IE9 error '$'is not defined -