In R, I have a dataframe (mydf
) that looks like:
weight | Q1 | Q2 |
---|---|---|
100 | 1 | 0 |
200 | 0 | 1 |
250 | NA | 1 |
387 | 1 | 0 |
(But with 100 columns (all binary Q3, Q4 etc) and 2000 rows).
I am trying to write a loop so I can get the sum of the weights for each binary value in each column.
So for example I want an output like:|Q1 | Q1.weight | Q2 | Q2.weight||:------|:-----:|:-----:|:-----||1 | 487 | 1 | 450||0 | 200 | 0 | 487|
I have written a loop that does this which saves the result to an empty dataframe (empty_df
). It reads:
questions <- c("Q1", "Q2")for (i in questions) {empty_df[[paste0(i, "")]] <- aggregate(weight ~ mydf[[paste0(i, "")]], data = mydf, FUN = sum) }
This is generating the correct values but it’s giving weird column names. It’s giving me the output:
Q1$mydf[[paste0(i, "")]] | Q1$weight | Q2$mydf[[paste0(i, "")]] | Q2$weight |
---|---|---|---|
1 | 487 | 1 | 450 |
0 | 200 | 0 | 487 |
I have tried a bunch of different things to name the columns correctly and tried using gsub to remove the $
and the "mydf[[paste0(i, "")]]"
but nothing seems to be working.
Any ideas on how to create the output with the correct columns would be much appreciated - thanks in advance for your help!