In R, I have a dataframe (mydf) that looks like:
| weight | Q1 | Q2 |
|---|---|---|
| 100 | 1 | 0 |
| 200 | 0 | 1 |
| 250 | NA | 1 |
| 387 | 1 | 0 |
(But with 100 columns (all binary Q3, Q4 etc) and 2000 rows).
I am trying to write a loop so I can get the sum of the weights for each binary value in each column.
So for example I want an output like:|Q1 | Q1.weight | Q2 | Q2.weight||:------|:-----:|:-----:|:-----||1 | 487 | 1 | 450||0 | 200 | 0 | 487|
I have written a loop that does this which saves the result to an empty dataframe (empty_df). It reads:
questions <- c("Q1", "Q2")for (i in questions) {empty_df[[paste0(i, "")]] <- aggregate(weight ~ mydf[[paste0(i, "")]], data = mydf, FUN = sum) }This is generating the correct values but it’s giving weird column names. It’s giving me the output:
| Q1$mydf[[paste0(i, "")]] | Q1$weight | Q2$mydf[[paste0(i, "")]] | Q2$weight |
|---|---|---|---|
| 1 | 487 | 1 | 450 |
| 0 | 200 | 0 | 487 |
I have tried a bunch of different things to name the columns correctly and tried using gsub to remove the $ and the "mydf[[paste0(i, "")]]" but nothing seems to be working.
Any ideas on how to create the output with the correct columns would be much appreciated - thanks in advance for your help!