I have a big Polars
dataframe with a lot of groups. Now, I want to partition the dataframe by group and save all sub-dataframes. I can easily do this as follows:
for d in df.partition_by(["group1", "group2"]): d.write_csv(f"~/{d[0, 'group1']}_{d[0, 'group2']}.csv")
However, the approach above is sequential and slow when the df
is very large and has a whole lot of partitions.
Is there any Polars
native way to parallelize it (the code section above)?
If not, how can I do it in a Python
native way instead?