Quantcast
Channel: Recent Questions - Stack Overflow
Viewing all articles
Browse latest Browse all 15641

Straightforward C code takes a long time to run, any methods of optimising it?

$
0
0

I wrote a basic code that utilises a leapfrog integrator (aka kick-drift-kick) to simulate gravitational systems. So far it has worked successfully at simulating orbits in any potential you input.

However, for a proper test-particle simulation I want to increase the particle count to ~1 million, which is where the main issue manifests - it is remarkably slow, approximately 35 minutes for 1 Myr (the timestep at which the simulation progresses is 0.001 Myr, so 1000 steps in a for-loop).

Most of my coding experience is in Python, so my concern is that I missed something key in my C code that slows down the code so significantly. I tested different optimisation methods, like parallelising my time loop so I can integrate more particles at a single time, but that either marginally improved the runtime or non at all.

Below I am pasting the code with all the necessary comments, but I cut down on the code size for this question as some of the potentials I use are very bulky and take up space.I would be thankful for guidance and suggestions, I want to truly understand how C operates and not limit myself to half-baked code, I might have missed a key C coding practice and added artificial barriers for the code to run fast etc.

Some burning questions are:

  • Are while-loops are slower than for-loops when performing orbit integration?
  • Are the if-conditions I use to save particle information are slowing down my code?
  • Is there a way to create a buffer array to dump particle coordinates/velocities into to save at a later date? I am worried that a 20x1,000,000x7 double array would be too massive to be computational efficient.
  • If there are no particle-particle interactions to compute, would using OpenMP be considered superfluous or can it still speed up the computations?

Edit: I was unaware how compiling can affect the performance of my code, so I will add this information here: I just compile it with gcc-13 with no other options, unless I was testing OpenMP in which case I added -fopenmp.

Here is the code

#include <stdlib.h>#include <math.h>// #include <omp.h> //previous experiments, touched on at the end//Model Parameters//Haloconst double Mh=2e12; //Mass of haloconst double ah=16e3; //Halo paramterconst double ch=0.53980675319058735; //Halo paramter c=15.3, simplified for formula value//t_force parametersconst double Mt_f=2.5*3.43*1e10; //Maximum massconst double R_t=(15*2.5)*1000; // Fixed paramater determining the scale of the potentialconst double mdt=2000; //Mass gain rate (units = Myr)// Units#define pi 3.141592653589793#define G 0.00449857598201285// Sim Setupconst double theta=75*pi/180.; //Inclination of disc, explained further downconst double CCD=1; // Period of file saving, explained further downconst double dt=0.001; //Fixed time step (units = Myr)const double runtime=4e3; //Total runtime (units = Myr)const int timelim=runtime/dt; //Integer limit in for-loopchar name[999]="SIMNAME"; //Simulation snapshot name format//Number of particles#define pno 1000000//Norms of vectors norm()double norm(double x[3])    {      return sqrt(x[0]*x[0]+x[1]*x[1]+x[2]*x[2]);    }// Function that rotates particles around x-axisvoid rotate(double x[3],double v[3],double theta){    double yd,zd,vy,vz;    yd=x[1]*cos(theta)-x[2]*sin(theta);//    zd=x[1]*sin(theta)+x[2]*cos(theta);//    x[1]=yd;//    x[2]=zd;//    vy=v[1]*cos(theta)-v[2]*sin(theta);    vz=v[1]*sin(theta)+v[2]*cos(theta);    v[1]=vy;    v[2]=vz;}//Mass function of t_force - a time dependant potential (only increasing mass, other parameters stay constant)double Mt(double t)    {        double mass;        if (t<mdt) {mass=Mt_f*t/mdt;}        else {mass=Mt_f;}        return mass;    }//Calculating force for every particle (only potentials, no particle-particle interactions). h_force is a static potential representing an NFW halovoid calculate_force(double x[3], double a[3], double mt_c){    //Normss    double md=norm(x);    //Forces    double t_force[3]={0.,0.,0.};    double h_force;    t_force[0]=...; // ... is indicative of long stretches of code. These lines use the x[3] array and constants defined earlier    t_force[1]=...;    t_force[2]=...;    for (int ind=0;ind<3;ind++)    {        h_force=...;        a[ind]=t_force[ind]+h_force;    }}//Calculating force for every particle, for setting up initial conditionsvoid force(double x[3], double a[3]){    //Norms    double md=norm(x);    //Forces    double h_force;    for (int ind=0;ind<3;ind++)    {        h_force=...;        a[ind]=h_force;    }}// Assigning velocities to particles void initial_cond(double x[3],double v[3], double a[3]){    //Norms    double md=norm(x);    double phi=atan2(x[1],x[0]);    //Forces    force(x,a);    v[0]=sqrt(norm(a)*md)*cos(phi+pi/2);    v[1]=sqrt(norm(a)*md)*sin(phi+pi/2);    v[2]=0;}//'Kick' part of a Leapfrog integratorvoid kick(double x[3], double v[3], double a[3], double dt, double t){     calculate_force(x, a, t);    for (int k=0;k<3;k++)    {        v[k]+=a[k]*dt/2.;    }}//'Drift' part of Leapfrog integratorvoid drift(double x[3], double v[3], double dt){    for (int k=0;k<3;k++)    {         x[k]+=v[k]*dt;    }}// Function to save particles positions/velocities/energy/etc to a filevoid snapshot( char nom[999], int snapshot,char *n1, char *n2, char *n3, double x[pno][3], double v[pno][3],double mt_c){    FILE *fill;    char NAM[999];    sprintf(NAM, "%s_%.12s_%.12s_%.12s.%06d.dat", nom,n1,n2,n3,snapshot);    fill = fopen(NAM,"w");        fprintf(fill,"%.16f\t%.16f\t%.16f\t%.16f\t%.16f\t%.16f\t%.16f\n", 0.0,0.0,0.0,0.0,0.0,0.0,0.0);    for (int j=0;j<pno;j++)            {            fprintf(fill,"%.16f\t%.16f\t%.16f\t%.16f\t%.16f\t%.16f\t%.16f\n", x[j][0],x[j][1],x[j][2],v[j][0],v[j][1],v[j][2],                calculate_erg(x[j],v[j],mt_c));            }    fclose(fill);}/////////////////////Finished defining functions///////////////////////Core of codeint main(int argc, char **argv){    //Setting coordinates and velocities, prior to launch    double t, mt_c;    double mt_c;    int step=1;    // Coordinates, velocities, and accelerations of toy particles    static double x[pno][3];    static double v[pno][3];    static double a[pno][3];    // Loading particle coordinates from .IC file, getting the initial velocities and writing them into the aformentioned arrays    FILE *f1;    char IC[999];    sprintf(IC,"NFW.IC");    char output_H[9];    char output_TM[9];    char output_TR[9];    f1 = fopen(IC,"r");        for (int j=0;j<pno;j++)    {        fscanf(f1,"%lf\t%lf\t%lf\n", &x[j][0],&x[j][1],&x[j][2]);        initial_cond(x[j],v[j],a[j]);        // Rotating the coordiantes and velocities, my test requires a tilted disc of toy particles        rotate(x[j],v[j],pi/2-theta);    }    // Writing down simulation parameters for filenames    snprintf(output_H,9,"%f",log10(Mh));    snprintf(output_TM,9,"%f",log10(Mt_f));    snprintf(output_TR,9,"%f",log10(R_t));    // Setting initial mass of t_force potential to 0, saving the particles as an "initial conditions" file    mt_c=0;    snapshot(name,0,output_H,output_TM,output_TR,x,v,mt_c);    // Main orbit integration    // Prior to this it was a while-loop but from previous work in Python    // I was cautious it would slow computation    for (int k=0;k<timelim;k+=1)    {        // Get current simulation time and t_force mass to use in force calculations        t=k*dt;        mt_c=Mt(t);        // Leapfrog integrator loop        // Prior to this version, I added a "#pragma omp parallel for" but it only slowed the code down        for (int j=0;j<pno;j++)        {            kick(x[j],v[j],a[j],dt,mt_c);            drift(x[j],v[j],dt);            kick(x[j],v[j],a[j],dt,mt_c);        }        // When time meets the condition below - save snapshot.         // I am cautious that this is a bottleneck, but many runs were inconclusive        if (t>CCD*step)        {               snapshot(name,step,output_H,output_TM,output_TR,x,v,mt_c);            step+=1;        }        t+=dt;    }}

Viewing all articles
Browse latest Browse all 15641

Trending Articles