I wrote a basic code that utilises a leapfrog integrator (aka kick-drift-kick) to simulate gravitational systems. So far it has worked successfully at simulating orbits in any potential you input.
However, for a proper test-particle simulation I want to increase the particle count to ~1 million, which is where the main issue manifests - it is remarkably slow, approximately 35 minutes for 1 Myr (the timestep at which the simulation progresses is 0.001 Myr, so 1000 steps in a for-loop).
Most of my coding experience is in Python, so my concern is that I missed something key in my C code that slows down the code so significantly. I tested different optimisation methods, like parallelising my time loop so I can integrate more particles at a single time, but that either marginally improved the runtime or non at all.
Below I am pasting the code with all the necessary comments, but I cut down on the code size for this question as some of the potentials I use are very bulky and take up space.I would be thankful for guidance and suggestions, I want to truly understand how C operates and not limit myself to half-baked code, I might have missed a key C coding practice and added artificial barriers for the code to run fast etc.
Some burning questions are:
- Are while-loops are slower than for-loops when performing orbit integration?
- Are the if-conditions I use to save particle information are slowing down my code?
- Is there a way to create a buffer array to dump particle coordinates/velocities into to save at a later date? I am worried that a 20x1,000,000x7 double array would be too massive to be computational efficient.
- If there are no particle-particle interactions to compute, would using OpenMP be considered superfluous or can it still speed up the computations?
Edit: I was unaware how compiling can affect the performance of my code, so I will add this information here: I just compile it with gcc-13 with no other options, unless I was testing OpenMP in which case I added -fopenmp.
Here is the code
#include <stdlib.h>#include <math.h>// #include <omp.h> //previous experiments, touched on at the end//Model Parameters//Haloconst double Mh=2e12; //Mass of haloconst double ah=16e3; //Halo paramterconst double ch=0.53980675319058735; //Halo paramter c=15.3, simplified for formula value//t_force parametersconst double Mt_f=2.5*3.43*1e10; //Maximum massconst double R_t=(15*2.5)*1000; // Fixed paramater determining the scale of the potentialconst double mdt=2000; //Mass gain rate (units = Myr)// Units#define pi 3.141592653589793#define G 0.00449857598201285// Sim Setupconst double theta=75*pi/180.; //Inclination of disc, explained further downconst double CCD=1; // Period of file saving, explained further downconst double dt=0.001; //Fixed time step (units = Myr)const double runtime=4e3; //Total runtime (units = Myr)const int timelim=runtime/dt; //Integer limit in for-loopchar name[999]="SIMNAME"; //Simulation snapshot name format//Number of particles#define pno 1000000//Norms of vectors norm()double norm(double x[3]) { return sqrt(x[0]*x[0]+x[1]*x[1]+x[2]*x[2]); }// Function that rotates particles around x-axisvoid rotate(double x[3],double v[3],double theta){ double yd,zd,vy,vz; yd=x[1]*cos(theta)-x[2]*sin(theta);// zd=x[1]*sin(theta)+x[2]*cos(theta);// x[1]=yd;// x[2]=zd;// vy=v[1]*cos(theta)-v[2]*sin(theta); vz=v[1]*sin(theta)+v[2]*cos(theta); v[1]=vy; v[2]=vz;}//Mass function of t_force - a time dependant potential (only increasing mass, other parameters stay constant)double Mt(double t) { double mass; if (t<mdt) {mass=Mt_f*t/mdt;} else {mass=Mt_f;} return mass; }//Calculating force for every particle (only potentials, no particle-particle interactions). h_force is a static potential representing an NFW halovoid calculate_force(double x[3], double a[3], double mt_c){ //Normss double md=norm(x); //Forces double t_force[3]={0.,0.,0.}; double h_force; t_force[0]=...; // ... is indicative of long stretches of code. These lines use the x[3] array and constants defined earlier t_force[1]=...; t_force[2]=...; for (int ind=0;ind<3;ind++) { h_force=...; a[ind]=t_force[ind]+h_force; }}//Calculating force for every particle, for setting up initial conditionsvoid force(double x[3], double a[3]){ //Norms double md=norm(x); //Forces double h_force; for (int ind=0;ind<3;ind++) { h_force=...; a[ind]=h_force; }}// Assigning velocities to particles void initial_cond(double x[3],double v[3], double a[3]){ //Norms double md=norm(x); double phi=atan2(x[1],x[0]); //Forces force(x,a); v[0]=sqrt(norm(a)*md)*cos(phi+pi/2); v[1]=sqrt(norm(a)*md)*sin(phi+pi/2); v[2]=0;}//'Kick' part of a Leapfrog integratorvoid kick(double x[3], double v[3], double a[3], double dt, double t){ calculate_force(x, a, t); for (int k=0;k<3;k++) { v[k]+=a[k]*dt/2.; }}//'Drift' part of Leapfrog integratorvoid drift(double x[3], double v[3], double dt){ for (int k=0;k<3;k++) { x[k]+=v[k]*dt; }}// Function to save particles positions/velocities/energy/etc to a filevoid snapshot( char nom[999], int snapshot,char *n1, char *n2, char *n3, double x[pno][3], double v[pno][3],double mt_c){ FILE *fill; char NAM[999]; sprintf(NAM, "%s_%.12s_%.12s_%.12s.%06d.dat", nom,n1,n2,n3,snapshot); fill = fopen(NAM,"w"); fprintf(fill,"%.16f\t%.16f\t%.16f\t%.16f\t%.16f\t%.16f\t%.16f\n", 0.0,0.0,0.0,0.0,0.0,0.0,0.0); for (int j=0;j<pno;j++) { fprintf(fill,"%.16f\t%.16f\t%.16f\t%.16f\t%.16f\t%.16f\t%.16f\n", x[j][0],x[j][1],x[j][2],v[j][0],v[j][1],v[j][2], calculate_erg(x[j],v[j],mt_c)); } fclose(fill);}/////////////////////Finished defining functions///////////////////////Core of codeint main(int argc, char **argv){ //Setting coordinates and velocities, prior to launch double t, mt_c; double mt_c; int step=1; // Coordinates, velocities, and accelerations of toy particles static double x[pno][3]; static double v[pno][3]; static double a[pno][3]; // Loading particle coordinates from .IC file, getting the initial velocities and writing them into the aformentioned arrays FILE *f1; char IC[999]; sprintf(IC,"NFW.IC"); char output_H[9]; char output_TM[9]; char output_TR[9]; f1 = fopen(IC,"r"); for (int j=0;j<pno;j++) { fscanf(f1,"%lf\t%lf\t%lf\n", &x[j][0],&x[j][1],&x[j][2]); initial_cond(x[j],v[j],a[j]); // Rotating the coordiantes and velocities, my test requires a tilted disc of toy particles rotate(x[j],v[j],pi/2-theta); } // Writing down simulation parameters for filenames snprintf(output_H,9,"%f",log10(Mh)); snprintf(output_TM,9,"%f",log10(Mt_f)); snprintf(output_TR,9,"%f",log10(R_t)); // Setting initial mass of t_force potential to 0, saving the particles as an "initial conditions" file mt_c=0; snapshot(name,0,output_H,output_TM,output_TR,x,v,mt_c); // Main orbit integration // Prior to this it was a while-loop but from previous work in Python // I was cautious it would slow computation for (int k=0;k<timelim;k+=1) { // Get current simulation time and t_force mass to use in force calculations t=k*dt; mt_c=Mt(t); // Leapfrog integrator loop // Prior to this version, I added a "#pragma omp parallel for" but it only slowed the code down for (int j=0;j<pno;j++) { kick(x[j],v[j],a[j],dt,mt_c); drift(x[j],v[j],dt); kick(x[j],v[j],a[j],dt,mt_c); } // When time meets the condition below - save snapshot. // I am cautious that this is a bottleneck, but many runs were inconclusive if (t>CCD*step) { snapshot(name,step,output_H,output_TM,output_TR,x,v,mt_c); step+=1; } t+=dt; }}