Parallelization

MPI

All of the CT-HYB impurity solvers in the iQIST software package are parallelized by MPI. The strategy is very simple. In the beginning, we launch $n$ processes simultaneously. The master process is responsible for reading input data and configuration parameters, and broadcasts them among the children processes, and then each process will perform Monte Carlo samplings and measure physical observables independently. After all the processes finish their jobs, the master process will collect the measured quantities from all the processes and average them to obtain the final results. Apart from that, no additional inter-process communication is needed. Thus, we can anticipate that the parallel efficiency will be very good, and near linear speedups are possible, as long as the number of thermalization steps is small compared to the total number of Monte Carlo steps.

In practical calculations, we always fix the number of Monte Carlo steps $N_{\text{sweep}}$ done by each process, and launch as many processes as possible. Suppose that the number of processes is $N_{\text{proc}}$, then the total number of Monte Carlo samplings should be $N_{\text{proc}}N_{\text{sweep}}$. Naturally, the more processes we use, the more accurate data we can obtain.

OpenMP