并行计算积分

如何解决并行计算积分

这里我有一段代码是一个函数，用于计算一个函数的积分。在代码中，function（）被定义为要集成的函数。

我正在学习并行编程，因此需要并行编写此代码。原始程序是顺序的，因为每次迭代都会将发送操作执行到另一个处理器。我想要实现使其并行，是要对其他3个可用处理器执行每个循环迭代3发送操作。想象一下，有1个处理器划分任务（等级= 0），而其他3个处理器进行实际计算。

请注意，这是一大段代码，但我还添加了一些注释，以使其更加清晰：

顺序代码：

    if (myRank == 0)
    {
        // I am the controller,distribute the work
        for (step = 0; step < maxSteps; step++)
        {
            x[0] = x_start + stepSize*step;
            x[1] = x_start + stepSize*(step+1);
            nextRank = step % (numProcs-1) + 1;
            // Send the work
            MPI_Send(x,2,MPI_DOUBLE,nextRank,TAG_WORK,MPI_COMM_WORLD);
            // Receive the result
            MPI_Recv(y,MPI_COMM_WORLD,MPI_STATUS_IGNORE);
            sum += stepSize*0.5*(y[0]+y[1]);
        }
        // Signal workers to stop by sending empty messages with tag TAG_END
        for (nextRank = 1; nextRank < numProcs; nextRank++)
            MPI_Send(&nextRank,MPI_INT,TAG_END,MPI_COMM_WORLD);
    }
    else
    {
        while (1)
        {
            // I am a worker,wait for work

            // Receive the left and right points of the trapezoid and compute
            // the corresponding function values. If the tag is TAG_END,don't
            // compute but exit.
            MPI_Recv(x,MPI_ANY_TAG,&status);
            if (status.MPI_TAG == TAG_END) break;
            y[0] = f(x[0]);
            y[1] = f(x[1]);
            // Send back the computed result
            MPI_Send(y,MPI_COMM_WORLD);
        }
    }
    return sum;
}

要使其并行化，我确实对其进行了硬编码，以明确说明自己的工作。我以3的步长使循环递增。我添加了新的数组来存储x和y值。我要做的是首先将x值收集到特定的数组中。然后，我将x值的每个数组发送到新处理器。然后，我执行另一个函数以获得y值。然后，我将它们发送回处理器（等级= 0）以添加所有“集成片”。

尝试并行化代码

 if (myRank == 0)
    {
        // I am the controller,distribute the work
        for (step = 0; step < maxSteps; step+3)
        {
            x1[0] = x_start + stepSize*step;
            x1[1] = x_start + stepSize*(step+1);
            x2[0] = x_start + stepSize*(step+1);
            x2[1] = x_start + stepSize*((step+1)+1);
            x3[0] = x_start + stepSize*(step+2);
            x3[1] = x_start + stepSize*((step+1)+2);
            nextRank = step % (numProcs-1) + 1;
            // Send the work
            MPI_Send(x1,1,MPI_COMM_WORLD);
            MPI_Send(x2,MPI_COMM_WORLD);
            MPI_Send(x3,3,MPI_COMM_WORLD);
            // Receive the result
            MPI_Recv(y1,MPI_STATUS_IGNORE);
            sum += stepSize*0.5*(y1[0]+y1[1]);
            MPI_Recv(y2,MPI_STATUS_IGNORE);
            sum += stepSize*0.5*(y2[0]+y2[1]);
            MPI_Recv(y3,MPI_STATUS_IGNORE);
            sum += stepSize*0.5*(y3[0]+y3[1]);
        }
        // Signal workers to stop by sending empty messages with tag TAG_END
        for (nextRank = 1; nextRank < numProcs; nextRank++)
            MPI_Send(&nextRank,MPI_COMM_WORLD);
    }
    else if (myRank = 1)
    {
        while (1)
        {
            MPI_Recv(x1,&status);
            if (status.MPI_TAG == TAG_END) break;
            y1[0] = func(x1[0]);
            y1[1] = func(x1[1]);
            // Send back the computed result
            MPI_Send(y1,MPI_COMM_WORLD);
        }
    }
    
    else if (myRank = 2)
    {
        while (1)
        {
            MPI_Recv(x2,&status);
            if (status.MPI_TAG == TAG_END) break;
            y2[0] = func(x2[0]);
            y2[1] = func(x2[1]);
            // Send back the computed result
            MPI_Send(y2,MPI_COMM_WORLD);
        }
    }
    
    else if (myRank = 3)
    {
        while (1)
        {
            MPI_Recv(x3,&status);
            if (status.MPI_TAG == TAG_END) break;
            y3[0] = func(x3[0]);
            y3[1] = func(x3[1]);
            // Send back the computed result
            MPI_Send(y3,MPI_COMM_WORLD);
        }
    }
    return sum;
}

问题是我不再输出。恐怕我造成了僵局，但我找不到位置。我可以收到有关此方法的反馈吗？

来源：https://doc.itc.rwth-aachen.de/display/VE/PPCES+2012

解决方法

如果您想获得拥有8个内核的利润（这只是一个示例），您可以做的最好的事情（也是最简单的事情）是将整数间隔分为八个部分（您可以任意划分分区，以得到每个工作量都由您自己决定，然后由您自己决定），然后分别计算每个线程中的每个积分（对于一个线程而言，具有相同的循环）。

这种方法不会改变您的原始计算，并且使计算完全彼此独立（因此根本没有资源争用）

最后，您只需要添加八个积分即可获得所需的结果。

如果您正在考虑展开循环等事情以提高并行度，那么最好信任您的编译器，该编译器能够并行使用其优化器，以从当今普通cpu拥有的32个以上寄存器中获利，并且您很可能不会做得更好。

此处建议的方法将您的积分转换为8种不同的积分计算，每种计算具有不同的参数和不同的值，并且一个线程中的演算不依赖于其他线程上的演算，因此，即使在基于管道的线程核心中，您不必重新排序或复杂化指令，因为很容易将另一个线程的指令添加到管道中以免产生气泡。如果您有8个核心，那么实际上有8个以上的线程来计算某事物并不代表任何有利的任务。

如何解决并行计算积分

解决方法

相关推荐