loveislonely Posted August 7, 2008 Share Posted August 7, 2008 Hi there, I am using Gaussian doing a calculation, which calls subroutine DGEMM to operate a matrix multiplication. And I am sure the DGEMM has been parallelized. When I ran a test job on a node with 4 processors, the speed up is very good, about 3.6 times faster compared to the speed of serial running. Then I thought if I change it to 8 processors (the limit of the node is 8 processors), it should be much faster. However, the output confused me. When I ran the job with 8 processors, the speed is about the same as running the job serially. I am really confused:confused:, does any one know how to solve this? Thank you so much. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now