[Smeagol-discuss] problem about smeagol parallel efficiency

Barraza-lopez, Salvador sbl3 at mail.gatech.edu
Tue Jan 5 13:52:41 GMT 2010


Hi Guangping. 

The bottom line is these input lines: 

NEnergReal 1000 
NEnergImCircle 200 
NEnergImLine 50 
NPoles 20 

Neither 6 nor 8 cores divide them. You have NEnergImCircle+NEnergImLine+2NPoles=290 integration points for the equilibrium part and 1000 integration points for the non-equilibrium part of the energy integration. Hence try to make your number of integration points be DIVISIBLE by the number of processors you are using (240 integration points in the equilibrium part and 960 for the non-equilibrium part will work fine -and will scale properly- on 6 and 8 processors). Otherwise try to use 5 processors with the input file as-is and you'll also see scalable performance. 

Best regards, 
-Salvador. 

----- Original Message ----- 
From: "张广平" <284107217 at qq.com> 
To: "smeagol-discuss" <smeagol-discuss at lists.tchpc.tcd.ie> 
Sent: Tuesday, January 5, 2010 7:50:13 AM GMT -05:00 US/Canada Eastern 
Subject: [Smeagol-discuss] problem about smeagol parallel efficiency 


HI,every smeagol user 
I have encountered a problem:when I use one core to work,it costs me 35 minutes while 49 minutes for 8 cores to work for the same task.The more cores the more time? Another example : one core cost me 42 minutes while 6 core cost me 26 minutes,the efficiency is so bad. 
Our OS is as follows: 
------------------------------------------------------- 
[test at localhost LIB]$ uname -a 
Linux localhost 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux 
---------------------------------------------------------- 
And there are 8 core for one node.We now just let the parallel in one node.The information for one core is : 
----------------------------------------------------------- 
processor : 0 
vendor_id : GenuineIntel 
cpu family : 6 
model : 23 
model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz 
stepping : 6 
cpu MHz : 2666.844 
cache size : 6144 KB 
physical id : 0 
siblings : 4 
core id : 0 
cpu cores : 4 
apicid : 0 
fpu : yes 
fpu_exception : yes 
cpuid level : 10 
wp : yes 
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm 
bogomips : 5333.68 
clflush size : 64 
cache_alignment : 64 
address sizes : 38 bits physical, 48 bits virtual 
power management: 
------------------------------------------------------------------ 
All the 8 cores are the same in one node. 
When I compile the code for parallel mode,I use mkl-10.0.2.018,and all the math library form it.The fortran compile used is pgi7.0,and MPI I use mpich-1.2.7.Are there any problems for the software I used that lead to bad efficiency. 
My arch.make for paralle is: 
----------------------------------------------------------------- 
SIESTA_ARCH=pgf90 
FC=mpif90 
FC_ASIS=$(FC) 
FFLAGS= -tp p7-64 -OPT:Ofast -O2 
LDFLAGS= -tp p7-64 -OPT:Ofast -O2 
COMP_LIBS = 
FFLAGS_DEBUG= 
TRANSPORTFLAGS = -tp p7-64 -OPT:Ofast -O2 -c 

SOURCE_DIR=/home/test/software/smeagol-1.3.7 
EXEC = smeagolpara 
#NETCDF_LIBS=/usr/local/netcdf-3.5/lib/pgi/libnetcdf.a 
#NETCDF_INTERFACE=libnetcdf_f90.a 
#DEFS_CDF=-DCDF 
MPI_INTERFACE=libmpi_f90.a 
MPI_INCLUDE=/home/test/software/mpich-1.2.7/include 
DEFS_MPI=-DMPI 

BLAS_LIBS= -L/home/test/intel/mkl/10.0.2.018/lib/em64t -lmkl_solver -lmkl_em64t -lguide -lpthread 
LAPACK_LIBS= -L/home/test/intel/mkl/10.0.2.018/lib/em64t -lmkl_lapack -lmkl_core 
BLACS_LIBS= -L/home/test/intel/mkl/10.0.2.018/lib/em64t -lmkl_blacs_lp64 
SCALAPACK_LIBS= -L/home/test/intel/mkl/10.0.2.018/lib/em64t -lmkl_scalapack_lp64 
LIBS= $(SCALAPACK_LIBS) $(BLACS_LIBS) $(LAPACK_LIBS) $(BLAS_LIBS) 

RANLIB=echo 
SYS=bsd 
DEFS= $(DEFS_CDF) $(DEFS_MPI) 
# 
.F.o: 
$(FC) -c $(FFLAGS) $(DEFS) $< 
.f.o: 
$(FC) -c $(FFLAGS) $< 
.F90.o: 
$(FC) -c $(FFLAGS) $(DEFS) $< 
.f90.o: 
$(FC) -c $(FFLAGS) $< 

# 
------------------------------------------------------------------ 
The whole mkl lib is : 
------------------------------------------------------------------ 
libguide.a libmkl_intel_lp64.a 
libguide.so libmkl_intel_lp64.so 
libiomp5.a libmkl_intel_sp2dp.a 
libiomp5.so libmkl_intel_sp2dp.so 
libmkl_blacs_ilp64.a libmkl_intel_thread.a 
libmkl_blacs_intelmpi20_ilp64.a libmkl_intel_thread.so 
libmkl_blacs_intelmpi20_lp64.a libmkl_lapack.a 
libmkl_blacs_intelmpi_ilp64.a libmkl_lapack.so 
libmkl_blacs_intelmpi_lp64.a libmkl_mc.so 
libmkl_blacs_lp64.a libmkl_p4n.so 
libmkl_blacs_openmpi_ilp64.a libmkl_scalapack.a 
libmkl_blacs_openmpi_lp64.a libmkl_scalapack_ilp64.a 
libmkl_cdft.a libmkl_scalapack_lp64.a 
libmkl_cdft_core.a libmkl_sequential.a 
libmkl_core.a libmkl_sequential.so 
libmkl_core.so libmkl.so 
libmkl_def.so libmkl_solver.a 
libmkl_em64t.a libmkl_solver_ilp64.a 
libmkl_gf_ilp64.a libmkl_solver_ilp64_sequential.a 
libmkl_gf_ilp64.so libmkl_solver_lp64.a 
libmkl_gf_lp64.a libmkl_solver_lp64_sequential.a 
libmkl_gf_lp64.so libmkl_vml_def.so 
libmkl_gnu_thread.a libmkl_vml_mc2.so 
libmkl_gnu_thread.so libmkl_vml_mc.so 
libmkl_intel_ilp64.a libmkl_vml_p4n.so 
libmkl_intel_ilp64.so 
--------------------------------------------------------------- 
When I run the task, I first copy the compiled executable file smeagolpara in the directory /Src to my work directory,then I use the command: mpirun -np 8 smeagolpara <Auwire.fdf> mx.log & after the lead calculation. 
By the way ,can the lead use parallel calculation? It seems not for me. 
I put the input files in attachment. 
Any advise is welcome! 
BEST REGARDS! 
YOURS 
Guangping Zhang 
_______________________________________________ Smeagol-discuss mailing list Smeagol-discuss at lists.tchpc.tcd.ie http://lists.tchpc.tcd.ie/listinfo/smeagol-discuss 

-- 
Salvador Barraza-Lopez 
Postdoctoral Fellow 
School of Physics 
The Georgia Institute of Technology 

Office N205 
837 State Street Atlanta, Georgia 30332-0430 U.S.A 
Tel: (404) 894-0892 Fax: (404) 894-9958 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tchpc.tcd.ie/pipermail/smeagol-discuss/attachments/20100105/d32c351b/attachment.html 


More information about the Smeagol-discuss mailing list