Compare commits
4 Commits
Author | SHA1 | Date | |
---|---|---|---|
|
80f1f66052 | ||
|
45293c6962 | ||
|
115bcdb4c8 | ||
|
436d13ab14 |
Binary file not shown.
BIN
Gabriel-Possenti_19123_Tugas3HPC.pdf
Normal file
BIN
Gabriel-Possenti_19123_Tugas3HPC.pdf
Normal file
Binary file not shown.
BIN
Gabriel-Possenti_19123_Tugas3HPC.zip
Normal file
BIN
Gabriel-Possenti_19123_Tugas3HPC.zip
Normal file
Binary file not shown.
Binary file not shown.
61
README.md
Normal file
61
README.md
Normal file
@ -0,0 +1,61 @@
|
||||
<p>
|
||||
Gabriel Possenti Kheisa Drianasta<br>
|
||||
19/442374/PA/19123
|
||||
</p>
|
||||
|
||||
|
||||
<h1>Hasil Tugas OpenMP</h1>
|
||||
<h3>Pengujian terhadap 1, 2, 4, 8, 16, 32, hingga 64 thread</h3>
|
||||
<h2>Hasil</h2>
|
||||
<p>Matrix size: 4096</p>
|
||||
<ul>
|
||||
<li>1 Thread: 227.454 sec</li>
|
||||
<li>2 Thread: 115.605 sec</li>
|
||||
<li>4 Thread: 60.744 sec</li>
|
||||
<li>8 Thread: 30.498 sec</li>
|
||||
<li>16 Thread: 26.556 sec</li>
|
||||
<li>32 Thread: 24.960 sec</li>
|
||||
<li>64 Thread: 24.707 sec</li>
|
||||
</ul>
|
||||
<br>
|
||||
<h3>Waktu (detik)</h3>
|
||||
<img src="https://media.discordapp.net/attachments/1003173519879847966/1041940259719618560/image.png">
|
||||
|
||||
<h3>Selisih waktu terhadap jumlah core sebelumnya (detik)</h3>
|
||||
<img src="https://cdn.discordapp.com/attachments/1003173519879847966/1042339831071657994/image.png">
|
||||
|
||||
<br><br>
|
||||
<h1>Pertanyaan</h1>
|
||||
<h3>1. What is the maximum speed up compared to single thread
|
||||
execution?</h3>
|
||||
<p>Dengan menghitung perbandingan selisih waktu yang di dapat antara 1 thread dengan 2 thread, yakni ( 227 / 115 ) detik, maka speedup yang di dapat adalah <b>1.97</b> kali. Dengan menggunakan rumus perhitungan speedup berikut:</p>
|
||||
|
||||
<img src="https://cdn.discordapp.com/attachments/1003173519879847966/1042413697986994176/image.png">
|
||||
|
||||
<p>Maka program bersifat <b>98.4%</b> parallel. Dengan persentase berikut,maka, perbandingan performa dengan program yang bersifat parallel sempurna adalah sebagai berikut:</p>
|
||||
|
||||
<pre>
|
||||
Parallel 100% = [1, 2, 4, 8, 16, 32, 64]
|
||||
Parallel 98.4% = [1, 1.97, 3.8, 7.19, 12.9, 21.4, 31.8]
|
||||
</pre>
|
||||
|
||||
<img src="https://media.discordapp.net/attachments/1003173519879847966/1042418071593304117/image.png">
|
||||
|
||||
<p>Selisih speedup terbesar tampak pada jumlah thread <b>8 ke 16</b>, sama seperti hasil percobaan sebelumnya, sehingga penambahan jumlah core dari 8, ke 16, ke 32, dan seterusnya menjadi relatif lebih tidak efektif.</p>
|
||||
|
||||
<h3>2. As we increase the number of threads, the execution time starts to
|
||||
saturate at one point (cannot get any faster). Explain why this
|
||||
happens.</h3>
|
||||
|
||||
<p>Karena program tidak bersifat parallel sempurna (100%). Program bersifat 98.4% parallel, dimana <b>1.6%</b> bersifat serial. Bagian serial ini merupakan bagian yang tidak dapat dikerjakan secara parallel karena berbagai faktor. Meskipun angka tersebut kecil, selisih speedup dengan program yang dapat dijalankan secara parallel sempurna akan semakin terlihat dengan jumlah core atau thread yang semakin banyak.</p>
|
||||
|
||||
<img src="https://media.discordapp.net/attachments/1003173519879847966/1042420674385416202/image.png">
|
||||
<img src="https://media.discordapp.net/attachments/1003173519879847966/1042421381591203871/image.png">
|
||||
|
||||
<h3>3. Find the exact number of threads where saturation begins. What is its relationship with the number of physical cores of the system?
|
||||
Explain.</h3>
|
||||
|
||||
<p>Berdasarkan hasil yang terlampir, baik hasil eksperimen maupun perhitungan dengan Hukum Amdahl, <b>8 thread</b> merupakan jumlah core terbaik karena selisih waktu eksekusi (hasil percobaan) tidak begitu besar, sedangkan selisih speedup (perhitungan menggunakan Hukum Amdahl) sangat besar. Dengan kata lain, 8 thread ke 16 thread dan seterusnya relatif tidak efektif. Hubungan antara jumlah thread program dengan jumlah core adalah berbanding lurus dalam hal speedup dan berbanding terbalik dengan waktu eksekusi, dengan program yang bersifat 100% parallel.</p>
|
||||
|
||||
<br>
|
||||
<p>Lampiran program dan hasil: https://repo.gabrielkheisa.xyz/gabrielkheisa/tugas3-openMP</p>
|
46
graph.html
Normal file
46
graph.html
Normal file
@ -0,0 +1,46 @@
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/Chart.js/2.5.0/Chart.min.js"></script>
|
||||
<body>
|
||||
<canvas id="myChart" style="width:100%;max-width:600px"></canvas>
|
||||
|
||||
<script>
|
||||
var xValues = [1,2,4,8,16,32,64];
|
||||
var data = [227.454, 115.605, 60.774, 30.498, 26.556, 24.960, 24.707];
|
||||
var d_data = [];
|
||||
|
||||
var speedup = [1, 1.97, 3.8, 7.19, 12.9, 21.4, 31.8];
|
||||
|
||||
var i = 1;
|
||||
|
||||
while(i < data.length){
|
||||
d_data[i] = data[i+1] - data[i];
|
||||
i++;
|
||||
}
|
||||
|
||||
/*
|
||||
1 Thread: 227.454 sec
|
||||
2 Thread: 115.605 sec
|
||||
4 Thread: 60.744 sec
|
||||
8 Thread: 30.498 sec
|
||||
16 Thread: 26.556 sec
|
||||
32 Thread: 24.960 sec
|
||||
64 Thread: 24.707 sec
|
||||
*/
|
||||
|
||||
|
||||
new Chart("myChart", {
|
||||
type: "line",
|
||||
data: {
|
||||
labels: xValues,
|
||||
datasets: [{
|
||||
data: data,
|
||||
borderColor: "black",
|
||||
fill: false
|
||||
}]
|
||||
},
|
||||
options: {
|
||||
legend: {display: false}
|
||||
}
|
||||
});
|
||||
</script>
|
@ -1,7 +0,0 @@
|
||||
mahasiswa2
|
||||
komputasi06
|
||||
mimax= 129 mjmax= 65 mkmax= 65
|
||||
imax= 128 jmax= 64 kmax= 64
|
||||
Start rehearsal measurement process.
|
||||
Measure the performance in 10000 times.
|
||||
MFLOPS: 6017.17725 time(s): 27.3678112 8.79942896E-10
|
@ -1,7 +0,0 @@
|
||||
mahasiswa2
|
||||
komputasi06
|
||||
mimax= 129 mjmax= 65 mkmax= 65
|
||||
imax= 128 jmax= 64 kmax= 64
|
||||
Start rehearsal measurement process.
|
||||
Measure the performance in 10000 times.
|
||||
MFLOPS: 724.292114 time(s): 227.362640 8.79942896E-10
|
15
no_reduction.f90
Normal file
15
no_reduction.f90
Normal file
@ -0,0 +1,15 @@
|
||||
program summ
|
||||
implicit none
|
||||
integer :: sum = 0
|
||||
integer :: n
|
||||
!$OMP parallel do
|
||||
do n = 0, 1000
|
||||
sum = sum + n
|
||||
print*, n, " ", sum
|
||||
end do
|
||||
!$OMP end parallel do
|
||||
print*, " "
|
||||
print*, " "
|
||||
print*, " "
|
||||
print*, "hasilnya adalah ", sum
|
||||
end program summ
|
12
no_reduction.sh
Normal file
12
no_reduction.sh
Normal file
@ -0,0 +1,12 @@
|
||||
#!/bin/bash
|
||||
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --time=00:01:10
|
||||
#SBATCH --job-name=Gabriel
|
||||
|
||||
whoami
|
||||
hostname
|
||||
|
||||
gfortran -fopenmp no_reduction.f90 -o no_reduction.x
|
||||
export OMP_NUM_THREADS=4
|
||||
./no_reduction.x
|
BIN
no_reduction.x
Executable file
BIN
no_reduction.x
Executable file
Binary file not shown.
46
readme.md
46
readme.md
@ -1,46 +0,0 @@
|
||||
|
||||
<p>
|
||||
Gabriel Possenti Kheisa Drianasta<br>
|
||||
19/442374/PA/19123
|
||||
</p>
|
||||
|
||||
<h1>Hasil Performance tuning</h1>
|
||||
|
||||
<h3>Sebelum parallel</h3>
|
||||
<pre>
|
||||
mahasiswa2
|
||||
komputasi06
|
||||
mimax= 129 mjmax= 65 mkmax= 65
|
||||
imax= 128 jmax= 64 kmax= 64
|
||||
Start rehearsal measurement process.
|
||||
Measure the performance in 10000 times.
|
||||
MFLOPS: 724.292114 time(s): 227.362640 8.79942896E-10
|
||||
</pre>
|
||||
|
||||
<h3>Sesudah parallel</h3>
|
||||
<pre>
|
||||
mahasiswa2
|
||||
komputasi06
|
||||
mimax= 129 mjmax= 65 mkmax= 65
|
||||
imax= 128 jmax= 64 kmax= 64
|
||||
Start rehearsal measurement process.
|
||||
Measure the performance in 10000 times.
|
||||
MFLOPS: 6017.17725 time(s): 27.3678112 8.79942896E-10
|
||||
</pre>
|
||||
|
||||
|
||||
|
||||
<h1>Pembahasan</h1>
|
||||
|
||||
<h3>Program OMP do end do disispkan pada bagian berikut pada program <a href="https://elok.ugm.ac.id/pluginfile.php/2173252/mod_resource/content/1/sample4.f">himeno</a>:</h3>
|
||||
<p>https://repo.gabrielkheisa.xyz/gabrielkheisa/tugas3-openMP/commit/1f11c6330793ba22afb953d114078c87e65d522c</p>
|
||||
<img src="https://cdn.discordapp.com/attachments/1003173519879847966/1046321839607136297/image.png">
|
||||
<img src="https://media.discordapp.net/attachments/1003173519879847966/1046321891406786580/image.png">
|
||||
|
||||
<h3>Kemudian dilanjutkan dengan penambahan $OMP parallel private() dan do reduction untuk deklarasi variabel parallel:</h3>
|
||||
<p>https://repo.gabrielkheisa.xyz/gabrielkheisa/tugas3-openMP/commit/8babfcc27e3b0dc524cf99a00cbd47645b9d0273</p>
|
||||
<img src="https://media.discordapp.net/attachments/1003173519879847966/1046322875797352518/image.png">
|
||||
|
||||
<p>Hasilnya adalah peningkatan performa dari <b>724MFLOPS</b> menjadi <b>6017MFLOPS</b> atau sebesar <b>8.3 kali</b> untuk <b>20 threads</b></p>
|
||||
<br>
|
||||
<p>Lampiran source code dan dokumentasi: https://repo.gabrielkheisa.xyz/gabrielkheisa/tugas3-openMP/src/branch/master2</p>
|
4
slurm-16206.out
Normal file
4
slurm-16206.out
Normal file
@ -0,0 +1,4 @@
|
||||
mahasiswa2
|
||||
komputasi06
|
||||
Matrix Size = 4096
|
||||
Execution Time = 227.454 sec A(n,n) = 0.102092747142180D+04
|
4
slurm-16207.out
Normal file
4
slurm-16207.out
Normal file
@ -0,0 +1,4 @@
|
||||
mahasiswa2
|
||||
komputasi06
|
||||
Matrix Size = 4096
|
||||
Execution Time = 115.605 sec A(n,n) = 0.102437460360345D+04
|
4
slurm-16208.out
Normal file
4
slurm-16208.out
Normal file
@ -0,0 +1,4 @@
|
||||
mahasiswa2
|
||||
komputasi06
|
||||
Matrix Size = 4096
|
||||
Execution Time = 60.744 sec A(n,n) = 0.102843459928075D+04
|
4
slurm-16209.out
Normal file
4
slurm-16209.out
Normal file
@ -0,0 +1,4 @@
|
||||
mahasiswa2
|
||||
komputasi06
|
||||
Matrix Size = 4096
|
||||
Execution Time = 30.498 sec A(n,n) = 0.100725125084073D+04
|
4
slurm-16210.out
Normal file
4
slurm-16210.out
Normal file
@ -0,0 +1,4 @@
|
||||
mahasiswa2
|
||||
komputasi06
|
||||
Matrix Size = 4096
|
||||
Execution Time = 26.556 sec A(n,n) = 0.100253761446337D+04
|
4
slurm-16212.out
Normal file
4
slurm-16212.out
Normal file
@ -0,0 +1,4 @@
|
||||
mahasiswa2
|
||||
komputasi03
|
||||
Matrix Size = 4096
|
||||
Execution Time = 24.960 sec A(n,n) = 0.101898724293035D+04
|
4
slurm-16213.out
Normal file
4
slurm-16213.out
Normal file
@ -0,0 +1,4 @@
|
||||
mahasiswa2
|
||||
komputasi03
|
||||
Matrix Size = 4096
|
||||
Execution Time = 24.707 sec A(n,n) = 0.101964571498883D+04
|
253
tugas-parallel.f
253
tugas-parallel.f
@ -1,253 +0,0 @@
|
||||
C*********************************************************************
|
||||
C
|
||||
C This benchmark test program is measuring a cpu performance
|
||||
C of floating point operation by a Poisson equation solver.
|
||||
CC
|
||||
C If you have any question, please ask me via email.
|
||||
C written by Ryutaro HIMENO, November 26, 2001.
|
||||
C Version 3.0
|
||||
C ----------------------------------------------
|
||||
C Ryutaro Himeno, Dr. of Eng.
|
||||
C Head of Computer Information Division,
|
||||
C RIKEN (The Institute of Pysical and Chemical Research)
|
||||
C Email : himeno@postman.riken.go.jp
|
||||
C ---------------------------------------------------------------
|
||||
C You can adjust the size of this benchmark code to fit your target
|
||||
C computer. In that case, please chose following sets of
|
||||
C (mimax,mjmax,mkmax):
|
||||
C small : 65,33,33
|
||||
C small : 129,65,65
|
||||
C midium: 257,129,129
|
||||
C large : 513,257,257
|
||||
C ext.large: 1025,513,513
|
||||
C This program is to measure a computer performance in MFLOPS
|
||||
C by using a kernel which appears in a linear solver of pressure
|
||||
C Poisson eq. which appears in an incompressible Navier-Stokes solver.
|
||||
C A point-Jacobi method is employed in this solver as this method can
|
||||
C be easyly vectrized and be parallelized.
|
||||
C ------------------
|
||||
C Finite-difference method, curvilinear coodinate system
|
||||
C Vectorizable and parallelizable on each grid point
|
||||
C No. of grid points : imax x jmax x kmax including boundaries
|
||||
C ------------------
|
||||
C A,B,C:coefficient matrix, wrk1: source term of Poisson equation
|
||||
C wrk2 : working area, OMEGA : relaxation parameter
|
||||
C BND:control variable for boundaries and objects ( = 0 or 1)
|
||||
C P: pressure
|
||||
C -------------------
|
||||
C -------------------
|
||||
C "use portlib" statement on the next line is for Visual fortran
|
||||
C to use UNIX libraries. Please remove it if your system is UNIX.
|
||||
C -------------------
|
||||
! use portlib
|
||||
use omp_lib
|
||||
IMPLICIT REAL*4(a-h,o-z)
|
||||
real*8 t1,t2
|
||||
C
|
||||
C PARAMETER (mimax=513,mjmax=257,mkmax=257)
|
||||
C PARAMETER (mimax=257,mjmax=129,mkmax=129)
|
||||
PARAMETER (mimax=129,mjmax=65,mkmax=65)
|
||||
C PARAMETER (mimax=65,mjmax=33,mkmax=33)
|
||||
C
|
||||
C ttarget specifys the measuring period in sec
|
||||
PARAMETER (ttarget=60.0)
|
||||
CC Arrey
|
||||
common /pres/ p(mimax,mjmax,mkmax)
|
||||
common /mtrx/ a(mimax,mjmax,mkmax,4),
|
||||
+ b(mimax,mjmax,mkmax,3),c(mimax,mjmax,mkmax,3)
|
||||
common /bound/ bnd(mimax,mjmax,mkmax)
|
||||
common /work/ wrk1(mimax,mjmax,mkmax),wrk2(mimax,mjmax,mkmax)
|
||||
CC Other constants
|
||||
common /others/ imax,jmax,kmax,omega
|
||||
C
|
||||
dimension time0(2),time1(2)
|
||||
C
|
||||
omega=0.8
|
||||
imax=mimax-1
|
||||
jmax=mjmax-1
|
||||
kmax=mkmax-1
|
||||
CC Initializing matrixes
|
||||
call initmt
|
||||
write(*,*) ' mimax=',mimax,' mjmax=',mjmax,' mkmax=',mkmax
|
||||
write(*,*) ' imax=',imax,' jmax=',jmax,' kmax=',kmax
|
||||
CC Start measuring
|
||||
C
|
||||
nn=10000
|
||||
write(*,*) ' Start rehearsal measurement process.'
|
||||
write(*,*) ' Measure the performance in 10000 times.'
|
||||
C
|
||||
! cpu0=dtime(time0)
|
||||
t1 = omp_get_wtime()
|
||||
C
|
||||
C Jacobi iteration
|
||||
call jacobi(nn,gosa)
|
||||
C
|
||||
! cpu1= dtime(time1)
|
||||
t2 = omp_get_wtime()
|
||||
! cpu = cpu1
|
||||
cpu = t2-t1
|
||||
flop=real(kmax-2)*real(jmax-2)*real(imax-2)*34.0*real(nn)
|
||||
xmflops2=flop/cpu*1.0e-6
|
||||
write(*,*) ' MFLOPS:',xmflops2,' time(s):',cpu,gosa
|
||||
C
|
||||
C end the test loop
|
||||
! nn=ifix(ttarget/(cpu/3.0))
|
||||
! write(*,*) 'Now, start the actual measurement process.'
|
||||
! write(*,*) 'The loop will be excuted in',nn,' times.'
|
||||
! write(*,*) 'This will take about one minute.'
|
||||
! write(*,*) 'Wait for a while.'
|
||||
C
|
||||
C Jacobi iteration
|
||||
! cpu0=dtime(time0)
|
||||
! call jacobi(nn,gosa)
|
||||
C
|
||||
! cpu1= dtime(time1)
|
||||
! cpu = cpu1
|
||||
! flop=real(kmax-2)*real(jmax-2)*real(imax-2)*34.0*real(nn)
|
||||
! xmflops2=flop*1.0e-6/cpu
|
||||
C
|
||||
CCC xmflops2=nflop/cpu*1.0e-6*float(nn)
|
||||
C
|
||||
! write(*,*) ' Loop executed for ',nn,' times'
|
||||
! write(*,*) ' Gosa :',gosa
|
||||
! write(*,*) ' MFLOPS:',xmflops2, ' time(s):',cpu
|
||||
! score=xmflops2/82.84
|
||||
! write(*,*) ' Score based on Pentium III 600MHz :',score
|
||||
C
|
||||
! pause
|
||||
stop
|
||||
END
|
||||
C
|
||||
C
|
||||
C**************************************************************
|
||||
subroutine initmt
|
||||
C**************************************************************
|
||||
IMPLICIT REAL*4(a-h,o-z)
|
||||
C
|
||||
C PARAMETER (mimax=513,mjmax=257,mkmax=257)
|
||||
C PARAMETER (mimax=257,mjmax=129,mkmax=129)
|
||||
PARAMETER (mimax=129,mjmax=65,mkmax=65)
|
||||
C PARAMETER (mimax=65,mjmax=33,mkmax=33)
|
||||
C
|
||||
|
||||
|
||||
CC Arrey
|
||||
common /pres/ p(mimax,mjmax,mkmax)
|
||||
common /mtrx/ a(mimax,mjmax,mkmax,4),
|
||||
+ b(mimax,mjmax,mkmax,3),c(mimax,mjmax,mkmax,3)
|
||||
common /bound/ bnd(mimax,mjmax,mkmax)
|
||||
common /work/ wrk1(mimax,mjmax,mkmax),wrk2(mimax,mjmax,mkmax)
|
||||
CC other constants
|
||||
common /others/ imax,jmax,kmax,omega
|
||||
C
|
||||
!$OMP parallel private(k,j,i)
|
||||
!$OMP do
|
||||
do k=1,mkmax
|
||||
do j=1,mjmax
|
||||
do i=1,mimax
|
||||
a(i,j,k,1)=0.0
|
||||
a(i,j,k,2)=0.0
|
||||
a(i,j,k,3)=0.0
|
||||
a(i,j,k,4)=0.0
|
||||
b(i,j,k,1)=0.0
|
||||
b(i,j,k,2)=0.0
|
||||
b(i,j,k,3)=0.0
|
||||
c(i,j,k,1)=0.0
|
||||
c(i,j,k,2)=0.0
|
||||
c(i,j,k,3)=0.0
|
||||
p(i,j,k) =0.0
|
||||
wrk1(i,j,k)=0.0
|
||||
bnd(i,j,k)=0.0
|
||||
enddo
|
||||
enddo
|
||||
enddo
|
||||
!$OMP end do
|
||||
C
|
||||
|
||||
!$OMP do
|
||||
do k=1,kmax
|
||||
do j=1,jmax
|
||||
do i=1,imax
|
||||
a(i,j,k,1)=1.0
|
||||
a(i,j,k,2)=1.0
|
||||
a(i,j,k,3)=1.0
|
||||
a(i,j,k,4)=1.0/6.0
|
||||
b(i,j,k,1)=0.0
|
||||
b(i,j,k,2)=0.0
|
||||
b(i,j,k,3)=0.0
|
||||
c(i,j,k,1)=1.0
|
||||
c(i,j,k,2)=1.0
|
||||
c(i,j,k,3)=1.0
|
||||
p(i,j,k) =float((k-1)*(k-1))/float((kmax-1)*(kmax-1))
|
||||
wrk1(i,j,k)=0.0
|
||||
bnd(i,j,k)=1.0
|
||||
enddo
|
||||
enddo
|
||||
enddo
|
||||
!$OMP end do
|
||||
!$OMP end parallel
|
||||
C
|
||||
return
|
||||
end
|
||||
C
|
||||
C*************************************************************
|
||||
subroutine jacobi(nn,gosa)
|
||||
C*************************************************************
|
||||
IMPLICIT REAL*4(a-h,o-z)
|
||||
C
|
||||
C PARAMETER (mimax=513,mjmax=257,mkmax=257)
|
||||
C PARAMETER (mimax=257,mjmax=129,mkmax=129)
|
||||
PARAMETER (mimax=129,mjmax=65,mkmax=65)
|
||||
C PARAMETER (mimax=65,mjmax=33,mkmax=33)
|
||||
C
|
||||
|
||||
CC Arrey
|
||||
common /pres/ p(mimax,mjmax,mkmax)
|
||||
common /mtrx/ a(mimax,mjmax,mkmax,4),
|
||||
+ b(mimax,mjmax,mkmax,3),c(mimax,mjmax,mkmax,3)
|
||||
common /bound/ bnd(mimax,mjmax,mkmax)
|
||||
common /work/ wrk1(mimax,mjmax,mkmax),wrk2(mimax,mjmax,mkmax)
|
||||
CC other constants
|
||||
common /others/ imax,jmax,kmax,omega
|
||||
C
|
||||
C
|
||||
DO loop=1,nn
|
||||
gosa=0.0
|
||||
!$OMP parallel private(K,J,I,S0,SS,wrk2)
|
||||
!$OMP do reduction(+:GOSA)
|
||||
DO K=2,kmax-1
|
||||
DO J=2,jmax-1
|
||||
DO I=2,imax-1
|
||||
S0=a(I,J,K,1)*p(I+1,J,K)+a(I,J,K,2)*p(I,J+1,K)
|
||||
1 +a(I,J,K,3)*p(I,J,K+1)
|
||||
2 +b(I,J,K,1)*(p(I+1,J+1,K)-p(I+1,J-1,K)
|
||||
3 -p(I-1,J+1,K)+p(I-1,J-1,K))
|
||||
4 +b(I,J,K,2)*(p(I,J+1,K+1)-p(I,J-1,K+1)
|
||||
5 -p(I,J+1,K-1)+p(I,J-1,K-1))
|
||||
6 +b(I,J,K,3)*(p(I+1,J,K+1)-p(I-1,J,K+1)
|
||||
7 -p(I+1,J,K-1)+p(I-1,J,K-1))
|
||||
8 +c(I,J,K,1)*p(I-1,J,K)+c(I,J,K,2)*p(I,J-1,K)
|
||||
9 +c(I,J,K,3)*p(I,J,K-1)+wrk1(I,J,K)
|
||||
SS=(S0*a(I,J,K,4)-p(I,J,K))*bnd(I,J,K)
|
||||
GOSA=GOSA+SS*SS
|
||||
wrk2(I,J,K)=p(I,J,K)+OMEGA *SS
|
||||
enddo
|
||||
enddo
|
||||
enddo
|
||||
!$OMP end do
|
||||
C
|
||||
!$OMP do
|
||||
DO K=2,kmax-1
|
||||
DO J=2,jmax-1
|
||||
DO I=2,imax-1
|
||||
p(I,J,K)=wrk2(I,J,K)
|
||||
enddo
|
||||
enddo
|
||||
enddo
|
||||
!$OMP end do
|
||||
!$OMP end parallel
|
||||
C
|
||||
enddo
|
||||
CC End of iteration
|
||||
return
|
||||
end
|
@ -78,14 +78,16 @@ C
|
||||
C
|
||||
! cpu0=dtime(time0)
|
||||
t1 = omp_get_wtime()
|
||||
!$OMP PARALLEL DO
|
||||
C
|
||||
C Jacobi iteration
|
||||
call jacobi(nn,gosa)
|
||||
C
|
||||
! cpu1= dtime(time1)
|
||||
t2 = omp_get_wtime()
|
||||
!$OMP END PARALLEL DO
|
||||
t2 = omp_get_wtime()
|
||||
! cpu = cpu1
|
||||
cpu = t2-t1
|
||||
cpu = t2-t1
|
||||
flop=real(kmax-2)*real(jmax-2)*real(imax-2)*34.0*real(nn)
|
||||
xmflops2=flop/cpu*1.0e-6
|
||||
write(*,*) ' MFLOPS:',xmflops2,' time(s):',cpu,gosa
|
||||
@ -140,7 +142,6 @@ CC Arrey
|
||||
CC other constants
|
||||
common /others/ imax,jmax,kmax,omega
|
||||
C
|
||||
!$OMP do
|
||||
do k=1,mkmax
|
||||
do j=1,mjmax
|
||||
do i=1,mimax
|
||||
@ -160,10 +161,7 @@ C
|
||||
enddo
|
||||
enddo
|
||||
enddo
|
||||
!$OMP end do
|
||||
C
|
||||
|
||||
!$OMP do
|
||||
do k=1,kmax
|
||||
do j=1,jmax
|
||||
do i=1,imax
|
||||
@ -183,7 +181,6 @@ C
|
||||
enddo
|
||||
enddo
|
||||
enddo
|
||||
!$OMP end do
|
||||
C
|
||||
return
|
||||
end
|
||||
@ -211,7 +208,6 @@ C
|
||||
C
|
||||
DO loop=1,nn
|
||||
gosa=0.0
|
||||
!$OMP do reduction(+:GOSA)
|
||||
DO K=2,kmax-1
|
||||
DO J=2,jmax-1
|
||||
DO I=2,imax-1
|
||||
@ -231,9 +227,7 @@ C
|
||||
enddo
|
||||
enddo
|
||||
enddo
|
||||
!$OMP end do
|
||||
C
|
||||
!$OMP do
|
||||
DO K=2,kmax-1
|
||||
DO J=2,jmax-1
|
||||
DO I=2,imax-1
|
||||
@ -241,7 +235,6 @@ C
|
||||
enddo
|
||||
enddo
|
||||
enddo
|
||||
!$OMP end do
|
||||
C
|
||||
enddo
|
||||
CC End of iteration
|
29
tugas3.f90
Normal file
29
tugas3.f90
Normal file
@ -0,0 +1,29 @@
|
||||
program sample3
|
||||
use omp_lib
|
||||
|
||||
implicit real(8)(a-h,o-z)
|
||||
parameter (n=4096)
|
||||
real(8) a(n,n), c(n,n)
|
||||
real(4) b(n,n)
|
||||
real*8 t1, t2
|
||||
a=0.0d0
|
||||
call random_number(b)
|
||||
call random_number(c)
|
||||
write(6,50) ' Matrix Size = ', n
|
||||
50 format(1x,a,i5)
|
||||
t1 = omp_get_wtime()
|
||||
!$OMP PARALLEL DO
|
||||
do j=1,n
|
||||
do k=1,n
|
||||
do i=1,n
|
||||
a(i,j)=a(i,j)+b(i,k)*c(k,j)
|
||||
end do
|
||||
end do
|
||||
end do
|
||||
!$OMP END PARALLEL DO
|
||||
t2 = omp_get_wtime()
|
||||
write(6, 60) ' Execution Time = ',t2-t1,' sec',' A(n,n) = ',a(n,n)
|
||||
60 format(1x,a,f10.3,a,1x,a,d24.15)
|
||||
stop
|
||||
|
||||
end
|
13
tugas3.sh
Normal file
13
tugas3.sh
Normal file
@ -0,0 +1,13 @@
|
||||
#!/bin/bash
|
||||
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --time=00:01:10
|
||||
#SBATCH --job-name=Gabriel
|
||||
|
||||
whoami
|
||||
hostname
|
||||
|
||||
gfortran -fopenmp tugas3.f90
|
||||
export OMP_NUM_THREADS=512
|
||||
ulimit -s unlimited
|
||||
./a.out
|
@ -1,15 +1,13 @@
|
||||
#!/bin/bash
|
||||
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --node-list=komputasi06
|
||||
#SBATCH --time=00:05:00
|
||||
#SBATCH --job-name=Gabriel
|
||||
|
||||
whoami
|
||||
hostname
|
||||
|
||||
gfortran -fopenmp tugas-serial.f
|
||||
export OMP_NUM_THREADS=20
|
||||
export OMP_STACKSIZE=32m
|
||||
gfortran -fopenmp tugas3.f90
|
||||
export OMP_NUM_THREADS=16
|
||||
ulimit -s unlimited
|
||||
./a.out
|
@ -1,15 +1,13 @@
|
||||
#!/bin/bash
|
||||
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --node-list=komputasi06
|
||||
#SBATCH --time=00:05:00
|
||||
#SBATCH --job-name=Gabriel
|
||||
|
||||
whoami
|
||||
hostname
|
||||
|
||||
gfortran -fopenmp tugas-parallel.f
|
||||
export OMP_NUM_THREADS=20
|
||||
export OMP_STACKSIZE=32m
|
||||
gfortran -fopenmp tugas3.f90
|
||||
export OMP_NUM_THREADS=1
|
||||
ulimit -s unlimited
|
||||
./a.out
|
13
tugas3_2thread.sh
Normal file
13
tugas3_2thread.sh
Normal file
@ -0,0 +1,13 @@
|
||||
#!/bin/bash
|
||||
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --time=00:05:00
|
||||
#SBATCH --job-name=Gabriel
|
||||
|
||||
whoami
|
||||
hostname
|
||||
|
||||
gfortran -fopenmp tugas3.f90
|
||||
export OMP_NUM_THREADS=2
|
||||
ulimit -s unlimited
|
||||
./a.out
|
13
tugas3_32thread.sh
Normal file
13
tugas3_32thread.sh
Normal file
@ -0,0 +1,13 @@
|
||||
#!/bin/bash
|
||||
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --time=00:05:00
|
||||
#SBATCH --job-name=Gabriel
|
||||
|
||||
whoami
|
||||
hostname
|
||||
|
||||
gfortran -fopenmp tugas3.f90
|
||||
export OMP_NUM_THREADS=32
|
||||
ulimit -s unlimited
|
||||
./a.out
|
13
tugas3_4thread.sh
Normal file
13
tugas3_4thread.sh
Normal file
@ -0,0 +1,13 @@
|
||||
#!/bin/bash
|
||||
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --time=00:05:00
|
||||
#SBATCH --job-name=Gabriel
|
||||
|
||||
whoami
|
||||
hostname
|
||||
|
||||
gfortran -fopenmp tugas3.f90
|
||||
export OMP_NUM_THREADS=4
|
||||
ulimit -s unlimited
|
||||
./a.out
|
13
tugas3_64thread.sh
Normal file
13
tugas3_64thread.sh
Normal file
@ -0,0 +1,13 @@
|
||||
#!/bin/bash
|
||||
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --time=00:05:00
|
||||
#SBATCH --job-name=Gabriel
|
||||
|
||||
whoami
|
||||
hostname
|
||||
|
||||
gfortran -fopenmp tugas3.f90
|
||||
export OMP_NUM_THREADS=64
|
||||
ulimit -s unlimited
|
||||
./a.out
|
13
tugas3_8thread.sh
Normal file
13
tugas3_8thread.sh
Normal file
@ -0,0 +1,13 @@
|
||||
#!/bin/bash
|
||||
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --time=00:05:00
|
||||
#SBATCH --job-name=Gabriel
|
||||
|
||||
whoami
|
||||
hostname
|
||||
|
||||
gfortran -fopenmp tugas3.f90
|
||||
export OMP_NUM_THREADS=8
|
||||
ulimit -s unlimited
|
||||
./a.out
|
15
with_reduction.f90
Normal file
15
with_reduction.f90
Normal file
@ -0,0 +1,15 @@
|
||||
program summ
|
||||
implicit none
|
||||
integer :: sum = 0
|
||||
integer :: n
|
||||
!$OMP parallel do
|
||||
do n = 0, 1000
|
||||
sum = sum + n
|
||||
print*, n, " ", sum
|
||||
end do
|
||||
!$OMP end parallel do
|
||||
print*, " "
|
||||
print*, " "
|
||||
print*, " "
|
||||
print*, "hasilnya adalah ", sum
|
||||
end program summ
|
12
with_reduction.sh
Normal file
12
with_reduction.sh
Normal file
@ -0,0 +1,12 @@
|
||||
#!/bin/bash
|
||||
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --time=00:01:10
|
||||
#SBATCH --job-name=Gabriel
|
||||
|
||||
whoami
|
||||
hostname
|
||||
|
||||
gfortran -fopenmp with_reduction.f90 -o with_reduction.x
|
||||
export OMP_NUM_THREADS=4
|
||||
./with_reduction.x
|
BIN
with_reduction.x
Executable file
BIN
with_reduction.x
Executable file
Binary file not shown.
Loading…
Reference in New Issue
Block a user