PB3D [2.47]
Ideal linear high-n MHD stability in 3-D
Loading...
Searching...
No Matches
MPI_utilities.f90
Go to the documentation of this file.
1!------------------------------------------------------------------------------!
2!> Numerical utilities related to MPI.
3!!
4!! This includes a lock system, which can be both BL (blocking) or NB
5!! (non-blocking). It is based on the implementation of an MPI-IO atomic mode
6!! without file support, described in \cite RossAtomicIO.
7!!
8!! \see See mpi_vars.
9!!
10!! The reason for this was the fact that using a simple lock file can lead to
11!! crashes.
12!!
13!! \note A downside of this method is that in some rare cases a deadlock may
14!! occur as the master process, which contains the shared variable with a window
15!! that other processes may use, is idle and waiting, whereas the others are
16!! still performing lock options. As the master is idle and waiting, its MPI
17!! asynchronous communication is not performed. To remedy this, just call
18!! wait_MPI() after procedures where lock operations are performed.
19!------------------------------------------------------------------------------!
21#include <PB3D_macros.h>
23 use messages
24 use mpi
25 use num_vars, only: dp, max_str_ln, pi
26 use mpi_vars, only: lock_type
27
28 implicit none
29 private
32#if ldebug
35#endif
36
37#if ldebug
38 ! global variables
39 ! Note: use "sort -nk1 [output file]" to get output sorted by time.
40 logical :: debug_lock = .false. !< print debug information about lock operations \ldebug
41 integer :: n_waits = 0 !< number of waits \ldebug
42#endif
43
44 ! interfaces
45
46 !> \public Gather parallel variable in serial version on group master.
47 !!
48 !! Optionally, all the processes receive the parallel variable using \c
49 !! scatter.
50 !!
51 !! \note The serial variable has to be allocatable and if unallocated, it
52 !! will be allocated.
53 !!
54 !! \return ierr
55 interface get_ser_var
56 !> \public
57 module procedure get_ser_var_complex
58 !> \public
59 module procedure get_ser_var_real
60 !> \public
61 module procedure get_ser_var_int
62 end interface
63
64 !> \public Fill the ghost regions in an array.
65 !!
66 !! This is done by sending the first normal point of a process to the left
67 !! process.
68 !!
69 !! Every MPI message is identified by its sending process. The array should
70 !! have the extended size, including ghost regions.
71 !!
72 !! \return ierr
73 interface get_ghost_arr
74 !> \public
75 module procedure get_ghost_arr_3d_complex
76 !> \public
77 module procedure get_ghost_arr_3d_real
78 !> \public
79 module procedure get_ghost_arr_2d_complex
80 !> \public
81 module procedure get_ghost_arr_1d_real
82 end interface
83
84 !> \public Wrapper function to broadcast a single variable using MPI.
85 !!
86 !! \return ierr
87 interface broadcast_var
88 !> \public
89 module procedure broadcast_var_real
90 !> \public
91 module procedure broadcast_var_int
92 !> \public
93 module procedure broadcast_var_log
94 !> \public
95 module procedure broadcast_var_complex_arr
96 !> \public
97 module procedure broadcast_var_real_arr
98 !> \public
99 module procedure broadcast_var_int_arr
100 !> \public
101 module procedure broadcast_var_log_arr
102 end interface
103
104contains
105 !> \private complex version
106 integer function get_ser_var_complex(var,ser_var,scatter) result(ierr)
107 use num_vars, only: rank, n_procs
108
109 character(*), parameter :: rout_name = 'get_ser_var'
110
111 ! input / output
112 complex(dp), intent(in) :: var(:) !< parallel vector
113 complex(dp), allocatable, intent(inout) :: ser_var(:) !< serial vector
114 logical, intent(in), optional :: scatter !< optionally scatter the result to all the processes
115
116 ! local variables
117 character(len=max_str_ln) :: err_msg ! error message
118 integer, allocatable :: recvcounts(:) ! counts of nr. of elements received from each processor
119 integer, allocatable :: displs(:) ! displacements elements received from each processor
120 integer :: id ! counter
121 logical :: scatter_loc ! local copy of scatter
122
123 ! initialize ierr
124 ierr = 0
125
126 ! set local copy of scatter
127 scatter_loc = .false.
128 if (present(scatter)) scatter_loc = scatter
129
130 ! gather local size of var of all groups onto main processor, to serve
131 ! as receive counts on group master
132 if (rank.eq.0 .or. scatter_loc) then
133 allocate(recvcounts(n_procs))
134 allocate(displs(n_procs))
135 else
136 allocate(recvcounts(0))
137 allocate(displs(0))
138 end if
139 if (scatter_loc) then
140 call mpi_allgather(size(var),1,mpi_integer,recvcounts,1,&
141 &mpi_integer,mpi_comm_world,ierr)
142 else
143 call mpi_gather(size(var),1,mpi_integer,recvcounts,1,&
144 &mpi_integer,0,mpi_comm_world,ierr)
145 end if
146 err_msg = 'Failed to gather size of parallel variable'
147 chckerr(err_msg)
148
149 ! allocate serial variable
150 if (allocated(ser_var)) then
151 if (size(ser_var).ne.sum(recvcounts)) then
152 ierr = 1
153 write(*,*) size(ser_var), sum(recvcounts)
154 err_msg = 'ser_var has wrong dimensions'
155 chckerr(err_msg)
156 end if
157 else
158 allocate(ser_var(sum(recvcounts)))
159 end if
160
161 ! deduce displacements by summing recvcounts
162 if (rank.eq.0 .or. scatter_loc) then
163 displs(1) = 0
164 do id = 2,n_procs
165 displs(id) = displs(id-1) + recvcounts(id-1)
166 end do
167 end if
168
169 if (scatter_loc) then
170 call mpi_allgatherv(var,size(var),mpi_double_complex,ser_var,&
171 &recvcounts,displs,mpi_double_complex,mpi_comm_world,ierr)
172 else
173 call mpi_gatherv(var,size(var),mpi_double_complex,ser_var,&
174 &recvcounts,displs,mpi_double_complex,0,mpi_comm_world,ierr)
175 end if
176 err_msg = 'Failed to gather parallel variable'
177 chckerr(err_msg)
178 end function get_ser_var_complex
179 !> \private real version
180 integer function get_ser_var_real(var,ser_var,scatter) result(ierr)
181 use num_vars, only: rank, n_procs
182
183 character(*), parameter :: rout_name = 'get_ser_var_real'
184
185 ! input / output
186 real(dp), intent(in) :: var(:) !< parallel vector
187 real(dp), allocatable, intent(inout) :: ser_var(:) !< serial vector
188 logical, intent(in), optional :: scatter !< optionally scatter the result to all the processes
189
190 ! local variables
191 character(len=max_str_ln) :: err_msg ! error message
192 integer, allocatable :: recvcounts(:) ! counts of nr. of elements received from each processor
193 integer, allocatable :: displs(:) ! displacements elements received from each processor
194 integer :: id ! counter
195 logical :: scatter_loc ! local copy of scatter
196
197 ! initialize ierr
198 ierr = 0
199
200 ! set local copy of scatter
201 scatter_loc = .false.
202 if (present(scatter)) scatter_loc = scatter
203
204 ! gather local size of var of all groups onto main processor, to serve
205 ! as receive counts on group master
206 if (rank.eq.0 .or. scatter_loc) then
207 allocate(recvcounts(n_procs))
208 allocate(displs(n_procs))
209 else
210 allocate(recvcounts(0))
211 allocate(displs(0))
212 end if
213 if (scatter_loc) then
214 call mpi_allgather(size(var),1,mpi_integer,recvcounts,1,&
215 &mpi_integer,mpi_comm_world,ierr)
216 else
217 call mpi_gather(size(var),1,mpi_integer,recvcounts,1,&
218 &mpi_integer,0,mpi_comm_world,ierr)
219 end if
220 err_msg = 'Failed to gather size of parallel variable'
221 chckerr(err_msg)
222
223 ! allocate serial variable
224 if (allocated(ser_var)) then
225 if (size(ser_var).ne.sum(recvcounts)) then
226 ierr = 1
227 write(*,*) size(ser_var), sum(recvcounts)
228 err_msg = 'ser_var has wrong dimensions'
229 chckerr(err_msg)
230 end if
231 else
232 allocate(ser_var(sum(recvcounts)))
233 end if
234
235 ! deduce displacements by summing recvcounts
236 if (rank.eq.0 .or. scatter_loc) then
237 displs(1) = 0
238 do id = 2,n_procs
239 displs(id) = displs(id-1) + recvcounts(id-1)
240 end do
241 end if
242
243 if (scatter_loc) then
244 call mpi_allgatherv(var,size(var),mpi_double_precision,ser_var,&
245 &recvcounts,displs,mpi_double_precision,mpi_comm_world,ierr)
246 else
247 call mpi_gatherv(var,size(var),mpi_double_precision,ser_var,&
248 &recvcounts,displs,mpi_double_precision,0,mpi_comm_world,ierr)
249 end if
250 err_msg = 'Failed to gather parallel variable'
251 chckerr(err_msg)
252 end function get_ser_var_real
253 !> \private integer version
254 integer function get_ser_var_int(var,ser_var,scatter) result(ierr)
255 use num_vars, only: rank, n_procs
256
257 character(*), parameter :: rout_name = 'get_ser_var_int'
258
259 ! input / output
260 integer, intent(in) :: var(:) !< parallel vector
261 integer, allocatable, intent(inout) :: ser_var(:) !< serial vector
262 logical, intent(in), optional :: scatter !< optionally scatter the result to all the processes
263
264 ! local variables
265 character(len=max_str_ln) :: err_msg ! error message
266 integer, allocatable :: recvcounts(:) ! counts of nr. of elements received from each processor
267 integer, allocatable :: displs(:) ! displacements elements received from each processor
268 integer :: id ! counter
269 logical :: scatter_loc ! local copy of scatter
270
271 ! initialize ierr
272 ierr = 0
273
274 ! set local copy of scatter
275 scatter_loc = .false.
276 if (present(scatter)) scatter_loc = scatter
277
278 ! gather local size of var of all groups onto main processor, to serve
279 ! as receive counts on group master
280 if (rank.eq.0 .or. scatter_loc) then
281 allocate(recvcounts(n_procs))
282 allocate(displs(n_procs))
283 else
284 allocate(recvcounts(0))
285 allocate(displs(0))
286 end if
287 if (scatter_loc) then
288 call mpi_allgather(size(var),1,mpi_integer,recvcounts,1,&
289 &mpi_integer,mpi_comm_world,ierr)
290 else
291 call mpi_gather(size(var),1,mpi_integer,recvcounts,1,&
292 &mpi_integer,0,mpi_comm_world,ierr)
293 end if
294 err_msg = 'Failed to gather size of parallel variable'
295 chckerr(err_msg)
296
297 ! allocate serial variable
298 if (allocated(ser_var)) then
299 if (size(ser_var).ne.sum(recvcounts)) then
300 ierr = 1
301 write(*,*) size(ser_var), sum(recvcounts)
302 err_msg = 'ser_var has wrong dimensions'
303 chckerr(err_msg)
304 end if
305 else
306 allocate(ser_var(sum(recvcounts)))
307 end if
308
309 ! deduce displacements by summing recvcounts
310 if (rank.eq.0 .or. scatter_loc) then
311 displs(1) = 0
312 do id = 2,n_procs
313 displs(id) = displs(id-1) + recvcounts(id-1)
314 end do
315 end if
316
317 if (scatter_loc) then
318 call mpi_allgatherv(var,size(var),mpi_integer,ser_var,&
319 &recvcounts,displs,mpi_integer,mpi_comm_world,ierr)
320 else
321 call mpi_gatherv(var,size(var),mpi_integer,ser_var,&
322 &recvcounts,displs,mpi_integer,0,mpi_comm_world,ierr)
323 end if
324 err_msg = 'Failed to gather parallel variable'
325 chckerr(err_msg)
326 end function get_ser_var_int
327
328 !> Redistribute variables according to new limits.
329 integer function redistribute_var(var,dis_var,lims,lims_dis) result(ierr)
330 use num_vars, only: n_procs, rank
331
332 character(*), parameter :: rout_name = 'redistribute_var'
333
334 ! input / output
335 real(dp), intent(in) :: var(:) !< parallel vector
336 real(dp), intent(inout) :: dis_var(:) !< redistributed vector
337 integer, intent(in) :: lims(2) !< indices of parallel vector
338 integer, intent(in) :: lims_dis(2) !< indices of redistributed parallel vector
339
340 ! local variables
341 integer :: id, jd ! counters
342 integer, allocatable :: lims_tot(:,:) ! limits of all processes
343 integer, allocatable :: lims_dis_tot(:,:) ! redistributed limits of all processes
344 integer, allocatable :: temp_lim(:) ! temporary variable
345 integer, allocatable :: n_vars(:,:) ! number of values to be received and sent from each process (row: to, column: from)
346 integer, allocatable :: nr_sen(:) ! number of values to be sent to each process from this process
347 integer, allocatable :: nr_rec(:) ! number of values to be received from each process to this process
348 integer, allocatable :: id_rec(:) ! starting position of variable to be received in local memory (starting at 0)
349 integer, allocatable :: id_sen(:) ! starting position of variable to be sent in local memory (starting at 0)
350 integer :: max_loc ! local maximum
351
352 ! initialize ierr
353 ierr = 0
354
355 ! get limits and redistributed limits of all procs
356 allocate(lims_tot(2,n_procs))
357 allocate(lims_dis_tot(2,n_procs))
358 do id = 1,2
359 ierr = get_ser_var(lims(id:id),temp_lim,scatter=.true.)
360 chckerr('')
361 lims_tot(id,:) = temp_lim
362 ierr = get_ser_var(lims_dis(id:id),temp_lim,scatter=.true.)
363 chckerr('')
364 lims_dis_tot(id,:) = temp_lim
365 end do
366
367 ! find out how many variables to send and receive. A row indicates a
368 ! receive and a column a send. E.g.:
369 ! (a b)
370 ! (c d)
371 ! means that process 0 gets a from process 0 and b from process 1.
372 ! Process 1 gets c from process 0 and d from process 1.
373 allocate(n_vars(n_procs,n_procs))
374 n_vars = 0
375 do jd = 1,n_procs
376 max_loc = lims_dis_tot(1,jd)
377 do id = 1,n_procs
378 if (lims_tot(2,id).ge.max_loc) then
379 n_vars(jd,id) = min(lims_tot(2,id),lims_dis_tot(2,jd)) - &
380 &max_loc + 1
381 max_loc = max_loc + n_vars(jd,id)
382 end if
383 end do
384 end do
385
386 ! set up how many are to be sent by each process and which memory index
387 ! - nr_rec is the row of n_vars corresponding to this process,
388 ! - nr_sen is the column of n_vars corresponding to this process,
389 ! - id_rec counts the number of variables gotten from previous
390 ! processes,
391 ! - id_sen counts the number of variables sent to previous processes,
392 ! taking into account the difference betweeen the lower limit of the
393 ! process sending to and the current process.
394 allocate(nr_rec(n_procs))
395 allocate(nr_sen(n_procs))
396 allocate(id_rec(n_procs))
397 allocate(id_sen(n_procs))
398 do id = 1,n_procs
399 nr_rec(id) = n_vars(rank+1,id)
400 nr_sen(id) = n_vars(id,rank+1)
401 id_rec(id) = sum(n_vars(rank+1,1:id-1))
402 id_sen(id) = sum(n_vars(id,1:rank)) + lims_dis_tot(1,id) - lims(1)
403 end do
404
405 call mpi_alltoallv(var,nr_sen,id_sen,mpi_double_precision,&
406 &dis_var,nr_rec,id_rec,mpi_double_precision,mpi_comm_world,ierr)
407 chckerr('')
408
409 !call sleep(rank)
410 !write(*,*) 'rank', rank
411 !write(*,*) ' nr_rec', nr_rec
412 !write(*,*) ' id_rec', id_rec
413 !write(*,*) ' nr_sen', nr_sen
414 !write(*,*) ' id_sen', id_sen
415 !do id = 1,n_procs
416 !write(*,*) ' ', n_vars(id,:)
417 !end do
418 end function redistribute_var
419
420 !> \private 3-D complex version
421 integer function get_ghost_arr_3d_complex(arr,size_ghost) result(ierr)
422 use num_vars, only: rank, n_procs
423
424 character(*), parameter :: rout_name = 'get_ghost_arr_3D_complex'
425
426 ! input / output
427 complex(dp), intent(inout) :: arr(:,:,:) !< divided array
428 integer, intent(in) :: size_ghost !< width of ghost region
429
430 ! local variables
431 integer :: n_modes(2) ! number of modes
432 integer :: tot_size ! total size (including ghost region)
433 integer :: istat(mpi_status_size) ! status of send-receive
434
435 ! initialize ierr
436 ierr = 0
437
438 ! initialize number of modes and total size
439 n_modes = [size(arr,1),size(arr,2)]
440 tot_size = size(arr,3)
441
442 ! ghost regions only make sense if there is more than 1 process
443 if (n_procs.gt.1) then
444 if (rank.eq.0) then ! first rank only receives
445 call mpi_recv(arr(:,:,tot_size-size_ghost+1:tot_size),&
446 &size_ghost*product(n_modes),mpi_double_complex,rank+1,&
447 &rank+1,mpi_comm_world,istat,ierr)
448 chckerr('Failed to receive')
449 else if (rank+1.eq.n_procs) then ! last rank only sends
450 call mpi_send(arr(:,:,1:size_ghost),&
451 &size_ghost*product(n_modes),mpi_double_complex,rank-1,&
452 &rank,mpi_comm_world,ierr)
453 chckerr('Failed to send')
454 else ! middle ranks send and receive
455 call mpi_sendrecv(arr(:,:,1:size_ghost),&
456 &size_ghost*product(n_modes),mpi_double_complex,rank-1,&
457 &rank,arr(:,:,tot_size-size_ghost+1:tot_size),&
458 &size_ghost*product(n_modes),mpi_double_complex,&
459 &rank+1,rank+1,mpi_comm_world,istat,ierr)
460 chckerr('Failed to send and receive')
461 end if
462 end if
463 end function get_ghost_arr_3d_complex
464 !> \private 3-D real version
465 integer function get_ghost_arr_3d_real(arr,size_ghost) result(ierr)
466 use num_vars, only: rank, n_procs
467
468 character(*), parameter :: rout_name = 'get_ghost_arr_3D_real'
469
470 ! input / output
471 real(dp), intent(inout) :: arr(:,:,:) !< divided array
472 integer, intent(in) :: size_ghost !< width of ghost region
473
474 ! local variables
475 integer :: n_modes(2) ! number of modes
476 integer :: tot_size ! total size (including ghost region)
477 integer :: istat(mpi_status_size) ! status of send-receive
478
479 ! initialize ierr
480 ierr = 0
481
482 ! initialize number of modes and total size
483 n_modes = [size(arr,1),size(arr,2)]
484 tot_size = size(arr,3)
485
486 ! ghost regions only make sense if there is more than 1 process
487 if (n_procs.gt.1) then
488 if (rank.eq.0) then ! first rank only receives
489 call mpi_recv(arr(:,:,tot_size-size_ghost+1:tot_size),&
490 &size_ghost*product(n_modes),mpi_double_precision,&
491 &rank+1,rank+1,mpi_comm_world,istat,ierr)
492 chckerr('Failed to receive')
493 else if (rank+1.eq.n_procs) then ! last rank only sends
494 call mpi_send(arr(:,:,1:size_ghost),&
495 &size_ghost*product(n_modes),mpi_double_precision,&
496 &rank-1,rank,mpi_comm_world,ierr)
497 chckerr('Failed to send')
498 else ! middle ranks send and receive
499 call mpi_sendrecv(arr(:,:,1:size_ghost),&
500 &size_ghost*product(n_modes),mpi_double_precision,&
501 &rank-1,&
502 &rank,arr(:,:,tot_size-size_ghost+1:tot_size),&
503 &size_ghost*product(n_modes),mpi_double_precision,&
504 &rank+1,rank+1,mpi_comm_world,istat,ierr)
505 chckerr('Failed to send and receive')
506 end if
507 end if
508 end function get_ghost_arr_3d_real
509 !> \private 2-D complex version
510 integer function get_ghost_arr_2d_complex(arr,size_ghost) result(ierr)
511 use num_vars, only: rank, n_procs
512
513 character(*), parameter :: rout_name = 'get_ghost_arr_2D_complex'
514
515 ! input / output
516 complex(dp), intent(inout) :: arr(:,:) !< divided array
517 integer, intent(in) :: size_ghost !< width of ghost region
518
519 ! local variables
520 integer :: n_modes ! number of modes
521 integer :: tot_size ! total size (including ghost region)
522 integer :: istat(mpi_status_size) ! status of send-receive
523
524 ! initialize ierr
525 ierr = 0
526
527 ! initialize number of modes and total size
528 n_modes = size(arr,1)
529 tot_size = size(arr,2)
530
531 ! ghost regions only make sense if there is more than 1 process
532 if (n_procs.gt.1) then
533 if (rank.eq.0) then ! first rank only receives
534 call mpi_recv(arr(:,tot_size-size_ghost+1:tot_size),&
535 &size_ghost*n_modes,mpi_double_complex,rank+1,&
536 &rank+1,mpi_comm_world,istat,ierr)
537 chckerr('Failed to receive')
538 else if (rank+1.eq.n_procs) then ! last rank only sends
539 call mpi_send(arr(:,1:size_ghost),size_ghost*n_modes,&
540 &mpi_double_complex,rank-1,rank,mpi_comm_world,&
541 &ierr)
542 chckerr('Failed to send')
543 else ! middle ranks send and receive
544 call mpi_sendrecv(arr(:,1:size_ghost),size_ghost*n_modes,&
545 &mpi_double_complex,rank-1,rank,&
546 &arr(:,tot_size-size_ghost+1:tot_size),size_ghost*n_modes,&
547 &mpi_double_complex,rank+1,rank+1,mpi_comm_world,&
548 &istat,ierr)
549 chckerr('Failed to send and receive')
550 end if
551 end if
552 end function get_ghost_arr_2d_complex
553 !> \private 1-D real version
554 integer function get_ghost_arr_1d_real(arr,size_ghost) result(ierr)
555 use num_vars, only: rank, n_procs
556
557 character(*), parameter :: rout_name = 'get_ghost_arr_1D_real'
558
559 ! input / output
560 real(dp), intent(in) :: arr(:) !< divided array
561 integer, intent(in) :: size_ghost !< width of ghost region
562
563 ! local variables
564 integer :: tot_size ! total size (including ghost region)
565 integer :: istat(mpi_status_size) ! status of send-receive
566
567 ! initialize ierr
568 ierr = 0
569
570 ! initialize number of modes and total size
571 tot_size = size(arr)
572
573 ! ghost regions only make sense if there is more than 1 process
574 if (n_procs.gt.1) then
575 if (rank.eq.0) then ! first rank only receives
576 call mpi_recv(arr(tot_size-size_ghost+1:tot_size),&
577 &size_ghost,mpi_double_precision,rank+1,&
578 &rank+1,mpi_comm_world,istat,ierr)
579 chckerr('Failed to receive')
580 else if (rank+1.eq.n_procs) then ! last rank only sends
581 call mpi_send(arr(1:size_ghost),size_ghost,&
582 &mpi_double_precision,rank-1,rank,mpi_comm_world,&
583 &ierr)
584 chckerr('Failed to send')
585 else ! middle ranks send and receive
586 call mpi_sendrecv(arr(1:size_ghost),size_ghost,&
587 &mpi_double_precision,rank-1,rank,&
588 &arr(tot_size-size_ghost+1:tot_size),size_ghost,&
589 &mpi_double_precision,rank+1,rank+1,&
590 &mpi_comm_world,istat,ierr)
591 chckerr('Failed to send and receive')
592 end if
593 end if
594 end function get_ghost_arr_1d_real
595
596 !> \private real version
597 integer function broadcast_var_real(var,source) result(ierr)
598 character(*), parameter :: rout_name = 'broadcast_var_real'
599
600 ! input / output
601 real(dp), intent(in) :: var !< variable to be broadcast
602 integer, intent(in), optional :: source !< process that sends
603
604 ! local variables
605 integer :: source_loc = 0 ! local value for source
606
607 ! initialize ierr
608 ierr = 0
609
610 ! set local source if given
611 if (present(source)) source_loc = source
612
613 call mpi_bcast(var,1,mpi_double_precision,source_loc,mpi_comm_world,&
614 &ierr)
615 chckerr('MPI broadcast failed')
616 end function broadcast_var_real
617 !> \private integer version
618 integer function broadcast_var_int(var,source) result(ierr)
619 character(*), parameter :: rout_name = 'broadcast_var_int'
620
621 ! input / output
622 integer, intent(in) :: var !< variable to be broadcast
623 integer, intent(in), optional :: source !< process that sends
624
625 ! local variables
626 integer :: source_loc = 0 ! local value for source
627
628 ! initialize ierr
629 ierr = 0
630
631 ! set local source if given
632 if (present(source)) source_loc = source
633
634 call mpi_bcast(var,1,mpi_integer,source_loc,mpi_comm_world,ierr)
635 chckerr('MPI broadcast failed')
636 end function broadcast_var_int
637 !> \private logical version
638 integer function broadcast_var_log(var,source) result(ierr)
639 character(*), parameter :: rout_name = 'broadcast_var_log'
640
641 ! input / output
642 logical, intent(in) :: var !< variable to be broadcast
643 integer, intent(in), optional :: source !< process that sends
644
645 ! local variables
646 integer :: source_loc = 0 ! local value for source
647
648 ! initialize ierr
649 ierr = 0
650
651 ! set local source if given
652 if (present(source)) source_loc = source
653
654 call mpi_bcast(var,1,mpi_logical,source_loc,mpi_comm_world,ierr)
655 chckerr('MPI broadcast failed')
656 end function broadcast_var_log
657 !> \private complex array version
658 integer function broadcast_var_complex_arr(var,source) result(ierr)
659 character(*), parameter :: rout_name = 'broadcast_var_complex_arr'
660
661 ! input / output
662 complex(dp), intent(in) :: var(:) !< variable to be broadcast
663 integer, intent(in), optional :: source !< process that sends
664
665 ! local variables
666 integer :: source_loc = 0 ! local value for source
667
668 ! initialize ierr
669 ierr = 0
670
671 ! set local source if given
672 if (present(source)) source_loc = source
673
674 call mpi_bcast(var,size(var),mpi_double_complex,source_loc,&
675 &mpi_comm_world,ierr)
676 chckerr('MPI broadcast failed')
677 end function broadcast_var_complex_arr
678 !> \private real array version
679 integer function broadcast_var_real_arr(var,source) result(ierr)
680 character(*), parameter :: rout_name = 'broadcast_var_real_arr'
681
682 ! input / output
683 real(dp), intent(in) :: var(:) !< variable to be broadcast
684 integer, intent(in), optional :: source !< process that sends
685
686 ! local variables
687 integer :: source_loc = 0 ! local value for source
688
689 ! initialize ierr
690 ierr = 0
691
692 ! set local source if given
693 if (present(source)) source_loc = source
694
695 call mpi_bcast(var,size(var),mpi_double_precision,source_loc,&
696 &mpi_comm_world,ierr)
697 chckerr('MPI broadcast failed')
698 end function broadcast_var_real_arr
699 !> \private integer array version
700 integer function broadcast_var_int_arr(var,source) result(ierr)
701 character(*), parameter :: rout_name = 'broadcast_var_int_arr'
702
703 ! input / output
704 integer, intent(in) :: var(:) !< variable to be broadcast
705 integer, intent(in), optional :: source !< process that sends
706
707 ! local variables
708 integer :: source_loc = 0 ! local value for source
709
710 ! initialize ierr
711 ierr = 0
712
713 ! set local source if given
714 if (present(source)) source_loc = source
715
716 call mpi_bcast(var,size(var),mpi_integer,source_loc,mpi_comm_world,ierr)
717 chckerr('MPI broadcast failed')
718 end function broadcast_var_int_arr
719 !> \private logical array version
720 integer function broadcast_var_log_arr(var,source) result(ierr)
721 character(*), parameter :: rout_name = 'broadcast_var_log_arr'
722
723 ! input / output
724 logical, intent(in) :: var(:) !< variable to be broadcast
725 integer, intent(in), optional :: source !< process that sends
726
727 ! local variables
728 integer :: source_loc = 0 ! local value for source
729
730 ! initialize ierr
731 ierr = 0
732
733 ! set local source if given
734 if (present(source)) source_loc = source
735
736 call mpi_bcast(var,size(var),mpi_logical,source_loc,mpi_comm_world,ierr)
737 chckerr('MPI broadcast failed')
738 end function broadcast_var_log_arr
739
740 !> Wait for all processes, wrapper to MPI barrier.
741 !!
742 !! \return ierr
743 integer function wait_mpi() result(ierr)
744 character(*), parameter :: rout_name = 'wait_MPI'
745
746 ! initialize ierr
747 ierr = 0
748
749 ! set barrier
750 call mpi_barrier(mpi_comm_world,ierr)
751 chckerr('MPI Barrier failed')
752
753#if ldebug
754 n_waits = n_waits + 1
755#endif
756 end function wait_mpi
757
758 !> Request access to lock of a BL (blocking) or optionally a NB
759 !! (non-blocking) type.
760 !!
761 !! \note Based on \cite RossAtomicIO.
762 !!
763 !! \return ierr
764 integer function lock_req_acc(lock,blocking) result(ierr)
765 use num_vars, only: rank
766 use num_utilities, only: bubble_sort
767
768 character(*), parameter :: rout_name = 'lock_req_acc'
769
770 ! input / output
771 type(lock_type), intent(inout) :: lock !< lock
772
773 ! local variables
774 integer :: id ! counter
775 integer, allocatable :: wl_loc(:) ! local copy of waiting list
776 integer, allocatable :: next_nb_procs(:) ! next NB process(es)
777 integer, allocatable :: ranks_to_activate(:) ! ranks to be set active
778 logical, intent(in), optional :: blocking ! is a blocking process
779 logical :: next_nb_proc_exists ! a next NB process exists
780 logical :: direct_receipt ! receipt was direct
781#if ldebug
782 integer :: istat ! status
783#endif
784
785 ! initialize ierr
786 ierr = 0
787
788#if ldebug
789 ! check for initialized lock
790 if (.not.allocated(lock%wl)) then
791 ierr = 1
792 chckerr('lock not intialized')
793 end if
794#endif
795
796 ! set blocking property
797 lock%blocking = .true.
798 if (present(blocking)) lock%blocking = blocking
799
800#if ldebug
801 if(debug_lock) write(*,*,iostat=istat) trim(lock_header(lock)), &
802 &'requesting access'
803#endif
804
805 ! add current process to waiting list
806 ierr = lock_wl_change(1,lock%blocking,lock,wl_loc)
807 chckerr('')
808
809 ! set direct_receipt when wait list empty
810 direct_receipt = wl_empty(wl_loc,[-2,-1,1,2])
811
812 ! wait for notification if other processes
813 if (direct_receipt) then
814#if ldebug
815 if(debug_lock) write(*,*,iostat=istat) trim(lock_header(lock)), &
816 &'and got it right away'
817#endif
818
819 ! find next NB if this one is NB also
820 next_nb_proc_exists = .false.
821 if (.not.lock%blocking) then
822 ! find all NB procs
823 next_nb_proc_exists = &
824 &.not.wl_empty(wl_loc,[-1],next_procs=next_nb_procs)
825#if ldebug
826 if (debug_lock) write(*,*,iostat=istat) &
827 &trim(lock_header(lock)),&
828 &' NB proc, next NB procs found:', next_nb_procs
829#endif
830 end if
831
832 ! set ranks to activate
833 if (.not.lock%blocking .and. next_nb_proc_exists) then ! more processes
834 allocate(ranks_to_activate(size(next_nb_procs)+1))
835 ranks_to_activate = [rank,next_nb_procs]
836 else ! only this process
837 allocate(ranks_to_activate(1))
838 ranks_to_activate = rank
839 end if
840 call bubble_sort(ranks_to_activate) ! sort
841
842 ! activate
843#if ldebug
844 if(debug_lock) write(*,*,iostat=istat) trim(lock_header(lock)), &
845 &' setting status to activate:', ranks_to_activate
846#endif
847 ierr = lock_wl_change(2,.false.,lock,wl_loc,ranks=ranks_to_activate)
848 chckerr('')
849
850 ! notify other ranks to activate (NB)
851 if (next_nb_proc_exists) then
852 do id = 1,size(next_nb_procs)
853 ierr = lock_notify(lock,next_nb_procs(id))
854 chckerr('')
855 end do
856 end if
857 else
858 ! wait to get notified
859 ierr = lock_get_notified(lock)
860 chckerr('')
861 end if
862 end function lock_req_acc
863
864 !> Returns access to a lock.
865 !!
866 !! The blocking property has been set when requesting the lock.
867 !!
868 !! \note Based on \cite RossAtomicIO.
869 !!
870 !! \return ierr
871 integer function lock_return_acc(lock) result(ierr)
872 use num_utilities, only: bubble_sort
873
874 character(*), parameter :: rout_name = 'lock_return_acc'
875
876 ! input / output
877 type(lock_type), intent(inout) :: lock !< lock
878
879 ! local variables
880 integer :: id ! counter
881 integer, allocatable :: wl_loc(:) ! local copy of waiting list
882 integer, allocatable :: next_procs(:) ! next process(es)
883 integer, allocatable :: ranks_to_activate(:) ! ranks to be set active
884 logical :: next_proc_exists ! a next process exists
885 logical :: next_proc_bl ! next process is BL
886#if ldebug
887 integer :: istat ! status
888#endif
889
890 ! initialize ierr
891 ierr = 0
892
893#if ldebug
894 if (.not.allocated(lock%wl)) then
895 ierr = 1
896 chckerr('lock not intialized')
897 end if
898
899 if(debug_lock) write(*,*,iostat=istat) trim(lock_header(lock)), &
900 &'returning access'
901#endif
902
903 ! remove current process to lock waiting list
904 ierr = lock_wl_change(0,lock%blocking,lock,wl_loc)
905 chckerr('')
906
907 ! find next processes if:
908 ! - BL: always
909 ! - NB: was last NB running
910 next_proc_bl = .false.
911 next_proc_exists = .false.
912 if (lock%blocking .or. wl_empty(wl_loc,[-2])) then
913 ! find all BL procs
914 next_proc_exists = .not.wl_empty(wl_loc,[1],next_procs=next_procs)
915 if (next_proc_exists) next_proc_bl = .true.
916#if ldebug
917 if (debug_lock) write(*,*,iostat=istat) trim(lock_header(lock)),&
918 &' next BL procs found:', next_procs
919#endif
920
921 ! if none found, try NB procs
922 if (.not. next_proc_exists) then
923 next_proc_exists = .not.wl_empty(wl_loc,[-1],&
924 &next_procs=next_procs)
925 if (next_proc_exists) next_proc_bl = .false.
926#if ldebug
927 if (debug_lock) write(*,*,iostat=istat) &
928 &trim(lock_header(lock)),&
929 &' next NB procs found:', next_procs
930#endif
931 end if
932#if ldebug
933 else
934 if(debug_lock) write(*,*,iostat=istat) trim(lock_header(lock)), &
935 &'but has no notification rights'
936#endif
937 end if
938
939 if (next_proc_exists) then
940 ! set ranks to activate
941 if (next_proc_bl) then ! only the BL process
942 allocate(ranks_to_activate(1))
943 ranks_to_activate = next_procs(1)
944 else ! multiple NB processes
945 allocate(ranks_to_activate(size(next_procs)))
946 ranks_to_activate = next_procs
947 end if
948 call bubble_sort(ranks_to_activate) ! sort
949
950 ! activate
951#if ldebug
952 if(debug_lock) write(*,*,iostat=istat) trim(lock_header(lock)), &
953 &' setting status to activate:', ranks_to_activate
954#endif
955 ierr = lock_wl_change(2,next_proc_bl,lock,wl_loc,&
956 &ranks=ranks_to_activate)
957 chckerr('')
958
959 ! notify ranks to activate (NB or BL)
960 if (next_proc_exists) then
961 do id = 1,size(ranks_to_activate)
962 ierr = lock_notify(lock,ranks_to_activate(id))
963 chckerr('')
964 end do
965 end if
966 end if
967 end function lock_return_acc
968
969 !> Decides whether a waiting list is empty.
970 !!
971 !! The type of process to find is indicated by an array of possible values.
972 !!
973 !! \see See lock_wl_change() for an explanation of the process type.
974 !!
975 !! Additionally, for NB processes, the negative inverse of these values are
976 !! used.
977 !!
978 !! If the waiting list is not empty, the next process(es) can optionally be
979 !! returned.
980 !!
981 !! \note Based on \cite RossAtomicIO.
982 !!
983 !! \return ierr
984 logical function wl_empty(wl,proc_type,next_procs)
985 use num_vars, only: n_procs, rank
986
987 ! input / output
988 integer, intent(in) :: wl(:) !< waiting list
989 integer, intent(in) :: proc_type(:) !< types of processes accepted
990 integer, intent(inout), optional, allocatable :: next_procs(:) !< next process(es) if not empty
991
992 ! local variables
993 integer :: id, jd ! counters
994 integer :: nr_np ! nr. of next processes found
995 integer, allocatable :: next_procs_loc(:) ! local next process
996
997 ! initialize
998 nr_np = 0
999 wl_empty = .true.
1000 allocate(next_procs_loc(n_procs))
1001
1002 ! find the next rank
1003 do id = 1,n_procs-1
1004 ! wrap around n_procs-1 to zero
1005 next_procs_loc(nr_np+1) = mod(rank+id,n_procs)
1006
1007 proc_types: do jd = 1,size(proc_type)
1008 if (wl(next_procs_loc(nr_np+1)+1).eq.proc_type(jd)) then ! arrays start at 1, ranks at 0
1009 wl_empty = .false.
1010 nr_np = nr_np + 1
1011 exit proc_types
1012 end if
1013 end do proc_types
1014 end do
1015
1016 if (.not.wl_empty .and. present(next_procs)) then
1017 if (allocated(next_procs)) deallocate(next_procs)
1018 allocate(next_procs(nr_np))
1019 next_procs = next_procs_loc(1:nr_np)
1020 end if
1021 end function wl_empty
1022
1023 !> Notifies a rank that they can get the lock.
1024 !!
1025 !! The signal sent is the rank + 1.
1026 !!
1027 !! \note Based on \cite RossAtomicIO.
1028 !!
1029 !! \return ierr
1030 integer function lock_notify(lock_loc,rec_rank) result(ierr)
1031 use num_vars, only: rank
1032
1033 character(*), parameter :: rout_name = 'lock_notify'
1034
1035 ! input / output
1036 type(lock_type), intent(in) :: lock_loc !< lock
1037 integer, intent(in) :: rec_rank !< receiving rank
1038
1039 ! local variables
1040#if ldebug
1041 integer :: istat ! status
1042#endif
1043
1044 ! initialize ierr
1045 ierr = 0
1046
1047 call mpi_send(rank+1,1,mpi_integer,rec_rank,lock_loc%wu_tag,&
1048 &mpi_comm_world,ierr)
1049 chckerr('Failed to send notification')
1050
1051#if ldebug
1052 if (debug_lock) write(*,*,iostat=istat) trim(lock_header(lock_loc)), &
1053 &' notified', rec_rank
1054#endif
1055 end function lock_notify
1056
1057 !> Get notified that the rank can get the lock.
1058 !!
1059 !! \note Based on \cite RossAtomicIO.
1060 !!
1061 !! \return ierr
1062 integer function lock_get_notified(lock_loc) result(ierr)
1063 character(*), parameter :: rout_name = 'lock_get_notified'
1064
1065 ! input / output
1066 type(lock_type), intent(in) :: lock_loc !< lock
1067
1068 ! local variables
1069 integer :: dum_buf ! dummy buffer
1070#if ldebug
1071 integer :: istat ! status
1072#endif
1073
1074 ! initialize ierr
1075 ierr = 0
1076
1077#if ldebug
1078 if (debug_lock) write(*,*,iostat=istat) trim(lock_header(lock_loc)), &
1079 &' but needs to wait for lock'
1080#endif
1081 call mpi_recv(dum_buf,1,mpi_integer,mpi_any_source,&
1082 &lock_loc%wu_tag,mpi_comm_world,mpi_status_ignore,ierr)
1083 chckerr('Failed to receive notification')
1084#if ldebug
1085 if (debug_lock) write(*,*,iostat=istat) trim(lock_header(lock_loc)), &
1086 &' got notified by ',dum_buf-1
1087#endif
1088 end function lock_get_notified
1089
1090 !> Adds, removes or sets to active a rank from the waiting list for a lock
1091 !! and returns the lock waiting list:
1092 !!
1093 !! Actions:
1094 !! - \c wl_action = 0: remove
1095 !! - \c wl_action = 1: add
1096 !! - \c wl_action = 2: active
1097 !!
1098 !! Or negative equivalents for non-blocking (NB) procs.
1099 !!
1100 !! Optionally, the rank(s) of the process for which to perform this action
1101 !! can be passed. This is useful for doing the same action on multiple
1102 !! processes.
1103 !!
1104 !! \note Based on \cite RossAtomicIO.
1105 !!
1106 !! \ldebug
1107 !!
1108 !! \return ierr
1109 integer function lock_wl_change(wl_action,blocking,lock_loc,wl,ranks) &
1110 &result(ierr)
1111 use num_vars, only: n_procs, rank
1112
1113 character(*), parameter :: rout_name = 'lock_wl_change'
1114
1115 ! input / output
1116 integer, intent(in) :: wl_action !< action to perform
1117 logical, intent(in) :: blocking !< the ranks to be changed are blocking
1118 type(lock_type), intent(inout) :: lock_loc !< lock
1119 integer, intent(inout), allocatable :: wl(:) !< waiting list
1120 integer, intent(in), optional :: ranks(:) !< rank(s) for which to perform option
1121
1122 ! local variables
1123 integer :: id ! counter
1124 integer :: n_ranks ! number of ranks to set
1125 integer :: put_val ! value to put
1126 integer, allocatable :: ranks_loc(:) ! local ranks
1127 integer(kind=MPI_ADDRESS_KIND) :: one = 1 ! one
1128 integer(kind=MPI_ADDRESS_KIND) :: disp ! local displacement
1129 integer :: ln ! local length
1130#if ldebug
1131 integer :: istat ! status
1132 integer(kind=8) :: window_time(2) ! time that a window is locked
1133#endif
1134
1135 ! initialize ierr
1136 ierr = 0
1137
1138 ! set value to put
1139 put_val = wl_action
1140 if (.not.blocking) put_val = -put_val ! NB indicated in waiting list using a negative value
1141 !allocate(wl_excl(n_procs-1))
1142 if (allocated(wl)) deallocate(wl)
1143 allocate(wl(n_procs))
1144
1145 ! set local ranks if not present
1146 if (.not.present(ranks)) then
1147 allocate(ranks_loc(1))
1148 ranks_loc(1) = rank
1149 else
1150 allocate(ranks_loc(size(ranks)))
1151 ranks_loc = ranks
1152 end if
1153 n_ranks = size(ranks_loc)
1154
1155 ! get window to lock
1156#if ldebug
1157 if (debug_lock) call system_clock(window_time(1))
1158#endif
1159 call mpi_win_lock(mpi_lock_exclusive,0,0,lock_loc%wl_win,ierr)
1160 chckerr('Failed to lock window')
1161
1162 ! get previous list and put value for current ranks
1163 ! Do this by looking at every rank that is to be put, whether there are
1164 ! values to be gotten between the previous put and the current put.
1165 ! Finally, do one more get for the ranks between the last put and the
1166 ! end.
1167 do id = 1,n_ranks+1
1168 ! set local displacement and length
1169 if (id.eq.1) then
1170 disp = 0
1171 ln = ranks_loc(1)
1172 else
1173 disp = disp + ln + 1 ! previous step used ln gets and 1 put
1174 if (id.lt.n_ranks+1) then
1175 ln = ranks_loc(id)-ranks_loc(id-1)-1
1176 else
1177 ln = n_procs-ranks_loc(id-1)-1
1178 end if
1179 end if
1180
1181 ! perform the gets between this rank and previous rank
1182 if (ln.gt.0) then ! if there is something to get between last put and this one
1183 call mpi_get(wl(int(disp)+1:int(disp)+ln),ln,mpi_integer,0,&
1184 &disp,ln,mpi_integer,lock_loc%wl_win,ierr)
1185 chckerr('Failed to get waiting list')
1186 end if
1187
1188 ! perform the single put up to the current rank if not last piece
1189 if (id.lt.n_ranks+1) then
1190 call mpi_put(put_val,1,mpi_integer,0,ranks_loc(id)*one,1,&
1191 &mpi_integer,lock_loc%wl_win,ierr)
1192 chckerr('Failed to add to waiting list')
1193 wl(ranks_loc(id)+1) = put_val
1194 end if
1195 end do
1196
1197 ! unlock window
1198 call mpi_win_unlock(0,lock_loc%wl_win,ierr)
1199 chckerr('Failed to lock window')
1200#if ldebug
1201 if (debug_lock) call system_clock(window_time(2))
1202#endif
1203
1204#if ldebug
1205 if(debug_lock) write(*,*,iostat=istat) trim(lock_header(lock_loc)),&
1206 &' time: '//&
1207 &trim(r2strt(1.e-9_dp*(window_time(2)-window_time(1)))),&
1208 &' waiting list:', wl
1209#endif
1210 end function lock_wl_change
1211
1212#if ldebug
1213 !> Returns the header for lock debug messages.
1214 !!
1215 !! \ldebug
1216 character(len=max_str_ln) function lock_header(lock_loc) result(header)
1217 use num_vars, only: rank
1218
1219 ! input / output
1220 type(lock_type), intent(in) :: lock_loc !< lock
1221
1222 ! local variables
1223 integer(kind=8) :: clock ! current clock
1224 character(len=2) :: block_char ! BL for blocking and NB for non-blocking processes
1225
1226 ! get clock
1227 call system_clock(clock)
1228
1229 ! set block_char
1230 if (lock_loc%blocking) then
1231 block_char = 'BL'
1232 else
1233 block_char = 'NB'
1234 end if
1235
1236 header = trim(ii2str(clock))//' '//trim(i2str(lock_loc%wu_tag))//' '//&
1237 &block_char//' '//trim(i2str(rank))//'-'
1238 end function lock_header
1239#endif
1240end module mpi_utilities
Wrapper function to broadcast a single variable using MPI.
Fill the ghost regions in an array.
Gather parallel variable in serial version on group master.
Sorting with the bubble sort routine.
Numerical utilities related to giving output.
Definition messages.f90:4
Numerical utilities related to MPI.
integer function, public redistribute_var(var, dis_var, lims, lims_dis)
Redistribute variables according to new limits.
integer, public n_waits
number of waits
integer function, public lock_wl_change(wl_action, blocking, lock_loc, wl, ranks)
Adds, removes or sets to active a rank from the waiting list for a lock and returns the lock waiting ...
integer function, public lock_req_acc(lock, blocking)
Request access to lock of a BL (blocking) or optionally a NB (non-blocking) type.
integer function, public wait_mpi()
Wait for all processes, wrapper to MPI barrier.
integer function, public lock_return_acc(lock)
Returns access to a lock.
logical, public debug_lock
print debug information about lock operations
character(len=max_str_ln) function, public lock_header(lock_loc)
Returns the header for lock debug messages.
Variables pertaining to MPI.
Definition MPI_vars.f90:4
Numerical utilities.
Numerical variables used by most other modules.
Definition num_vars.f90:4
integer, parameter, public dp
double precision
Definition num_vars.f90:46
real(dp), parameter, public pi
Definition num_vars.f90:83
integer, public n_procs
nr. of MPI processes
Definition num_vars.f90:69
integer, parameter, public max_str_ln
maximum length of strings
Definition num_vars.f90:50
integer, public rank
MPI rank.
Definition num_vars.f90:68
Operations on strings.
elemental character(len=max_str_ln) function, public i2str(k)
Convert an integer to string.
elemental character(len=max_str_ln) function, public r2strt(k)
Convert a real (double) to string.
elemental character(len=max_str_ln) function, public ii2str(k)
Convert an integer to string.