Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Defect: no assumption can be made about MPI_Win opaque handler #801

Open
@ggouaillardet

Description

! syncall test
!
! Copyright (c) 2012-2014, Sourcery, Inc.
! All rights reserved.
!
! Redistribution and use in source and binary forms, with or without
! modification, are permitted provided that the following conditions are met:
! * Redistributions of source code must retain the above copyright
! notice, this list of conditions and the following disclaimer.
! * Redistributions in binary form must reproduce the above copyright
! notice, this list of conditions and the following disclaimer in the
! documentation and/or other materials provided with the distribution.
! * Neither the name of the Sourcery, Inc., nor the
! names of its contributors may be used to endorse or promote products
! derived from this software without specific prior written permission.
!
! THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
! ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
! WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
! DISCLAIMED. IN NO EVENT SHALL SOURCERY, INC., BE LIABLE FOR ANY
! DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
! (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
! LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
! ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
! (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
!
program syncall
 implicit none
 integer :: me,np,i
 me = this_image()
 np = num_images()
 call mysyncall()
end program syncall
subroutine mysyncall()
 use mpi_f08
 implicit none
 integer :: me,np,i
 integer, allocatable, dimension(:), codimension[:] :: scalar2
 logical :: success = .true.
 integer(c_int), allocatable :: tally(:)
 integer :: rank, size
 integer :: base
 integer(kind=mpi_address_kind) :: sz
 type(MPI_Win) :: win = mpi_win_null
 call mpi_comm_rank(mpi_comm_world, rank)
 call mpi_comm_size(mpi_comm_world, size)
 print *,"hello ", rank, " / ", size
 base = 100
 sz = 4096
 ! comment the line below to hide the issue
 if (rank.eq.1) call mpi_win_create(base, sz, 4, mpi_info_null, mpi_comm_self, win)
 me = this_image()
 np = num_images()
 allocate(scalar2(1)[*])
 scalar2(1) = -1
 if(me /= 1) call sleep(1)
 scalar2(1) = 1
 sync all
 if(me == 1) then
 do i=1,np
 if(scalar2(1)[i] /= 1) then
 success = .false.
 endif
 end do
 end if
 if(me == 1) then
 if (success) then
 print *,'Test passed.'
 else
 print *,'Test failed.'
 endif
 endif
 if (win.ne.mpi_win_null) call mpi_win_free(win)
end

The program above fails when ran on two nodes with gfortran-15, the latest OpenCoarray and MPICH
The root cause is that the MPICH implementation does not guarantee all ranks of the same window will have the same MPI_Win opaque handler, even if one might be lucky most of the time.

Refs #800

The title of the issue should start with Defect: followed by a
succinct title.

  • [x ] I am reporting a bug others will be able to reproduce and not asking a question or requesting a new feature.

System information including:

  • OpenCoarrays Version: 2.10.2-32-g3d0fa68
  • Fortran Compiler: gfortran 15.1.0
  • C compiler used for building lib: gcc 15.1.0
  • Installation method: cmake && make && make install
  • All flags & options passed to the installer
  • Output of uname -a: 4.18.0-553.22.1.el8_10.aarch64
  • MPI library being used: MPICH 4.3.1
  • Machine architecture and number of physical cores: A64fx 48+2 cores
  • Version of CMake: 3.26.5

To help us debug your issue please explain:

What you were trying to do (and why)

What happened (include command output, screenshots, logs, etc.)

$ cafrun -n 2 ./a.out
 Test failed.

What you expected to happen

$ cafrun -n 2 ./a.out
 Test passed.

Step-by-step reproduction instructions to reproduce the error/bug

$ caf -I/opt/mpich-4.3.1/include ns.f90
$ cafrun -n 2 ./a.out

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions

        AltStyle によって変換されたページ (->オリジナル) /