The GO_BOARD array will have its rows distributed cyclically over a
one-dimensional arrangement of abstract processors.
The "*" specifies that GO_BOARD is not to be distributed along its
second address; thus an entire row is to be distributed as one object.
This is sometimes called "on-processor" distribution
Distribution
Examples distributions
- The ALIGN directive is used to specify that certain data objects
are mapped in the same way as certain other data objects.
- Operations between aligned data objects are likely to be more efficient
than operations between data objects that are not known to be aligned
(because two objects that are aligned are intended to be mapped to the
same abstract processor)
- Objects can be aligned by matching DISTRIBUTE statements; ALIGN however
is more general.
- Common implementations of the ALIGN directive for non-conformable
arrays actually create two variables of the same size. That is to say the
smaller array actually takes up as much memory as the larger array.
Examples of ALIGN
- ALIGNing a smaller array inside a larger array:
DIMENSION A(10,10), B(8,8)
!HPF$ ALIGN B(I,J) WITH A(I+1,J+1)
- ALIGNing a two dimensional array with a one dimensional array. Note that
the : signifies which dimensions are aligned and the * indicates positions
not used and has the same affect of using a dummy variable which is not
used else where in the statement.
INTEGER Y (N)
REAL, DIMENSION (N,N) :: X
!HPF$ ALIGN X(:,*) WITH Y (:)
- Example of transposing two axes:
!HPF$ ALIGN X(J,K) WITH Y(K,J)
- Example of reversing both axes:
!HPF$ ALIGN X(J,K) WITH Y (M-J+1,N-K+1)
- ALIGNing that match in distributed ("parallel") dimensions but may
differ in collapsed ("on-processor") dimension:
REAL A(3,N), B(4,N), C(43,N), Q(N)
!HPF$ DISTRIBUTE Q(BLOCK)
!HPF$ ALIGN (*,:) WITH Q :: A,B,C
- Optional directive used to provide additional information
useful in distributing data to specific geometry
- Staticly defines the number of processors used
- Used to define an array of Abstract Processors
- Defines a linear array and two matrices
- Hopefully these will map to Actual Processors
- Computer need not have this geometry
- Computer may not have this many
- Processor arrays can have rank 7
- An abstract space of indexed positions
- Useful in aligning arrays.
- Specifically useful when partially overlapping arrays when there is no
need to declare an array the entire size.
!HPF$ TEMPLATE, DISTRIBUTE (BLOCK,BLOCK) :: overlap (30,30)
real, dimension (20,20) :: a,b
!HPF$ ALIGN a(i,j) with overlap (i,j)
!HPF$ ALIGN b(i,j) with overlap (i+10,j+10)
- Similar to ALIGN and DISTRIBUTE
- An array can be moved in two ways
- Realigning or redistributing itself
- Redistributing the array to which it is aligned
- Can cause massive amounts of traffic
- The !HPF$ INDEPENDENT directive can precede an indexed DO loop
- It asserts to the compiler that the operations in a DO loop may be
executed independently - that is, in any order, or interleaved, or
concurrently - without changing the semantics of the program
- This directive is useful since it is often the case that the
compiler can not detect parallelism. The INDEPENDENT directive
allows the compiler to parallelize the loop without concern for
dependencies
- Parallel calculations increase the performance of the code
- The communication occurring when calculations use data found on
other processors local memory degrades performance
- HPF assumes all data is stored in a global name space
- Messages are passed when different data elements are used in the same
calculation but not used on the same processor
- All communication is performed by message passing done by the run-time
system and is invisible to the programmer
- Performance is gained by operating on whole arrays in parallel
Array processing is one of the most attractive features in the F90 / HPF.
It is particularly important to numerical intensive high performance
scientific computation.
A whole array is now an object. Operations can be performed on
a whole array rather than one element at a time.
Processing with arrays - definitions
- rank - The rank of an array is the number of dimensions
- extent - The extent of an array dimension is the number of elements
in a dimension
- shape - The shape of an array is a vector of its extents
- size - The size of an array in a product of its extents
- conformance - Arrays are said to be conformable if they have the same
shape
Array Specifications
type [,DIMENSION (extent-list),[,attribute] ... ::] entry-list
- type - can be an intrinsic(integer, real, complex ...) or derived type
- dimension - used to define entents of each dimension
- (extent-list) if an explicit shape array, defines upper and lower bounds
in each dimension; the values are provided by integer expressions
that can be evaluated at compile-time
- attribute - provides information (allocatable, dimension, intrinsic ...)
Array Operations
- Calculations can be performed on whole arrays or sections of arrays as
long as they are conformable
Array Sections
- A subset of an array bay me specified by referencing a range
- As a subscript - a(1,2,3)
- As a subscript triplet [lower bound]:[upper bound][:stride] - a(2:6:2)
- As a subscript vector a(/2,4,6/)
Example of array constructor and operations
program Arrays
real :: a(0:6), b(3,3), c(-1:1)
a = (/ (sqrt(real(i)), i=1,7) /)
b = reshape( source = (/a, 2.5, 2.6/), &
shape = (/3,3/) )
c = 10.0
c = b(1,:) + b(:,3) + c
print *, "This is the 2nd element of array c:", c(0)
print *, "This is array c:", c
end program Arrays
Types of Arrays
- Automatic arrays are those arrays whose extents can only be determined on
enty to a subprogram -- that is, the bound expressions depend on variables that
are dummy arguments, in common, or in a module
real,dimension(N,N) :: arg
- Assumed-shape arrays are dummy arguments that take
the shape of the actual arguments passed.
real,dimension(0:, :) :: arg
real a(:), b(-1:)
- Assumed-size arrays are dummy arguments whose sizes are assumed
from that of the associated actual argument. Only the extent in
the last dimension is free to match.
real s(3,3,*)
- Deferred-shape arrays are either array pointers
or allocatable arrays. They are dynamically allocatable.
real,pointer :: d(:,:), p(:)
real,allocatable :: e(:)
Dynamic Memory Allocation of Arrays
Dynamic memory allocation of arrays allows the array size to be defined
during execution.
An allocatable array is one of the dynamic data objects provided
by the Fortran 90 language. An array pointer is similar to an allocatable
array in functionality, although scalars may also have the POINTER
attribute.
! Example 2: array_ex2.f90
!***************************
!
program ArrayEx2
integer n
real, allocatable :: a(:)
print *, "Please type an integer number(>10):"
read *, n ! The size of the array will be
! defined at runtime.
allocate(a(n))
a = (/ (1.0/real(i), i = 1,n) /) ! To get the reciprocals
! of the integer numbers.
print *, "First ten elements"
print *, a(1:5) ! array section
print *, a(6:10) !
end program ArrayEx2
The WHERE construct allows for array assignments and calculations based on a
conditional mask array. All arrays must be conformable.
There are two types of WHERE.
- WHERE statement
WHERE ( logical-expression ) array-intrinsic-assignment-statement
- WHERE construct
WHERE ( logical-expression )
[ array-intrinsic-assignment-statement ] ...
ELSE WHERE
[ array-intrinsic-assignment-statement ]
END WHERE
The WHERE construct may not be nested.
An example of a WHERE construct:
subroutine example (a,b,c,above,below)
real, dimension (N,N)::a,b,c,above,below
b = 0
c = 0
where (a > 0.50)
above = 1
b = a
else where
below = 1
c = a
end where
end
- The purpose of the FORALL statement and construct is to provide a
convenient syntax for simultaneous assignments to large groups of array
elements.
- Simultaneous calculations lie at the heart of the data parallel
computations that HPF is designed to express.
- The FORALL construct allows for array assignments and calculations on
non-conformable arrays.
- An optional conditional mask array is supported.
- Function calls are supported within FORALLs.
There are two forms of FORALL
- The FORALL statement
FORALL (forall-triplet-spec-list [, scalar-mask-expr]) forall-assignment
- The FORALL construct
FORALL (forall-triplet-spec-list [, scalar-mask-expr])
forall-body-statement
[forall-body-statement]
END FORALL
Examples of the FORALL construct
- do x(i,j)=1/y(i) for i=1 to n, j=1 to m and y(i) 0.0
- Test prevents divide by zero
- Test may prevent/cause communication?
- A function with no statements in it that could cause side effects and no
arguments are changed, or a subroutine that has no side effects, except through
its arguments
- A procedure with:
- No SAVE attribute or statement
- No DATA initialization
- No use of variables in COMMON or in modules
- No reference to nonPURE procedures
- No input/output
- No STOP statement
- A function that can be used in a FORALL assignment statement
The !HPF$ INDEPENDENT command
- Gives the compiler extra information used for optimization.
- Asserts that various active index values of the forall do not
interfere with each other.
- The result will not vary if the order is changed.
Note: Some compilers do this automatically.
The INDEPENDENT statement provides:
- Execute statements independently
- Any order
- Interleaved
- Concurrently
- Effects
- Precedence
- Communication patterns
- Temporary Storage Requirements
- INDEPENDENT & Precedence
- Data transfer can be delayed until necessary for continuation
- INDEPENDENT effect on Communication & Storage
- Without INDEPENDENT
- All RHSA must be calculated before proceeding
- Requires temporary storage
- One large broadcast can be done for all RHSA to processor for LHSA
- Requires a barrier to be done for synchronization
at each step of the evaluation (after all LHSA computations, all RHSA computations,
...)
- With INDEPENDENT
- No Temporary Storage
- Only one Block at the end of the calculations
There are a total of 17 new intrinsic array functions defined in Fortran 90
- ALL
- ANY
- COUNT
- CSHIFT
- EOSHIFT
- MAXLOC
- MAXVAL
- MERGE
- MINLOC
- MINVAL
- PACK
- PRODUCT
- RESHAPE
- SPREAD
- SUM
- TRANSPOSE
- UNPACK
ALL (MASK,DIM) - Determine whether all values are true in MASK along
dimension DIM.
ANY (MASK, DIM) - Determine whether any value is true in MASK along
dimension DIM.
COUNT (MASK,DIM) - Count the number of true elements of MASK along dimension
DIM.
CSHIFT (ARRAY,SHIFT,DIM) - Perform a circular shift on an array expression of
rank one or perform circular shifts on all the complete rank one sections along
a given dimension of an array expression of rank two or greater. Elements
shifted out at one end of a section are shifted in at the other end.
EOSHIFT (Array,Shift,Boundary,Dim) - Perform an end-off shift on an array
expression of rank one or perform end-off shifts on all complete rank-one
sections along specified given dimension of an array expression of rank two or greater.
Elements are shifted off at one end of a section and copies of a boundary
values and may be shifted by different amounts in different directions
MAXVAL (ARRAY,DIM,MASK) - Computes the value of the elements of ARRAY along
dimension DIM corresponding to the true elements of MASK.
MERGE (TSOURCE,FSOURCE,MASK) - Choose alternative value according to the value
of a mask.
MINVAL(ARRAY,DIM,MASK) - Minimum value of all the elements of ARRAY along
dimension DIM corresponding to true elements of MASK.
PACK(ARRAY,MASK,VECTOR) - Pack an array into an array of rank one under the
control of a mask.
PRODUCT (ARRAY,DIM,MASK) - Multiply all of the elements of ARRAY along
dimension DIM corresponding to the true elements of MASK.
RESHAPE (SOURCE,SHAPE,PAD,ORDER) - Construct an array of a specified
shape from the elements of a given array.
SPREAD (SOURCE,DIM,NCOPIES) - Replicate an array by adding a dimension.
Broadcast several copies of SOURCE along a specified dimension and thus
forms an array of rank one or greater.
SUM (ARRAY,DIM,MASK) - Add all the elements of ARRAY along dimension DIM
corresponding to the true elements of MASK.
TRANSPOSE (MATRIX) - Transpose an array of rank two.
UNPACK (VECTOR,MASK,FIELD) - Unpack an array of rank one into an array of shape
MASK under control of MASK.
HPF, provides the ability to call routines written in other programming
paradigms and languages. Since these procedures are outside of
HPF they are called extrinsic.
- Called like a normal procedure
- Declared extrinsic in an "Interface"
- When called a copy is started on each processor
- Each "sees" only a portion of the array passed
- Can use any locally defined libraries
- Can be written in any language and style
- Need not be "PURE", can have side effects
This section describes the statements in Fortran 90 that control
program execution.
- The familiar DO loop of FORTRAN 77, for example,
DO 100 I = 1, 10
A(I) = 3.0*B(I)
100 CONTINUE
has been enhanced by several notable additions.
- First, a label is no longer necessary, with termination of
the DO block denoted by the END DO statement, making
a DO block label unnecessary.
- Secondly, no iteration loop control is necessary with the
addition of the CYCLE and EXIT statements, although they
may be used together with iteration loop control. A DO construct
without iteration control is called a Simple DO Loop. An example
of such a construct is:
DO
IDX = INDEX(J)
IF (IDX < 0) CYCLE ! GET NEXT PAGE
IF (IDX == 0) EXIT ! DONE
A(IDX) = PAGE(IDX) ! PROCESS PAGE FOR POSITIVE IDX
END DO
- Another example illustrating the EXIT statement is
NEWTON_LOOP: DO
X1 = X0 - F(X0)/F(X0,DERIVATIVE=.TRUE.)
IF (ABS(X1-X0) <= TOL) THEN
EXIT NEWTON_LOOP ! CONVERGED
ELSE
ITER = ITER + 1 ! CONTINUE ITERATIONS
X0 = X1
ENDIF
END DO NEWTON_LOOP
Note that the loop construct has been given a name NEWTON_LOOP
as is possible for all block control constructs.
- Both CYCLE and EXIT statements may optionally specify a
construct name. If DO constructs are nested and a construct
name is specified then a CYCLE or EXIT statement applies to the
named DO construct. If a construct name is not specified, then
a CYCLE or EXIT statement applies to the innermost DO construct
in which it appears.
- Finally the WHILE loop control statement can be specified
as an alternative to the usual iteration loop control. In this case,
the loop continues until some specified logical condition is
contradicted. A simple example is
COUNT = 0
INIT_LOOP: DO WHILE (COUNT <= MAX)
X(COUNT) = COUNT
COUNT = COUNT + 1
END DO INIT_LOOP
- Fortran 90 has added a new control construct, the SELECT CASE statement.
The CASE construct selects for execution at most one of its
constituent blocks. A case index is evaluated and compared with
all case selectors. If there is a match, the block following the
matched case is executed. A default case may be specified that is
executed in the event that no match occurs; the default case selector
statement does not have to appear last.
- The CASE construct may be given a name; the case selector statements
may also be given the same name.
- A case index is limited to an expression of type integer,
character, or logical, and all case selectors must be of this type.
Character strings may have different lengths but must be of the same
kind.
- A case selector of type integer or character, but not logical,
can be a value of one of these types or a range of values, or a
combination of values and ranges. A case selector of type logical is
restricted to a list of values, not ranges. In all cases, the values and
range endpoints must be constants or constant expressions.
- A case index matches a case selector if it is found to match one
of the values or is contained in a range of values in a case
selector statement. Ranges of character strings match all
character strings that collate between the specified range.
- There must not be more than one case selector that matches the case
index. Therefore overlapping case values and ranges are prohibited.
- Example:
SELECT CASE (INDX)
CASE (1); X = 1.0
CASE (2); X = 10.0
CASE (3); X = 100.0
CASE (4); X = 1000.0
END SELECT
- Example:
CHARACTER PARAMETER SEMI_COLON=';', COMMA=',', PERIOD='.'
CHARACTER PARAMETER EXCLAM='!', DOLLAR='$', AMPERSAND='&'
DO I = 1, LEN_TRIM(LINE)
IDENTIFY_DELIMITER: SELECT CASE (LINE(I:I))
CASE (SEMI_COLON, COMMA, PERIOD)
FOUND = .TRUE.
LEVEL = 0
CASE (EXCLAM, DOLLAR, AMPERSAND)
FOUND = .TRUE.
LEVEL = 1
CASE DEFAULT
FOUND = .FALSE.
END SELECT IDENTIFY_DELIMITER
IF (FOUND) THEN
CALL PROCESS_LINE (LINE)
EXIT
ENDIF
END DO
- Example:
SELECT CASE (N)
CASE (5:8, 10)
CALL SUB_1
CASE (:4, 9)
CALL SUB_2
CASE (11:)
CALL SUB_3
END SELECT
- The FORTRAN 77 statements PAUSE, STOP, CONTINUE, and RETURN are
remain available in Fortran 90 without modification. The PAUSE statement
is obsolescent.
Program Units
A Fortran program must contain one main program unit and may
contain any number of the following other kinds of program units.
- External subprogram (subroutine or function)
- Module
- Block data with a name for the block data program
The END statement is used to terminate all program units.
Module and block data statements are not executable while main program
and subprogram unit statements are.
- New feature in Fortran 90 related to program units
- Modules
- Internal procedure
- Interface blocks
- Recursive procedures
- All variables belong to their program units.
- Association
- Use association (modules)
- Storage association (common blocks)
- Argument association (external procedures)
- Host association (internal procedures)
- Note:
- Variables in a common block are not global.
- No declarations are needed for internal procedures.
- Internal Procedures
- The introduction of internal procedures in Fortran 90 changes
the traditional block structure in the FORTRAN language.
- There is limited capability for encapsulation.
- Information hiding is possible on a large scale base.
- Example 1: addfunc.f90
program AddFunc
print *, add2f(23, 46)
contains
integer function add2f(n1, n2)
add2f = n1 + n2
end function add2f
end program AddFunc
- Example 2: addsub.f90
program AddSub
integer total
call add2s(23, 46)
print *, total
contains
subroutine add2s(n1, n2)
total = n1 + n2
end subroutine add2s
end program AddSub
- Modules
-
Modules, which are reusable program units, have been introduced to allow
more flexibility and clarity when reusing previously written procedures, as well
as providing global data areas.
- The introduction of modules in Fortran 90 represents a first step
towards object-oriented programming.
- Furthers code reuse at a subprogram level.
- Limited capability of object hiding
(rename-list, USE...ONLY access-list PRIVATE and PUBLIC attributes and
statements)
- Example: print_box.f90(module), use_module.f90
! print_box.f90
!*******************************
module print_box
character*6 , private :: label = "MY BOX"
contains
subroutine printbox
write(6,10)
10 format(10X,"+----------+")
write(6,11) label
11 format(10X,"|",2X,A6,2X,"|")
write(6,10)
end subroutine printbox
end module print_box
! use_module.f90
!*******************************
program use_module
use print_box
print *, "This is my first box!"
call printbox
print *, "This is my second box!"
call printbox
! print *, label ! Try this!
! If "label" is private,
! you can't see it!
end program use_module
- Notice that this example uses fixed source form.
Procedure Interface Blocks
- The procedure interface block in Fortran 90 allows using a generic name
for user-defined procedures.
- It provides limited capability of polymorphism which is the ability to
overload Fortran 90 names and operators, as is done
in Fortran 77 with intrinsic operators and function names.
- Example: summ.f90
! summ.f90
!***********************
program TestSumm
interface summ ! "summ" will be the
! generic function name.
function summ_r(x,y)
real :: summ_r
real,intent(in) :: x,y
end function summ_r
function summ_i(x,y)
integer :: summ_i
integer,intent(in) :: x,y
end function summ_i
function summ_d(x,y)
double precision summ_d
double precision,intent(in) :: x,y
end function summ_d
end interface
real :: a=0.1,b=0.25
integer :: c=12,d=13
double precision :: e=1.11111111d0,f=2.33333333d0
print *, "c + d =", summ(c,d)
print *, "a + b =", summ(a,b)
print *, "e + f =", summ(e,f)
end program TestSumm
function summ_r(x,y)
real :: summ_r
real,intent(in) :: x,y
summ_r = x + y
end function summ_r
integer function summ_i(x,y)
integer :: summ_i
integer,intent(in) :: x,y
summ_i = x + y
end function summ_i
function summ_d(x,y)
double precision :: summ_d
double precision :: x,y
summ_d = x + y
end function summ_d
Additional Information on the WWW
- Maui High Performance Computing Center's HPF Home Page