
SP Parallel Programming Workshop
Parallel I/O
© Copyright Statement
Although no prerequisites are required, it is a good idea to familiarize
yourself with the concepts covered in the following talks:
- Input and Output are the acts of reading and writing data to some media.
- Media can be anything from a video screen to a magnetic/optical disk
or tape.
- The data can be formatted or unformatted(binary) depending on the
application.
- Serial I/O performance usually involves a number of factors, most
completely out of the control of the programmer:
- CPU design - size, bandwidth and efficiency of I/O operations,
cache size, memory size
- Memory to device interface design - bandwidth and efficiency
- Machine to storage device adapter/connection (SCSI, HIPPI, etc.)
- Data storage device performance characteristics (seek time, type
of media, etc.)
- Network performance characteristics (for networked filesystems)
- Operating system - efficiency of I/O operations
- Compiler - efficiency and optimization of I/O instructions
- Algorithm - how the program conducts I/O and whether or not the
programmer considers architecture issues/optimizations
- Serial I/O (in terms of parallel programming) generally refers to
a condition where:
- I/O is performed by a single task because all tasks can not safely
perform I/O simultaneously. For example, write operations by
different tasks to the same file "overwrite" each other.
- Simultaneous requests by different tasks to access the same
filesystem encounter bottlenecks because I/O requests are still
handled sequentially by the network and/or file server. Example:
multiple users have their home directories on the same disk and
try to perform I/O at the same time - but the file server (I/O node)
can only handle one I/O request to the same disk at a time.
- Multiple processors must use the same physical disk/server
- Advantages to Serial I/O
- Ease of use
- Serial I/O is "built into" most programming languages
and usually requires nothing more than using well known
read/write subroutines.
- Single disk paradigm - as a user, it's much easier to
conceptualize a single file residing in one location rather than
pieces of a file residing in multiple locations.
- Learning curve - serial I/O requires a man page; parallel I/O
requires a user's guide.
- There are programming, operating system and hardware standards
which are universally recognized.
- Hardware vendors are continually improving serial I/O performance
- Disadvantages to Serial I/O
- Limited File Size
- Under AIX 3.2.5, the maximum filesize is 2 Gigabytes.
- AIX 4.1, it's 4 gigabytes.
- AIX 4.2, it's 64 gigabytes.
- If NFS is used, it is 2 GB regardless of AIX version
- Bottlenecks
- I/O operations are usually orders of magnitude slower
than CPU operations. Many applications find that I/O is the
single largest performance bottleneck.
- Concurrent I/O
- As a parallel programmer, it is quite often desirable to
perform I/O from multiple nodes to a single file. This creates
a huge slow down in performance.
- Poor Scalability
- No matter how powerful the computer, I/O takes the same
amount of time. Using serial I/O for "Grand Challenge"
problems is not feasible.
- From a user's perspective, parallel I/O might be described as the
ability for:
- Multiple tasks to read/write the same data element or portion of
a file simultaneously
- Multiple tasks to read/write the same file (different
portions) simultaneously
- Multiple users to read/write to the same disk space simultaneously
- The programmer to accomplish all of the above by using an easy,
high level interface.
- Implementing parallel I/O is a very complex task which involves
independently complex issues. It can involve some/all of the following
factors:
- Multiprocessors with parallel access to multiple disks
- Hardware architecture (ex: disk striping)
- Operating system support (ex: pre-fetching, cache coherency)
- Parallel databases
- Networking (ex: SP2 I/O over switch versus ethernet)
- Algorithms
- Software & language support, programmer interfaces (ex: MPI-I/O)
- Although I/O is a very important part of computing, some will
agree that it has been "overlooked" by the emphasis placed on ever
increasing computing (CPU) power. Great advances have been made in
improving the computation and communication performance of parallel
machines, but their I/O performance is lagging far behind.
- Research into parallel I/O is growing.
Example: Parallel I/O Archive
(http://www.cs.dartmouth.edu/pario) at Dartmouth.
- There are currently no "standards" for parallel I/O.
- Advantages & Disadvantages
- Basically, parallel I/O solves the short comings of serial I/O, such
as allowing large file sizes, scalability and speedup.
- However, most of the advantages to serial I/O, like ease of use
and standardization, are lost.
- Several different parallel I/O implementations and prototypes are
available (or will soon be). A few which are applicable to the
IBM Sp are mentioned below.
- PIOFS
- Stands for Parallel I/O File System. Developed by International
Business Machines (IBM) for the RS/6000 SP.
- Allows for the creation of files as large as 128 Terabytes.
- Files are stored on disks connected to server nodes. Files can
span multiple server nodes to permit simultaneous I/O by multiple
client nodes

- Provides user-defined parallel views of files for data partitioning.
Users can partition a file into subfiles which physically reside
on different servers. If parallel tasks each process their own
subfile, then I/O is parallel.
- Allows scalability up to 512 nodes.
- Network can be the High Performance Switch
- Designed to run with any parallel application.
- PIOUS
- A library designed specifically for PVM 3.
- Stands for Parallel Input/OUtput System. It is a public domain
package designed by Steven Moyer and Vaidy Sunderam from Emory
University.
- Files consist of file segments which are partitioned (declustered)
across separate data servers. If more segments exist than data
servers, then files segments are declustered in a round
robin fashion.

- Access modes support globally shared and independent file
pointers, and file per node accesses.
- Provides scalability in a heterogeneous environment.
- MPI-IO
- As part of the ongoing MPI-2 standardization process, there is
currently work being done on defining a parallel I/O interface for MPI.
- Designed to specify the MPI programmer's interface only. Leaves
implementation of the actual parallel filesystem up to vendors or
other developers.
- Version 5.0 draft of the MPI-2 MPI-I/O committee standard is
available at:
http://lovelace.nas.nasa.gov/MPI-IO/mpi-io-report/mpi-io-report.html
- PASSION
- Stands for: Parallel And Scalable Software for Input-Output.
- Primarily a compiler and runtime support system.
- Supports two models for storing and accessing data
- Local Placement Model - local array of each processor is
stored in a separate file - for each array, there are as many
files as the number of processors. A processor cannot
directly access data from the local array files of other processors.
- Global Placement Model - entire array is stored in a single file.
Each processor can directly access any portion of the file.
- Runtime library (version 1.1 as of 12/96) is supported on SP2
using PIOFS or AFS with MPI.
- Goal of compiler (not currently available) is to take an HPF program
as an input and generate a program with calls to the PASSION
Runtime Library.
- This work and related research/development is being done at
Syracuse University.
If a system does not offer a parallel library, the user must rely
on I/O techniques to achieve decent results:
- Multiple files
- Instead of returning values from multiple nodes to a
single node to write, have each node write out a file.
- This will avoid the classic bottle neck of I/O.
However, be careful, N files (N = # of nodes) can be tricky.
- Reads would work the same with multiple input files on multiple nodes.
- Keeping writes local
- Unformatted (binary) writes tend to be smaller and quicker than
formatted writes.
- Currently, the MHPCC has installed PIOFS on the production SP system
for testing purposes.
Additional Information on the WWW
References and Acknowledgments
- IBM AIX Parallel I/O File System: Installation, Administration, and Use,
IBM, Kingston, NY
© Copyright 1995
Maui High Performance Computing Center. All rights reserved.
Documents located on the Maui High Performance Computing Center's WWW server
are copyrighted by the MHPCC. Educational institutions are encouraged to
reproduce and distribute these materials for educational use as long as
credit and notification are provided. Please retain this copyright notice
and include this statement with any copies that you make. Also, the MHPCC
requests that you send notification of their use to help@mail.mhpcc.edu.
Commercial use of these materials is prohibited without prior written
permission.
Written: 13 December 1995 marc@mhpcc.edu (Marc Friedman)
Revised: 17 December 1996 blaise@mhpcc.edu