Reports focussing on GAMESS-UK

With a focus on the GAMESS-UK code, the work described below includes:

  • Initial parallel SCF implementations of GAMESS-UK included those for the Intel IPSC/2 and iPSC/860 hypercubes. Development of the parallel code between 1993 and 1995 was undertaken within the Esprit-funded EUROPORT2 IMMP project. EUROPORT's principle aim was to provide exemplars of commercial codes that demonstrated the potential of parallel processing to industry; the primary goal was to establish a broad spectrum of standards-based medium-scale parallel application software rather than targeting the highest levels of scalability and performance.
  • Developments of the parallel code following EUROPORT are described in a number of reports and presentations. These include:
    1. "Molecular Modelling on on High-End and Commodity-Type Computers" describes progress in implementing and benchmarking a number of the QUASI core codes on a variety of high-end and commodity based systems. This presentation, part of the recent QUASI workshop at Mulheim (25-27 September, 2000), provides performance comparisons between Beowulf systems and both high-end MPP (the Cray T3E/1200E) and ASCI-style SMP platforms (typified by the Compaq AlphaServer SC series and IBM SP/WH2-375). Applications considered include those from both computational chemistry (NWChem, GAMESS-UK, Turbomole, DLPOLY and CHARMM),and computational materials (CRYSTAL and the Car Parrinello codes, CPMD and CASTEP).
    2. Ongoing developments to the parallel version of GAMESS-UK, along with a number of other computational chemistry codes, were undertaken within the QUASI project, (Jan 1998 to Dec 2000) funded by the European Union. The project aimed to extend and implement techniques for combined QM/MM (quantum mechanics/molecular mechanics) simulations on a variety of High Performance Computing (HPC) platforms, and to apply the techniques to industrial catalytic chemistry applications. The project addressed many of the weaknesses in current molecular modelling practice, and resulted in a user-friendly modelling environment capable, through exploitation of HPC, of realising an order of magnitude enhancement in the computational power available to industry. The reduced time to solution on current problems and the ability to tackle more realistic and complex models will extend the role of modelling to the rational design of catalysts with specific properties. The workplan incorporated extensions to the QM/MM methods, porting to and provision of HPC resources, and industrial projects to validate, demonstrate and exploit the power of the environment in advancing the predictive simulation of a wide range of industrial catalytic systems.
    3. "Computational Chemistry on High-End and Commodity-Type Computers: Status and Perspectives" presents a variety of performance data on the Cray T3E/1200E, and describes initial work in the implementation and optimisation of GAMESS-UK on Beowulf systems. This was presented at the Maxwell Institute Workshop, "Towards Petaflops" held in May 1999 at the University of Edinburgh.
    4. Detailed timings on both MPP (the IBM SP and Cray T3E) and SMP hardware are given in Part 16 of the GAMESS-UK User Manual and Reference Guide.
    5. GAMESS-UK Parallel Benchmarks are presented in the report "Massive Parallelism: The Hardware for Computational Chemistry?". This paper is based on an invited presentation at the meeting "Supercomputing, Collision Processes and Applications" that was held at The Queen's University of Belfast from the 14th to 16th September 1998 to mark the retirement of Professor P.G. Burke CBE FRS. The Proceedings have been published by Plenum Publishing Corporation.

General Benchmarking Reports

A more general series of reports that cover benchmarking and performance analysis of both parallel kernels and communication primitives, together with applications codes codes from chemistry, materials and engineering are available in the following presentations (in reverse chronological order):

  • Application Performance on High-End and Commodity-class Computers presented at the "5th Annual Workshop on Linux Clusters for Supercomputing", LCSC'2004 (Linkoping, Sweden, October 18-20, 2004). Commodity-based clusters now provide an established, viable cost effective alternative for the provision of High Performance Computing. In this presentation we compare the performance of a variety of clusters in the support of major research and production codes with current high-end hardware, such as the IBM p690+ series and the SGI Altix 3700, together with the older Compaq AlphaServer SC and SGI Origin 3800. Our focus lies in applications and looks to address the differing demands from the fields of Capability and Capacity computing. The results concentrate on the areas of computational chemistry, computational materials and computational engineering. Based on simple metrics, we consider the performance of a variety of codes, including NWChem and GAMESS-UK, CPMD, DLPOLY and CHARMM, plus ANGUS and PCHAN, and in each case identify the associated bottlenecks. We overview performance data from some twenty commodity-based systems (CS1-CS20), featuring Intel IA32 and IA64 plus AMD Athlon and Opteron architectures, coupled to traditional Beowulf interconnects, such as Myrinet and Gbit Ethernet, plus the SCALI/SCI, Infiniband and Quadrics QSNet interconnect technologies. This presentation is also available in PDF format..
  • Communication Benchmarks are available for the commonly used MPI operations in computational chemistry codes). Written by Pallas , we provide data for MPI functions that describe point-to-point message-passing and global data movement. Also included are the Bandwidth benchmarks (B_EFF) from Pallas and a set of benchmarks showing the performance of the Global Array (GA) tools from PNNL. This presentation is also available in PDF format (October 2004). .
  • Application Performance on High-End and Commodity-class Computers contrasts the performance of Beowulf clusters built from commodity "off the shelf" components in the support of major research and production codes, with current high-end hardware such as the IBM p690+, SGI Altix 3700, and the older Compaq AlphaServer SC and SGI Origin 3800. The results concentrate on the application areas of computational chemistry, computational materials and computational engineering. Using simple metrics, we consider the performance of a variety of computational chenmistry codes, including the electronic structure codes, GAMESS-UK and NWChem, and the molecular simulation codes, DLPOLY, DLMULTI and CHARMM. Computational materials and engineering codes include the Car Parrinello code, CPMD, and the computational fluid dynamics codes, ANGUS and SBLI. We overview performance data from some twenty commodity-based systems (CS1-CS20), featuring Intel IA32 and IA64 plus AMD Athlon and Opteron architectures, coupled to traditional Beowulf interconnects, such as Myrinet and Gbit Ethernet, plus the SCALI/SCI, Infiniband and Quadrics QSNet interconnect technologies. This presentation is a more detailed comparison compared to that above, and extends the comparisons below through the inclusion of seven new clusters (CS14-CS20). This presentation is also available in PDF format. (November 2004). .
  • Chemistry Applications. Development and Performance on the Bradford Xeon Cluster, presented at Bradford University's Institute of Pharmaceutical Innovation on the 26th August, 2004. This presentation considers application performance of both molecular electronic and molecular simulation codes on a variety of cluster-based systems, comparing the performance of the myrinet-based Xeon cluster from ClusterVision at the Bradford Institute of Pharmaceutical Innovation with a number of other clusters and high-end systems. Consideration is given to the CPMD, GAMESS-UK and NWChem electronic structure codes and to the DLPOLY and CHARMM simulation codes.
  • Application Performance on High-End and Commodity-class Computers contrasts the performance of Beowulf clusters built from commodity "off the shelf" components in the support of major research and production codes, with current high-end hardware such as the IBM SP/p690, Compaq AlphaServer SC and SGI Origin 3800. The results concentrate on the application area of computational chemistry. In addition to GAMESS-UK and NWChem, applications considered include those from computational chemistry (DLPOLY, DLMULTI and CHARMM), computational materials (CRYSTAL and the Car Parrinello codes, CPMD and CASTEP), and computational Engineering (ANGUS and SBLI). Benchmark data on thirteen commodity-based systems (CS1-CS13) featuring Intel, AMD Athlon and Alpha CPU architectures coupled to traditional Beowulf interconnect, such as Myrinet and Ethernet, are presented. We include performance data on systems utilising the Quadrics QSNet and SCALI/SCI interconnect technologies. This extends the comparisons below through the inclusion of four new clusters (CS10-CS13) and additional applications (SBLI and DLMULTI). This presentation is also available in PDF format.
  • Communication Benchmarks are available for the commonly used MPI operations in computational chemistry codes (August 2003). Written by Pallas , we provide data for MPI functions that describe point-to-point message-passing and global data movement. This presentation is an earlier version of that cited above, and is also available in PDF format.
  • The Promise and Challenge of High-performance Computing ; presented at The Department of Computer Science, University of Cardiff, (July 15, 2002 ). During the next decade, advances in computing technologies will increase the speed and capacity of computers, storage and networks by several orders of magnitude. At the same time, advances in theoretical and computational science will result in computational models of ever increasing complexity. In this talk we discuss the challenges that must be addressed if computational scientists and engineers are to harness the power offered by present and future high-performance computers to solve the most critical problems in science and engineering. Such a task will require close collaboration between disciplinary computational scientists, computer scientists and applied mathematicians.
  • To illustrate issues central to this discussion, we provide an overview of the current status and trends in HPC. Our focus lies in applications and looks to address the differing demands from the fields of Capability and Capacity computing. An analysis is presented of current performance in the areas of molecular simulation and electronic structure, materials simulation, computational engineering and environmental modelling. This analysis (i) provides compelling evidence for the potential of commodity-based clusters in providing cost effective alternative for the provision of Capacity computing, and (ii) highlights the challenges in delivering on the promise of Capability computing, typified by the HPCx national computing service. This presentation is also available in PDF format.

  • Computational Chemistry Applications: Performance on High-End and Commodity-class Computers; presented at HPCS'2002, the 16th Annual International Symposium on High Performance Computing Systems and Applications, (June 16-19, 2002 ) at Moncton, Canada. In this presentation we compare the performance of Beowulf clusters built from commodity "off the shelf" components in the support of major research and production codes, with current high-end hardware such as the IBM SP, Compaq AlphaServer SC and SGI Origin 3800. The results concentrate on the application area of computational chemistry. Benchmark data on nine commodity-based systems (CS1-CS9) featuring Intel, AMD Athlon and Alpha CPU architectures coupled to traditional Beowulf interconnect, such as Myrinet and Ethernet, are presented. Furthermore, we provide performance data on systems utilising the Quadrics QSNet and SCALI/SCI interconnect technologies, and initial results from a prototype of the Cray Supercluster and the IBM SP/Regatta-H nodes. This presentation is also available in PDF format.
  • Computational Chemistry Applications: Performance on High-End and Commodity-class Computers; presented at the fifth SCICOMP Meeting, SCICOMP 5 (May 7, 2002 ) at Daresbury Laboratory. This presentation compares the performance of a variety of clusters built from commodity "off the shelf" components in the support of major research and production codes, with current high-end hardware such as the IBM SP, Compaq AlphaServer SC and SGI Origin 3800. The results concentrate on the application area of computational chemistry. Benchmark data on nine commodity-based systems (CS1-CS9) featuring Intel IA32 and IA64, AMD Athlon and Alpha CPU architectures coupled to traditional Beowulf interconnect, such as Myrinet and Ethernet, are presented. Furthermore, we provide performance data on systems utilising both the Quadrics QSNet and SCALI SCI interconnect technology, together with initial results from the IBM SP/Regatta-H. This presentation is also available in PDF format.
  • Application Performance on High-End and Commodity-class Computers describes recent developments on Beowulf Systems. This presentation (October 2001) outlines a variety of Pentium-, Athlon- and Alpha-based Beowulf systems (CS-1 to CS-6) under investigation at the Daresbury Laboratory and elsewhere. The major focus of this work is on applications, with the presentation providing performance comparisons between commodity-based systems and both MPP (the Cray T3E/1200E) and high-end SMP platforms. The latter include the SGI Origin 3800 (with both R14k-500 and R12k-400 CPUs), the Compaq AlphaServer SC (with 833 and 667 MHz CPUs), a prototype of the Cray SuperCluster (833 MHz EV67 CPUs) and the SP/WH2-375. In addition to GAMESS-UK, applications considered include those from computational chemistry (DLPOLY and CHARMM), computational materials (CRYSTAL and the Car Parrinello codes, CPMD and CASTEP), and computational Engineering (ANGUS and FLITE3D). Copies of the associated report are in preparation; an previous description of this work is still available in both HTML and PDF format.