11th September 1995
Lecture room G22 (also known as the Pearson Lecture Theatre)
Pearson Building
University College London
Gower Street
London WC1E 6BT
Registration: 76 Attendance: 70 (==> 6 either didn't show or showed up too late for our registration desk)
09:50 Introduction to the Day (Professor Peter Welch, University of Kent) 10:00 High performance compute + interconnect is not enough (Professor David May, University of Bristol) 10:40 Experiences with the Cray T3D, PowerGC, ... (Chris Jones, British Aerospace, Warton) 11:05 More experiences with the Cray T3D, ... (ABSTRACT) (Ian Turton, Centre for Computational Geography, University of Leeds) 11:30 Coffee 11:50 Experiences with the Meiko CS2, ... (ABSTRACT) (Chris Booth, Parallel Processing Section, DRA Malvern) 12:15 Problems of Parallelisation - why the pain? (ABSTRACT) (Dr. Steve Johnson, University of Greenwich) 13:00 Working Lunch (provided) [Separate discussion groups] 14:20 Language Problems and High Performance Computing (ABSTRACT) (Nick Maclaren, University of Cambridge Computer Laboratory) 14:50 Parallel software and parallel hardware - bridging the gap (ABSTRACT) (Professor Peter Welch, University of Kent) 15:30 Work sessions and Tea [Separate discussion groups] 16:30 Plenary discussion session 16:55 Summary 17:00 Close
These are summarised through the answers achieved in the final plenary session to the questions posed at the start of the workshop.
Yes. In some quarters this was severe enough to be causing serious economic embarrassment and a recommendation to think hard before moving into HPC (especially for non-traditional users - like geographers). Some felt that this disappointment is partly the result from over-selling on the part of vendors and funders and local enthusiasts (e.g. "When the 40 Gflops (40 billion arithmetic operations per second), 256-processor Cray T3D is commissioned this Spring by the University's Computing Services it will be amongst the ten fastest computers in the world ...", from the "Supercomputer procurement - press release (3rd Feb 1994)"), which led to grossly raised expectations.
Yes. See below.
Yes. Supercomputers are a scarce resource (about 2-and-a-half in the UK now available to academics) and user queues are inevitable. Lower efficiency levels mean longer waiting times to get your job turned around - this is the real killer to users, over and above the actual execution time achieved.
Comparison against efficiency levels for PCs is irrelevant. PCs sit on your desk. We can afford to over-resource them so we have access to them immediately (a classic real-time constraint). If supercomputers were similarly available, no one would worry about efficiencies ... (except those wanting interactive response). There is one other difference. If you have a problem that is too large for a workstation on your desk, you can use a supercomputer. If you have a problem that is too large for a supercomputer, you can either wait for the next generation of supercomputer or improve your program's efficiency.
Efficiency also seems to be an easily measured benchmark against which funding agencies (EPSRC and Industry) are judging projects.
Having said this, there certainly exists a range of problems that can only be solved on the current MPP machines - for them, there is no choice but to live with the long turn-arounds and accept low efficiencies.
Parallel architecture had not picked up strongly enough on known problems (and their solutions) from the 1980s and had concentrated on building unattainable MFLOP/s and MBYTE/s. Considerable frustration was expressed at those wasted opportunities - we should be doing much better than we are. This has to change.
Note that these conclusions are the reverse of the normally received wisdom (which says that parallel hardware is brilliant, but the parallel software infrastructure to support it lags behind and users' abilities to exploit what is on offer are weak). This workshop suggests that users have a natural affinity for the parallelism inherent in their applications, that sound and scalable models for expressing that parallelism exist, but that current parallel hardware lacks some crucial technical parameters that are necessary for the execution of those expressions at a worthwhile level of efficiency.
Nevertheless, it may be possible to influence such architecture (hardware and software) - in particular, latency and context switch times must move in line with computational performance and communications bandwidth. Machines designed from scratch - using the fastest commodity micro-cores for processor, memory, link and routing components and maintaining the correct balance as an overriding design constraint - would be easier to program, extract high efficiencies from and be closer to a general purpose machine. Such machines would obtain huge leverage from well-behaved models for parallelism - not least through the automatic control of cache coherency, without the need for hardware or software run-time checks and remedies. It will be necessary to re-cast our application software to conform to those disciplines - for some, it will be necessary to re-write them. Failure to make such changes will result in HPC becoming increasingly ineffective, which will be serious since the need for HPC looks set to increase. The technical knowledge to avoid this exists in great parts and can be developed considerably - if this is used, the future looks exciting and we can be optimistic.
On the political front, there is no immediate crisis but there is disappointment. Access to HPC facilities still exists, although at a lower scale than many had hoped.
At the engineering level, there is a crisis. There has been little or no progress in MPP architecture over the past 5 years as manufacturers and their clients have pursued obvious goals (MFLOP/s and MBYTE/s) and not emphasised the twiddly bits (low startup latencies and context switches, portable and scalable models of parallelism, prevention of cache incoherency, ...) that are necessary to make them work properly.
The result is real difficulty even for experienced users, the scale of whose applications give them no choice but to accept the machines and live with the long turn-arounds. There is discouragement to new users from entering the fray, especially if they are from non-traditional HPC fields of application.
Herein lies the basis for a real political crisis that may be upon us soon. If the engineering problems are not resolved in the near future, pressure will build to close down (or, at least, not upgrade) existing HPC facilities that may be difficult to resist. Such pressure is already being felt in the USA and this feeling is not very comfortable.
Educate, research, develop, publish and influence.
Teach high-level models of parallelism, independent of target architecture. Teach and research good models that scale, are efficient and can extract much more of the parallelism in the users' applications. Priorities are: correctness, efficiency and scalability, portability, reliability and clarity of expression. Maintenance of existing vector/serial codes is not relevant for the long term.
Be fundamental - don't be afraid to question the existing consensus, whether this be HPF, MPI, FP, CSP, BSP or whatever. Do not set up a single `centre of excellence' for the provision and dissemination of training and education in HPC.
Listen (and get manufacturers and funding organisations to listen) to real users. Don't go for raw performance (e.g. 1.8 TFLOP/s). Demand to know what will really be sustained and publish the answers and the reasons behind them. Do some real computer science on the performance and programmability of `grand challenge' machines (which may be difficult in the UK as the funding bodies for HPC and computer science seem to be entirely separate).
Don't necessarily expect to provide efficient HPC solutions for all problems that need them - some badly behaved ones may need to wait (these need to be characterised). Look to the embedded and consumer market for the base technologies of the future (e.g. video-on-demand servers and their supporting communications and switching) - influence and modify them to the special needs of HPC applications.
Don't just accept what is on offer from today's HPC - the hardware may have to be accepted, but software access to it may bear considerable improvement.
Don't do nothing!
Review progress in 12 months - another workshop? Meanwhile, work through suitable Internet newsgroups.
Disseminate results and concerns through newsgroups and archives (e.g. the Internet Parallel Computing Archive at <URL:/parallel/groups/selhpc/crisis/>).
Move the discussion beyond the UK.