Message-Id: <377E38AE.3127@startrekmail.com> Date: Sat, 03 Jul 1999 11:22:06 -0500 From: Steven Merritt Reply-To: smerr612@startrekmail.com Organization: KETR Mime-Version: 1.0 Newsgroups: comp.parallel.mpi Subject: Is anyone working on a hybrid of farm/MPI topologies? Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Xref: ukc comp.parallel.mpi:5277 Ok, I may be using the wrong terminology, feel free to correct it at any point. I may also have an incorrect grasp of the concepts but I'll explain them the best I can as I understand them. Farm style clustering. Each node is assigned a discrete task which does not rely on other processors/nodes keeping up because it will finish it's task and be ready to begin another task for that job or for another job. Once all the discrete tasks of any given job are completed, there is a final task of putting it all together. This topology is useful for jobs which require a lot of throughput instead of raw power. If you were using a cluster as a web server or a LAN server this topology would be ideal, each node could handle requests from a single source at any given time and not have to wait until the process is divided up and then watch all the other nodes to monitor them for completeness of it's job. MPI clustering. Every task is divided up and a certain amount of processing allotted to each node. Each node monitors the progress of each other node and (Ideally) they all finish at the same time and put it back together. This is more useful for solving one HUGE problem which requires a lot of processing and isn't really as intensive on the need for throughput. An anology I use is comparing batteries in either series or parallel. MPI would be connecting the batteries in parallel because you need their voltages to be cumulative to help you clear that particularly nasty patch of resistance. Farm topologies are like connecting batteries in parallel. Each task the group will be asked to drive doesnt' really overtax the potential of each individual part, but there are thousands, tens of thouseands, millions of tasks. So, each node adds extra resources, a seperate bus, seperate I/O channels, etc which allow these small tasks to be completed faster without competing for time on a system. So what advantages would a Hybrid have? Assuming a I have a decent grasp of the technologies. A hybrid would allow more diverse programs to be run on clusters. You could have a program which performs complex operations in one section _while_ it handles other features such as user interaction/network administration on the other nodes. So we take a farm of about six units and connect it to a cluster of about ten other units for raw processing power. Now we either rewrite MPI or write our own administration program to dissect any jobs and send less processor-intensive tasks to the single units and send any complex tasks to the cluster. How best to do this? I would try to write a new programming environment with specific tools to send directions on the complexity of each task to the administrative program. The administrative program would then funnel the tasks appropriately. Problem I see with that: It would be outdated very fast. If I wrote a program which would be considered intensive for today's PC's and directed that part of it to the cluster and technology advanced, as it always does, then I'd basically be wasting my time in the future as the "intensive" part of my application would be using time on the cluster when it could easily be handled by one of the farm machines. Another option would be to do some pre-processing on each program and have a program decide which task is held against a set of standards determined by the system administrator(a person not an administration program) based on the capabilities of the system. This approach might be better in the long run because Object Oriented programming is really looking like the wave of the future and each object could be evaluated for an estimated level of processing power needed and assigned to and individual node or the cluster as appropriate. Problems with this approach: Administrator intensive. It would require frequent updates depending on each system's capabilities and special training so the Admin can set it up for maximum effeciency. So, is the flexibility to be able to write programs which can handle a large variety of operations within a program designed for parallel processing worth the work? That's the question and if someone is already working on doing this I'd be interested in knowing. I'm going to suggest to our Beowulf project team at my university to do something like this. It'll take a lot of work and maybe not work at all. Steven