White Paper Title: Comprehensive Modeling of Large Scale Computation Systems Authors Names: JoAnn M. Paul and Craig S. Sims Titles: Visiting Assistant Professor and Professor Department: Electrical and Computer Engineering Affiliation: West Virginia University Postal Address: P.O. Box 6104, Morgantown, WV 26506 email addresses: {jpaul,sims}@ece.wvu.edu phone numbers: 304/293-6371 ext. 519 and ext. 523 fax number: 304/293-8602 Large-scale complexity due to multiple forms of parallelism and rapid proliferation of computation systems rivals that of more naturally occurring systems such as physical, biological, and economic. Adequate understanding for the purposes of prediction, design, and optimization of parallel computation systems (including computer networking) requires investigation of the novel application of existing modeling techniques and the identification of novel methods in large-scale comprehensive systems modeling. Consider the paradigms most fundamental and critical to adequately model the behavior and usage of computation systems. The paradigms include (but are not limited to): hierarchy, heterogeneity, dynamics, complexity, and large scale. These paradigms, or microtheories, have mathematical representation in systems theory. The challenge is in the parameterization and integration of the paradigms into a unified paradigm. Among the most critical challenges in verification of existing modeling techniques against existing performance modeling data is the lack of an adequate collection of performance data required to completely characterize a system by all parameters of interest. Existing performance models tend to be highly platform-specific and do not readily fit into a unified paradigm. But in the absence of a comprehensive, unified paradigm, how do we know what types of performance data are most appropriate? A comprehensive model will provide the necessary mathematical basis or abstraction to prompt additional investigations into the gathering of the most appropriate performance data. Consider compiler abstractions as models of SISD computation. In many instances, compiler interfaces cannot replace assembly code development. In most cases however, models of computation as provided by compiler abstractions have been the driving force in allowing computation to become useful on a widespread basis. The compiler abstraction is likely more important than the underlying hardware mechanisms. Compilers were initially provided to allow more effective use of given ISAs (Instruction Set Architectures) and ease of programming to the general user. At present, new ISA designs must be considered with respect to effects on the compiler. The development of block-structured and goto-less programming has made pipelined instruction sets possible and useful. Without the abstraction of the compiler, the interface to the underlying hardware mechanism would be cumbersome and would result in rudimentary forms of disjointed analytical techniques. And yet, parallel computing continues to be analyzed in that way. As computer systems grow in size and complexity, the most useful abstractions under some circumstances may not be in software (sequential or language-based). One analogy is drawn from physics. Is light a particle or a wave? Both models are appropriate. Is a network of computers most fundamentally represented as interconnected digital programs and digital communications, or is the mathematical behavior more typical of large-scale analog systems? They are appropriately modeled both ways. The most comprehensive models may not be language-based. Parallel programs are not simply an extension of sequential paradigms to multiple execution units. Language by its very nature is sequential. The addition of parallel constructs to an otherwise sequential computer language facilitates simultaneous description of events with efficacy similar to that with which simultaneous events can be expressed by human language. While humans be may be able to see and hear simultaneously, humans cannot express more than one thought at a time in language. True forms of parallelism have no inherent order, and multiple forms of description are required for the multiple forms of parallelism that can occur. Mathematical models can inherently represent parallelism. Among the most revealing characterization of high performance computers are the numerous descriptions of computer "systems". An implicit recognition is made in the language that computers have evolved in complexity into systems and merit study and description similar to other forms of systems. Systems theory is rich in abstractions for parallel systems. Parallelism and dynamics are inherent in systems theory. For optimization of computation systems, natural questions that arise include; What existing formulations in systems theory, if any, apply? What are the parameters of interest? How should the parameters be represented in a mathematical model or models? How are multiple models inter-related? How well do the models predict performance? Can the models be optimized? What are the performance metrics for optimization? How is parallel programmability (mapping) possible by configuring the resultant model of the parallel computation system? The application of existing techniques in optimization to parallel computation requires an understanding of {1} the mathematical abstractions, {2} the application of those abstractions to parallel algorithms and architectures. Computer Architects and Software Engineers have a general understanding of the "parameters" or "factors" which contribute to ultimate performance of a parallel architecture. A partial list includes; processor speed, communications bandwidth, number of processors, number of communications links per CPU, size of problem mapped to each processor (granularity), memory speed, memory hierarchy (including cache), cache size, cache hit rate, inter-task overhead, inter-connection or communications strategy (a.k.a. topology) which can include; shared memory (bus), I/O (bus), point-to-point, or network. Secondary effects of each of these fundamental parameters or factors include hierarchical methodologies (processor and memory), inter-task vs. inter-processor overhead, total networked capacity, cache update policy, shared memory access policy, routing policy, operating system, compiler, languages and problem mapping. The characterization of any parallel process requires simultaneous representation of {1} independent production capability, {2} interaction of each parallel process in the formation of a system, and {3} net effects on each otherwise independent process of the required interaction with all other processes in the formation of a system. A natural representation of such a representation is in the form of a matrix-based input/output model. A matrix-based model inherently provides a mathematical abstraction of simultaneity, while the abstraction of simulteneity is difficult to provide in language-based models. This research is based upon: {1} the collaboration between researchers with an understanding of the computer architecture factors and researchers who are experts in large-scale systems theory and {2} the integration of multiple mathematical models of parallelism into a single abstraction. While computer science grew as a discipline out of mathematics and while systems theorists as engineers also grew out of the application of mathematics, there is a surprisingly small amount of true collaboration between systems theorists and computer architects. REFERENCES Cheng, Gang and Geoffrey C. Fox. "Integrating Multiple Parallel Programming Paradigms in a Dataflow-Based Software Environment," Concurrency: Practice and Experience. Vol. 8, No. 9, pp. 667-684, November, 1989. Grossglauser, M. and J-C. Bolot. "On the Relevance of Long-Range Dependence in Network Traffic," Proceedings of ACM SIGCOMM'96 Conference, Applications, Technologies, Architectures, and Protocols for Computer Communications, (ACM) pp. 15-24. 1996. Hennessy, J.L and D.A. Patterson. Computer Architecture: A Quantitative Approach 2nd Ed. Morgan Kaufmann. 1996. Hewitt, Carl. "Offices Are Open Systems," ACM Transactions on Office Information Systems. Vol. 4, No. 3, July 1986, pp. 271-287. Inverardi, Paola and Corrado Priami. "Automatic Verification of Distributed Systems: The Process Algebra Approach," Formal Methods in System Design. Vol. 8, No. 1, pp. 7-38, January, 1996. Jacob, Bruce L, Peter M. Chen, Seth R. Silverman, and Trevor N. Mudge. "An Analytical Model for Designing Memory Hierarchies," IEEE Transactions on Computers. Vol. 45, No. 10, pp. 1180-1194, October, 1996. Kuo, Benjamin C. Digital Control Systems. Orlando: Harcourt, Brace, Jovanovich. 1992. Lee, Soo-Young and Kyung Geun Lee. "Synchronous and Asynchronous Parallel Simulated Annealing with Multiple Markov Chains," IEEE Transactions on Parallel and Distributed Systems. Vol. 7, No. 10, pp. 993-1008, October, 1996. Leontief, WW. The Structure of the American Economy 1919-1939, London: Oxford University Press, 1951. Mickle, MH and JM Paul. "Heterogeneous Granularity Distribution of Continuum Problems," Journal of Parallel and Distributed Computing, Vol. 39. pp. 66-73, November 1996. Mickle, Marlin H. and William G. Vogt, "On the Dynamics of a Class of Optimum Routing Problems," Journal of the Franklin Institute, Vol. 300, No. 1, 1975. Paul, JoAnn M. and Marlin H. Mickle. "A Matrix-Based Scalability Metric for Homogeneous and Heterogeneous Computer Architecture," accepted for publication in the journal of Computers in Education. Paul, JoAnn M. and Marlin H. Mickle. "The Dynamics of Multitasking-Multiprocessing Computing," Journal of the Franklin Institute, Vol. 332B, No. 2, pp. 237-245, March 1995. Paul, J.M. and M.H. Mickle. "Multiprocessor shared memory access and rewards," accepted for publication in the Journal of the Franklin Institute. Patsis, Nikos T., Chun-Hung Chen, and Michael E. Larson. "SIMD Parallel Discrete-Event Dynamic System Simulation," IEEE Transactions on Control Systems Technology, Vol. 5, No. 1, pp. 30-41, January, 1997. Sims, C.S., "Combined Trajectory Shaping and Optimization for Switched Systems," Proceedings of the Seventh International Conference on Systems Engineering, Las Vegas, NV, July 18-20, 1990. Sims, C.S., "Optimal Tracking of a Markov Jump Process," Proceedings of the 11th Asilomar Conference on Circuits, Systems, and Computers, November 7-9, 1977. Stoten, D.P. and M. Di Bernardo. "Application of the minimal control synthesis algorithm to the control and synchronization of chaotic systems," International Journal of Control, Vol. 65, No. 6, pp. 925-938, 1996. Torrie, Evan, Margaret Martonosi, Chau-Wen Tseng, and Mary W. Hall. "Characterizing the Memory Behavior of Compiler-Parallelized Applications," IEEE Transactions on Parallel and Distributed Systems, Vol. 7, No. 12, pp. 1224-1237, December, 1996. Vogt, WG, MH Mickle, and H Aldermeshian. "A Dynamic Leontief Model for a Productive System", Proceedings of the IEEE. Vol 63, No. 3, pp. 438-443, March, 1975.