fbpx
Wikipedia

Joel McCormack

Joel McCormack is an American computer scientist who designed the NCR Corporation version of the p-code machine, which is a kind of stack machine popular in the 1970s as the preferred way to implement new computing architectures and languages such as Pascal and BCPL. The NCR design shares no common architecture with the Pascal MicroEngine designed by Western Digital but both were meant to execute the UCSD p-System.[1,2]

P-machine theory edit

Urs Ammann, a student of Niklaus Wirth, originally presented p-code in his PhD thesis (see Urs Ammann, On Code Generation in a Pascal Compiler, Software: Practice and Experience, Vol. 7, No. 3, 1977, pp. 391–423). The central idea is that a complex software system is coded for a non-existent, fictitious, minimal computer or virtual machine and that computer is realized on specific real hardware with an interpreting computer program that is typically small, simple, and quickly developed. The Pascal programming language had to be re-written for every new computer being acquired, so Ammann proposed writing the system one time to a virtual architecture. The successful academic implementation of Pascal was the UCSD p-System developed by Kenneth Bowles, a professor at UCSD, who began the project of developing a universal Pascal programming environment using the P-machine architecture for the multitude of different computing platforms in use at that time. McCormack was part of a team of undergraduates working on the project.[3] He took this familiarity and experience with him to NCR.

P-machine design edit

NCR hired McCormack directly out of college. They had previously developed a bit-sliced hardware implementation of a p-code machine using AMD's AM2900 chipset. A myriad of timing and performance problems plagued the machine; McCormack proposed a redesign of the processor, which would have a microsequencer based on programmable logic. When McCormack left NCR to start Volition Systems he continued his work on the processor as a contractor.

This new CPU used horizontal microcode which radically enhanced parallelism within the microarchitecture. These wide, 80-bit microwords allowed the CPU to perform many operations in a single microcycle: the processor could do an arithmetic operation while also performing a memory read into the internal stack, or transfer the contents of a register while at the same time reading new data into the ALU. Resultingly, many of the simpler p-code operations only took one or two microinstructions; some operations were constructed with tight, single-microword loops.

Two bits per clock selected one of four cycle times for each instruction: 130, 150, or 175 nanoseconds, which generated with a delay line. Faster parts from AMD would have also allowed for a 98 ns cycle time, but there was no correspondingly faster branch control unit. A separate prefetch/instruction formatting unit also used delay lines to generate asynchronous timing signals. This unit had a 32-bit buffer and could decode the next data in multiple formats: signed byte; unsigned byte; word; and compressed "big" format, which encoded small numbers in 0..127 in a single byte, and larger numbers within 128..32767 in two.

An on-board stack of 1024, 16-bit words held temporary values—scalars as well as sets. The stack addresses ran downwards, with the stack pointer decrementing before a write and incrementing after a read. A register in the AMD 2901's internal file held the top-of-stack value so as to accelerate simple operations. Integer addition took only a single instruction cycle; since one operand was always in the register file, only one fetch from stack memory was needed.

Each wide control word could either hold the address of the next microinstruction or it could control the next p-machine instruction to be fetched. Thus, the microsequencer could jump almost arbitrarily the control code. The first 256 microinstructions in memory corresponded to p-machine instructions, so the microassembler would place the first control word in its corresponding location. P-code instructions that took multiple multiple microinstructions to execute could not start with a branch, (as this field is already used to jump to the rest of the microprogram for the instruction). [citation needed]

P-machine architecture edit

The CPU used the technique of keeping the top word of the stack in one of the AMD 2901 registers. This often resulted in one fewer microinstructions. For example, here are a few p-codes the way they ended up. tos is a register, and q is a register. "|" means parallel activities in a single cycle. (The stack doesn't quite operate this way...it decrements before data is written to it, and increments after data is read.)

Since next-address control and next microcode location were in each wide microword, there was no penalty for any-order execution of the microcode. A table of 256 labels, and the microcode compiler moved the first instruction at each of those labels to the first 256 locations of microcode memory. The only restriction this placed upon the microcode was that if the p-code required more than one microinstruction, then the first microinstruction couldn't have any flow control specified (as it would be filled in with a "goto <rest of microcode for p-code>).

fetch% Fetch and save in an AMD register the next byte opcode from % the prefetch unit, and go to that location in the microcode. q := ubyte | goto ubyte SLDCI% Short load constant integer (push opcode byte) % Push top-of-stack AMD register onto real stack, load % the top-of-stack register with the fetched opcode that got us here dec(sp) | stack := tos | tos := q | goto fetch LDCI% Load constant integer (push opcode word) % A lot like SLDCI, except fetch 2-byte word and "push" on stack dec(sp) | stack := tos | tos := word | goto fetch SLDL1% Short load local variable at offset 1 % mpd0 is a pointer to local data at offset 0. Write appropriate % data address into the byte-addressed memory-address-register mar := mpd0+2 % Push tos, load new tos from memory SLDXdec(sp) | stack := tos | tos := memword | goto fetch LDL% Load local variable at offset specified by "big" operand r0 := big mar := mpd0 + r0 | goto sldx INCR% Increment top-of-stack by big operand tos := tos + big | goto fetch ADI% Add two words on top of stack tos := tos + stack | inc(sp) | goto fetch EQUI% Top two words of stack equal? test tos - stack | inc(sp) tos := 0 | if ~zero goto fetch tos := 1 | goto fetch 

This architecture should be compared to the original P-code machine specification as proposed by Niklaus Wirth.

P-machine performance edit

The end result was a 9"x11" board for the CPU that ran UCSD p-System faster than anything else, by a wide margin. As much as 35-50 times faster than the LSI-11 interpreter, and 7-9 times faster than the Western Digital Pascal MicroEngine did by replacing the LSI-11 microcode with p-code microcode. It also ran faster than the Niklaus Wirth Lilith machine but lacked the bit-mapped graphics capabilities, and around the same speed as a VAX-11/750 running native code. (But the VAX was hampered by the poor code coming out of the Berkeley Pascal compiler, and was also a 32-bit machine.)

Education edit

Later employment edit

Publications edit

  • Joel McCormack, Robert McNamara. Efficient and Tiled Polygon Traversal Using Half-Plane Edge Functions, to appear as Research Report 2000/4, Compaq Western Research Laboratory, August 2000. [Superset of Workshop paper listed immediately below.]
  • Joel McCormack, Robert McNamara. Tiled Polygon Traversal Using Half-Plane Edge Functions, Proceedings of the 2000 EUROGRAPHICS/SIGGRAPH Workshop on Graphics Hardware, ACM Press, New York, August 2000, pp. 15–21.
  • Robert McNamara, Joel McCormack, Norman P. Jouppi. Prefiltered Antialiased Lines Using Half-Plane Distance Functions, Research Report 98/2, Compaq Western Research Laboratory, August 2000. [Superset of Workshop paper listed immediately below.]
  • Robert McNamara, Joel McCormack, Norman P. Jouppi. Prefiltered Antialiased Lines Using Half-Plane Distance Functions, Proceedings of the 2000 EUROGRAPHICS/SIGGRAPH Workshop on Graphics Hardware, ACM Press, New York, August 2000, pp. 77–85.
  • Joel McCormack, Keith I. Farkas, Ronald Perry, Norman P. Jouppi. Simple and Table Feline: Fast Elliptical Lines for Anisotropic Texture Mapping, Research Report 99/1, Compaq Western Research Laboratory, October 1999. [Superset of SIGGRAPH paper listed immediately below.]
  • Joel McCormack, Ronald Perry, Keith I. Farkas, Norman P. Jouppi. Feline: Fast Elliptical Lines for Anisotropic Texture Mapping, SIGGRAPH 99 Conference Proceedings, ACM Press, New York, August 1999, pp. 243–250.
  • Joel McCormack, Robert McNamara, Christopher Gianos, Larry Seiler, Norman P. Jouppi, Ken Correll, Todd Dutton, John Zurawski. Neon: A (Big) (Fast) Single-Chip 3D Workstation Graphics Accelerator, Research Report 98/1, Compaq Western Research Laboratory, Revised July 1999. [Superset of Workshop and IEEE Neon papers listed immediately below.]
  • Joel McCormack, Robert McNamara, Christopher Gianos, Larry Seiler, Norman P. Jouppi, Ken Correll, Todd Dutton, John Zurawski. Implementing Neon: A 256-bit Graphics Accelerator, IEEE Micro, Vol. 19, No. 2, March/April 1999, pp. 58–69.
  • Joel McCormack, Robert McNamara, Christopher Gianos, Larry Seiler, Norman P. Jouppi, Ken Correll. Neon: A Single-Chip 3D Workstation Graphics Accelerator, Proceedings of the 1998 EUROGRAPHICS/SIGGRAPH Workshop on Graphics Hardware, ACM Press, New York, August 1998, pp. 123–132. [Voted Best Paper/Presentation.]
  • Joel McCormack, Robert McNamara. A Smart Frame Buffer, Research Report 93/1, Digital Equipment Corporation, Western Research Laboratory, January 1993. [Superset of USENIX paper listed immediately below.]
  • Joel McCormack, Robert McNamara. A Sketch of the Smart Frame Buffer, Proceedings of the 1993 Winter USENIX Conference, USENIX Association, Berkeley, January 1993, pp. 169–179.
  • Joel McCormack. Writing Fast X Servers for Dumb Color Frame Buffers, Research Report 91/1, Digital Equipment Corporation, Western Research Laboratory, February 1991. [Superset of the Software: Practice and Experience paper listed immediately below.]
  • Joel McCormack. Writing Fast X Servers for Dumb Color Frame Buffers, Software: Practice and Experience, Vol 20(S2), John Wiley & Sons, Ltd., West Sussex, England, October 1990, pp. 83–108. [Translated and reprinted in the Japanese edition of UNIX Magazine, ASCII Corp., October 1991, pp. 76–96.]
  • Hania Gajewska, Mark S. Manasse, Joel McCormack. Why X is Not Our Ideal Window System, Software: Practice and Experience, Vol 20(S2), John Wiley & Sons, Ltd., West Sussex, England, October 1990, pp. 137–171.
  • Paul J. Asente and Ralph R. Swick, with Joel McCormack. X Window System Toolkit: The Complete Programmer's Guide and Specification, X Version 11, Release 4, Digital Press, Maynard, Massachusetts, 1990.
  • Joel McCormack, Paul Asente. An Overview of the X Toolkit, Proceedings of the ACM SIGGRAPH Symposium on User Interface Software, ACM Press, New York, October 1988, pp. 46–55.
  • Joel McCormack, Paul Asente. Using the X Toolkit, or, How to Write a Widget. Proceedings of the Summer 1988 USENIX Conference, USENIX Association, Berkeley, June 1988, pp. 1–14.
  • Joel McCormack. The Right Language for the Job. UNIX Review, REVIEW Publications Co., Renton, Washington, Vol. 3, No. 9, September 1985, pp. 22–32.
  • Joel McCormack, Richard Gleaves. Modula-2: A Worthy Successor to Pascal, BYTE, Byte Publications, Peterborough, New Hampshire, Vol. 8, No. 4, April 1983, pp. 385–395.

See also edit

References edit

  1. The Pascal Users' Group Newsletter Archive
  2. The UCSD P-system Museum
  3. The UCSD Pascal Reunion website

joel, mccormack, american, computer, scientist, designed, corporation, version, code, machine, which, kind, stack, machine, popular, 1970s, preferred, implement, computing, architectures, languages, such, pascal, bcpl, design, shares, common, architecture, wit. Joel McCormack is an American computer scientist who designed the NCR Corporation version of the p code machine which is a kind of stack machine popular in the 1970s as the preferred way to implement new computing architectures and languages such as Pascal and BCPL The NCR design shares no common architecture with the Pascal MicroEngine designed by Western Digital but both were meant to execute the UCSD p System 1 2 Contents 1 P machine theory 2 P machine design 3 P machine architecture 4 P machine performance 5 Education 6 Later employment 7 Publications 8 See also 9 ReferencesP machine theory editUrs Ammann a student of Niklaus Wirth originally presented p code in his PhD thesis see Urs Ammann On Code Generation in a Pascal Compiler Software Practice and Experience Vol 7 No 3 1977 pp 391 423 The central idea is that a complex software system is coded for a non existent fictitious minimal computer or virtual machine and that computer is realized on specific real hardware with an interpreting computer program that is typically small simple and quickly developed The Pascal programming language had to be re written for every new computer being acquired so Ammann proposed writing the system one time to a virtual architecture The successful academic implementation of Pascal was the UCSD p System developed by Kenneth Bowles a professor at UCSD who began the project of developing a universal Pascal programming environment using the P machine architecture for the multitude of different computing platforms in use at that time McCormack was part of a team of undergraduates working on the project 3 He took this familiarity and experience with him to NCR P machine design editNCR hired McCormack directly out of college They had previously developed a bit sliced hardware implementation of a p code machine using AMD s AM2900 chipset A myriad of timing and performance problems plagued the machine McCormack proposed a redesign of the processor which would have a microsequencer based on programmable logic When McCormack left NCR to start Volition Systems he continued his work on the processor as a contractor This new CPU used horizontal microcode which radically enhanced parallelism within the microarchitecture These wide 80 bit microwords allowed the CPU to perform many operations in a single microcycle the processor could do an arithmetic operation while also performing a memory read into the internal stack or transfer the contents of a register while at the same time reading new data into the ALU Resultingly many of the simpler p code operations only took one or two microinstructions some operations were constructed with tight single microword loops Two bits per clock selected one of four cycle times for each instruction 130 150 or 175 nanoseconds which generated with a delay line Faster parts from AMD would have also allowed for a 98 ns cycle time but there was no correspondingly faster branch control unit A separate prefetch instruction formatting unit also used delay lines to generate asynchronous timing signals This unit had a 32 bit buffer and could decode the next data in multiple formats signed byte unsigned byte word and compressed big format which encoded small numbers in 0 127 in a single byte and larger numbers within 128 32767 in two An on board stack of 1024 16 bit words held temporary values scalars as well as sets The stack addresses ran downwards with the stack pointer decrementing before a write and incrementing after a read A register in the AMD 2901 s internal file held the top of stack value so as to accelerate simple operations Integer addition took only a single instruction cycle since one operand was always in the register file only one fetch from stack memory was needed Each wide control word could either hold the address of the next microinstruction or it could control the next p machine instruction to be fetched Thus the microsequencer could jump almost arbitrarily the control code The first 256 microinstructions in memory corresponded to p machine instructions so the microassembler would place the first control word in its corresponding location P code instructions that took multiple multiple microinstructions to execute could not start with a branch as this field is already used to jump to the rest of the microprogram for the instruction citation needed P machine architecture editThe CPU used the technique of keeping the top word of the stack in one of the AMD 2901 registers This often resulted in one fewer microinstructions For example here are a few p codes the way they ended up tos is a register and q is a register means parallel activities in a single cycle The stack doesn t quite operate this way it decrements before data is written to it and increments after data is read Since next address control and next microcode location were in each wide microword there was no penalty for any order execution of the microcode A table of 256 labels and the microcode compiler moved the first instruction at each of those labels to the first 256 locations of microcode memory The only restriction this placed upon the microcode was that if the p code required more than one microinstruction then the first microinstruction couldn t have any flow control specified as it would be filled in with a goto lt rest of microcode for p code gt pre fetch Fetch and save in an AMD register the next byte opcode from the prefetch unit and go to that location in the microcode q ubyte goto ubyte SLDCI Short load constant integer push opcode byte Push top of stack AMD register onto real stack load the top of stack register with the fetched opcode that got us here dec sp stack tos tos q goto fetch LDCI Load constant integer push opcode word A lot like SLDCI except fetch 2 byte word and push on stack dec sp stack tos tos word goto fetch SLDL1 Short load local variable at offset 1 mpd0 is a pointer to local data at offset 0 Write appropriate data address into the byte addressed memory address register mar mpd0 2 Push tos load new tos from memory SLDXdec sp stack tos tos memword goto fetch LDL Load local variable at offset specified by big operand r0 big mar mpd0 r0 goto sldx INCR Increment top of stack by big operand tos tos big goto fetch ADI Add two words on top of stack tos tos stack inc sp goto fetch EQUI Top two words of stack equal test tos stack inc sp tos 0 if zero goto fetch tos 1 goto fetch pre This architecture should be compared to the original P code machine specification as proposed by Niklaus Wirth P machine performance editThe end result was a 9 x11 board for the CPU that ran UCSD p System faster than anything else by a wide margin As much as 35 50 times faster than the LSI 11 interpreter and 7 9 times faster than the Western Digital Pascal MicroEngine did by replacing the LSI 11 microcode with p code microcode It also ran faster than the Niklaus Wirth Lilith machine but lacked the bit mapped graphics capabilities and around the same speed as a VAX 11 750 running native code But the VAX was hampered by the poor code coming out of the Berkeley Pascal compiler and was also a 32 bit machine Education editUniversity of California San Diego BA 1978 University of California San Diego MS 1979Later employment editDigital Equipment Corporation Compaq Computer Corporation Hewlett Packard NvidiaPublications editJoel McCormack Robert McNamara Efficient and Tiled Polygon Traversal Using Half Plane Edge Functions to appear as Research Report 2000 4 Compaq Western Research Laboratory August 2000 Superset of Workshop paper listed immediately below Joel McCormack Robert McNamara Tiled Polygon Traversal Using Half Plane Edge Functions Proceedings of the 2000 EUROGRAPHICS SIGGRAPH Workshop on Graphics Hardware ACM Press New York August 2000 pp 15 21 Robert McNamara Joel McCormack Norman P Jouppi Prefiltered Antialiased Lines Using Half Plane Distance Functions Research Report 98 2 Compaq Western Research Laboratory August 2000 Superset of Workshop paper listed immediately below Robert McNamara Joel McCormack Norman P Jouppi Prefiltered Antialiased Lines Using Half Plane Distance Functions Proceedings of the 2000 EUROGRAPHICS SIGGRAPH Workshop on Graphics Hardware ACM Press New York August 2000 pp 77 85 Joel McCormack Keith I Farkas Ronald Perry Norman P Jouppi Simple and Table Feline Fast Elliptical Lines for Anisotropic Texture Mapping Research Report 99 1 Compaq Western Research Laboratory October 1999 Superset of SIGGRAPH paper listed immediately below Joel McCormack Ronald Perry Keith I Farkas Norman P Jouppi Feline Fast Elliptical Lines for Anisotropic Texture Mapping SIGGRAPH 99 Conference Proceedings ACM Press New York August 1999 pp 243 250 Joel McCormack Robert McNamara Christopher Gianos Larry Seiler Norman P Jouppi Ken Correll Todd Dutton John Zurawski Neon A Big Fast Single Chip 3D Workstation Graphics Accelerator Research Report 98 1 Compaq Western Research Laboratory Revised July 1999 Superset of Workshop and IEEE Neon papers listed immediately below Joel McCormack Robert McNamara Christopher Gianos Larry Seiler Norman P Jouppi Ken Correll Todd Dutton John Zurawski Implementing Neon A 256 bit Graphics Accelerator IEEE Micro Vol 19 No 2 March April 1999 pp 58 69 Joel McCormack Robert McNamara Christopher Gianos Larry Seiler Norman P Jouppi Ken Correll Neon A Single Chip 3D Workstation Graphics Accelerator Proceedings of the 1998 EUROGRAPHICS SIGGRAPH Workshop on Graphics Hardware ACM Press New York August 1998 pp 123 132 Voted Best Paper Presentation Joel McCormack Robert McNamara A Smart Frame Buffer Research Report 93 1 Digital Equipment Corporation Western Research Laboratory January 1993 Superset of USENIX paper listed immediately below Joel McCormack Robert McNamara A Sketch of the Smart Frame Buffer Proceedings of the 1993 Winter USENIX Conference USENIX Association Berkeley January 1993 pp 169 179 Joel McCormack Writing Fast X Servers for Dumb Color Frame Buffers Research Report 91 1 Digital Equipment Corporation Western Research Laboratory February 1991 Superset of the Software Practice and Experience paper listed immediately below Joel McCormack Writing Fast X Servers for Dumb Color Frame Buffers Software Practice and Experience Vol 20 S2 John Wiley amp Sons Ltd West Sussex England October 1990 pp 83 108 Translated and reprinted in the Japanese edition of UNIX Magazine ASCII Corp October 1991 pp 76 96 Hania Gajewska Mark S Manasse Joel McCormack Why X is Not Our Ideal Window System Software Practice and Experience Vol 20 S2 John Wiley amp Sons Ltd West Sussex England October 1990 pp 137 171 Paul J Asente and Ralph R Swick with Joel McCormack X Window System Toolkit The Complete Programmer s Guide and Specification X Version 11 Release 4 Digital Press Maynard Massachusetts 1990 Joel McCormack Paul Asente An Overview of the X Toolkit Proceedings of the ACM SIGGRAPH Symposium on User Interface Software ACM Press New York October 1988 pp 46 55 Joel McCormack Paul Asente Using the X Toolkit or How to Write a Widget Proceedings of the Summer 1988 USENIX Conference USENIX Association Berkeley June 1988 pp 1 14 Joel McCormack The Right Language for the Job UNIX Review REVIEW Publications Co Renton Washington Vol 3 No 9 September 1985 pp 22 32 Joel McCormack Richard Gleaves Modula 2 A Worthy Successor to Pascal BYTE Byte Publications Peterborough New Hampshire Vol 8 No 4 April 1983 pp 385 395 See also editUCSD p System p code machine Pascal MicroEngineReferences editThe Pascal Users Group Newsletter Archive The UCSD P system Museum The UCSD Pascal Reunion website Retrieved from https en wikipedia org w index php title Joel McCormack amp oldid 1214779830, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.