The INMOS transputer (the all-lowercase "transputer" was the official written form) was a pioneering concurrent computing microprocessor design of the 1980s from INMOS, a British semiconductor company based in Bristol. For some time in the late 1980s many considered the transputer to be the next great design for the future of computing.
Today, this interesting chip is largely forgotten. Whilst ultimately a commercial failure, the transputer architecture was highly influential in provoking new ideas in computer architecture, several of which have re-emerged in different forms in modern systems.
In the early 1980s, it appeared that conventional CPUs were reaching their performance limits. Up to this point in time, designers had been limited primarily by the amount of circuitry they could place on a chip due to manufacturing issues. But as the "fabbing" process continued to improve, soon the problem became that the chips could hold more circuitry than the designers knew how to use. Soon the traditional CISC designs were reaching a performance plateau, and it wasn't clear it could be surpassed.
It seemed that the only way forward was to increase the use of parallelism, the use of several CPUs that would work together to solve several tasks at the same time. This depended on the machines in question being able to run several tasks at once, a process known as multitasking. Multitasking had generally been too difficult for previous CPU designs to handle, but more recent designs were able to run it effectively. It was clear that in the future this would be a feature of all operating systems.
A side effect of most multitasking design is that it often also allows the processes to be run on physically different CPUs, in which case it is known as multiprocessing. A low-cost CPU built with multiprocessing in mind could allow the speed of a machine to be increased by adding additional CPUs, potentially for far less money than adding a single faster CPU design.
The transputer (transistor computer) was the first general purpose microprocessor designed specifically to be used in parallel computing systems. The goal was to produce a family of chips ranging in power and cost that would then be wired together to form a complete computer. The name was selected to indicate the role the individual transputers would play: numbers of them would be used as basic building blocks, just as transistors had earlier.
Originally the plan was to make the transputer cost only a few dollars per unit. INMOS saw them being used for practically everything, from operating as the main CPU for a computer, to acting as a channel controller for disk drives in the same machine. Spare cycles on any of these transputers could be used for other tasks, greatly increasing the overall performance of the machines.
Even a single transputer would have all the circuitry needed to work by itself, a feature more commonly associated with microcontrollers. The idea in this case was to allow the transputers to be connected together as easily as possible, without the requirement for a complex bus (or motherboard). Instead you simply supplied power and a simple clock signal. You did not have to provide RAM, a RAM controller, bus support or even an RTOS – these were all built in.
There were limits to the size of a system that could be built in this fashion. Since each transputer was linked to another transputer in a fixed point-to-point layout, sending messages to a more distant transputer required the messages to be forwarded off by each chip on the line. This introduced a delay with every "hop" over a link, leading to long delays on large nets. To solve this problem INMOS also provided a zero-delay switch that connected up to 32 transputers (or switches) into even larger networks.
In order to include all this functionality on a single chip, the transputer's core logic was simpler than most CPUs. It used a RISC-based design, but unlike the more common register-heavy load-store RISC CPUs, the transputer was a stack-based system with only a few registers. This allowed for very fast context switching by simply changing the stack pointer to the memory used by another program (a technique used in a number of contemporary designs). The transputer also included three "normal" registers, but they were in fact mirrors of the top three stack positions, used to allow for zero-address instructions.
The first 16 'primary' instructions were :-
| Mnemonic | Description |
| J | Jump |
| LDLP | Load Local Pointer - loads an address offset from workspace |
| PFIX | Prefix - general way to increase lower nibble |
| LDNL | Load non-local - Load a value offset from address at top of stack |
| LDC | Load constant |
| LDNLP | Load Non-local pointer - Load address, offset from top of stack |
| NFIX | Negative prefix - general way to negate (and possibly increase) lower nibble |
| LDL | Load Local - load value offset from Workspace |
| ADC | Add Constant |
| CALL | Subroutine call |
| CJ | Conditional jump - depending on value at top of stack |
| AJW | Adjust workspace |
| EQC | Equals constant |
| STL | Store local - Store, offset from workspace |
| STNL | Store non-local - store at address offset from top of stack |
| OPR | Operate - general way to extend instruction set |
All these instructions take a constant, representing an offset or an arithmetic constant. If this constant was less than 16, all these instructions coded to a single byte.
The first 16 'secondary' instructions (using the OPR primary instruction) were :-
| Mnemonic | Description |
| REV | Reverse - SWAP two top items of stack |
| LB | Load byte |
| BSUB | Byte subscript |
| ENDP | End process |
| DIFF | Difference |
| ADD | Add |
| GCALL | General Call - swap top of stack and instruction pointer |
| IN | Input |
| PROD | Product |
| GT | Greater Than - the only comparison instruction |
| WSUB | Word subscript |
| OUT | Output |
| SUB | Subtract |
| STARTP | Start Process |
| OUTBYTE | Output Byte |
| OUTWORD | Output word |
The initial occam development environment for the transputer was the INMOS D700 Transputer Development System (TDS). This was an unorthodox integrated development environment incorporating an editor, compiler, linker and (post-mortem) debugger. The TDS was itself a transputer application written in occam. The TDS text editor was notable in that it was a folding editor, allowing blocks of code to be hidden and revealed, to make the structure of the code more apparent. Unfortunately, the combination of an unfamiliar programming language and equally unfamiliar development environment did nothing for the early popularity of the transputer. Later, INMOS would release more conventional occam cross-compilers, the occam 2 Toolsets.
Implementations of more mainstream programming languages, such as C, FORTRAN and Pascal were also later released by both INMOS and third-party vendors. These usually included language extensions or libraries providing, in a less elegant way, occam-like concurrency and channel-based communication.
The transputer's lack of support for virtual memory inhibited the porting of mainstream variants of the UNIX operating system, though ports of UNIX-like operating systems (such as Minix and Idris from Whitesmiths) were produced. An advanced UNIX-like distributed operating system, HeliOS, was also designed specifically for multi-transputer systems by Perihelion Software.
The first transputers were announced in 1983 and released in 1984.
In keeping with their role as microcontroller-like devices, they included on-board RAM and a built-in RAM controller which allowed you to add more memory without any additional hardware. Unlike other designs, the transputers did not include I/O lines, this was to be added with hardware attached to the existing serial links. There was one 'Event' line, similar to a conventional processors interrupt line. Treated as a channel, a program could 'input' from the event channel, and proceed only after the event line was asserted.
All transputers ran from an external 5 MHz clock input; this was multiplied to provide the processor clock.
The transputer did not include an MMU or a virtual memory system.
Transputer variants (excepting the cancelled T9000) can be categorised into three groups: the 16-bit T2 series, the 32-bit T4 series and the 32-bit T8 series with 64-bit IEEE 754 floating-point support.
An enhanced T810 was planned, which would have had more RAM, more, faster links, extra instructions and improved microcode, but this was cancelled around 1990.
INMOS also produced a variety of support chips for the transputer processors, such as the C004 32-way link switch and the C012 "link adapter" which allowed transputer links to be interfaced to an 8-bit data bus.
In the desktop/workstation world the transputer was fairly fast, operating at about 10 MIPS at 20MHz. This was excellent performance for the early 1980s, but by the time the FPU-equipped T800 was shipping, other RISC designs had already surpassed it. This could have been mitigated to a large extent if machines used multiple transputers, but the T800 cost about $400 each when introduced, so the price/performance ratio wasn't there. Few transputer-based workstation systems were designed, the most notable probably being the Atari Transputer Workstation.
The transputer was more successful in the field of massively parallel computing, where several vendors produced transputer-based systems in the late 1980s. These included Meiko (founded by ex-INMOS employees), Floating Point Systems, Parsytec and Parsys.
The T9000 used a five stage pipeline for added speed. An interesting addition was the grouper which would collect instructions out of the cache and group them into larger packages of 4 bytes to feed the pipeline faster. Groups then completed in a single cycle, as if they were single larger instructions working on a faster CPU.
The link system was upgraded to a new 100 MHz mode, but unlike the previous systems the links were no longer downwardly compatible. This new packet-based link protocol was called DS-Link and later formed the basis of the IEEE 1355 serial interconnect standard. The T9000 also added link routing hardware called the VCP (Virtual Channel Processor) which changed the links from point-to-point to a true network, allowing for the creation of any number of virtual channels on the links. This meant programs no longer had to be aware of the physical layout of the connections. A range of DS-Link support chips were also developed, including the C104 32-way crossbar switch, and the C101 link adapter.
Long delays in the T9000's development meant that the faster load-store designs were already outperforming it by the time it was to be released. In fact it consistently failed to reach its own performance goal of beating the T800 by ten times, when the project was finally cancelled it was still only about 36 MIPS at 50 MHz. The production delays gave rise to the quip that the best host architecture for a T9000 was an overhead projector.
This was too much for INMOS, who didn't have the funding needed to continue development. By this time, the company had been sold to SGS-Thomson (now STMicroelectronics). SGS-Thomson's focus was the embedded systems market, and eventually the T9000 project was abandoned. However, a comprehensively redesigned 32-bit transputer intended for embedded applications, the ST20 series, was later produced, utilising some technology developed for the T9000. The ST20 core was incorporated into chipsets for set-top box and GPS applications.
Ironically it was largely through additional internal parallelism that conventional CPU designs got faster. Instead of using a heavyweight explicit system like the transputer, modern CPU designs are parallel only at the instruction level, looking at the code being run and then distributing what it can be sure of across a number of internal arithmetic and storage units within the CPU core. It appears this form of parallelism, known as superscalar, is much more suitable to general purpose computing.
Nevertheless, the model of multiple cooperating processors can be found in modern cluster computing systems and supercomputers. Unlike in the proposed transputer architecture, the processing units in these systems are similar to conventional computer servers, using CPUs with an internal superscalar architecture, access to substantial amounts of memory and often disk storage, and conventional operating systems and network interfaces. The software architecture used to marshal the cooperating software processes across the loosely-coupled processors in these systems is typically far more heavyweight than that proposed in the transputer architecture.
The nearest modern equivalent to the transputer link technology is the HyperTransport processor interconnection fabric designed by AMD. Although it is capable of message-passing, HyperTransport is, unlike the transputer link, generally used to implement a shared memory system for implementing a traditional symmetric multiprocessing software architecture.
A recent intriguing development is the Cell processor architecture designed by Sony, which some of Sony's patent applications seem to show as being designed to be able to run distributed processes at low level in a similar way to that proposed in the transputer architecture. However, this aspect of the Cell design does not seem to have been used in the first implementation of the system, which appears to be more dedicated to using its abilities as a set of parallel DSP engines connected by DMA pipelines, under the control of a conventional core processor.
Concurrent computing | Microprocessors
Transputer | Transputer | INMOS Transputer | トランスピュータ | Transputer | Транспьютер
This article is licensed under the GNU Free Documentation License.
It uses material from the
"INMOS transputer".
Home Page • arts • business • computers • games • health • hospitals • home • kids & teens • news • physicians • recreation• reference • regional • science • shopping • society • sports • world