9.1.8.1 Nick Carriero - Yale Computer Science Departmert

Interviewee: Nicholas Carriero, PhD
Job: Research Scientist
Organisation: Computer Science, Yale University
Interviewer: Thomas Haigh
Date: Wednesday 14th September 1994
Location: Yale University
Revision: 1

Nicholas Carriero's work is in the design, implementation and use of parallel systems. He has been most closely associated with the 'Linda' project, a 'software system that creates the illusion of a shared memory resource' for process communication. The group works on a range of 'follow-on' projects, one of which is 'adaptive parallelism'. This involves the harnessing of the largely idle processors in networked workstations by providing 'software resources' so that a large computation can be spread over the network to idle processors, which will be 'relinquished' when their primary users need them. This project is of great interest to 'virtually all large research centres' with an investment in workstations.

'Typically any organisation, whether its a university or a government research lab or a commercial research lab, enjoys certain economies of scale from buying large numbers of pretty much the same kind of workstation and they have to pick workstations that more or less meet the peak demands of the few high-end users and so otherwise just aren't being used to their capacity.'

Another area of research has been the 'packaging' of distributed computation so as to be invisible to the end-user. For example, information extraction from a continual data flow is crucial for a Wall Street dealer or an intensive care monitoring system - to provide an 'informed snapshot' of the status of the world. This is most easily intellectually structured as a sequence of independent tasks, such as the 'cleaning' of raw data followed by various stages of filtering, analysis and high level trend analysis and diagnosis. It is thus natural to represent these tasks in terms of a hierarchical arrangement of 'boxes' with input and output streams. What is needed is a system which takes definitions of these boxes, prepared by a domain expert and 'figures out how to actually execute them on a bunch of machines so that as data pumps into the system answers pump out' in real time. This specialised parallel model is called 'Trellis' due to its lattice structure.

FGP - 'Fetch, Generalise, Project' is a system which 'takes a lot of data records and treats them almost as diagnostic cases'. A medical system might represent patient information as an n-tuple which forms a single point in a space. By examining the clustering of these points a possible diagnosis for a new case can be produced without the need to derive explicit rules. The technique has been applied to areas as diverse as the analysis of mammograms and folk dances; one obvious commercial use is the analysis of credit card transactions to organise marketing efforts around common buying patterns. This application is even further from explicit parallelism than Trellis, the user 'would be absolutely oblivious to any issue of parallelism'. However parallelism 'underneath' allows the very heavy computational demands of the technique to be met.

'The trajectory of these projects is to start out with exclusive concern for parallel systems, build explicit parallel applications, and then begin to build software frameworks which rise higher and higher in the level of presentation to the end-user and hide away more and more of these details, but as they rise higher focus more and more narrowly on a particular style of application.'

Are basic research, applied research and technology inseparable in this kind of work?

'From our point of view they are... there are people who pursue strongly conceptual research with no real concern for bridging that gap between the conceptually interesting problems and how they might be applied in any kind of practical domain. I have to say, though, that the field we chose to work in doesn't allow us that luxury - if you're working in parallel computing.... by and large most people working in the field are very concerned with the practical issues - how do I get things done faster? Once you come into that kind of field you don't just want to do things in the abstract, you want to see practical gains achieved from this practical concern. Much of what has driven parallel computing has been the demands of a bunch of different people in industry and in the government....'

The government's involvement was originally prompted, in the days of national labs with huge budgets, by the rate at which Crays were being consumed. 'They were just building the machines and shipping them down to the lab' and would be saturated within a week. The demand for better ways of achieving very high performance led to a revival of interest in parallelism. As time has gone by, attention has shifted from these 'very exotic performance at any cost large scale machines' aimed at ever larger problems towards industrial demand for machines to solve large problems in more cost-effective ways - making new applications of computing economically viable. Thus, instead of machines being designed with the aim of 'selling at least three to the national labs' they are being aimed at corporates such as Boeing and sold on grounds of cost performance and economies of scale in their components.

Yale's research is an extreme example of this, which 'uncovered, as it were, parallel machines that people already owned, in the shape of their local area networks'. 'For the cost of a software system they could realise performance improvements that were significant, and made a difference to their production cycle of computing.' This is reflected in a shift in the group's interactions from basic work with massive government and industrial laboratories, such as AT&T Bell Labs, towards more applied work with corporate entities.

They now have a co-operative arrangement with a local software house for the distribution of their software with a local company who has 'opened the doors to a number of different environments' - providing a point of contact with the financial, petroleum and microelectronics simulation communities and developing finished code for industrial use. 'This company for the most part is the one that takes on the responsibility of doing the serious, commercial grade research and development of the software'. This arrangement is very beneficial, as the group aims to produce software 'which people will use' but cannot practically aim to produce and support finished products itself. They have the flexibility to 'pursue interesting ideas, get the software to a point where we have a prototype that demonstrates feasibility' and then transfer them for debugging, further implementation, porting, support and maintenance to the company. SBIR (Small Business Innovation Research Program) grants from the government are designed to support companies engaged in this kind of work. They are important because an idea which is too close to commercialisation to attract academic grants will require further work before saleable products are ready.

As far as he understood the terms of the agreement, Yale gained royalties on sales of the software and usage of the code produced. Ideas on the division of royalties are 'so much sifting sand over time' but are currently paid to the university, which shares a portion with the department - some of which goes to the individual researcher. Royalties have not been a major motivating factor in the agreement - sales from the product meet the costs of commercialisation and continuing development, but Yale's own costs have already been funded by granting and do not need to be recouped.

Another advantage of the arrangement is that companies are 'less generally interested' in funding general research and 'much more interested, both in their in-house research activities and in their dealings with others, in research projects for which there is not some mysterious leap of faith between that research project and something of practical interest to the company'. If the company deals with another company, rather than directly with the university, they have the feeling that 'they are dealing with a serious organisation that knows and understands' the way companies function, corporate culture and so forth. An external example is the transfer of the Mosaic software to a company for support and maintenance.

'As a research institution you quickly begin to feel the pressures on your normal activities from dealing with software distribution and support in dealing with something that has become popular in the outside world'. Most major departments have produced spin-offs of one kind or another. Another group at Yale set up a small company around the 'wavelets' compression technique, and historically a company called 'Cognitive Systems' was set up around AI researchers. Yale has an 'incubator site' in shape of their Science Park

'The ideas are there, the interest's there, the enthusiasm's there - the difficult part is the set of legal entanglements.... it's just a problem that the universities have got to come to terms with. In the recent past their model of the product of intellectual activity was the book... That model doesn't really apply any longer when you talk about pursuing interesting high-tech ideas.'

Different interests might be claimed by funding agencies, graduate students, universities and faculty members in the rights to inventions, and academic freedom must be preserved, but companies look for exclusive licences. So far the system has allowed a 'roll with the blows' approach which has been workable in practice, but 'one would much prefer.... to know what the bottom looks like before you jump'.

The group has a good working relationship with IBM, which supports the approach of cost effective, scalable systems with their SB/1 and SB/2 lines and switching technology. They provide Yale with hardware, equipment and joint study arrangements. This contrasts with their far less product oriented work a decade ago in conjunction with AT&T's basic researchers. From IBM's point of view, a major gain is 'creating the ability to demonstrate the technology to their customers'.

'The focus has shifted dramatically from exploratory research to practical development, which from IBM's point of view allows them to say "Oh, you want to speed up your application? Well, let's try to work with Yale and we can talk with them...", and there's a much tighter relationship with business activities than we would have had with Bell Labs in the past.'

This development of example case studies allows a 'portfolio of technology demonstrations' and data to be built up for IBM's use. This relationship is symbiotic, as Yale welcome the provision of equipment and real situations for their work. 'We don't just want to build parallel systems and let them sit and say "Well, we've done it so let's more on to something else". We want to use them, make sure they work, achieve a certain level of efficiency, understand where the weakness are and improve them - and in the course of doing that work we will naturally develop the kinds of case studies that IBM are interested in.' This willingness to be involved with real 'industrial codes' is a great strength of the group in its relations with industry. Yale provides 'reasonable access' to IBM, its sales people and its potential customers - most demonstrations are internal within IBM. IBM's contribution is a 'loan of equipment' valued at over $1 million - machines, switches and resources to support them. As far as he knew, no cash was involved. IBM gets the results of the group's efforts for a substantially lower price than they would cost to duplicate internally, and the group are doing 'work that they wanted to do' anyway. An agreement which provided resources for work which was not of direct interest to the group would be counter-productive.

In the past it has sometimes been possible to obtain grants which match donations of equipment from industry with donations of cash, but in 'corporations themselves, at least recently to my mind, there is some willingness to give cash, but they prefer to give as equipment, or possibly even researcher time on their end.' They have 'talked endless with company representatives.... we see a lot of them in the hope that one of them pays off'. Alumni relations provide another means of making contact - again low probability and requiring careful nurturing.

How are industrial links likely to develop in the future? 'The particular combination of sponsored research mixed with focused industrial collaboration will become more prominent.... Everyone seems to have brought into the "let's get a little more practical" style of funding.' The development of royalties are much harder to predict and are not the motivating factor. 'We'd love to spawn two or three companies with significant revenue streams, but I think realistically that can't be relied upon.' About 10 years ago a company, Multiflow, was set up to build computers in a Silicon Valley style operation, but this was not successful in the long term.

Will this trend favour certain sorts of institutions? Well, Yale suffers to some extent from its 'non-technical image' as opposed to MIT, Stanford or Berkeley. Smaller institutions will tend to have smaller expectations - although sometimes deliberate efforts have been made to change the balance through the backing of 'centres of excellence' by government and the wealthier states. One example of this is the funding of Syracuse university by New York state - and with its proximity to important manufacturing companies it enjoys an environment productive to this kind of research. With projects such as 'North Eastern Parallel Architecture' and their 'Infomall' initiative they are very focused on technology transfer and compare interestingly with nearby Cornell.

'There was a time when high tech companies, to some extent, helped fund research, but I don't believe that it was ever as generous as the perception was, and we got to rather a difficult time a few years back (it hasn't really changed now but people are more honest about it) when the government was saying "Well, we're not financing that" and industry was saying "That's what government grants are all for" and they were just pointing at each other'.

The poor fortunes and reorganisations of organisations such as IBM, DEC, AT&T and Xerox and the 'lean, mean economic machine' approaches they have adopted limit their willingness and ability to invest in basic research to create ideas for products in five or ten years time without any provision for competitive advantage.

Would restrictions on publication and increased exclusivity for research sponsors make such corporations more willing to fund universities? Possibly, but the large issue is 'do universities need to undergo the same kind of restructuring that industrial concerns are going through? Can the universities continue to be pure bastions of abstract reflection, or do they need to get their hands dirty and begin to worry more seriously about what's really important to current economic environments'. When a Yale education costs a student around $100,000, questions of 'value for money' cannot be ignored. 'What did I get from that? What am I able to do that I couldn't have done some other way'. Such questions require a reappraisal of mission given the standing commitment of Yale and other distinguished universities to diversity in their intakes and 'bring to the university the best young minds, regardless of their background and training.'

Such considerations make the search for practicality and relevance in research part of a 'wider Zeitgeist'. The quest for intellectual freedom may have led to 'excesses in the academic environment' in some areas, but there is merit 'in maintaining at least some notion of an ideal cannon - stuff that should be pursued because it is intellectually important, not because it will help your bottom line in the next three months.... Finding the balance there is going to be the trick.'