From Aaron Sloman Sun Apr 7 13:02:26 BST 2002 To: Morris Sloman , Tom Rodden Subject: Re: E-Science Research agenda Cc-ed to various > The attached document is the outcome of a working group set up by EPSRC > under the chairmanship of Tom Rodden ( I was a member) to identify an > E-Science research agenda. Thanks Morris, > We have been asked to circulate this widely within the IT/CS community to > try to get comments and enthusiasm. It's an interesting and ambitious report. Below are a few comments on parts of the report, which I'll place here: http://www.cs.bham.ac.uk/~axs/misc/escience 1. "We need to understand how we might best support new forms of community..." Perhaps all comments on the report, unless the sender requests otherwise, should be put on a web site where others can see them, then write in with supporting or critical responses. There's too much one-way commentary on proposals in many organisations: i.e. comments are seen only by the proposers, and their views on the comments are not seen either by those who comment or by others who might add support or rebuttal. Addressing this in the discussion of the E-science report would be an example of what the report proposes. The report itself should be posted somewhere where people can find it easily instead of depending on individuals to circulate it. (Don't use the dreadful EPSRC web site for this!) 2. "An archival repository that will allow experimental results to be shared across the computing community and made more generally available." Why just the computing community? Doesn't that ignore the crucial importance both for computing and other areas of science of cross-disciplinary fertilisation? (For reasons partly explained in: http://www.cs.bham.ac.uk/~axs/misc/cs-research.html ) We need open and easy access by everyone to results of all sciences, not closed groups that talk to one another. (Apologies if I've mis-interpreted the document.) There are already movements, with well developed resources and software for making results of science open and freely accessible. Any new initiative should support and build on those instead of reinventing wheels. Some examples: arxiv.org Already includes mathematics, physics, computer science, and AI, with mirror sites all round the world. http://www.soros.org/openaccess Budapest Open Access Initiative funded by George Soros (financier and philanthropist) who founded the Open Society Initiative. http://www.eprints.org/ Set of tools dedicated to the freeing of the refereed research literature online through author/institution self-archiving. developed at the University of Southampton. EPrints 2 is free (GPL) software aimed at organisations and communities rather than individuals. Google is excellent at tying such things together (compared with anything else available, as far as I know). Perhaps it could, at least initially, provide part of the infrastructure for some of the tasks listed in the report, e.g. "...supports the management and traceability of knowledge?" By contrast topic-based search-engines lose out because they are constrained by their designers' ideas of relevance, which are inevitably prejudiced. E.g. Elsevier's Science Direct http://www.sciencedirect.com/ 3. "We need to make the infrastructure more available, more robust and more trusted by developing trusted ubiquitous systems." ^^^^^^^ .... "The Computing Community needs to respond to this trend and undertake the fundamental research needed to underpin the development and deployment of ubiquitous systems that people can trust and rely upon." ^^^^^ There's a lot about developing *trustworthy* systems in the document. However, we need to educate the public, government organisations, and potential users in many professions, as well as CS researchers, as to why it is *impossible in principle* to build completely trustworthy systems (while they inhabit our physical universe), and why all potential purchasers and users should therefore always do appropriate risk analysis and *never* assume that systems touted as trustworthy will actually be totally trustworthy. (E.g. because of human error, human wickedness, unanticipated technology developments, incomplete analysis of total systems, NP =/= P, the difficulty of making computing systems that understand what they are doing, or what is happening in their environment, etc.) 4. "An archival repository is necessary so that experimental results can be shared across the computing community and made more generally available." Cood repositories and tools already exist. See above. 5. "New theories and techniques to allow tolerant, safe and scalable reasoning over uncertain and incomplete knowledge where it embraces data, metadata and knowledge activities. Tools, methods and techniques to support the design, development and deployment of large-scale ontologies. Support for collaboration and sharing across different knowledge repositories at varying scales including working across personalised knowledge structures and larger organisational and disciplinary structures. Support for semantic directed knowledge discovery to complement data mining methods The development of lightweight and incidental knowledge capture techniques. Development of network based reasoning and decision support services that can be tailored to meet the demands of different domains and users. AI researchers have been working on such topics for some time -- e.g. the ARPA knowledge-sharing effort is 9 or 10 years old, and the CYC project is even older. E.g. see http://www.opencyc.org/ http://www.cs.umbc.edu/agents http://www.cs.umbc.edu/kqml/ The problems are MUCH harder than most people anticipated, and much work on these problems is seriously (fatally?) hampered by the fact that most scientists receive an excessively narrow education, which prevents them from seeing what they need to see and thinking thoughts that go beyond *their* ontologies. (E.g. see http://www.cs.bham.ac.uk/~axs/misc/talks/#talk8 which criticises AI vision research because of its narrow vision.) 6. "Autonomic Computing" "Complex assemblies of open systems ... We must develop a supporting open digital infrastructure ... Our current approach to system building and configuration is overly dependant on human intervention and simply does not scale. We must shift to self-configuring systems ... " This is the latest fashionable topic, as the phrase "autonomic computing" keeps cropping up all over the place, including a major project IBM are attempting to lead http://www.research.ibm.com/autonomic/ (not mentioned in the report as far as I can see.) There certainly are some deep and important problems to be addressed, but it is not clear whether we are better placed to address them now than we were five, or ten, or twenty years ago, apart from having far more, far cheaper, more connected, computing power. Computing researchers are constantly announcing new potential applications for computing. Delivery is a little harder. Do we have any major new intellectual insights or research results that will support this new initiative? (Enthusiasm and computer power are not enough.) Biological evolution solved many of the problems millions of years before we recognized the problems. But I don't think anyone has much understanding of what the problems are or how they were solved by evolution. (Simple-minded notions of evolutionary computation fail to address the issue -- they merely provide another search mechanism.) Some thoughts on parts of the problem (for an IBM workshop in March on architectures for intelligent systems) are here: http://www.cs.bham.ac.uk/research/cogaff/ibm02/ Comments, criticisms, suggestions, welcome. Something not mentioned there: robust mutually supportive systems depend on mechanisms that allow A to observe B without B taking any special measures to be observed (e.g. I use light bouncing off your face to see you, and tell how you feel, one neural net monitors output signals of another neural net.) Current computing systems are not designed to support non-intrusive involuntary monitoring. (Hence the need for explicit tracing facilities.) Is anyone working on an analogy for light bouncing off computational processes to enable them to be observed (like the early mechanisms for virtual memory in Apollo computers where one CPU monitored another and took control after an access violation occurred)? I.e. in order to enhance robustness we need inherently less secure (more observable) low level mechanisms. 7. There was a potentially relevant BBSRC/EPSRC workshop on Adaptive and interactive behaviour of animals and computational systems (AIBACS) held at Coseners House about a year ago. It was fun (and sometimes fractious) but I have no idea if anything came out of it. I guess the two groups never talked to each other. The EPSRC web page that used to exist for AIBACS is gone, but google has this in its cache: http://216.239.51.100/search?q=cache:vuEQUmi8nmkC:www.epsrc.ac.uk/documents/support_for_researchers/calls_for_proposals/itcs/aibacsep_nw0.htm+aibacs+epsrc&hl=en Apologies if this is all irrelevant to the actual E-Science proposal. I may have hallucinated my own interests on to it. Aaron === Aaron Sloman, ( http://www.cs.bham.ac.uk/~axs/ ) School of Computer Science, The University of Birmingham, B15 2TT, UK PAPERS: http://www.cs.bham.ac.uk/research/cogaff/ FREE TOOLS: http://www.cs.bham.ac.uk/research/poplog/freepoplog.html TALKS: http://www.cs.bham.ac.uk/~axs/misc/talks/ FREE BOOK: http://www.cs.bham.ac.uk/research/cogaff/crp/