Technology for haptic-auditory-vision interfaces
An Enactive interface is a system that preserves the enactivness of the natural interaction between the human and the environment. Enactive interaction is not represented symbolically – instead it relies on acts/gestures/movements. It is supported by the two principles of "embodiement" and "emergence". The human - world or human – human interaction constitute a non-separable dynamic system integrating multisensory and motor abilities of the human. Consequently the mediation of human actions and sensory feedback necessary in Enactive Interfaces, and more generally Enactive Systems, introduce specific technological constraints.
Within this technological research direction we have:
- elicited technological bottlenecks that needs to be overcome to create Enactive Interfaces and Enactive Systems
- provided a typology of exemplary enactive tasks related to these bottlenecks and proposed solutions for a few critical cases
- produced draft guidelines and recommendations for designers of enactive interfaces.
1. Elicit the main bottlenecks
Enactive Interfaces is a multidisciplinary technological field. It involves Human Computer Interfaces, Virtual Reality, Haptics, Robotics, Multimodal interfaces, Signal processing, Computer Image Synthesis and Computer Sounds Synthesis. Each of these is a domain of its own with specific methods and traditions. To clarify the technological needs of Enactive Interfaces, a transversal analysis of all these domains have to be performed, taking into account the current state and the future of the computer technologies. Two main "Bottlenecks" have been extracted as guidelines for the design of Enactive Interfaces:
1. Bottleneck 1 "Toward enactive interaction" is related to multimodal relations between actions and sensory feedbacks. It addresses the limitations related to the ways in which actions and sensory feedbacks have to be linked to implement enactive interaction (Figure 1)
2. Bottleneck 2 "Toward enactive modeling" is related to the notion of metaphors as a central component artificial mediation systems, introducing the necessity of a reference to equivalent non-computerized mediated systems already used and known by humans. It addresses how metaphors have to be modified or evolved to ensure the enactivness of the interfaces (Figure 2).
Bottleneck 1 "Toward enactive interaction" - Multimodal relations between actions and sensory feedbacks
Multimodal interfaces are confronted with a first category of bottlenecks, not yet well identified, if they want to increase the degree of enactivness. We called this "Enactive Complexity".
Until the arrival of the force feedback devices, the main problems of the action-vision loop were, and are still, the spatial and visual complexity of the scene ("complexity for the eye"): 3D shape realism, large scenes modelling and real time rendering, the level of details, 3D spatial occlusions, photorealism, etc... Similarly, in the action-sound loop, the main problem is related to space exploration and identification. The main stream of research has put the focus along these lines, leading to the use of the immersive situation as a reference situation. The processing inserted between the action and the sensory feedback has mainly been geometrically, optically and acoustically based processing.
Such an approach can be considered to be driven by the sensory visual and auditory feedback, and we call it "complexity for the senses". The main technological issue addressed is that of computer processing power.
The arrival of force feedback devices as core components of computerized interfaces allowing direct physical manipulation, introduces gradually a shift toward what we call metaphorically "complexity for the hand". Manual manipulation and force feedback basically imply the insertion of computational components that simulate matter behaviours. This involves physically-based modelling, computational dynamics, closed-loop systems regulation and control, etc. The "soft" reactivity of systems designed for "complexity for the senses", will no longer be sufficient. Instead, new constraints are introduced.
Thus, three open issues have to be addressed:
- The increase of the reactivity of input-output computer architectures
- Until now, the computer evolution has been focusing on increasing the bandwidth of the chips (i.e. the computational power) and has been lagging in the improvement of the latencies (i.e. reactivity).
- Until now, the computer algorithm implementations and architectures have favoured signal processing, spatial and logical computation and not dynamic and physical computations necessary in haptics algorithms and matter behaviours (highly demanding in terms of computational load and reactivity).
- Until now, the developments have been linear, assuming the continuous growing of the computational power and reactivity. That will be no longer the case, as these evolutions enter an asymptotic phase, due both to the Moore law and to the lag in the improvement of latencies relatively to the computational power.
Figure 1. Bottleneck 1 "Toward Enactive Interaction"
The critical frontier separates spatially oriented tasks and dynamically oriented tasks. Here technological implementations are confronted with hard scaling problems:
- Scaling of space representations: computer representations and computations have to traverse several powers, about 105, in spatial scales: from meters to 10 mm
- Scaling of temporal representations: computer representations and computations have to traverse several powers, about 103, in temporal scales: from 100 ms to 0,1 ms of systems temporal reactivity
Bottleneck 2: "Toward Enactive Modeling" - Metaphors in Enactive Artificial Mediation Systems
Two different types of metaphors underlie current interaction systems: immersion and manipulation. Until now they have been considered as opposite approaches each with their "pros" and "cons". In activities that are not mediated by computers, this opposition does not exist. The functionality of the interaction is continuously changing, from environment to manipulated instrument via proximal object and selection.
- Until now, the trade-off between these two types of metaphors is not really implemented. This has to change, leading to a dynamic programming of the interaction.
Figure 2. Bottleneck 2 "Toward Enactive Modelling"
The critical frontier separates "object near-the-hand" from "object in hand" and spatially oriented tasks from dynamically oriented tasks.
2. Typology of Enactive Exemplary Tasks
Propositions of typologies of tasks in Human Computer Interfaces (Esposito 1996, Poupyrev 1998, Gabbard 1997, Bowman 2005) and of characterization of multimodal systems (Nigay & Coutaz 1993, Hovy & Arens 1990, Sturm 2002, Andersen 2000, De Suza 1993, usually do not address the question related to the level of enactivness of interfaces.
What is needed is physically based manipulation tasks requiring haptic interaction. We can say that they stop their investigation in the middle area between the "object selection" and "the tool in hand" in the figure XXX representing Bottleneck 2.
A new useful typology helping to identify the technological difficulties for Enactive Interfaces is necessary. The Enactive network has launched a proposition, theoretically based on two axes of challenge:
- Trade-off between "complexity for the senses" and "complexity for the hand".
- Versatile metaphor evolution from "immersion" to "physical object manipulation".
We establish five basic categories characterized by specific technological difficulties:
Type 1: Exploration of large environments indoor and outdoor: Localization, Navigation and path finding
Type 2: Object selection
Type 3: Object recognition, identification and exploration
Type 4: Spatially-oriented Manipulation tasks
Type 5: Physically-oriented Manipulation tasks
Figure 3. Positioning of exemplary tasks on the type of complexity axis
3. Recommendations to designers
Finally, the analysis of the previous work has led to a set of recommendations for the development of enactive interfaces. We choose to group these recommendations under the following headings: Computer hardware, boards, architecture, operating systems, drivers and middleware, Computer models and software, Interaction and multisensory rendering, Interaction metaphors and Human Computer Interfaces Design.
If we look at hardware things to note are the use of specific boards, computer architecture and input-output latencies, computer power and computation rate and data formats. For computer models and software, interaction and multisensory rendering and interaction metaphors recommendations are given for large scenes, the frontier between "immersion" and "vis-à-vis", at medium and small spatial scale.
As a summary we can say that
- no real tasks implement only a "single functionality of object".
- real tasks are constantly playing with various functionality of object.
- when performing a task in the real world, the user is moving on the axis "from thing of the environment to be perceived to thing as tools to act" and vice-versa.
Beyond the improvement of each specific dedicated technology, a general set of recommendations are:
- Implemented computer models and interaction processes have to be dynamically flexible, versatile and adaptive during the performance of the task. This transformation has to be under the control of the functionality of the object and interaction during the task. We call that "Enactive Modeling", in the sense that the modeling process itself is an interaction situation which is adaptive, similar to the processes of Enaction and Autopoïesis in living organisms.
- Tasks / Technology / human capabilities have to be analyzed together to be able to find the optimal and generic point for the real implementation. This leads to the set-up of the theoretical and pragmatic basis of a new methodology and know-how of implementations in which the three partners technology of interaction / users / uses are not considered separately, but as a kind of "new ergonomics of interactive mediated computer tools".
When it comes to human computer interfaces design we have the following main recommendations:
- Consider users, usage and functionality before technology – i.e start by analyzing the goals, tasks and actions the artifact should perform or respond to before considering which technology may be utilized.
- Involve real users and real usage into the design process. Note that aspects of usage may be situated, and thus it is important to consider user interactions also in the actual setting where the future artifact is intended to be used.
- Make use of standard usability heuristics such as Shneiderman’s "Eight Golden Rules", Bruce Tognazzini’s list of basic principles for interface design or Nielsen’s "Ten Usability Heuristics".
- Use targeted guidelines and recommendations for the type of system you are designing.
A specific case of enactive environment is the non-visual audio haptic environments. These environments are still not as well researched as environments involving vision, and specific problems occur when it comes to obtaining an overview, interacting with dynamic objects and navigation and object recognition. Thus we have put special effort into the audio-haptic area, and we have also provided a set of targeted recommendations for this type of environments which include the following points:
- Elaborate the best virtual scene for each case
- Facilitate the navigation at a virtual scene
- Facilitate obtaining a overview of the virtual scene
- Provide contextual information
- Use all available modalities
- Provide a suitable interface/interaction with the application
- Support the user in learning the interaction method and the specific application
- Collaboration and social setting
Although at this level this may sound just as saying "design good applications" (which of course actually is the top level recommendation) we provide detailed recommendations under each heading in the deliverable DRD1.2.2.