NCITS Information Technology Accommodation Study Group
Universal Accessibility Protocol Requirements Document (Draft)

Contents

Preamble

1. Introduction: What This Protocol Must Do
1.1 Provide Flexibility for Users (User Interfaces For All)
1.2 Provide IT Accommodation of People with Disabilities
1.3 A Communications Protocol to Support Universal Accessibility in Information Technology

2. General Requirements

3. Architectural Considerations

4. Informational Requirements
4.1 Entity Self-descriptions
4.2 Formats

5. Behavioral Requirements
5.1 Proxy protocols
5.2 Discovery Processes
5.3 Negotiation Processes
5.4 Configuration and Setup Processes
5.5 Task operation

6. Examples

7. Some Related Standards, Technology and Research References
 

Top


Preamble: The following principles are fundamental to this activity and are to be sustained in whatever architecture and set of protocols ultimately result:
  • The access needs of people with disabilities will be retained as a primary decision mechanism; 
  • A minimal level of connectivity at the Human Computer Interface will be retained which does not depend upon intelligence or more sophisticated communications mechanisms; 
  • The architecture/protocols developed (or modified by our efforts) here will be able to function only at the Human Computer Interface without the need for explicit networking; 
  • There will be allowance for proxies and minimal participation (e.g., a laptop with appropriate, relatively simple software should be able to serve as an accessor device in the system); 
  • An implementation may be as sophisticated and intelligent as needed for the intended use without necessarily providing service which is less sophisticated or intelligent. 

Top



 

1. Introduction: What This Protocol Must Do

1.1 Provide Flexibility for users (User Interfaces For All)

The following is exerpted from an editorial [1] given by Constantine Stephanidis at the 1995 inaugural of the User Interfaces for All (UI4ALL) Working Group in the European Research Consortium for Informatics and Mathematics (ERCIM):

"The proliferation of computer-based systems and applications in every walk of life and the anticipated widespread use of emerging telematic services has introduced new dimensions to the issue of human-machine interaction, necessitating the design of high quality user interfaces accessible and usable by a diverse user population with different abilities, requirements and preferences. This user population at large, includes people with different cultural, educational, training and employment background, novice and experienced computer users, the very young and the elderly and people with different types of disabilities. Thus, it has become increasingly important to design human-machine interfaces, which not only support more efficient and effective user interaction, but also address the individual end user needs, requirements, skills and expectations, while exhibiting a wide range of `intelligent' and `cooperative' behaviour.

"One of the domains of tele-informatics research and development, which has recently emerged, concerns the development of methodologies, tools, applications and services which support the socio-economic integration and independent living of people with disabilities. In particular, the potential of the emerging telematic network infrastructure, offers new possibilities for the socioÐeconomic integration of people with disabilities and can be exploited to facilitate direct access to the general purpose telematic terminals, services and applications. Issues related to the humanÐcomputer interaction, i.e. rendering the user interface accessible and usable also by users with functional limitations are of considerable importance and relevance.

"Currently, the plethora of existing methodologies and development tools for producing the user interface of such services and applications do not directly address the broad range of issues related to their accessibility by the various categories of disabled people. Therefore, `alternative' solutions for people with special needs have to be provided in order to support accessibility to the same computerÐbased systems and services and applications. Until now, these solutions were `adaptations', i.e. ad hoc and intuitive modifications targeted to addressing a particular access problem for a particular user (group).

"However, novel architectures and schemes for the design and implementation of development tools, may facilitate the construction of User Interfaces of services and applications, which are `inherently' accessible by all user categories. More recent work is seeking to address such issues through activities based on the principles of `design for all' and `universal accessibility'. Emerging technological advances can be exploited to design systems and tools which refine and extend the current state of the art in interface design, and support the development of user-tailored and (technological) platform-independent interfaces. This implies the development of user interfaces which can utilise the broad range of lexical interaction technologies and benefit from user-adaptability at design time and/or system supported adaptation at run-time (i.e. adaptive behaviour) according to the particular endÐuser abilities, requirements and preferences."

Additionally, governments, industry and academia have increased their focus on the importance of the human machine interface in the global information economy. More effective, efficient and natural human computer or computer mediated humanÐhuman interaction will require automated understanding and generation of multimedia, and will rely upon precise information about the user, discourse, task and context.

Top


1.2 Provide IT Accommodation for People with Disabilities

The importance of the computer in the everyday life of the individual is growing. It is becoming the vehicle by which a person gets information from the government, orders merchandise, and manages bank accounts. Access to a computer will be mandatory in the foreseeable future just to be able to transact the day-to-day business of life - and the rate of at which we are moving in that direction is accelerating.

To those individuals with limitations on their capacity to use some of their faculties - whether due to age, birth, or accident - this growth is a threat. It potentially isolates them from their communities. If they cannot communicate with the tools upon which others depend, their capacity to participate in society and the economy is diminished.

Alternatives to seeing the computer screen, using the mouse, and typing on the keyboard are emerging. Voice recognition is able to understand continuous speech, eye-trackers can control cursor positioning, and screen readers use synthesized speech to make visual information available to the sightless. Connecting these devices to popular applications is problematic, however.

Project Archimedes [2], a research project based at Stanford University, has established an architecture, the Total Access System (TAS), that provides a mechanism for resolving this "disconnect." The architecture separates the devices that compensate for a disability from the application that provides the application functionality. A compensating device, called a "Personal Accessor," might be a speech-recognition subsystem that can functionally substitute for a keyboard, or it might be a head-tracking device that simulates a mouse. These accessors need to be easily interfaced with different manufacturers' computers; the Project Archimedes team has defined a "smart interface", called a Total Access Port (TAP), that allows a single accessor to be used with a variety of computers and device controllers. The protocol and architecture whose requirements are described in this document generalize the Archimedes Total Access System.

In general, the participants in developing the assistive technology market have not been the major manufacturers of computing equipment, but rather specialists in this technology domain. The process of moving these products to the mainstream will depend on the joining of these smaller companies and the larger IT vendors on a partnering basis.

IT companies must find sustainable reasons to steer technology in a direction that provides access to computing for those with disabilities. Such a motivation could come from market opportunity. The market is expanding with both new users with disability, and new uses for this technology for the current non-disabled users.

According to PCEPD, the 54 million Americans with disabilities have a combined income of 700 billion dollars. Of that figure, 175 billion represents discretionary income. The fact that almost thirty percent of American families have at least one member with a disability means that there will be a significant market for assistive technology and accessible computing. The demographics of age are changing. In 1997, 12 percent of Americans were 65 years of age and older and 50% had some level of disability. By 2030, the number of older Americans with disabilities will almost double, and half the population will be over 65 years of age.

The march of technology makes possible appliances that yesterday were only science-fiction dreams. Development of assistive technology is a catalyst for discovery of new ways to make computing easier for everyone. An alternative use for voice synthesis application using a screen reader - essential for a blind person - might help a technician with busy eyes. Voice recognition is useful for a worker with busy hands, and it is crucial for someone without any hands. An eye-tracking pointing device is useful for many applications and mandatory for some physically challenged individuals. Thus, the size of the market for any assistive technology is not confined to those with disabilities: It includes others that have special needs due to new applications of computing in the workplace and the home.

The "new users" enabled by accessible Information Technology, when added to these new uses, comprise the yet-to-be-tapped market of accessible technology. And there will be ripples of new business opportunity on top of this technology. The continuing cost erosion in the industry allows for affordable solutions for people with disabilities. One implication of Moore's law is that in 18 months, the cost of computing functionality will be half of what it is today. Applying this thinking to assistive technology suggests that it is time to focus on the marketplace for individuals with disabilities. Free enterprise will reward the stockholders of companies that address this market. At the same time, IT can deliver on its promise that it has held out to the world - namely to make life richer for even larger numbers of users.

To gain creative solutions and not freeze the technology at a single point in time, it will be important that regulators and legislators allow the marketplace to drive the necessary solutions. Providing an environment that can allow the marketplace to move toward resolution of the tension between explosive development and cost reduction in Information Technology and the need for solutions for disabled users to endure this rapid change is the goal of this architecture effort. An architecture which couples assistive technology to the computers of today and to tomorrow's ubiquitous computers is necessary to let the assistive technology and the PC and embedded computer technologies evolve independently without compromising their connectability.

The final piece of this puzzle involves partners. Finding, developing, and joining business with companies that are ready to move these technologies into the market will mean that regulators and legislators, as well as the marketplace, will see rapid progress toward accommodation for more customers. An architecture that allows partners to join their technologies will mean a more cost-effective approach to joining their ventures.

Top


1.3 A Communications Protocol to Support Universal Accessibility in Information Technology

What's Needed and Why

A UI4ALL white paper by Constantine Stephanidis, et al.[3], expresses Research and Development Activities required to realize the goals of user interfaces for all, especially including persons with disabilities. Among these are:

The intent of this document is to outline requirements for a communications protocol which will support the first of these Activities through promoting and utilizing the remaining Activities. This document is not concerned with with the explicit transformation of user interfaces, nor with the design of hardware or software designs needed to effect such transformations. Rather the focus is upon providing a standard means of conveyance of profiles, or self-descriptions, of user interface preferences (e.g., which may express the accommodation needed for a disability), operating contexts and constraints, and system/device capabilities for matching/negotiation decisions toward configuration and operation to support the desired interaction. The standard would provide the concise definitions of service interfaces, message formats, and the information structures and behavior required to realize the conveyance of profiles in a wide variety of architectures and technologies. Detailed descriptions of preferences, operating contexts, constraints, and capabilities is beyond the scope of this document, but discussion is provided here to enable the reader to obtain some sense of what these entail.

Explicit negotiation, rather than simple atttribute/capability matching may be required in some systems. For example, in ATMs and public kiosks, there may be many hundreds of different people who use the device per day. Each of these persons would have their own particular way (plus local environmental conditions) of interacting with the system, so that many different profiles would need to be negotiated. Contrast this with a PC which may have only one user, with an essentially non-varying environment. Here the negotiation may need to be done only once, when the system is brought up. However, if the PC is on a network, then a negotiation may be necessary with each different system on the net with which the person wants to interact. Now consider a home network in which each person in a home has their own preferred way of interacting with the appliances: children different than parents, different than grandparents, different than family member with disability. Negotiation may be necessary as each new appliance (or new person) is added, but after that operation can use profiles extracted immediately from a local repository. However, if a new family should happen to move into the house, negotiation would have to begin anew.

The way that the protocol is intended to operate implies a general architecture for support and realization. This architecture is not intended to supplant existing or envisioned architectures, but rather to supplement them with universal accessibility. The architecture and protocol need to work with everything from hand-held devices to ATMs and kiosks, from home consumer electronics networks to global virtual enterprise networks. The architecture is not a "plug-and-play" architecture, but it could be implemented within such architectures. Some realizations (various interpretations) of the architecture might be distributed in nature, using agents along with CORBA, DCOM, RMI or other distributed access methods, while others could be built using local program calls and table lookups.

Because of the wide range of realizations, the protocol should not be thought of as a single entity. Instead, it should be considered to be a family of procedures of various complexities, each based around common service interface descriptions, message and data formats, and general behavior. Each member of the family might be targeted to an environment of given complexity and expected purpose. For example, the protocol family member for home networks might be far simpler than that for a virtual enterprise, because of the differences in assumptions of system and interaction complexity, operation and information exchange.

Top


Adopt, Adapt or Invent?

To determine an approach for development of the protocol, i.e., adopt, adapt or invent, more than twenty emerging technologies and architectures were examined for suitability, at least in some respect, upon which to base the protocol. The following were general criteria used to judge suitability and completeness.

The technological areas examined are in Section 7. Nearly all of these areas have something to contribute, perhaps three of them more than the others: Salutation [4], NIIIP [5] and JetSend [6]. None of these is "universal" in any sense, but they are not intended to be. Of the three, JetSend and Salutation are closer to being extensible to universality, even though both are far less sophisticated than NIIIP. Both NIIIP and Salutation would require very significant rewrites to serve the purpose of the protocol sought. JetSend has a language for description of data types and structures that is focused on device-level descriptions. However, the language should be extensible and usable for other descriptions, since it is hierarchical. The JetSend negotiation protocol is similar in outline to what is required here, and might be easily extended for non-complex (e.g., one-on-one) situations. Extending the JetSend protocol for more complex cases would require a significant rewrite. On the other hand, it would not seem to be overwhelmingly difficult to adapt any of the three architectures to include the desired protocol. In actuality, this approach is what is needed: a protocol that can be included, in some form or another, in every architecture, with suitable modifications for intgeration..

Protocols suitable for direct adoption do not appear to be available in any of the technical approaches surveyed. Sun Microsystems' Jini [7], for example, provides some of the functionality needed, but leaves significantly many of the details to the implementor and would serve better as infrastructure for the desired protocol. (Note: JetSend is intended to operate above the level of Jini). A protocol for attribute matching exists in the Salutation, the JetSend and the Jini specifications. But the Salutation and JetSend protocols are not sophisticated enough to cover negotiation using agents, as might be necessary in an Internet setting, without considerable rewriting. The Jini protocol, as already noted, requires considerable design detailing, particularly regarding agents. A very sophisticated set of protocols exists in the NIIIP specification which does use agents for negotiation of some preferences, but these preferences have to do with business rules rather than user interface preferences. The agents and the message formats would have to be rewritten. Also, the state machines may also have to be redesigned to meet the proper sequence of protocol events.

Another possible source of difficulty with both Salutation, JetSend and NIIIP protocols is that they are designed for particular environments. This means that certain assumptions about where a protocol operates within an architecture (e.g., what "level") may prevent that protocol from being easily integrated into another architecture with different assumptions, even when the architectures are targeted for the same purpose. For example, in Salutation and JetSend, there is a protocol and mechanism for matching attributes and capabilities, a central feature of the architecture sought. However, this operates near the devices and does not appear to include the user interface, although it could be modified to handle it. JetSend, on the other hand, appears to operate at the right level, but assumes that devices are communicating without any other intervention. JetSend has no relationship to a user interface, so that protocol service interfaces ("API"s) will have to be constructed. Protocol service interfaces--and possibly state machines--may require significant rewriting in order to successfully integrate a Salutation protocol into an architecture for which it was not designed. JetSend should be rather easier to integrate, but not without some development costs. Consequently, adaptation may be possible if the cost of rewriting is less than that of just designing a new protocol based upon the functionality of an existing protocol. It must be borne in mind that the protocol desired must be usable, in some form, in every architecture of interest, not just in a narrow market interest.

A possible approach rather than outright invention is synthesis. This results in a new protocol, one which is built upon previous design work but without architectural "legacy". For example, the Salutation, JetSend and NIIIP approaches might be studied to garner the functionality, tools (e.g., JetSend's description language) and formats that would be combined toward forming the desired protocol. Changes in state machines necessary to conform to new protocol service interfaces would be undertaken and new message formats would be created as necessary would be created. In order to accomplish this, a more detailed requirements document than the present one is needed. This would specify the information to be conveyed, the structure of service interface primitives required, expected behavior (i.e., abstract state machine) and architectural frameworks necessary to realize proper operation. This extended requirements document would evolve from the present document with inclusion of material from existing technology (e.g., Salutation, JetSend, Jini and NIIIP). The challenge will be to write the protocol abstractly enough that it can be incorporated into a wide variety of architectures. One way to do this is to use simpler models (e.g., JetSend or Salutation) for the simpler environments and NIIIP for the more complex ones. This begins to addresses the goal of a family of protocols of common functionality, and helps to alleviate the scaling problem.

Scaling is another concern that must be carefully considered. Some approaches are not intended to scale up (or down) and the universal accessibility architecture and protocols must be able to scale to any architecture. Consequently, the choice of the model for them is crucial. Further study is needed to determine the scaling properties of candidates for synthesis or adaptation.

Top


2. General Requirements

The purpose of the protocol is the conveyance of self-descriptions, in terms of capabilities and explicit user preferences, of components interacting on some task and their operating environments and conditions, enabling negotiations on those descriptions and conditions, and construction of configurations and interface transformations according to the negotiations for accomplishing the intended task.

There are two modes of interaction to be supported: asymmetric and symmetric. In asymmetric interaction one component uses the services of another component. For example, a person with a disability uses an Accessor, in the sense of the Archimedes project, to interact with a PC or a FAX. In asymmetric usage, one side may have no interface preferences expressed. For example, a server, in conventional client-server architectures, might not express any preferences whereas the client likely would.

In symmetric interaction, the components are "peers", in that neither side need be seen as user or provider of services. For example, in E-Commerce the interactions may be for the purpose of trading or negotiating. Another way of viewing this is to consider that a service user-provider or client-server model is an instance of the asymmetric mode, whereas the symmetric mode might be represented by collaborative interactions, such as chat, trading, white boards and the like.

Conceptually, two Accessors should be able to communicate in a symmetric mode. For example, at present two cellphones can communicate, through the cell network. Consequently, two persons with disabilities should be able to interact symmetrically, with the preferences of each being taken into account.

When interfaces are to be implemented with multimodal interactions, collaboration among modal-accessor elements will be crucial. An example of this is a military command post where speech, visual, audio and gestural controls must be coordinated. Each officer will have preferences based upon rank, experience, specialty and mission assignment, in addition to personal effects.

Preferences in symmetric-mode operation need not be entirely based upon the user interface configurations; preferences could be involved with a particular "face" or role the user wants to present in the interaction, as in role-playing Internet games, or, in E-Commerce, particular product/price combinations over which the participants will negotiate (e.g., "fixed prices are dead" scenarios).

Some systems could be considered to be hybrid, in the sense that they have elements of symmetry and asymmetry to them. For example, in E-Commerce, a buyer and seller could negotiate on price (symmetric) on items that the buyer selects from the seller's catalog (asymmetric).

The self-descriptions of the interacting entities will include:

The task to be accomplished will impose its own conditions on the operation. This information must also be conveyed to the process deciding upon the configuration and transformations necessary for the activity.

'Operating contexts' connotes a different, broader concept than a similar terminology in operating systems or user task environments (in usability, for example). Here it means not only what tasks are being worked on but what the platform is (e.g., PC running Linux), what conditions are present in the physical environment, possibly even the state of the user (fatigue, stress, how busy, etc.).

Constraints may be due to the conditions imposed by the operating context or by a physical limitation of the user. The limitation of the user might be due to an explicit disability or to an inability to use certain interface devices due to conditions or other activities the user might be concurrently engaged in. For example, a helicopter pilot would not use a mouse while in flight. Speech input and output might be limited by ambient noise from the rotors.

Preference negotiation, as such, is to be interpreted broadly. In some cases it will be merely implied, in that the characteristics provided will either match precisely or will not be accepted. The simplest instance of this case is when devices are coupled together with a hardwire "network", e.g., coupling through a serial port on a PC. In other cases, more extensive decision-making will be necessary, in extreme cases possibly requiring the use of special agents to determine whether or not a configuration or transformation is possible, given all the conditions that must be met.

There are four principal functions for interaction at interfaces between humans and systems:

These functions need not be orthogonal to one another. In fact, they commonly overlap. For example, a mouse is used for application control via the presentation on the monitor screen, and a touchscreen keyboard can be used for data input (consider an ATM with a touchscreen). Speech recognition can be used for both control and data entry.

Another "interaction function" that takes place among humans (or among systems, or systems and humans) is the Transaction. In a sense, all of the Interaction Functions listed above could be considered to be transactions, but a narrower interpretation is intended. What is intended is typified by exchanges in negotiation or collaboration, queries, responses to queries, and the like. Obviously, a transaction could be realized through one or more of the Interaction Functions above, but this need not be the case--transactions between systems need not involve the other Functions at all.

Any instantiation of these functions implies a context of operation, constraints on usage within context, and preferences and capabilities for realizing the function corresponding to context and constraint. In current computer systems in an office, for example, a mouse is used for control through pointing and clicking. The context here is the office and the computer is assumed to be capable of interpreting the mouse. A default preference is the mouse. However, a keyboard alternative is provided by Windows, for those users having Windows and able to use the alternative. Moreover, the office applications have interfaces designed with mouse-based control in mind and generally with the assumption of the user seated at a desk.

Contrast this now to a restaurant context, where the computer is used to determine seating, ordering and billing, and generally the user is not seated. Often the applications are designed for touchscreen or stylus control and a mouse would be entirely inappropriate for control input.

Then contrast this with a travel computer in a car where again the user is seated but may have both hands busy. Here neither a mouse nor a touchscreen is appropriate, but other modes of control input, such as speech, might be available. Note that this last example is functionally equivalent to a person seated in a wheelchair who has no effective use of his or her hands.

A disability can be expressed as a constraint to operation in some interaction function in some contexts. In this way, descriptions and preferences can avoid reference to any disability and be expressed entirely in terms of contexts and constraints, thus readily accommodating privacy concerns. A blind person might describe his or her preferences in presentation to be aural in relatively quiet contexts and tactile otherwise, without stating that he or she is sight-impaired in any way.

Sometimes a disability is irrelevant to a condition in a context. For example, a blind person who can read braille is generally able to read whether it is dark or light, quiet or noisy in the surrounding environment. On the other hand, a disability might be irrelevant in one interaction function but crucial in another. A person who has only a mobility difficulty might not be constrained by this disability in visual or aural presentations, but might be considerably constrained when control has to be actuated through the visual presentation--links in small font on the browser screen, for example.

Top


3. Architectural Considerations

The protocol requirements are specified in the context of a general architecture that is intended to capture most if not all of the functionality demanded by very broad applicability: the approach is intended to apply to all Information Technology environments envisioned now and for the future. Moreover, the approach is not intended to be limited to accommodation of people with disabilities, nor to a specific technology or set of technologies.

Because of the wide range of possibilities that must be covered, multiple classes of behavior, from very simple to quite complex, must be included. For this reason, a single, all encompassing protocol is not envisioned. Rather, there should be a family of protocols, with core functionality, that correspond to the various levels of complexity encountered. By building around a core functionality, any new complexities (or simplifications) could be accommodated by just creating a new "family" member; an identifier in the initial handshake would determine the class to be used in any partiular instance.

The architecture below must not be viewed as supplanting any existing or envisioned architecture or system, but as supplementing them to provide universal accessibility.

The diagram below shows the general services arrangement for the accessibility protocols within the the communications hierarchy. The grouping shown is explained below. Except in the simplest realizations, the accessibility protocol "stacks" should be capable of being altered ad hoc to meet the service needs. This will be especially useful when complex collaborations or multimodal interfaces are to be configured. By not using fixed stacks, an implementation can build stacks as needed and attach them as needed to applications. One way of doing this is to use a stack template class and populate it, to create stack instances, with objects representing instances of protocol function classes. By following this approach, protocol functions may be downloaded when updates are necessary or when an entity does not have a protocol that is needed for some interaction

The Accessor/Transformer entity shown in the diagram couples a user's interface device to the communications system. It is analogous to the Accessor/TAP in the Archimedes architecture, but does not directly correspond to the Accessor/TAP functionality. In Archimedes, the communications corresponding to the communications described in this document exists between the Accessor and TAP. In the usage here, the communications has been conceptually split, with the Accessor doing some of its own communications independent of the Transformer, such as lookup and possibly negotation. The Transformer can be internal to the Accessor or it can be an external proxy. The Transformer and its functions are described farther below.

In asymmetric mode operation, the Accessor on the "server/target" side is replaced by a server/target application. In symmetric mode operation, both sides have Accessor/Transformer components.

 

Accessibility Protocol Structure

The Immediate Services are for an operation that has previously been negotiated and setup, or which is a direct connection between the session initiator ("Initiator") and the responding entity ("Respondent"). For example, accessing a frequently used application or service would not need to be renegotiated every time. The Mediated Services provide for negotiating or updating operating features and configurations between the Initiator and Respondent, as well as for collaboration among a group of systems. A service or device used by a large number of different individuals, such as a kiosk, would almost always involve negotiation. However, if there were repeat sessions (e.g., a person using the same ATM week in and week out) the user might be "remembered" by the system and interaction could be handled by the Immediate Services instead. In such cases a repository becomes crucial to operation.

The Connection Technology provides for discovery and lookup access, as might be provided, for example, by Jini, CORBA or any of the home networking systems. It might also provide Transaction Services, to ensure that the preferences and capabilities are properly received and that negotiations, configurations and setup are received and committed to. When entities are not closely coupled (e.g., not on the same hardware platform or in a network where entity or service availability has some probability of failure) this transaction should be a two-phase commit protocol. For example, when the to communicants are distributed in the Internet, the two-pahse commit protocol should be used. When negotiation takes place an Accessor and a kiosk, the kiosk itself may be doing all the negotiation, configuration and setup, and the connection with the Accessor may be considered coupled closely enough that reliability is not a significant question. Here a two-phase commit protocol would not be needed, but each transfer of information would have to be acknowledged.

The Operation Services provide Session and Presentation level transfer of information (i.e., data, messages, control) between connected entities. The Immediate Services can use the Operation Services or the Networking Services directly. The Operation Services provide an "API", appropriate for accessibility or collaboration functions, for Networking Service protocols that are needed to support the interactions. This may not be needed in many cases, whereby the Networking Services protocols can be used directly. However, the configurations created as a result of negotiation may require coordination of the Networking Service protocols that cannot be anticipated by an application. Consequently, an "API"--a set of classes instantiated and arranged ad hoc in configuration and setup--will be necessary to provide the coordinated interface to the application. This may be more likely to occur with collaborations and multi-media/multimodal access than with other kinds of access (e.g., one-on-one messaging).

Networking Service is to provided by any networking technology appropriate to usage, including the Internet, IR and RF technologies, LAN technologies, power line technologies, phone line technologies, etc.

In the diagram below, Initiator and Respondent are two communicants; the Initiator begins the session ("is the calling party") and the Respondent is the entity with which the Initiator wishes to interact ("is the called party"). In the Archimedes architecture the Initiator corresponds to the Accessor and the Respondent corresponds to the Target Device. However, for the sake of complete generality, the Initiator need not be an accessor device--the Respondent may actually be an accessor system. The diagram assumes that discovery/lookup (see Section 5.2) has already taken place and that transfer of preference profiles and capabilities has been completed. The system is ready for negotiation to begin.
Proxies in this architecture are analogous to functions of the TAP in the Archimedes Architecture. However, unlike TAPs, the proxies may be part of the entity that invokes them. In this situation a protocol (or internal interface) specific to the invoking entity connects the two. Like TAPs, they could also be physically separate from the invoking entity, connected to it by proprietary protocol or by a standard protocol. In some situations, all of the proxies shown--for one communicant--could be contained in a single entity. Any of the proxies could be implemented within Initiator or Respondent or totally separately as agents, devices or other servers.

Any of the Protocols in this diagram can be virtual. That is, they could be implemented, not as explicit protocols like TCP, but perhaps as parameter passing in local method invocations or even database queries. This may be especially true of the protocols which are specific to an entity, such as the Intercession protocols described farther below. [Such protocols may be proprietary to a device or system or may require special implementation. Discussion of them is beyond the scope of this document.] Some of the proxies, e.g., the Negotiation and Configuration proxies, may be in fact implemented on one or the other of Initiator or Respondent.

Similarly, Repositories could be locally implemented or could be network services. For example, an ISP might offer repository services to its customers.

 

Preferences Configuration

The Intercession Proxy provides the ability ("intercedes") for an entity to execute a Transaction Protocol during preference negotiation, when one is necessary (e.g., the entity is not powerful enough). The Intercession Protocol may be entity-specific. The Negotiation Proxy possesses the ability to perform negotiations on preferences and capabilities. It passes the results of this negotiation back to the communicants and to the Configuration and Setup Proxy. The negotiations can be very simple, i.e., direct attributes matches, or more complicated, with preferences that influence one another or with acceptable ranges of behavior as an expressed preference. This will depend greatly upon the nature of the interacting entities and their preferences and capabilities. Having been presented the preferences and capabilities negotiated by the communicants, the Configuration and Setup Proxy decides how to configure the system between the communicants and what information is necessary to setup the interaction. This agent may use decision theoretic means to determine and optimal configuration (if one exists), or it may be simple table lookups based upon attribute matching, depending upon the nature of the communicants' profiles. When this has been accomplished, the communicants are informed and the proxy sends each communicant the appropriate information to set up the interactions.

Once the preferences and capabilities have been realized in a configuration, the communicants can begin to interact with one another. In order that the preferences and capabilities are respected during operation, it may be necessary to perform transformations on data passing between the communicants. This is exactly analogous to TAP operation in Archimedes. There are two modes: 1) an operation protocol carries information to a Transformer Proxy of the other communicant, which transforms the information and passes it via an entity-specific protocol for interface use; 2) an entity-specific protocol sends data to its associated Transformer Proxy, which transforms the data and passes it via an operation protocol to the other communicant. This is necessary because not all of the transformations that are to be made are consistent with only one mode of transfer. A simple example of transformation is speech-to-text. The speech can be converted to text before being sent, or the speech stream itself can be sent and converted when received. Clearly, this depends upon the nature and capability of the device where the speech is input. If neither the sending nor receiving devices is powerful enough, proxies will be required. This might occur when the sending device is a wireless unit with microphone input and a vocoder, and the receiving device is an appliance that understands only text-based commands.

Top


4. Informational Requirements

4.1 Entity Self-descriptions

Because every user will have different needs, and systems and collections of devices may have configurations that change with every session, it will be necessary to use self-descriptions of entities to provide flexibility. This information is used to determine operating conditions and configurations for a given session, and to support dynamic changes in these conditions and configurations during a session. These operating conditions and configurations are the result of negotiation, either by one of the communcicants or by a neutral party. Simple descriptions will almost always result in simple negotations (may be only attribute matching), while complex descriptions will often result in complex negotiations, including such things as matching an offer to a range (similarity), weighted decision models and counter-offers.

Profiles

The self-descriptions, referred to as Profiles, include: information about the user's working context; about the task that is to be performed; what kinds of constraints are placed on how the task is accomplished, e.g., by the working context or by limitations of the user; what solutions the user prefers to apply to ameliorate the constraints; platform and communications capacities; device capabilities and limitations. However, not all sessions will use or even have this information, depending on what the interacting components are and what is to be done. For example, in coupling a headtracker to a computer through a wireless connection, the profile might only state that for all contexts, this user prefers the head tracker as a pointing device to be used for control and data entry, and that visual presentation is preferred in all contexts. Note that no mention is made of a disability for the user. In fact, this user might not be disabled at all, but merely does not have hands free to operate a mouse and keyboard. It is important for privacy protection that any characterization of a user's disability not be present in a profile.

More complicated profiles could occur because the user is affected by noise or heat, for example, or because a tactical military situation prevents high light levels on a screen. A user might also find after a while that he or she is becoming fatigued or that the cognitive load in the presentation is too high. Again, these could be related to disabilities or not at all. The profile structure, therefore, must be able to handle various levels of complexity. In order for this to occur, there needs to be an abstract description language to provide the flexibility of structure is required. This would enable implmentation approaches appropriate to the complexity.

A profile is made up of Interaction Parameters, as described below. In practice, a profile may be "sparse", i.e., not completely filled. For example, when "widely recognized" defaults are to be used, there is little need in specifying their characteristics--the defaults could be referenced by a standard code or just applied indiscrimininantly when nothing is specified for them. Similarly, when a Parameter is frequently invoked by a user, it could be referenced by a code. A Parameter might be irrelevant to a given operating environment and could just be ignored. A Preference might be chosen to apply no matter what the context. For example, a user with a disability might have an accessor which has a breath switch. Regardless of the context, the user has but this one choice.

Interaction Parameters

Interaction parameters consist of the Task and the Interaction Functions that it uses, the Contexts, the Constraints and the Capabilities and Preferences. The sense of this is that Task, Context and Constraints determine a set of possible choices for interface elements. The user then states (explicitly or implicitly) a preference among these choices for the session. It is also possible to change preferences within a session whenever changes within the Task, Context and/or Constraints occur. Thus, for example, a user on the bridge of a warship may want to change to a dim navigation display when night comes on to avoid unnecessary light that may betray position. The diagram below shows the structural relationship of the Task, Context and Constraints. The interpretation is that the Interaction Functions of the Task are constrained within a given context. The constraints may be dependent or independent of context and may be different for each context of consideration. It is also possible that a constraint is irrelevant to a context or does not exist for a given context. Each of the parameters is discussed in more detail below. Capabilities are not task dependent in general, but capabilities must be properly matched with Task Interaction Functions in order to carry out the assigned task.

 

Contexts

A context consists of several environments which may be interrelated. Three primary environments are:

The Physical environment consists of the physical area where the user is located and the conditions of that area. For example, the Physical environment of an office might consist of a user space (in a room or cubicle), air temperature and humidity, lighting, furniture, and noise level. The Computing environment in this office might consist of Windows 98 (or MacOS or Linux) in a client-server system. The Personal environment might include time of day, levels of stress, fatigue, physical strain, discomfort. As another example, when accessing an ATM, the Physical environment might describe the area around the ATM, the air temperature and humidity (is it hot? is it raining?), lighting around the ATM, and noise level. The Computing environment might consist of the ATM embedded processor and wireless access to it. The Personal environment might be the same as for the Office example, but because ATM access is momentary and office usage is sustained over a period of time, the effects of the Personal environment are not as likely to be a factor. Personal environment becomes more critical when the Physical environment factors are likely to cause stress or fatigue, such as in a fighter cockpit. Computing environments are important to the extent that they condition the usage and interface configurations possible. For example, the computing environment for a hand-held wireless device is different than that of an office PC on an extensive intranet.

Preferences

The preference model is not a user model, but some preferences may be derived in some cases from a user model. In general, preferences represent an opposing concept to the user model, in that preferences more explicitly represent an individual in contrast to a user model which is usually the aggregated characteristics of many individuals. "Default" preferences may in fact be those of a large number of persons. For example, at present a default preference is the use of a mouse in a windowing environment in a home or office context, since millions of people successfully use a mouse in this situation.

Preferences are almost always context, task and constraint dependent. They are similar in some ways to capabilities in that given a set of capabilities (ways of doing something), a user makes a choice--states a preference--among the capabilities to accomplish a task in a given context with certain constraints.

Sources of preferences can be from direct input by a user (or someone acting for a user), from intelligent agents observing the user's behavior, from user models or from usability studies.

Capabilities

Capabilities in this usage have as much to do with the capability of a system or device to transform objects for interface adaptability as for device characteristics. For example, if a user is trying to use a simple appliance, the appliance probably will not have the processing power to provide interface transformation for the user. When capabilities and preferences are examined, it will be clear that some kind of proxy is needed to do the transformation. In HomeAPI, this might be an object added to the API which performs this when needed, as an adjunct to the device properties checking function. Capabilities are not task dependent in general, but capabilities must be matched up with Interaction Functions in order to carry out the assigned task. For example, control input should be carried by a channel with as low a latency as possible. On the other hand, high bandwidth is needed for high volume transfers, and latency is of little concern. In accommodating a disability, the capabilities of the accessing device are as important as those of the target system or device. A wireless accessor using speech input may not have the power of converting the speech to text before transmitting. Similarly, it may not have the ability to convert received text back to speech.

Constraints

A Constraint can be thought of as anything that limits the efficiency and/or effectiveness of the user to employ particular instances of Interaction Functions.

Impaired vision, for example, limits a user in employing visual means of Presentation. However, the person's vision might be impaired not by physical disability but by law, because some state does not permit a video screen in an autmobile that can be seen by the driver. Persons who cannot use their hands, because of a disability or because their hands are busy with other activities--perhaps flying a helicopter--are limited in their means of employing hand-activated Control and Information Input devices. A Presentation might need to be limited in complexity because the user has a cognitive disability or because the operating environment is high stress and the user is cognitively overloaded--perhaps a fighter cockpit in combat.

Clearly, there is a wide interpretation to be had for Constraints. They can be the result of disability, of conflicting user activities, of the context or environment of the user and the system, or of the state of the user. As indicated in the Interaction Parameters diagram above, categories of Constraint include: Physical, Cognitive, and Emotional.

Task Analysis

A task may be considered to be a collection of different instantiations of Interaction Functions.

Since Tasks can be thought of as being made up of Interaction Functions which are invoked in sequences or in groups to accomplish some goal, Task descriptions must give some sense of what is to be done and what functions are necessary to achieve it. Ideally, task analysis for the most common types of tasks will done off-line; then a simple index into a repository would be used for selection. Simple tasks, such as pointer actions (i.e., from mouse, touchscreen, eyetracker, etc.) may require only simple descriptions. Other kinds of tasks may involve complex collaborations and very complex descriptions. Some descriptions may not be adequate to make an automated analysis, and human intervention could be required.

Some common tasks are:

Capabilities will indicate the ability of one side or the other to handle such tasks. For example, one side may wish to control a device in some aspect. If the device does not provide access to that control, then the task cannot be accomplished. Access control (security) may also be considered a capability. If, for example, access is requested to write into a military database, and the user has no authorization to do so, then the task cannot be accomplished. In this sense, the protocol proposed might be successfully used in role-based access.

A task in a given context under certain conditions will provide a set of possibilities for preferences. In certain combinations, there may be no choice, or a preference may be meaningless.

4.2 Formats

A general format for expressing a profile is shown below. When only capabilities are being conveyed, the Preferences part is empty. This would be typical of a Respondent in an asymmetric mode of operation. A communicant might also transmit only the Preferences part. Any entry may be empty. Each Interaction Function has its own set of Contexts and Constraints, because one Function may be differently affected by a context or constraint than another. This may result in some duplication, but it is necessary to maintain flexibility. The structure implies that a hierarchical description language would be useful in describing the format for transmission.

Profile Format

Top


5. Behavioral Requirements

5.1 Proxy protocols

Since proxies are used whenever a device or system does not have the power to communicate, negotiate or execute certain operations on its own, protocols between the device and the proxy are required. However, since these protocols may be device-to-proxy network and device dependent, any details are out of the scope of this requirements specification. However, a proxy protocol must be capable of carrying pertinent information from or to the entity it represents. For example, consider a proxy for an accessor to a kiosk and that the accessor is an RF-linked device that is only capable of transmitting identification and preference information. The user prefers to interact with the kiosk through the kiosk's own speech input and output devices. The accessor is not capable of receiving any information from the kiosk (e.g., it might be a smart card). The proxy then negotiates with the kiosk for the user and notifies the user when setup is complete. Since the user has chosen audible output, this could be done by a simple audible signal or a spoken "OK". Note that a spoken "OK" is likely to be understood by almost all (hearing) users regardless of language, since it is used throughout virtually the entire world.

5.2 Discovery Processes

A protocol that an Initiator (Respondent) uses to invoke discovery (respond to a discovery request) is needed above the actual discovery service (of the Connectivity Services) in order that receipt of the transmitted data for identification and preference profiles can be acknowledged. This is especially critical for collaborations and multimodal sessions. The protocol must cover the situations where a system or device is only making itself known to the repository, is providing a profile, or is updating a previous profile. It also must cover the case when a previous profile is to be used again and a simple reference to it is provided. The diagram below shows the elements involved in the Discovery Process.

Discovery Process

The Discovery Processes consist of Indentification, by which an Initiator makes itself known to the "universe" in which it exists or intends to operate, and Preferences Declaration, in which the preferences of the user are declared to ameliorate the operating conditions. At a minimum, Identification consists of providing a locator and a description of what the entity is (e.g., a computer, a device, an appliance, a service). In the general case depicted in Section 3, the Initiator is represented in this by a proxy, when the Initiator is not be powerful enough to handle the communications on its own. (The Respondent may also have a proxy for the same reason.) In any session, an Initiator has either made itself known previously or it has not. The Discovery protocol must be capable of handling both of these cases. The Discovery Process is supported by a Lookup Service in the Connectivity Service. An Identification session can have five purposes.

  1. For identification only, in which no preferences profile is to be registered, i.e., a session with a Respondent will be undertaken at a later time.
  2. A preferences profile is to be submitted, but no contact is desired yet.
  3. The Initiator is to provide a preferences profile and desires contact with a Respondent, either known or unknown.
  4. When the Initiator has previously identified itself and has previously submitted a preferences profile and desires a session with a Respondent, known or unknown.
  5. The Initiator wants to update a previously submitted preferences profile, with or without communicating with a Respondent.

    If a Respondent is unknown, then discovery through the Connectivity Service is needed to locate a Respondent before communication can begin. Since the Respondent is presumably unknown, it must be found by characteristics that the Initiator provides. If such a Respondent has not previously registered, then the session cannot be established.

    In any case, the preferences profile is not actually sent to the Repository until the Repository requests it. This is necessary to ensure that the Respository has actually received the Identification Request and is able to act upon it.

    The Preferences Declaration proceeds when Identification has concluded and consists of a request to an Initiator for a preferences profile, as described in Section 4. This activity may also include a request to a Respondent for capabilities and, in the symmetric mode, a Respondent preferences profile as well. Each of the communicants should acknowledge receipt of the request before the Negotiation Process (below) can begin.

    The functional requirements outlined above are the basis for the grouping of discovery activities into Mediated and Immediate Services, as discussed in Section 3.

    Top


    5.3 Negotiation

    When the network and connectivity is such that there is a possibility that a resource (e.g., target system accessor or negotiator agent) might not be available because of network or host crashes or overloading, a two-phase commit transaction protocol may be necessary to assure that negotiation transactions are completed. This may not be necessary when operating in a tightly coupled or robust network system (e.g., a home network) where unavailability of a resource is unlikely or negotiation is infrequent. It may also not be needed when when unsophisticated negotiation (i.e., simple atrtribute matching) is used. However, some form of commit and acknowledge must be used when every session requires negotiation, such as in public kiosks, ATMs and automated POS terminals. It is also crucial in collaboration or other "federated" activities.

    The form of the preference profiles will determine the degree of negotiation required. Simple attribute matching generally will not require the active participation of the prospective communicants. On the other hand if there are several options presented for some property or preference, agreement by the prospective communicants may be necessary to make the decision before configuration can take place.

    In any case, the Negotiation Process should proceed as follows:

    1. The negotiation entity (i.e., proxy or agent) will notify the communicants that negotiation is beginning. Until all communicants commit, there can be no negotiation. Consequently, the negotiation entity waits until it has received commitments from all communicants.
    2. When all communicants have committed to negotiation, the negotiation entity signals that negotiations have begun.
    3. At any point during negotiation, the negotation entity may contact a communicant for the following reasons:
      • The negotiation has failed (for stated reason);
      • There is a conflict of preferences and capabilities--submit another preference (or capability)?
      • The negotiation entity has found a better option than the communicant originally offered--accept it reject it?
      • Negotations have successfully completed.
    4. When negotiations have completed successfully, the communicants are all notified and Configuration and Setup activities can begin.
    5. When negotiations have not successfully completed, connection with each communicant is severed. Some systems (communicant or negotiation entity) may want to do post-mortem analysis to avoid further problems or to improve future opportunities.

    The negotiation entity will obtain the preferences and capabilities information from the Repository. If the negotiation entity is not co-located with the Repository, then a separate protocol will be necessary to tranfer this information from the Repository to the negotiation entity.

    Top


    5.4 Configuration and Setup

    Once profiles from the communicants have been negotitated, it will be necessary to convey information on configurations and for setting up the interaction. This can range from very simple information, e.g., how to couple to a device's driver and what data types will be used, to very complex, e.g., locators, interfaces and protocols for agents, proxies and collaborators. Since this information is critical to the success of the interactions, it needs to be reliable.

    The actions required for Configuration and Setup are

    Top


    5.5 Task operation

    Operation of tasks will often require protocols to communicate from one system to another. For example, in ATM operation, once contact has been made and the ATM's interface is configured according to customer preferences for the task, it remains to get particular data regarding the banking transaction (e.g., dollar amounts) to the ATM. Because the system may be using wireless communications for this and because of the criticality of the transaction, some form of reliable interaction must be used. A full two-phase commit transaction protocol is not warranted across the accessor link, but some form of commit and acknowledgment is necessary to assure the customer of the local completion of the transaction. Other protocols will be required to send interface actions and events to the ATM, and ATM reactions back to the customer's device interface. These could be "lower" level protocols with low latency. If the user happened to be using speech input, this might be transmitted directly to the ATM for recognition processing. This would require a different protocol than if (speech-to-) text were being used on the customer's device. If this same device is now used for obtaining street traffic information (perhaps the customer is blind), still a different protocol might be used.

    This suggests that protocols might not exist in the device as fixed stacks but temporary stacks which would be built ad hoc from classes representing protocol functions and attached to the application/user interface as needed. If this approach were used, then protocol functions that a device did not have could be downloaded from a class library. Stack skeletons would be used for commonly occurring protocols. But this also means that some form of coordination function and interface will be needed in order to present an application a common way of using the protocols instead of a variety of different interfaces. While this is not necessarily an implementation issue for simple environments, with collaboration and multiple interaction communications it will be important to developers.

    Top


    6. Some Examples

    Computer-based Jobs

    The picture below is intended to represent controlling a FAX using a PC as an accessor. The task is to send a FAX of a document that is not already in digital form (i.e., on paper), so that a FAX internal to the PC is not an option. The PC has software that can control the FAX and can adapt its screen presentation to the user, through larger fonts, different contrasts, different layouts, etc. It could also have speech input and output, a touchscreen, headtracker, eyetracker or a haptic pad for control inputs, depending upon its users. These could be connected to the PC via a TAP as in Archimedes, but a more flexible approach would be to use an IR or RF link for those personal devices that would have to plug into the PC. This would alleviate, for example, a person who has a mobility disability from having to plug in a headtracker each time just to send a FAX. The software would permit a person (perhaps a system administrator-type) to select and setup the preferences for any user in any of these access media. These would be just stored on the PC and called up whenever the user came to send a FAX. If each user were given an identifier device that linked to the PC via an IR or RF link, the preferences could be brought from the repository without transferring them from an external source. There would be little need for negotation. A protocol (in green on the picture) such as that in Salutation (or Jini or Chai or Universal Plug and Play or ...) coupled to the application would be used to actually control the FAX. The FAX and PC are shown connected by a LAN. They could be connected directly, but then the FAX might not be physically available to everyone, and an added advantage is that more than one PC can be connected to the FAX via the LAN. In may ways, this example is not very different from the Home Automation example below.

     

    PC-operated FAX

     

    The example above can be extended in many ways. For example, suppose that there are several computers all linked to server via a LAN and any one of these is to be used by temporary employees who are to enter data into a database. A given employee may not be assigned the same computer every day (they could be assigned as they arrive). For employees with no disability, this is not a problem, in general, but the person with a disability could have a difficulty in using a different computer every day. The baseline Archimedes solution would physically plug the user's device and accessor into the computer, but this would have to be done anew every day. But if the coupling were RF- or IR-based, then the user would only have to be in proximity to the computer to link with it. The user's preferences could be pre-loaded into the server and negotiation (if needed) would take place between the server and the client machine. The user would be indentified through the accessor link to the computer. Other non-disable users could also have their own preferences enable in a similar way--their identifier device might just be smaller.

     

    Open PC Access

     

     

    Top


    World Wide Web

    With existing technology and the accessibility preferences, it is possible to make Web pages more accessible than they are now or are proposed to be. This does not obviate the need for more accessible interfaces on browsers [10], but instead provides for more accessible content. This might be accomplished through the use of XML Schemas. A Schema would provide the structure of the content of the page, with tags to be interpreted by the client according to preferences. In realization, activating a link (aka "clicking") would cause the HTTP to be captured by a proxy, which would insert its own locater in place of that of the client, much as is done with a firewall. The server would respond with an XML Schema of the page to be accessed. The proxy would interpret this in terms of the preferences provided to it by the client. The interpretation would be returned to the server which would then build style sheets according to the interpretation provided. In general, this would be difficult to do, because of the wide range of possibilities in XML Schemas and in interpretations. However, for certain areas that expect wide usage, such as public information or commercial sales, and therefore emphasizing accessibility, there should be conventions and these could be represented as capabilities to be negotiated by the proxy and server along with preferences. For example, for a blind person, text might be interpreted by the proxy as requiring a reader. But this need not be a conventional screen reader. The server could provide an applet along with the text that reads the text through the client machine's speakers. The applet might be activated by a mouse moving over a "hot" screen area. This can be done now with Java and JavaScript. Other presentation objects might become common and reusable, like JavaBeans, so that conventional interpretations might be had.

     

    World Wide Web

     

     

    Top


    Kiosk

    A public kiosk poses special problems for universal accessibility because it must be adaptable to any person who comes up to it. This means that it must have a wide variety of input and presentation media available for people who can get close to it and for those who cannot. In the latter situation a kiosk must be able to get input from a user's accessor device and to be able to provide that device with presentation material as well. Consequently, negotiation of preferences and capabilities is critical, especially since the likelihood of repeat encounters with a given person are vanishingly small. The diagram below shows how the the kiosk might be configured for universal accessibility. None of the kiosk's components that it ordinarily uses for conveying information--"show" software, databases, communications to servers, etc.-- are shown on the diagram.

    At the bottom of the diagram are two accessors, one connecting to the kiosk via and RF link and the other through an IR link. (This does not represent that both accessors must be present or that both take part in negotiation). A person with no disability might also come up to the kiosk and use it in the conventional manner (i.e., default preferences); this is not shown. A proxy will be needed if the accessor is simple and would be part of the kiosk system. This would be analogous to the case where preferences or other identifying data were input via a smart card: something must process the card-reader output. On the other hand, an accessor might be powerful enough to be its own proxy, in which case the RF or IR link would be on the kiosk-side of the proxy. In this case, the accessor might indicate in the initial handshake that it does not need a proxy.

    The kiosk handles all negotiation, since it will, in most cases, be configuring its input and presentation elements for the session. It also maintains repositories for preferences and configurations. There might also be an agent which examines the repositories to learn of accessor, preference and configuration patterns, in order to make the negotiations and configuration more efficient.

     

     

    Public Kiosk

     

     

    Top


    Home automation/networking

    Below is a diagram showing how HomeAPI might be made "universally accessible". The Accessor Attachment Network could be a separate network, such as IR, or it could be an adaptation of the Home Network that supports the HomeAPI device connectivity. The Accessors would be personal devices that each different person might use, with personal interface elements (e.g., speech input) attached to it or built into it. Assuming that the Accessors are not "smart" enough on their own to handle the communication with the Control PC, there is a proxy for that purpose. The proxy is shown as a device external to the Control PC, but it could as easily be internal to it. Accessors and proxy might be supplied by an assistive technology vendor. In the Control PC the preference sets are in a repository. These sets might be communicated in through the proxy, in which case the accessibility protocol would handled by the proxy, or the preferences could be entered through the PC's conventional inputs (this might be done with a special application). Because of the power of the PC, transformation would take place in the PC, and the results transmitted via the accessibility protocol's Operation Service to the Accessors via the proxy. Unless new devices are added, new people or people with new preferences are added, there will be little need to negotiate. The Control PC could handle all these negotiations in any case, since it has access to the all the device properties via HomeAPI.

    In this example, the accessibility protocol is minimal, consisting mainly of the data structures for the preference sets and the interfaces mapping the accessors functions to the HomeAPI control protocols. Although accessibility is shown separately here, the open nature of the HomeAPI specification suggests that there is no reason that accessibility cannot be built directly into a HomeAPI implementation.

     

     

    HomeAPI Solution

     

     

    Top


    Mobile Computing Device

    Handheld and other mobile computing devices are expected to be at the center of ubiquitous computing as it evolves over the next few years. This has been accelerated lately by the advent of very small RF transmitters and receivers, small processors and storage, and PCS services. Much of the infrastructure to support universal accessibility protocols already exists or is being developed (see AirJava, for example) as part of the need for universal roaming with mobile devices. The diagram below shows how universal accessibility might be incorporated in to geographic computing. The diagram is an adaptation of a diagram in a Salutation white paper [5] on geographic computing, by Robert Pascoe.

    In many ways, this usage is similar to that of the Kiosk (above). The Local Server is analogous to the Kiosk and the mobile device to the Accessor. In fact, we can expect that mobile devices might be used as Accessors for all kinds of systems linked by wireless communications. The main difference between the mobile computing example and the Kiosk is that the Local Server is not likely to have any presentation media of its own, since it will probably be some distance away from the handheld device and remain so for the session. Consequently, any interface adaptation will appear on the mobile device, but may be configured by the Local Server and downloaded to the mobile device. Also in contrast to the Kiosk, mobile device interaction and access policy data is typically shared among the servers in the system for security and efficiency reasons, so that over time a user's preferences will become widely distributed and there would less of a need for negotiation.

     

     

    Mobile Computing Device

     

     

    Top


    Military Applications

    There are an enormous number of ways that the military might use the accessibility protocol, from tactical situations,as depicted below, to 21st Century Command Posts. The Army has been conducting research for some time on interface adaptation on the basis of the rank, speciality, mission, experience and situation of the user. Aircraft coordinating fighters and bombers in the Persian Gulf and Kosovo routinely employ preferences for users of the onboard workstations, according to mission responsibilities and personal criteria. The Navy has a pratical need for preference-based interfaces in ship sensor software for navigation that it shares with the Coast Guard.

    The situation shown below involves a command/analysis center, tanks, helicopters and troops in the field, A-10 "Tankbuster" aircraft in the air, with coordination and missile warning provided by an AWACS aircraft, all linked via satellite (or other communication). Each of these entities will have a need for a different view and control of information than the others, because of the differences in activity, mission responsibilities, locales, and rates of movement. For example, in viewing a map, the soldier in a tank will need to know the terrain he is to traverse, so he may prefer a contour map. If he is an enlisted man, he will not need the tactical information on the map that his commander requires. Because of the noise in a tank and the near-constant use of the hands, the soldier will probably use speech input from a helmet-mounted microphone to bring up maps and other information. In contrast, a helicopter pilot will be more concerned with features on the ground by which she can navigate than with specific contours, because she will be moving over the ground at a faster speed than the tank moves, and because tanks can be impeded by contours and helicopters generally are not. But similar to the soldier in the tank, the chopper pilot's hands are likely to be busy and the noise level will be high.

    In the command center, maps of all sorts will be displayed along with a wide variety of other tactical (and possibly strategic) information. There will be a number of personnel in the center of various ranks, specialties, experiences and responsibilities. A general standing in a command theater might bring up a display simply by a gesture or spoken command, in his preferred manner. This would require visual and voice tracking to implement the multimodal coordination. A major seated at an analysis desk might use a touchscreen and speech input to control her analysis.

    The diagram below shows a simplified approach to preference-based digital map retrieval. In this scenario it is assumed that there will be no negotiation, since map retrieval preferences for a given individual in a given situation are not likely to vary. The preference sets for each individual can be pre-loaded into a database. However, the format of records and the messages needed to pre-load the preferences would be those of the accessibility protocol. The preference set for an individual can be selected before the mission is begun. Because an individual in combat may become incapacitated, it is useful for other persons to be able to bring up their own preferences to continue the mission. There is no need for discovery in general in this scenario, since the databases will be at fixed logical locations--simple lookup will do. In the diagram the grey boxes denote the accessibility preference services while the rose boxes denote the operational protocols.

    It is worth noting that the military devotes a certain amount of its technological research to applications that potentially will accommdate users with disabilities, as a matter of policy.

     

    Military Applications

     

    Image Retrieval Architecture

    Tactical Image Retrieval Architecture

     
     
     

    Top


    E-Commerce

    The diagram below shows an arrangement of preferences in an E-Commerce setting. Three users are represented, but any number could participate in the activity. This setting could also represent a role-playing game, with the Public Preferences being realized as character attributes. In an E-Commerce realization, the Public Preferences are views of what each user has to offer in trading, price, etc. For example, the "red" user has products with prices that are preferred, but wants to present a different view (the shape and shade of the color blob in the Presentation Plane) of these to the "green" and "blue" users, perhaps based upon prior knowledge that "red" has about "blue" and "green". This knowledge, plus the context of the setting (there might be a different approach that "red" would take if "yellow" and "purple" were playing) adhere to the model of preferences outlined earlier in Section 4: preferences within a context and constraints. What is different is that the task, negotiation, is defined in terms of transactions--as Interaction Functions--as discussed in Section 4. This implies that the accessibility formats and protocol could be used at two levels in this system: for private preferences, in the "accessibility" sense; and for supporting the "higher" level E-Commerce activities. Thus, an architecture such as described by the NIIIP could be used to build the system and the accessibility protocols and formats added to provide the preference functionality.

    E-Commerce

     
     
     

    Top


    7. Some Related Standards, Technology and Research

    The descriptions below of technologies and research were for the most part derived by editing the descriptions provided on Web sites by the cognizant organization or vendor. There is no particular rationale for including or excluding any technology or product. The list is not the result of any exhaustive search and is intended only to represent a segment of representative, related products or consortium efforts, for the purpose of ascertaining the availability of approaches that could be used for adoption or adaptation in the development of the accessibility protocol.

    CORBA

    The Object Management Group (OMG) has defined the Common Object Request Broker Architecture (CORBA) which allows applications to communicate with one another no matter where they are located or who has designed them. The Object Request Broker (ORB) is middleware that establishes the client-server relationships between objects. Using an ORB, a client can transparently invoke a method on a server object, which can be on the same machine or across a network. The ORB is responsible for finding an object that can implement the request, pass it the parameters, invoke its method, and return the results. In so doing, the ORB provides interoperability between applications on different machines in heterogeneous distributed environments and seamlessly interconnects multiple object systems. The Internet Inter-ORB Protocol (IIOP), like HTTP, uses the Internet as a backbone, and provides exchange mechanisms for CORBA-defined messages for interoperability.

    DCOM

    Microsoft's Distributed Component Object Model (DCOM) is a protocol that enables software components to communicate directly over a network in a reliable, secure, and efficient manner. DCOM is designed for use across multiple network transports, including Internet protocols such as HTTP. DCOM is based on the Open Software Foundation's DCE-RPC spec and will work with both Java applets and ActiveX components through its use of the Component Object Model (COM). DCOM is location independent, meaning that an application can combine related components into machines that are "close" to each other onto a single machine or even into the same process. Components can run on the machine where it makes most sense: user interface and validation on, or close to, the client; database-intensive business rules on the server, close to the database. DCOM is also language-independent.

    RMI

    Sun's Java-based Remote Method Invocation (RMI) enables the programmer to create distributed applications, in which the methods of remote Java objects can be invoked from other Java virtual machines, possibly on different hosts. A Java program can make a call on a remote object once it obtains a reference to the remote object, either by lookup in the bootstrap naming service provided by RMI or by receiving the reference as an argument or a return value. Java RMI allows developers to treat remote objects very much like normal Java objects. Sun has released several interfaces that work on top of RMI, including Transactions, Leases, and Distributed Events. Access to multi-language environments is accomplished through an IIOP-compliant (Internet Inter-ORB Protocol) subset of Java RMI which allows access to legacy systems.

    iBus

    Softwire's iBus is an intelligent, Java-based communications infrastructure, organizing multiple middleware approaches, Knowledge Management technologies, and communication protocols into a unified whole. iBus supports Asynchronous Push, through transmission by IP multicast, TCP, or by a combination thereof, and Synchronous Pull, a request-and-reply style communication, much like RMI or CORBA, and operates above both TCP and IP multicast. In contrast to RMI and CORBA, iBus allows a pull request to be issued on more than one receiver for fault-tolerance and load sharing. Another feature of iBus is a protocol composition framework that permits choosing types of service needed for an application. The quality of services (QoS) offered are reliable IP multicast using negative acknowledgments, unreliable IP multicast, TCP communication, data encryption, and failure notification, among others. The protocol framework can also be extended with other (not yet unsupported) qualities of services, permitting iBus to work over various communication media--IP Multicast, TCP, Infrared, RDS, GSM, Satellite, etc. iBus readily scales, permitting objects to join and leave a group dynamically.

    Jini

    Sun Microsystems' Jini technology provides simple mechanisms which enable devices to plug together to form an impromptu community without any planning, installation, or human intervention. Each device provides services that other devices in the community may use. Jini technology defines a set of protocols for discovery, join, and lookup, and also a leasing and transaction mechanism to provide resilience in a dynamic networked environment. The Jini connection infrastructure is small enough that a community of devices enabled by Jini connection software can be built out of the simplest devices. In general, Jini only provides connectivity among systems which run the Java Virtual Machine (JVM), but devices or systems which do not run the JVM can be connected and operated via proxies which do run the JVM. Jini uses Remote Method Invocation (RMI) for object communication.

    Top


    NIIIP

    The National Industrial Information Infrastructure Protocols (NIIIP) Consortium is a team of organizations that has entered into a cooperative development agreement with the U.S. Government to develop open industry software protocols that will make it possible for manufacturers and their suppliers to effectively interoperate as if they were part of the same enterprise, even though many of these interactions are unscheduled, occur between both sophisticated and relatively unsophisticated users who utilize a wide range of computer systems, operating environments, and business processes. The NIIIP will allow individuals, enterprises and organizations, or their subdivisions, to assemble themselves into Virtual Enterprises in order to provide products, services, or solutions without being constrained by the use of different data, processes, information technologies or computing environments.

    Concordia

    Concordia, a product of Mitsubishi Electric's Information Technology Center, provides a middleware infrastructure for the development and management of mobile agent applications. Another Mitsubishi Product, Multi-Enterprise Links-by-Agents (MELBA) integrates applications across multiple enterprises as well as within a single heterogeneous environment in a company. Application plug-ins function as the interface for each application avoiding any application-specific interface programming. Concordia does not require persistent network connections, allowing asynchronous and one-way connections over low-cost communications facilities or the public Internet. An Agent Server ensures robust operation and reliable transport of agents and events using a two-phase commit protocol. The mobile agents are Java-based and are platform independent to the extent that a Java Virtual Machine is available on the system at which an agent is targeted.

    Open Agent Architecture

    The Open Agent Architecture (OAA), is focused on building distributed communities of agents--software processes that register the services they can provide in an acceptable form, that speak the Interagent Communication Language (ICL), and share common functionalities, such as the ability to install triggers, manage data in certain ways, etc. The ICL is a logic-based declarative language capable of representing natural language expressions. Agents can communicate using simultaneous, multiple (natural) input modalities--from gestures, speech, drawing, handwriting, or a standard graphical user interface. The agents compete and cooperate in parallel to translate the user's request into an ICL expression to be handled. These techniques, with a special class of agents which reason about the agent interactions necessary for handling a given complex ICL expression, allow human users to closely interact with the ever-changing community of distributed agents. The agents are also mobile with lightweight user interfaces which can run on handheld PDA's and most applications can be run through a telephone-only interface.

    Universal Plug and Play

    Universal Plug and Play is an open standard technology for transparently connecting appliances, PCs, and services by extending Plug and Play to support networks and peer-to-peer discovery and configuration. This paper introduces the standards-based, flexible architecture for Universal Plug and Play and provides an overview of the basic principles and elements of Universal Plug and Play. The Universal Plug and Play Forum is an industry group of companies promoting Universal Plug and Play networking protocols and device interoperability standards. Universal Plug and Play members will work with Microsoft to enable device-to-device interoperability by promoting Universal Plug and Play protocols and cooperatively developing and contributing XML schemas for device description, naming and HTML-based control.

    Salutation

    Salutation is a "plug and play" architecture that has found primary application in interconnecting office devices (FAX, phone, computers, etc.) It features a structure similar in some respects to the Universal Accessibility Architecture and Protocol described in this document. It may be possible to extend Salutation to cover the functionality of the Universal Accessibility Architecture and Protocol. In particular, for broader scope of usage than is current with Salutation, it would require the addition of a Transaction Service, a more comprehensive set of protocols for task operation, and the ability to include agents and proxies. However, for the current scope of usage, Salutation might require only the inclusion of more comprehensive preference descriptions and selection (plus any interface transformations necessary, to complete the intent of universal accessibility). Salutation-Lite [6] is an adaptation of Salutation for hand-held and palm computers, to enable "geographic" [7] computing

    Top


    Bluetooth

    Bluetooth is a proposed Radio Frequency (RF) specification for short-range, point-to-multipoint voice and data transfer. Bluetooth can transmit through solid, non-metal objects. Its nominal link range is from 10 cm to 10 m, but can be extended to 100 m by increasing the transmit power. It is based on a low-cost, short-range radio link, and facilitates ad hoc connections for stationary and mobile communication environments.

    IrDA

    The Infrared Data Association (IrDA) specifies three infrared communication standards: IrDA-Data, IrDA-Control, and a new emerging standard called AIr. For the purpose of this document, IrDA refers to the IrDA-Data standard. In general, IrDA is used to provide wireless connectivity technologies for devices that would normally use cables for connectivity. IrDA is a point-to-point, narrow angle (30¡ cone), ad-hoc data transmission standard designed to operate over a distance of 0 to 1 meter and at speeds of 9600 bps to 16 Mbps.

    Piano

    The Motorola Piano Platform provides short range, wireless connectivity at high bandwidth to a variety of mobile devices. It creates spontaneous, ad hoc networks between a wide variety of common devices. When Piano-enabled devices come into physical proximity with one another, they automatically detect each other's presence, and exchange information, either automatically or under user control, to determine whether further communication between the devices is warranted. If further communication is necessary or desired, a "just-in-time" intranet is established between the devices so that either device can use the services of the other. Piano is complementary to both Jini and Bluetooth.

    T Spaces and MODAL

    IBM-Almaden's T Spaces is a network communication buffer with database capabilities. It enables communication between applications and devices in a network of heterogeneous computers and operating systems. T Spaces provides group communication services, database services, URL-based file transfer services, and event notification services. With a small footprint, T Spaces is capable of bringing network services to small/embedded systems. Written in Java, T Space client applications can be loaded dynamically into any network-attached computer. At present IBM has no plans to market a commercial product based upon this experimental technology. IBM-Almaden has also developed a software technology called Mobile Document Application Language (MODAL) that converts any handheld device into a universal remote control. MODAL uses T-Spaces for its communications, and has the capability to emulate any device's interface. The technology is functionally similar to Sun's Jini technology, but Jini has the devices only discover one another, then frees them to talk to each other anyway they want; MODAL and T Spaces provide application communications beyond discovery. MODAL works in multiple communications modes from local wireless devices using infrared and Bluetooth radio frequencies. It also works over wide areas using paging networks or cellular digital packet data.

    Trace R&D EZ Access and URCC

    The Trace R&D Center at the University of Wisconsin-Madison has developed EZ Access and URCC. EZ Access is a flexible but standard set of interface strategies for allowing people to access and use electronic devices even when they are operating under constrained conditions, such as having a disability or from environmental factors. EZ Access is a set of modes and features which can change the way electronic devices operate. The EZ Access package includes strategies for people with low vision, blindness, reduced hearing, deafness, physical disabilities, reading problems, inability to read, and more. EZ Access provides a standard way for people with disabilities to use all manner of electronic devices, such as microwave ovens, cellular phones, interactive multimedia kiosks, or coffee vending machines. The URCC is a communications protocol that can be used over any transmission medium. URCC allows individuals to use a single controller (e.g., a dedicated controller, an electronic pocket organizer, a laptop computer) with an appropriate comm port to control any URCC-compatible device (VCR, stereo, thermostat, kiosk, etc.). It also allows people with disabilities who cannot use the displays and controls on the standard devices to use a special assistive technology as a remote console, allowing them to access and use the standard devices.

    HomeRF

    The HomeRF Working Group (HRFWG) was formed to provide the foundation for a broad range of interoperable consumer devices by establishing an open industry specification for wireless digital communication, between PCs and consumer electronic devices anywhere in and around the home, called the Shared Wireless Access Protocol (SWAP). The system is intended to operate at 2.4GHz. SWAP provides support for delivery of both voice and data traffic and interoperates with both the Public Switched Telephone Network and the Internet. It employs TDMA for voice and other time-critical services and CSMA/CD (Ethernet) for high-speed packed data transfer.

    HomeAPI

    HomeAPI is a set of programming interfaces enabling software applications to discover and control home devices such as TVs, VCRs, lights, security systems, thermostats, etc. Home API and the Home Audio/Video Interoperability (HAVi) complement one another and target different clients. Home API addresses the full spectrum of home devices and targets Windows application clients. Home API will be capable of interfacing with Jini devices, but Home API is based on a more centralized control model in which a few general-purpose intelligent devices control a wide variety of devices across arbitrary home networks. Jini does not assume a centralized control nor a Windows environment. Home API is capable of interfacing with a variety of lower-level network and device control protocols, such as CEBus and Home PnP.

    HomePNA

    HomePNA has chosen to standardize on a technology already available which permits linking devices at speeds up to 1 Mbps over existing home phonelines. It supports the complex, random-tree type of wiring typically found in the home and does not require any hubs or new Category 5 wiring. Such an arrangement requires no special terminations, filters or splitters, and uses only the single pair of existing phone wires to make its connection, and operates concurrently with any normal telephone service that might be using those same wires. The technology also coexists with the new splitterless Universal ADSL standard. It is fully compatible with the Ethernet MAC layer standard (IEEE 802.3 CSMA/CD with a new physical layer).

    Top


    JetSend and Chai

    Hewlett-Packard has new product called JetSend that is intended to operated above connection technologies such as Jini and Universal Plug-and-Play. The premise of JetSend is a protocol that can be embedded into different devices--printer, PC, PalmPilot,scanner, cellphone--to permit them to communicate and exchange information directly. JetSend is independent of transport protocol and can be used over network technologies as diverse as infrared or Internet IP, or it could be layered over Jini. JetSend has a "negotiation" protocol which permits the exchange of data type and data structure information, through the transmission of Java objects called "surfaces". There is also a language which is used to describe the contents and structure of a surface. Each surface has an owner-device, but another device receiving a copy of a surface (called "impression") permits the recipient to operate the first device without having to load a driver. HP has published the JetSend specification and developers are free to build their own implementations.

    ChaiAppliance Plug and Play is a member of HP's Chai family of embedded-software products and supports device and application discovery using Web standards such as HTTP and XML and supports the Universal Plug and Play initiative. ChaiAppliance Plug and Play is written in the Java programming language and is targeted for use in a broad range of appliances. ChaiServer uses the World Wide Web's URLs (Uniform Resource Locators) to identify and directly access individual devices and their applications. This makes it possible to access a device and its services via any browser, or to call these services programmatically. All communication is done over HTTP (HyperText Transport Protocol). ChaiVM 2.0, a virtual machine that complies with the Java Virtual Machine specification, allows for Java-language-based logic mobility in embedded appliances, permitting downloadable applications that could execute on a family of appliances, regardless of operating system or CPU differences.

    HAVi

    HAVi ("Home Audio/Video interoperability") is consortium whose activity pertains to interconnecting and controlling AV electronics appliances connected in the Audio/Video Home Network based on IEEE 1394. The core open home network specification defines middleware elements for permitting interoperation of different brands of of AV electronic appliances, and the roles functions of these elements. Display devices offer their display to other applications and need not take part in the application themselves or be aware of the functionality behind the User Interface that is shown by an application. HAVi also specifies an open and standardized Java programming environment for applications and Device Control Modules on HAVi devices. HAVi adopted the IEEE-1394 bus as the underlying network technology for the HAVi protocols as well as for the transport of the real-time AV streams.

    CEBus

    The CEBus Industry Council (CIC) has developed the CEBus Standard around Common Application Language (CAL). The CAL defined within EIA-600 provides only a framework for communication among home LAN products produced within divergent industry sectors (e.g.; entertainment, computers, heating/cooling, kitchen appliances, and many more) . EIA/CEMA published CAL as a separate EIA Standard, to be known as EIA-721, after it was adopted by various industry sectors and CIC had defined "grammatical" rules for using the language. CIC's Home Plug & Play Specification is a set of interoperability guidelines developed by the CEBus Industry Council. The guidelines are transport-protocol independent. To be properly implemented, each industry sector must define the "application contexts" (i.e., grammatical rules) under which its products will use the language.

    UI4ALL

    UI4ALL

    This ERCIM Working Group is initiated against the background of recent European R&D activities which have analyzed the requirements, identified the viability, and demonstrated the feasibility of constructing 'user interfaces for all', i.e. interfaces which address the individual user requirements of potentially all users. It has been successfully argued that alternative, technologically more powerful and methodologically more systematic approaches are needed to tackle the problems of accessibility and quality of interaction for all potential users and in a holistic way.

    Top


    References

    1. Stephanidis, C., Editorial at First Workshop "User Interfaces for All", European Research Consortium for Informatics and Mathematics, Heraklion, Crete, Greece, 1995.
    2. Project Archimedes [http://www-csli.stanford.edu/arch/projects97.html].
    3. Stephanidis, C., et al, "Toward an Information Society for All", International Journal of Human-Computer Interaction, Vol. II (1), 1999, p. 1 - 28.
    4. Salutation Architecture Specification (Part 2) Version 2.0, The Salutation Consortium, December, 1996.
    5. NIIIP Reference Architecture, National Industrial Information Infrastructure Protocols Consortium, December 1998.
    6. Jini Architecture Specification, Revision 1.0, Sun Microsystems, January 1999.
    7. Pascoe, R., "Salutation-Lite: Find-and-Bind(tm) Technology for Mobile Devices" Salutation White Paper, The Salutation Consortium, June 6, 1999.
    8. Pascoe, R., "Geographic Computing: Enabling New Markets for Hand-held and Palm-size Information Appliances" Salutation White Paper, The Salutation Consortium, December 16, 1998.
    9. HP JetSend(tm) Communications Technology: Protocol Specification, Document Version 1.5, Hewlett-Packard Company, May 1999.
    10. Stephanidis, C., "Designing User Interfaces for All", CSUN '98, March, 1998.