in050242

----------

From: Charles Rich [mailto:rich@merl.com]
Sent: Monday, March 28, 2005 1:40 PM
To: Bennett, Barbara
Cc: psa@ansi.org; Zimmermann, Gottfried; gv@trace.wisc.edu; rich@merl.com
Subject: Public comments on INCITS 389-393


1. INTRODUCTION:

These are comments on the public draft of INCITS 389-393 (Protocol to Facilitate Operation of Information and Electronic Products through Remote and Alternative Interfaces and Intelligent Agents) from:

      Charles Rich, Ph.D.
      Distinguished Research Scientist
      Mitsubishi Electric Research Laboratories, Inc.
      201 Broadway
      Cambridge, MA 02139
      USA

      Phone: 617-621-7507
      Email: rich@merl.com

Please note that I have applied for membership on the V2 committee and have attended one plenary and several v2p and v2i meetings, but since I will not be a voting member until after the next plenary meeting, I am submitting these as "public" comments, as a means of getting them into the official record at this time.

All of my comments below are the result of reviewing V2 from the perspective of supporting intelligent agents, including natural language interfaces, areas in which I have extensive research experience (see http://www.merl.com/people/rich).  The suggestions in Section 2 below are minor extensions to the UI socket description (INCITS 390).  In Section 3, however, I propose a major extension to the overall V2 specification of introducing a new optional task model component .

It is also worth mentioning, as a general point, that a possible response to any proposed extension V2 specification is to say that "you can put that information in the resources (INCITS 393), if you want to".  Of course, that is technically true, since the resources provide an open-ended "trap-door" for any kind of application-specific information.  However, if the representation of a certain kind of information (such as command parameters) is not standardized, then it is impossible for the developer of a powerful, generic third-party URC to take advantage of it, whereas facilitating such development is a major goal of V2.

Finally, I understand that many or all of these suggestions may be best acted upon in the upcoming ISO process, rather than in the current INCITS draft.

2. IMPROVEMENTS TO USER INTERFACE SOCKET DESCRIPTION (INCITS 390)

2A. COMMAND PARAMETERS

The specification of a command type should (but does not currently) include an explicit list of the (typed) "parameters" or "inputs", if any, to each command instance.

For example, consider a "setDesiredTemp" command, which sets the target temperature of a thermostat to a desired temperature.  How is the desired temperature specified?  As I understand the current V2 spec, this would be coded by defining a variable in the socket description, e.g., "desiredTemp", and having the setDesiredTemp command (implicitly) use this variable. 

Unfortunately, this means that the connection between this variable
(desiredTemp) and this command (setDesiredTemp) is not made explicit for use by an intelligent agent in a general way.  An intelligent agent needs to know what the parameters of each command are, for example, so it can use this information in a general strategy for quering about parameters with unknown values.  For example, suppose the user says "change the temperature".  A generic agent using only information in the V2 spec (with command parameters) can then respond "what is the desired temperature?".  Otherwise, an application-specific thermostat agent is required.

Not the "execute" command dependency already in the spec does not provide the needed representation.  This dependency is about whether the command may be executed, not about what its inputs are.

Finally, note that the UPnP command specification _does_ include explicit parameters for commands.  V2 should have them, too.

2B. COMMAND RESULTS

Parameters are values that need to be known _before_ a command can be executed.  "Results" or "outputs" are values that are not (often, cannot be) known until _after_ the command is executed.  For example, a new object created during the execution of a command would be a result.  Also, commands that obtain (e.g., sense) data from the surrounding environment would naturally represent the new data as a result.  As another example, a command that retrieves a web page would represent the web page as its result.

My experience with modeling for intelligent agents is that having both the parameters and results of actions explicitly represented is a good idea.

2C. COMMAND POSTCONDITIONS

Although V2 currently has the ability to express the "preconditions"
of a command (i.e., using the execute dependency), there is no specification for the "postconditions", i.e., a predicate which defines the necessary and sufficient conditions for successful completion of the command.

To give the simplest possible example, the postcondition of the turnOnLights command would be that the lights are on.  The current V2 spec correctly discourages the definition of commands, such as turnOnLights, which simply set one variable.  Many commands, however, have multiple effects.  For example, the effect of the thermometer reset command is to set both the min and max temperatures to the current temperature.

Explicitly representing the postconditions of commands facilitates an intelligent URC agent in a number of ways.

First, the postconditions of a command can be used as an index to answer "how to" questions from the user, effectively expanding the vocabulary to include not just the words associated with the command, but also the state variables it effects.  For example, if the user asks "How do I change the min temperature?", the agent can reply "Use the reset command".

Second, postconditions allow an intelligent agent to understand when a command is redundant.  (It is a separate modeling decision whether to allow a particular redundant command to be executed or not).  For example, if the lights are on and the user says "turn on the lights", it is more intelligent for the agent to answer "The lights are already on" than to just redundantly turn on the lights or to reply "You cannot turn on the lights now".

Third, a more advanced use of postconditions is to "backward chain", a standard technique in AI planning which matches preconditions with postconditions in order to decide what to do next.  For example, suppose that that some command C has a precondition p, which is not true when the user says "please do C".  Backward chaining looks through available commands for a command D, which has p as one of its postconditions.  Then, rather than saying "You cannot do C (because p is not true)", the agent can be more helpful by asking "Do you want me to do D?".  Note that the backward chaining process can continue recursively if D has an unsatisified precondition.

Perhaps postconditions could be added to V2 as an optional command dependency called "effect".

2D. COMMAND SUCCESS/FAILURE

The current V2 spec has no general way for an intelligent agent to tell if a command it issued succeeded or failed.  There is a framework for specifying "notifications", which can include "exceptions", but not all exceptions are failures.

To motivate this issue, consider a command that doesn't always work (for whatever reason).  A helpful agent should know if the last command failed, so it can suggest say something like "That didn't work, do you want to try again?".

One can obviously program this behavior for a specific command with pre-arranged notification types, but as mentioned in the introduction, our goal is a general agent that can work with UI socket descriptions it has never seen before.

3.  ADDING A TASK MODEL DESCRIPTION (INCITS 389)

A key idea in the fields of both model-based UI design (cf.,
http://www.pebbles.hcii.cmu.edu/puc) and intelligent dialogue agents (cf., http://www.merl.com/projects/collagen), which is missing from V2, is the "task model".  I would like to propose adding a task model description as a new (optional) toplevel component of the V2 framework, on the same level as the UI Socket Description and Presentation Template.

Basically, a task model is an abstract, hierarchical description of how to decompose typical high-level goals in some domain into primitive actions.  For example, in the domain of home audio-visual systems, a high-level goal might be to copy a videotape to a DVD.  A typical task model might decompose this goal into the following steps:

  (1) Insert source videotape in the VCR
  (2) Insert a blank DVD in the DVD burner
  (3) Configure the DVD player appropriately
  (4) Turn on the VCR

Without going into too much detail here (see citations above), I would like to point out a few important features of task models, illustrated with this example.

First, the steps of a goal decomposition (sometimes called a "recipe"), may be only partially ordered.  For example, steps (1) and
(2) above can happen in any order, but both must occur before (3), which must in turn occur before (4).  Specification of partial order constraints is part of a task model description.

Second, a step, e.g., (1), (2), and (4), may be a primitive action, i.e., an action that can be directly executed, or a goal, such as (3), which is a goal that must be further decomposed by the task model (thus leading to its hierarchical structure).

Third, a goal may have alternative decompositions.  For example, the details of goal (3) above will have many alternatives, depending on the users preferences, e.g., for quality versus recording size, etc.,

Finally, notice that copying a videotape to DVD, like many high-level goals that users especially need help with, involves more than one target component. 

In terms of the syntax for specifying a task model in V2, the idea in a nutshell is to specify a tree which, like the Presentation Template, has UI socket commands as leaves.  However, in the task model tree, the intermediate nodes, rather than being aggregations for the purpose of presentation, are "goal" aggregations, i.e., the children of a goal node are the steps to achieve that goal. 

Furthermore, to represent alternative decompositions, we introduce a second type of node, each of whose children is an alternative.  The resulting structure is what is called an "and/or tree", commonly used in many AI and engineering applications: "and" nodes decompose goals into steps, "or" nodes specify alternative decompositions.  Finally, this tree structure can easily be decorated with applicability conditions (for alternatives), ordering constraints (for steps), and other logical constraints, using syntactic mechanisms already used in other places in the V2 specification.

This is obviously just a very rough sketch of how to incorporate a task model into V2.  Many more details need to be discussed and worked out.

-EOF

------------

From: Zimmermann, Gottfried
Sent: Friday, April 01, 2005 9:22 AM
To: Bennett, Barbara; LaPlant, William (Bill)
Subject: RE: Comment received on the public review of INCITS 389

Barbara,
 
thanks for archiving the message of Dr. Winters.
 
However, i did receive another comment, from Dr. Rich (see attachment).  You should have got it also.
 
Thanks,
Gottfried

___________

From: Bennett, Barbara [mailto:bbennett@itic.org]
Sent: Thursday, March 31, 2005 8:01 PM
To: Zimmermann, Gottfried; LaPlant, William (Bill)
Cc: jack.winters@marquette.edu
Subject: FW: Comment received on the public review of INCITS 389
Importance: High

Dear Gottfried, Bill,
 
The link for Dr. Winter's comments is as follows:
 
http://www.incits.org/archive/2005/in050225/in050225.htm
 
and NOT
 
http://www.incits.org/archive/2005/in050025/in050025.htm.
 
Sorry about the zeroes on the brain.
 
Best regards -

Barbara Bennett
Associate Manager, Standards Operations
INCITS/Information Technology Industry Council
1250 Eye Street, NW - Suite 200
Washington, DC 20005
202.626.5743
e-mail:   bbennett@itic.org
website: www.incits.org