From: Charles Rich [mailto:rich@merl.com]
Sent: Monday, March 28, 2005 1:40 PM
To: Bennett, Barbara
Cc: psa@ansi.org; Zimmermann, Gottfried; gv@trace.wisc.edu;
rich@merl.com
Subject: Public comments on INCITS 389-393
1. INTRODUCTION:
These are comments on the public draft of INCITS 389-393 (Protocol to
Facilitate Operation of Information and Electronic Products through
Remote and Alternative Interfaces and Intelligent Agents) from:
Charles Rich, Ph.D.
Distinguished Research Scientist
Mitsubishi Electric Research
Laboratories, Inc.
201 Broadway
Cambridge, MA 02139
USA
Phone: 617-621-7507
Email: rich@merl.com
Please note that I have applied for membership on the V2 committee and
have attended one plenary and several v2p and v2i meetings, but since I
will not be a voting member until after the next plenary meeting, I am
submitting these as "public" comments, as a means of getting them into
the official record at this time.
All of my comments below are the result of reviewing V2 from the
perspective of supporting intelligent agents, including natural
language interfaces, areas in which I have extensive research
experience (see http://www.merl.com/people/rich). The suggestions
in Section 2 below are minor extensions to the UI socket description
(INCITS 390). In Section 3, however, I propose a major extension
to the overall V2 specification of introducing a new optional task
model component .
It is also worth mentioning, as a general point, that a possible
response to any proposed extension V2 specification is to say that "you
can put that information in the resources (INCITS 393), if you want
to". Of course, that is technically true, since the resources
provide an open-ended "trap-door" for any kind of application-specific
information. However, if the representation of a certain kind of
information (such as command parameters) is not standardized, then it
is impossible for the developer of a powerful, generic third-party URC
to take advantage of it, whereas facilitating such development is a
major goal of V2.
Finally, I understand that many or all of these suggestions may be best
acted upon in the upcoming ISO process, rather than in the current
INCITS draft.
2. IMPROVEMENTS TO USER INTERFACE SOCKET DESCRIPTION (INCITS 390)
2A. COMMAND PARAMETERS
The specification of a command type should (but does not currently)
include an explicit list of the (typed) "parameters" or "inputs", if
any, to each command instance.
For example, consider a "setDesiredTemp" command, which sets the target
temperature of a thermostat to a desired temperature. How is the
desired temperature specified? As I understand the current V2
spec, this would be coded by defining a variable in the socket
description, e.g., "desiredTemp", and having the setDesiredTemp command
(implicitly) use this variable.
Unfortunately, this means that the connection between this variable
(desiredTemp) and this command (setDesiredTemp) is not made explicit
for use by an intelligent agent in a general way. An intelligent
agent needs to know what the parameters of each command are, for
example, so it can use this information in a general strategy for
quering about parameters with unknown values. For example,
suppose the user says "change the temperature". A generic agent
using only information in the V2 spec (with command parameters) can
then respond "what is the desired temperature?". Otherwise, an
application-specific thermostat agent is required.
Not the "execute" command dependency already in the spec does not
provide the needed representation. This dependency is about
whether the command may be executed, not about what its inputs are.
Finally, note that the UPnP command specification _does_ include
explicit parameters for commands. V2 should have them, too.
2B. COMMAND RESULTS
Parameters are values that need to be known _before_ a command can be
executed. "Results" or "outputs" are values that are not (often,
cannot be) known until _after_ the command is executed. For
example, a new object created during the execution of a command would
be a result. Also, commands that obtain (e.g., sense) data from
the surrounding environment would naturally represent the new data as a
result. As another example, a command that retrieves a web page
would represent the web page as its result.
My experience with modeling for intelligent agents is that having both
the parameters and results of actions explicitly represented is a good
idea.
2C. COMMAND POSTCONDITIONS
Although V2 currently has the ability to express the "preconditions"
of a command (i.e., using the execute dependency), there is no
specification for the "postconditions", i.e., a predicate which defines
the necessary and sufficient conditions for successful completion of
the command.
To give the simplest possible example, the postcondition of the
turnOnLights command would be that the lights are on. The current
V2 spec correctly discourages the definition of commands, such as
turnOnLights, which simply set one variable. Many commands,
however, have multiple effects. For example, the effect of the
thermometer reset command is to set both the min and max temperatures
to the current temperature.
Explicitly representing the postconditions of commands facilitates an
intelligent URC agent in a number of ways.
First, the postconditions of a command can be used as an index to
answer "how to" questions from the user, effectively expanding the
vocabulary to include not just the words associated with the command,
but also the state variables it effects. For example, if the user
asks "How do I change the min temperature?", the agent can reply "Use
the reset command".
Second, postconditions allow an intelligent agent to understand when a
command is redundant. (It is a separate modeling decision whether
to allow a particular redundant command to be executed or not).
For example, if the lights are on and the user says "turn on the
lights", it is more intelligent for the agent to answer "The lights are
already on" than to just redundantly turn on the lights or to reply
"You cannot turn on the lights now".
Third, a more advanced use of postconditions is to "backward chain", a
standard technique in AI planning which matches preconditions with
postconditions in order to decide what to do next. For example,
suppose that that some command C has a precondition p, which is not
true when the user says "please do C". Backward chaining looks
through available commands for a command D, which has p as one of its
postconditions. Then, rather than saying "You cannot do C
(because p is not true)", the agent can be more helpful by asking "Do
you want me to do D?". Note that the backward chaining process
can continue recursively if D has an unsatisified precondition.
Perhaps postconditions could be added to V2 as an optional command
dependency called "effect".
2D. COMMAND SUCCESS/FAILURE
The current V2 spec has no general way for an intelligent agent to tell
if a command it issued succeeded or failed. There is a framework
for specifying "notifications", which can include "exceptions", but not
all exceptions are failures.
To motivate this issue, consider a command that doesn't always work
(for whatever reason). A helpful agent should know if the last
command failed, so it can suggest say something like "That didn't work,
do you want to try again?".
One can obviously program this behavior for a specific command with
pre-arranged notification types, but as mentioned in the introduction,
our goal is a general agent that can work with UI socket descriptions
it has never seen before.
3. ADDING A TASK MODEL DESCRIPTION (INCITS 389)
A key idea in the fields of both model-based UI design (cf.,
http://www.pebbles.hcii.cmu.edu/puc) and intelligent dialogue agents
(cf., http://www.merl.com/projects/collagen), which is missing from V2,
is the "task model". I would like to propose adding a task model
description as a new (optional) toplevel component of the V2 framework,
on the same level as the UI Socket Description and Presentation
Template.
Basically, a task model is an abstract, hierarchical description of how
to decompose typical high-level goals in some domain into primitive
actions. For example, in the domain of home audio-visual systems,
a high-level goal might be to copy a videotape to a DVD. A
typical task model might decompose this goal into the following steps:
(1) Insert source videotape in the VCR
(2) Insert a blank DVD in the DVD burner
(3) Configure the DVD player appropriately
(4) Turn on the VCR
Without going into too much detail here (see citations above), I would
like to point out a few important features of task models, illustrated
with this example.
First, the steps of a goal decomposition (sometimes called a "recipe"),
may be only partially ordered. For example, steps (1) and
(2) above can happen in any order, but both must occur before (3),
which must in turn occur before (4). Specification of partial
order constraints is part of a task model description.
Second, a step, e.g., (1), (2), and (4), may be a primitive action,
i.e., an action that can be directly executed, or a goal, such as (3),
which is a goal that must be further decomposed by the task model (thus
leading to its hierarchical structure).
Third, a goal may have alternative decompositions. For example,
the details of goal (3) above will have many alternatives, depending on
the users preferences, e.g., for quality versus recording size, etc.,
Finally, notice that copying a videotape to DVD, like many high-level
goals that users especially need help with, involves more than one
target component.
In terms of the syntax for specifying a task model in V2, the idea in a
nutshell is to specify a tree which, like the Presentation Template,
has UI socket commands as leaves. However, in the task model
tree, the intermediate nodes, rather than being aggregations for the
purpose of presentation, are "goal" aggregations, i.e., the children of
a goal node are the steps to achieve that goal.
Furthermore, to represent alternative decompositions, we introduce a
second type of node, each of whose children is an alternative.
The resulting structure is what is called an "and/or tree", commonly
used in many AI and engineering applications: "and" nodes decompose
goals into steps, "or" nodes specify alternative decompositions.
Finally, this tree structure can easily be decorated with applicability
conditions (for alternatives), ordering constraints (for steps), and
other logical constraints, using syntactic mechanisms already used in
other places in the V2 specification.
This is obviously just a very rough sketch of how to incorporate a task
model into V2. Many more details need to be discussed and worked
out.
-EOF
------------
From: Zimmermann, Gottfried Sent: Friday, April 01, 2005 9:22 AM To: Bennett, Barbara; LaPlant, William (Bill) Subject: RE: Comment received on the public review of INCITS 389
Barbara,
thanks for archiving the message of Dr.
Winters.
However, i did receive another comment,
from Dr. Rich (see attachment). You should have got it also.
Thanks,
Gottfried
___________
From: Bennett, Barbara
[mailto:bbennett@itic.org] Sent: Thursday, March 31, 2005 8:01 PM To: Zimmermann, Gottfried; LaPlant, William (Bill) Cc: jack.winters@marquette.edu Subject: FW: Comment received on the public review of INCITS 389 Importance: High