Back to the computer science and engineering home page.
Division of
Biomedical
Computer Science

Course Info
Description
Prerequisites
Homework
Project
Grading
Your Grades

A note about plagiarism

Resources
CSLU Speech Toolkit
Tcl/tk

CS550 Spoken Dialogue Systems
Summer 2010

 

Instructor
Peter Heeman

When
Monday/Wednesday 4:00pm-5:30pm

Where
Wilson Clark Center 403

3 credits
Bulletin Board


Course Information

Spoken dialogue systems are already being deployed to help people find out flight information, trade stock, access email, and find out traffic conditions. With the continuing advancements in speech technology, more information and services will become readily available. A simple cell phone will be enough to hook into the information age.

This course teaches the fundamentals of spoken dialogue systems. Spoken dialogue systems include components for speech recognition, parsing, semantic interpretation, dialogue management, text generation, speech synthesis, and agent architecture. The course will be organized in terms of 3 frameworks for dialogue management: finite-state machines, form-filling, and speech-act reasoning. We will examine how speech recognition, parsing, and semantic interpretation fit into each framework. We will also contrast hand-crafting a dialogue manager with using machine learning.

There is no textbook for the class.


Prerequisites

Programming assignments will be in Tcl/Tk and will use the CSLU toolkit. No prior experience with either is required.

During the course, we will be going into the basics of different formalisms for expressing knowledge, such as finite state machines and context free grammars. Students will be taught the basics of these different formalisms, and are not expected to have already taken a course on automata and formal languages.


CSLU Speech Toolkit

Students will be using the CSLU Speech Toolkit for this class. It has been loaded on the CSE Windows machines. For students who have their own Windows-based PC, they can download it and install it for free onto their own machines. Instructions for downloading it are located at http://cslu.cse.ogi.edu/toolkit/download. The toolkit has many aspects to it. We will be using it solely to build spoken dialogue systems, starting with the Rapid Application Development (RAD) environment. Check out http://cslu.cse.ogi.edu/toolkit/docs/2.0/apps/rad/. This page has a series of tutorials. In particular, tutorial 1, 2, 6, 11, 15, and 16 are particularly useful. The others use features of RAD that we will not be exploring.

Tcl/Tk

The CSLU Speech Toolkit allows you to incorporate Tck/Tk code in building your spoken dialogue systems with RAD. If you want to bypass the graphical interface of RAD, the toolkit has functions written that can be easily incorporated into a Tck/TK program. Hence, in this course, we be using Tcl/Tk. Tcl/Tk is automatically installed with the toolkit. Some of the tutorials mentioned in the previous section focus on using Tcl/TK. Other sources of information are located at http://tcl.ActiveState.com/doc. Tcl is a scripting language and Tk is a graphics toolkit. You will be mainly using the Tcl part.

Here is some information that I compiled about Tcl/Tk to get you started.


Homework

For each section, there will be a homework assignment, which will involve either creating a technology or incorporating it into a spoken dialogue system.

Final Project

Toward the end of the course, students will do a final project, which can be individual or group-based (at most 3 students). Groups will build on the systems that they have built during the homework assignments. Below are some example projects. The writeup would discuss the application and the needed capabilities of the spoken dialogue system. It would discuss and justify the choices in underlying technology.

Timeline
Week 5Project groups decided
Week 6Each group hands in one-page writeup of what their project will entail
Week 7Each group meets with professor for feedback on their proposal
Week 10Presentation and Writeup due.

Groups work well when all members contribute to the project. Members do not have to contribute in the same way, rather the group should take advantage of the differing strengths of its team members. To encourage each member to fully participate, after finishing the project, each team member must hand in an evaluation of their team consisting of a one paragraph statement of how well they thought their team worked together, and a score between 0 and 10 of each of their team members.


Grading

Assignments 40%
Presentation 15%
Final Project & Presentation 30%
Final Exam 15%

Class Schedule

Below is a tentative vesion of the class schedule. This will be changing over the next two weeks. You can count on at least the next being accurate. I am making this available so that you can get an idea of what will be taught in the course. I am making tentative versions of the homeworks and the class lecture slides.

Mon Sep 27
Class 1
Finite-State Dialogue Management Basics of building simple spoken dialogue systems using Finite State Models, including how speech recognition, parsing, semantic intepretation can be easily incorporated.
Wed Sep 29
Class 2
Parsing Compositional Meaning. Bottom-up parsing algorithm.
  Homework 1 Implement a simple spoken language system using the CSLU toolkit.
This will be a system-controlled dialogue:
user responses will be highly contrained, just single words or short phrases.
Due Wednesday October 6 by 4:00pm.
Mon Oct 4
Class 3
Semantic Interpretation Semantic interpretation using parallel semantic rules. Knowledge representation formalisms, including frames, hierarchical frames, FOPC and lambda calculus, and event-based semantics.
Supplementary Reading: Chapter 14 & 15 of D. Jurafsky & P. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, Prentice Hall 2000.
Wed Oct 6
Class 4
Form-based Dialogue Management Dialogue manager uses data structures to guide its behaviors.
  Homework 2 Implement a system that uses the speech recognition grammar and that does limited semantic processing.
Due Wednesday October 13 by 4:00pm.
  Homework 3 Search, parsing, semantic interpretation.
Here is some information about writing tcl scripts. Due Wednesday October 13 by 4:00pm.
Mon Oct 11
Class 5
Hierarchical Forms Dialogue manager uses hierarchical data structures to guide its behaviors.
Wed Oct 13
Class 6
Speech Acts Philosophical and Artificial Intelligence view of Speech Acts.
Required Reading: David R. Traum, Speech Acts for Dialogue Agents, in Michael Wooldridge and Anand Rao, editors, ``Foundations And Theories Of Rational Agents'', Kluwer Academic Publishers, pages 169--201, 1999.
  Homework 4 Lambda Expressions. Form-Filling Dialogues.
Here is the file class04form.tcl that you need. Due Wednesday October 20 by 4:00pm.
Mon Oct 18
Class 7
ISU Toolkit for building dialogue managers.

Required Reading: Staffan Larsson and David Traum (2000): Information state and dialogue management in the TRINDI Dialogue Move Engine Toolkit. In Natural Language Engineering Special Issue on Best Practice in Spoken Language Dialogue Systems Engineering, Cambridge University Press, U.K. (pp. 323-340, 18 pages)

Wed Oct 20
Class 8
Continuation  
  Homework 5 Build a form-based spoken dialogue system. The car inventory is here (hw5cars.tcl)

Due Wednesday October 27.
Mon Oct 25
Class 9
Information State Example Banking Application cast as in Information State approach

Wed Oct 27
Class 10
Learning Dialogue Strategies Rather than hand-craft a dialogue strategy, machine learning techniques can be used.

Required Reading:
A Stochastic Model of Human-Machine Interactin for Learning Dialog Strategies, Levin, Pieraccini and Eckert, Transactions on Speech and Audio Processing, 2000.

  Homework 6 Augment an information-state dialogue system. Start with the code in ISEngine.tcl and ISAgent.tcl.
Due Monday November 8.
  Homework 7 Information State II. Use the code in ISEngine2.tcl
Due Wednesday November 10.
  Homework 8 Simulated Dialogues. Use the code in ISEngine3.tcl and ISAgent3.tcl. Also, make sure you get the official answers for Homework 7 from the instructor as you should use those for question 2. Due Monday November 15.
Mon Nov 1   No class
Wed Nov 3   No class
Mon Nov 8
Class 11
Learning Dialogue Strategies II MDP, Model-based RL, Model-Free RL, Flight Domain problem
Wed Nov 10
Class 12
Continuation
Mon Nov 15
Class 13
Learning Dialogue Strategies III Epsilon-Greedy, Alpha, Q-Learning
  Homework 9 RL-IS. Make sure you get the official answers for Homework 8 from the instructor. Due Wednesday November 24.
Wed Nov 17
Class 14
Combining RL and IS. Required Reading:
Combining Reinforcement Learning with Information-State Update Rules. Heeman. In Proceedings of the North American Chapter of the Association for Computational Linguistics Annual Meeting, pages 268-275, Rochester NY, April 2007.
 
Mon Nov 22
Class 15
Negotiation, Training Data Representing the Reinforcement Learning State in a Negotiation Dialogue. Heeman. In Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding, Merano Italy, December 2009.
Quantitative Evaluation of User Simulation Techniques for Spoken Dialogue Systems.Schatzmann, Georgila, and Young. In 6th SIGdial Workshop on Discourse and Dialogue, Lisbon Portugal, September 2005.
Wed Nov 24
Class 16
RL, Discourse Structure Movie Domain: Scheffler and Young, Automatic learning of dialogue strategy using dialogue simulation and reinforcement learning
Discourse Structure: Grosz and Sidner, Attention, Intentions, and the Structure of Discourse
  Homework 10 Learning Policies. Due Friday December 3.
Mon Nov 29
Class 17
Initiative
Walker and Whittaker, 1990. Mixed Initiative in Dialogue: An Investigation into Discourse Segmentation
Strayer, Heeman and Yang, 2003. Reconciling Control and Discourse Structure
Wed Dec 1
Class 18
Turntaking Yang and Heeman, 2010. Initiative Conflicts in Task-Oriented Dialogue
Heeman and Selfridge, 2010. Importance-Driven Turn-bidding for Spoken Dialogue Systems
Mon Dec 6
Class 19
Final

Plagiarism

Learning from and with each other is encouraged. However, interacting so as to avoid learning is not tolerated. Any discussion in which no personal notes (or programs) are taken in, and none are taken out, are fine. From such discussions, students should learn the material well enough to construct their notes on their own afterwards. If you are in doubt, the onus is you to discuss the sitation with the professor before hand.