Proj-2013-2014-Sign2Speech-English

Objective
The goal of our project is to allow speechless people to communicate with a computer thanks to sign langage. Our programm should be able to understand sign langage in order to obey to orders given. Then orders should be displayed under literal form and transformed into vocal signal thanks to a speech synthesis technology.

The programm should also be able to learn new hand movements to increase its database to recognize more ideas.

The Team

 * Tutor : Didier Donsez


 * Members: Arthur CLERC-GHERARDI, Patrick PEREA


 * Department : RICM 4, Polytech Grenoble

State of the Art
Recognition of sign langage alphabet

Recognition of particular hand movements which mean ideas (ZCam camera)

Recognition of particular hand movements which mean ideas, and the traduction in spanish (OpenCV)

Recognition of particular hand movements which mean ideas (Kinect)

Tools
We will use two different technologies :

The Leap motion is a device that allow you to control your computer with your hands. But there is no physical contact and the communication with the computer is based on hand movements. You put the Leap motion under your hands, next to the keyboard.
 * Leap Motion

Compared to the Kinect, it is much smaller. The device has a size of 8 x 2,9 x 1,1 cm and a frame repetition rate of 200 Hz (the Kinect has a 30 Hz frame repetition rate). The Leap motion is composed of two webcams of 1,3 MP which film in stereoscopy and three infra-red light LED. It can detect the position of all ten fingers.

The official website has a section for developer. You can download the SDK 1.0 (almost 47 Mo) which contains API for following langages: C++, C#, Java, Python, Objective C and JavaScript. The SDK also contains examples for libraries and functions of the SDK.


 * The Creative camera of Intel® Perceptual Computing SDK

This camera is also a remote controller for the computer. You put it in front of people.

The Creative is provided of depth recognition, which enable developer to do the difference between shots. This camera films at around 30 fps in 720p.

Intel provides also a SDK for developers :. Some of the libraries will help us for hand and fingers tracking and facial recognition.

=1.  Introduction=

1.5 Overview of the remainder of the document
=2.  General description=

2.5 Assumptions and dependencies
=3.Specific requirements, covering functional, non-functional and interface requirements=
 * document external interfaces,
 * describe system functionality and performance
 * specify logical database requirements,
 * design constraints,
 * emergent system properties and quality characteristics.

a. Hand movements recognition
Description: The application must recognize hand movements of the speechless user using the computer´s inputs.

Inputs: Stream data

Source: Leap motion and creative camera sensors

Outputs: Data base index corresponding to the recognized gesture

Destination: Local machine

Action: The user must placed himself in front of the camera or above the leap motion. He communicate with the computer with hand movements. The movements are recognized by the sensors and the data is transmitted to the program. The program processed it, makes the calculation and determines which sign language movement it is.

Non functional requirements: The recongnition process should be done in real time. (3 seconds maximum)

Pre-condition: Using the Creative Camera, ambient light should be enough to see correctly user´s movements. Using the leap motion, hands should be above the sensor. The recognized gesture is present in the data base.

Post-condition: The user hand movement has been well recognized.

Side-effects: Bad gesture recognition

b. Literal translation
Description: The application must show on the screen the word or the idea that refers to the user movement detected.

Inputs: Processed data (by our program) and sign language gesture data base.

Source: Hard drive of the local machine

Outputs: String

Destination: The screen

Action: The program searches into the data base the literal translation corresponding to the recognized gesture.

Non functional requirements: /

Pre-condition: The literal translation of the recognized gesture is present in the data base.

Post-condition: The literal translation is correctly shown on the screen.

Side-effects:

c. Vocal synthesis
Description: The application must say the literal translation of the user´s gesture found previously.

Inputs: Data base index

Source: Local machine

Outputs: Sound

Destination: Speakers

Action: The program searches into the data base the sound corresponding to the recognized gesture.

Non functional requirements: /

Pre-condition: The computer must have speakers. The sound of the recognized gesture is present in the data base.

Post-condition: The sound is correctly played on the screen.

Side-effects:

d. Self-learning
Description: Learn a new sign language gesture

Inputs: Stream data

Source: Local machine

Outputs: New data base index

Destination: Local machine

Action: If the gesture is not recognized, the program should propose to the user to enter the corresponding literal translation. Then, the application add a new entry in the data base.

Non functional requirements: The recongnition process should be done "quickly". (3 seconds maximum)

Pre-condition:

Post-condition: A new entry corresponding to the unknown gesture is added in the data base

Side-effects:

=4. Product evolution=

=5. Appendices=

5.1. SRS structure
The document is based on template of the Software Requirements Specification (SRS) inspired of the IEEE/ANSI 830-1998 Standard.

References:
 * http://www.cs.st-andrews.ac.uk/~ifs/Books/SE9/Presentations/PPTX/Ch4.pptx
 * http://en.wikipedia.org/wiki/Software_requirements_specification
 * IEEE Recommended Practice for Software Requirements Specifications IEEE Std 830-1998

=6. Index=