SRS - Sign2Speech

From air
Jump to navigation Jump to search

Introduction

Purpose of the requirements document

This Software Requirements Specification (SRS) identifies the requirements for project "Sign2Speech". This is an open source projet and we shall present what we did for this project in case to catch interest of new potential contributors. This document is a guideline about the functionalities offered and the problems that the system solve.

Scope of the product

Sign2Speech could be used at reception desks or during video conferences to allow signing people to speak with people who don't know the French Sign Language. The main point of this project is to use Intel's Real Sense camera to recognize gestures from the French Sign Language to offer a new means of communication. The program will be able to transcribe gestures, done by a signing person, into written words, printed on the screen of the person who doesn't know the FSL. This communication will be made via a chat application working on WebRTC.

Glossary

  • FSL: French Sign Language
  • JSON: Javascript Object Notation . We use this format to store our dictionary.

References

Overview of the remainder of the document

In the remainder of the document, the general description of the software will be exposed. The requirements (functional and non-functional) will be specified in another part. The document will end with the product evolution, the appendices and the index.

General Description

Product perspective

The main aim of our project is to help speechless people to communicate with other people. In this case, we develop a software able to recognize and analyze the sign language to retranscribing it in writting.

Product functions

Our appplication is made of 2 major parts:

  • The recognition and translating application

In this function, the program will try to recognize the gestures executed in front of the camera. It is linked to a dictionary (JSON file) that contains all the words that the program will be able to recognize. If the gesture is recognized by the program, its meaning will be printed on the screen. If not, nothing will display.

  • The WebRTC chat application

It is used to enable the communication between 2 persons, using the video and a text chat. It also allows the real-time transmission of the subtitles.

User characteristics

There is two types of users for the application:

  • The first user (User1) knows the FSL,
  • The second user (User2) doesn't understand the FSL

They both want to communicate together. User1 will stand in the front of the camera and will sign while User2 will watch the monitor to see the translation of the gestures. He will be able to reply using the messaging chat.

Operating environment

The camera's SDK only works on Windows. A good Internet connection is also required for the chat application, which only works with Mozilla Firefox due to restriction on Chrome (WebRTC needs a HTTPS connection) and other browser doesn't provide a full support of WebRTC. The user must not use a hotspot or a connection wich is under a complex NAT. The connection process for WebRTC will struggle to pass through the NAT and will send back an error message.

General constraints

Of course, the user needs to have Intel's Real Sense camera. We have reported different factors that have a negative consequence on the hand tracking process:

  • The user must not wear bracelets or rings,
  • The user should, as much as possible, use the camera under natural light, rather than artificial light sources,
  • The user must wear a monochrome top that contrasts with the color of the skin.

These elements can reduce the errors and make the tracking better, but it still won’t be perfect because of the imprecisions of the camera themselves.

Assumptions and dependencies

Specific requirements, covering functional, non-functional and interface requirements

Requirement X.Y.Z (in Structured Natural Language)

Gesture recognition

Description:

Inputs: Hand and finger data returned by the camera stream

Source: Intel's Real Sense camera

Outputs: The encoding corresponding to the gesture (fingers+trajectorie)

Destination: Computer's memory

Action:

Non functional requirements: Real-time tracking

Pre-condition:

Post-condition: The gesture must have been well recognized by the camera in order to be well encoded

Side-effects: If the gesture has not been well recognized and encoded, it won't be recognized/well translated afterwards

Gesture translation

Description:

Inputs: The gesture encoding returned by the gesture recognization step

Source: Computer's memory

Outputs: The current node moves inside the dictionary and sends the corresponding word if there is a match

Destination: WebSocket channel

Action:

Non functional requirements: Real-time search in the dictionary

Pre-condition: The encoding must have a translation in the dictionary

Post-condition: The gesture must be well translated

Side-effects:

Learning mode

Function:

Description:

Inputs:

Source:

Outputs:

Destination:

Action:

Non functional requirements:

Pre-condition:

Post-condition:

Side-effects:

Product Evolution

Appendices

Index