- 1 Introduction
- 2 General Description
- 3 Specific requirements, covering functional, non-functional and interface requirements
- 4 Product Evolution
- 5 References
Purpose of the requirements document
This Software Requirements Specification (SRS) identifies the requirements for project "RealTimeSubtitles". This is a guideline about features offered and problems that we will have to solve. It is an open source project loaded on Github, the code is well organized to allow review by us or by new potential contributors.
Scope of the product
The main target of our project is to help partially deaf student to be more autonomous attending a lecture. This project is proposed by the department of disabled students at the UGA. In addition, we have to design a collaborative HMI for students to correct in real time the subtitles.
The app is divided into 2 parts :
- The transcript by GoogleSpeech
In a first place the API must recognize the teacher speech and transcript it in real time. Final result are appended into the right place according to the current slide.
- The collaborative HMI
Designed for students, it allows logged in student to follow a course. While the teacher speech the students can either follow the courses and read the subtitles, or edit the subtitles to correct the results.
There are three types of users for our app
- The teacher talking while showing his slides
- The students editing notes
- The students reading the notes and the partially deaf students
The GoogleSpeech API works on google Chrome. A good Internet connection is required for the transcript.
- The teacher needs to have his slides on reveal.js
- The teacher need to talk loud and not so fast
- The room has to be quiet (no noise)
- These elements can reduce errors and help the API to transcript well the speech. However, it won’t be perfect due to the instability of GoogleSpeech API.
Specific requirements, covering functional, non-functional and interface requirements
Requirement X.Y.Z (in Structured Natural Language)
Description: Capture the voice and return a textual translation
Inputs: Voice of a speaker
Outputs: Textual data
Action: A speaker talk with a microphone and the system return the transcript in textual
Non functional requirements: Accurate detection of spoken words
Pre-condition: User has a microphone
Post-condition: Words are detected
Side-effects: words are not detected or wrong detection
Render the subtitles to slides
Description: Show the subtitles to the slides
Inputs: words spoken
Source: Speech recognizer
Outputs: slides with subtitles
Action: : Get the spoken words and show them correctly to the slides
Non functional requirements: No loss of data
Pre-condition: Spoken words are detected
Post-condition: Slides are shown with subtitles
Side-effects: Subtitles are not well shown and hide the slides. Subtitles are not readable.
Description: User can edit subtitles : add or edit words
Inputs: Wrong detected word
Source: Speech recognizer
Outputs: corrected word
Destination: shown subtitles
Action: User click on the word he wants to edit then edit it with his keyboard. User click on blank space between words to add a word.
Non functional requirements: Easy to click between words, or add a word
Pre-condition: Words are detected
Post-condition: words are added or modified
Side-effects: Removing a good word, text not well displayed.
Description: User has his own session
Inputs: User profile
Source: User profile
Outputs: A logged user
Destination: security manager, session control
Action: User click on login form and enter his login and password.
Non functional requirements: secured against SQL injection
Pre-condition: user wants to login and know his login and password
Post-condition: user is logged
Side-effects: Users are tracked by id. Users cant delete others courses
- Different API Speech more efficient
- Using RealTimeSubtitles in meetings/conferences