Developed by Team Anyhow Samagra Agarwal, Srushti Talandage, and Bhoumik Sangle
SignCast is a real-time accessibility tool designed to bridge the communication gap for the speech-impaired. By leveraging computer vision and deep learning, SignCast translates American Sign Language (ASL) gestures into spoken language instantaneously.
For millions of individuals who rely on ASL, communicating with non-signers in real-time remains a significant challenge. SignCast provides a software-based bridge, converting complex hand movements into synthesized speech, empowering users to express themselves naturally in any environment.
SignCast uses a sophisticated pipeline to ensure that both the shape and the motion of signs are captured accurately.
- Input: Real-time video stream processed at 30 FPS.
- Landmark Extraction: MediaPipe extracts 21 3D hand landmarks per frame.
- Normalization: To ensure the model is invariant to where the user is sitting, all landmarks are normalized relative to the wrist ().
- Temporal Processing: A Bi-Directional LSTM (Long Short-Term Memory) network processes a sliding window of 30 frames to understand the motion.
- Classification: A Softmax layer chooses the correct gloss from a vocabulary of 2,700 signs.
- Output: The predicted text is converted to audio via a Text-to-Speech (TTS) engine.
- Dataset: ASL Citizen (2,700+ distinct signs).
- Data Split: 60% Training | 25% Validation | 15% Testing.
- Accuracy: Achieved a final validation accuracy of 85.74%.
- TypeScript & JavaScript – Frontend logic and real-time interactions
- HTML5 & CSS3 – Responsive user interface and layout
- MediaPipe Hands – Real-time hand landmark detection
- OpenCV – Webcam capture and frame processing
- TensorFlow / Keras – Sequence-based sign language classification model
- NumPy – Landmark preprocessing and tensor handling
- Scikit-learn – Label encoding and dataset utilities
- pyttsx3 – Offline text-to-speech conversion
- Python – Core ML pipeline and inference logic
- Git & GitHub – Version control and collaboration
Prerequisites: Python 3.10
# Clone the repository
git clone https://github.com/your-repo/SignCast.git
# Create a virtual environment
python3 -m venv signcast_env
source signcast_env/bin/activate
# Install dependencies
pip install tensorflow mediapipe opencv-python pandas numpy
├── /models
│ ├── SignCast_best_model.h5 # Trained Bi-LSTM model
│ └── label_map.json # Dictionary for 2,700 signs
├── data_loader.py # Custom ASLDataGenerator
├── train_model.py # Model architecture & training script
└── webcam_test.py # Real-time inference & TTS integration
Team Anyhow is committed to expanding SignCast into a full-scale accessibility suite:
- Custom Word Feature: Allowing users to record and train personal idiosyncratic signs.
- Subtitles: Integrated overlay for video conferencing platforms.
- Multi-Hand Support: Expanding landmarks to include facial expressions and body pose for more nuanced translation.