Papers
Topics
Authors
Recent
2000 character limit reached

IsharaKotha: Bangla Sign Language Corpus

Updated 28 November 2025
  • IsharaKotha is an avatar-based Bangla Sign Language resource that uses the HamNoSys notation system to encode signs for detailed, language-agnostic transcription.
  • It comprises a structured corpus of 3,823 annotated entries with integrated SiGML files, facilitating dynamic, real-time sign generation via a modular animation pipeline.
  • The system employs a deep learning-based lemmatizer achieving 79.22% accuracy and renders animations at approximately 30 fps, validated by rigorous evaluation protocols.

IsharaKotha is an avatar-based Bangla Sign Language (BSL) resource designed for text-to-sign translation, integrating a structured linguistic corpus encoded in the Hamburg Notation System (HamNoSys) with a modular animation rendering pipeline. It is the first comprehensive, HamNoSys-based Bangla Sign Language corpus and supports both research and practical applications requiring dynamic sign generation and avatar animation from Bangla text input (Islam et al., 21 Nov 2025).

1. Corpus Architecture and Linguistic Representation

The IsharaKotha corpus is phonetically encoded using HamNoSys, a notation system developed for detailed, language-agnostic transcription of signed languages. Each sign, corresponding to a letter, digit, or word, is decomposed into five features:

  • Handshape (~200 possible configurations)
  • Orientation (six principal palm/finger directions)
  • Location (over 30 body-relative positions)
  • Movement (e.g., straight, circular, repeated)
  • Non-manual features (facial expressions, head/lip movement)

For instance, the sign for the Bangla word for "book" combines a two-hand symmetry operator, flat handshapes with palmar orientation inward, finger contact at chest height, an opening motion, and no non-manual markers. Signs authored in HamNoSys are converted into SiGML (Signing Gesture Markup Language) XML files using the SiS-Builder toolkit, with separate <hamnosys_manual> and <hamnosys_nonmanual> tags representing manual and non-manual components, respectively, and a gloss attribute linking each sign to its Bangla term (Islam et al., 21 Nov 2025).

2. Corpus Scope, Coverage, and Metadata

The corpus comprises 3,823 annotated sign entries, spanning alphabets, digits, and 34 semantic classes of vocabulary:

Category Entries
Alphabets 49
Digits 10
Word signs (34 classes) 3,764
Total 3,823

Primary semantic domains include Crime & Law (38), Economics (35), Food & Drinks (234), Household Items (342), Human Characteristics (470), Sports (53), and Others (776). Each entry is annotated with:

  • Bangla orthographic gloss
  • HamNoSys transcription
  • SiGML file (manual/non-manual)
  • Semantic category tag

This structure enables downstream NLP tasks, comprehensive annotation, and consistent mapping between Bangla text and sign form (Islam et al., 21 Nov 2025).

3. Text-to-Sign Translation Pipeline and Lemmatization

The IsharaKotha workflow operates as follows:

  1. Input Processing: Raw Bangla sentences are segmented and tokenized.
  2. Lemmatization: Inflected tokens are mapped to lemmas using a deep learning–based sequence-to-sequence (Seq2Seq) model with global attention. The architecture uses a Bi-LSTM encoder to process input character sequences, an attention function:

αt,s=exp(et,s)sexp(et,s)\alpha_{t,s} = \frac{\exp(e_{t,s})}{\sum_{s'}\exp(e_{t,s'})}

where et,s=vtanh(Wshs+Wdst1)e_{t,s} = v^\top \tanh(W_s h_s + W_d s_{t-1}), a unidirectional LSTM decoder, and a softmax output layer:

LCE=t=1Tv=1Vyt,vlogy^t,v\mathcal{L}_{CE} = -\sum_{t=1}^{T}\sum_{v=1}^{|V|} y_{t,v}\log\hat{y}_{t,v}

Trained on a corpus of 94,781 word-form pairs, the lemmatizer achieves 79.22% accuracy.

  1. SiGML Retrieval: For each lemma, a pre-computed SiGML file is located.
  2. Animation Rendering: SiGML sequences are rendered via a 3D avatar engine (Islam et al., 21 Nov 2025).

4. Avatar-Based Animation Generation

The rendering engine uses the JASigning platform to translate SiGML into avatar motion, mapping HamNoSys features as follows:

  • Handshapes: Symbol-to-joint angle presets
  • Orientation/Location: Relative palm/finger orientation mapped to avatar body coordinates
  • Movement: Bézier-style interpolation of hand trajectories
  • Non-manual features: Head and facial animations parsed from <hamnosys_nonmanual> tags

This pipeline supports real-time signing at ≈30 fps, enabling dynamic generation without recourse to pre-recorded video data (Islam et al., 21 Nov 2025).

5. Evaluation Protocol and Quantitative Results

Evaluation was performed via a publicly accessible web interface, partitioned into alphabets, digits, word categories, and sentences. Three evaluators (two professional sign interpreters and one hearing-impaired athlete) rated each animation on a scale (“Bad”=1, “Average”=2, “Good”=3, “Excellent”=4), yielding 3,828 ratings overall.

Distribution of ratings:

Rating Count Percentage
Bad 116 3.03%
Average 294 7.68%
Good 2,346 61.29%
Excellent 1,072 28.00%

The aggregate mean score is xˉ=3.14\bar{x} = 3.14, with a variance of ≈0.4774, standard deviation ≈0.691, and a 95% confidence interval of 3.14±0.023.14\pm0.02. Digits scored ≈3.6–4.0, while full sentence signing scored ≈3.06–3.32, reflecting relatively higher rates of lemmatizer errors (≈20% inflection mistransformations) and incomplete lemma coverage (Islam et al., 21 Nov 2025).

6. Applications, Limitations, and Future Directions

IsharaKotha supports applications including e-learning for the hearing-impaired, real-time smartphone/web text-to-sign translation, and sign language production or annotation for NLP research. Notable limitations include an estimated 1% omission rate for semantic units (chiefly directional/facial signs), reliance on a static dictionary with incomplete full-sentence coverage, and simplified avatar facial expression modeling. Extensions under development involve expanding the sign inventory (>10,000 entries), improving morphological analysis (target >90% lemmatizer accuracy), refining avatar blendshapes and eye gaze, grammar-based reordering for smoother multiword signing, and computer vision–assisted semi-automated HamNoSys transcription from usable video corpora (Islam et al., 21 Nov 2025).

By providing a rigorously annotated, extensible, and openly accessible Bangla Sign Language resource based on HamNoSys and SiGML standards, IsharaKotha establishes the foundation for scalable, dynamic text-to-sign translation systems and advances the state of computational sign linguistics for Bangla.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to IsharaKotha.