Papers
Topics
Authors
Recent
2000 character limit reached

HamNoSys-based Bangla Sign Language Corpus

Updated 28 November 2025
  • The resource presents the first extensive, machine-readable Bangla sign language corpus utilizing HamNoSys for fine-grained articulatory encoding.
  • IsharaKotha integrates handcrafted annotations, a deep neural lemmatizer, and a 3D avatar pipeline to facilitate Bangla-to-sign conversion.
  • The corpus, with 3,823 sign entries, achieved a mean quality score of 3.14/4 from expert evaluations, highlighting its promise for accessibility and translation applications.

The HamNoSys-based Bangla Sign Language Corpus, designated as IsharaKotha, constitutes the first extensive, machine-readable resource for Bangla Sign Language (BDSL) encoded in the Hamburg Sign Language Notation System (HamNoSys). This corpus facilitates fine-grained, articulatory-level descriptions of BDSL signs and is intended to support avatar-based sign rendering, NLP research, and downstream applications in translation, accessibility, and education. IsharaKotha integrates hand-crafted HamNoSys annotations for thousands of lexical items, a deep neural lemmatizer to support sentence-level generation, and an evaluation interface, thereby laying the groundwork for the development of dynamic sign language technologies for the Bangladeshi context (Islam et al., 21 Nov 2025).

1. Motivation and Conceptual Foundations

BDSL has historically lacked a standardized, extensible digital resource suitable for computational applications and avatar-based generation. IsharaKotha addresses this gap by encoding BDSL entries in HamNoSys, a language-independent phonetic transcription system optimized for representing the five core phonological parameters of sign languages: handshape, orientation, location, movement, and non-manual features. HamNoSys features a compact symbol set (approximately 200 characters) and seamless conversion tools for Sign Language Markup Language (SiGML), enabling precise, avatar-consumable representations.

Adopting HamNoSys enables:

  • High-fidelity encoding for each BDSL sign at the articulatory level for downstream animation,
  • Interoperability with other sign language corpora and computational resources,
  • Support for parsing and synthesis tasks required in sign language processing pipelines.

The overarching objectives encompass bridging the communication divide between hearing and hearing-impaired users, facilitating automatic translation, powering learning tools, and supporting linguistic research in BDSL (Islam et al., 21 Nov 2025).

2. Corpus Composition and Lexical Statistics

IsharaKotha comprises @@@@2@@@@,823 unique sign entries, systematically annotated and categorized. The corpus includes lexical entries for 49 Bangla alphabets, 10 numerals, and 36 thematic categories of root words. These categories span domains such as “Crime & Law” (38 signs), “Nature & Environment” (133 signs), “Household Items” (342 signs), “Human Characteristics” (470 signs), and an “Others” category (776 signs). The breakdown of select categories is summarized below:

Category Sign Count
Crime & Law 38
Economics 35
Alphabets 49
Numbers 10
Human Characteristics 470
Others 776
Total 3823

While fundamentally a word-level resource, IsharaKotha extends toward sentences using a deep learning-based lemmatizer that maps inflected word forms to lexical roots, allowing sign sequence generation for sentences. Currently, the corpus does not incorporate explicit part-of-speech tagging, morphological paradigms, or syntactic dependencies beyond lemma mappings, though these are planned as future extensions (Islam et al., 21 Nov 2025).

3. HamNoSys Annotation Methodology

Every sign in IsharaKotha is encoded as a HamNoSys string capturing four dimensions: handshape (HH), orientation (OO), location (LL), and movement (MM). Each sign is thus represented as a tuple (h,o,,m)H×O×L×M(h,o,\ell,m)\in H\times O\times L\times M, where:

  • HH: handshape symbols (e.g., A,B,m2,1.1\texttt{A}, \texttt{B}, \texttt{m2}, 1.1)
  • OO: orientation symbols (e.g., ,\uparrow, \rightarrow)
  • LL: anatomical locations (e.g., forehead, chest, nose tip)
  • MM: movement descriptors (e.g., straight_up, circleCW, touch, none)

Examples:

  • The gloss “boi” (“book”):
    • HamNoSys: sym(both),H=m2,O=,L=chest,M=none⟨\textsf{sym}(\textsf{both}), H=\texttt{m2}, O=\uparrow, L=\textsf{chest}, M=\textsf{none}⟩
    • SiGML:
    • 1
      2
      3
      4
      5
      
      <hamnosys_manual>
        <handconfig sym="both">m2</handconfig>
        <orientation palm="inward" dir="up"/>
        <position body="chest"/>
      </hamnosys_manual>
  • The gloss “ma” (“mother”):
    • HamNoSys: handshape=1.1,orientation=,location=nose tip,movement=touch⟨\textsf{handshape}=1.1,\, \textsf{orientation}=\uparrow,\, \textsf{location}=\text{nose tip},\, \textsf{movement}=\text{touch}⟩

This granular encoding regime supports consistent SiGML serialization and subsequent 3D avatar animation via JASigning, with manual or suppressed specification of non-manual features as needed (Islam et al., 21 Nov 2025).

4. Integration of Neural Lemmatization

For sentence-level sign generation, IsharaKotha incorporates a character-level sequence-to-sequence (Seq2Seq) lemmatizer with attention. The architecture consists of a BiLSTM encoder over input character embeddings, an attention mechanism defined by:

αt,i=exp(et,i)jexp(et,j),et,i=vtanh(Wdht1+Wehi)\alpha_{t,i} = \frac{\exp(e_{t,i})}{\sum_j\exp(e_{t,j})},\quad e_{t,i} = v^\top\tanh(W_d h_{t-1} + W_e h_i)

and a unidirectional LSTM decoder producing lemma characters yty_t conditioned on context ct=iαt,ihic_t = \sum_i\alpha_{t,i}h_i. The loss function is cross-entropy:

L=t=1TlogP(yty<t,x)\mathcal{L} = -\sum_{t=1}^T \log P(y_t\mid y_{<t},x)

The lemmatizer was trained on a corpus of 94,781 Bangla word–lemma pairs (80/10/10 split for train/validation/test) and achieved a test-set accuracy of 79.22%. This component allows mapping of inflected tokens in Bangla sentences to root forms indexed in IsharaKotha, an essential step for accurate multi-word sign generation (Islam et al., 21 Nov 2025).

5. Sign Generation and Avatar Animation Pipeline

The end-to-end process consists of:

  1. Input Processing: Tokenization of raw Bangla sentences.
  2. Lemmatization: Mapping each word to its lemma via the neural Seq2Seq model.
  3. SiGML Retrieval: Lookup of the corresponding SiGML file for each lemma.
  4. Avatar Rendering: Delivery of the SiGML sequence to JASigning for 3D avatar animation, with optional specification of non-manual features.

This pipeline allows direct Bangla-to-sign transduction and naturalistic sign rendering for letters, digits, words, and sentences. Non-manual features are currently suppressed or manually specified when information is available. The workflow supports both educational and communication applications requiring automated sign generation for arbitrary Bangla text (Islam et al., 21 Nov 2025).

6. Evaluation Protocol and Empirical Findings

A dedicated web-based evaluation system was constructed with distinct sections for Alphabets, Numbers, Words (across 28 subcategories), and Sentences. Three expert raters (two professional interpreters and a native BDSL user) provided 3,828 ratings on randomly sampled items, each scored on a 4-point categorical scale:

  • 1 = Bad
  • 2 = Average
  • 3 = Good
  • 4 = Excellent

The mean score per evaluator was:

  • Interpreter 1: 3.32
  • Interpreter 2: 3.17
  • Sign-language user: 3.13

Overall mean score: 3.14/4.00. The mean thus falls between “Good” and “Excellent” quality. No formal significance tests were performed; however, rating standard deviations remained near 0.5, suggesting high inter-rater concordance and consistency across evaluated content types (Islam et al., 21 Nov 2025).

7. Applications, Limitations, and Prospects for Expansion

IsharaKotha enables:

  • Automatic Bangla-to-BDSL translation for broadcast, web, and educational platforms,
  • Interactive tools for learning BDSL vocabulary and structure,
  • Accessibility solutions for public information or e-government services.

Planned extensions include expanding to broader vocabulary coverage, dynamic generation in cases of missing root signs, enhanced annotation for non-manual signals (facial expression, head movement), enriched formal linguistic annotations (morphosyntactic, dependency structures), and speech-to-avatar mappings via integration of speech recognition technologies.

Current limitations involve lack of robust treatment for low-visibility articulations (e.g., spine-based signs), scenario-conditioned facial display, and coarticulation phenomena. Addressing these challenges will require further annotation and refinement of avatar control APIs (Islam et al., 21 Nov 2025).


For access to the evaluation platform and additional details, see: http://bdsl-isharakotha.ap-1.evennode.com (Islam et al., 21 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to HamNoSys-based Bangla Sign Language Corpus.