Generalization differences between BLOOM and BLOOM-1B7 on morpho-syntactic probing, especially for under-resourced languages
Investigate whether the 176B-parameter BLOOM model or the 1.7B-parameter BLOOM-1B7 model better generalizes morpho-syntactic features across under-resourced languages, given that BLOOM-1B7 leads on average in morpho-syntactic feature classification while its stronger correlations with pretraining dataset size may indicate weaker generalization to under-resourced settings.
References
It should be noted that the following questions remain for further research: 1. Generalizing abilities. BLOOM-1B7 is leading in the average performance of morphosyntactic feature classification for the languages in~\autoref{tab:bloom:probing}. The BLOOM results are lower, which can be interpreted as a worse grammatical generalization over the aforecited languages. However, the BLOOM-1B7's probing correlation results with factors like pretraining dataset size are more prominent, which makes it potentially less generalizing on the under-resourced languages than the bigger version.