Multilingual Transformer Mapping
Supervisor: Dr. Isabel Papadimitriou · UBC · Jan 2025 – Present
This project investigates how multilingual transformer models represent and align linguistic structure across languages. The central question is whether the cross-lingual generalisation observed in models like mBERT and XLM-R arises from learned language-neutral representations, or from implicit mapping functions between language-specific subspaces — and what this implies for transfer learning across typologically diverse languages.
Background — Key Concepts
Click each card to expand.
Multilingual Transformers
Representation Alignment
Linguistic Structure in Embeddings
Core Research Questions
Are cross-lingual representations truly language-neutral, or do they retain language-specific structure that is implicitly mapped?
Representation geometryWhich layers of a multilingual transformer encode syntactic vs. semantic information, and does this vary across languages?
Layer specialisationCan we learn explicit linear mappings between language subspaces and use them to improve zero-shot transfer?
Mapping & transferHow do typological features (morphology, word order) affect the quality of cross-lingual alignment?
Typological diversityResearch Pipeline
Model Selection
Select multilingual transformer checkpoints (mBERT, XLM-R) across multiple sizes and training regimes for controlled comparison.
HuggingFace TransformersRepresentation Extraction
Extract hidden-state representations from each encoder layer for parallel sentence pairs across target language pairs.
Python · PyTorch · NumPyAlignment Analysis
Apply CKA, mutual nearest-neighbour retrieval, and learned linear maps to quantify how well language representations align at each layer.
SciPy · sklearnProbing Experiments
Train lightweight probing classifiers on frozen representations to determine what linguistic features (POS, dependency, morphology) are encoded and where.
Probing suite · multilingual UD treebanksMapping & Transfer
Learn explicit mapping functions between language-specific subspaces and evaluate zero-shot cross-lingual transfer on downstream classification tasks.
XNLI · MLQA · XTREME