Multilingual Transformer Mapping

Supervisor: Dr. Isabel Papadimitriou · UBC · Jan 2025 – Present

This project investigates how multilingual transformer models represent and align linguistic structure across languages. The central question is whether the cross-lingual generalisation observed in models like mBERT and XLM-R arises from learned language-neutral representations, or from implicit mapping functions between language-specific subspaces — and what this implies for transfer learning across typologically diverse languages.

mBERTXLM-RCKAProbingCross-lingual TransferPyTorchUniversal Dependencies

mBERT and XLM-R transformer architectures — input tokens (bottom) flow through stacked self-attention layers to produce contextual token representations (top).

Background — Key Concepts

Click each card to expand.

mBERT · XLM-R

Multilingual Transformers

▼

CKA · Probing

Representation Alignment

▼

Syntax · Morphology

Linguistic Structure in Embeddings

▼

Core Research Questions

Are cross-lingual representations truly language-neutral, or do they retain language-specific structure that is implicitly mapped?

Representation geometry

Which layers of a multilingual transformer encode syntactic vs. semantic information, and does this vary across languages?

Layer specialisation

Can we learn explicit linear mappings between language subspaces and use them to improve zero-shot transfer?

Mapping & transfer

How do typological features (morphology, word order) affect the quality of cross-lingual alignment?

Typological diversity

Research Pipeline

Model Selection

Select multilingual transformer checkpoints (mBERT, XLM-R) across multiple sizes and training regimes for controlled comparison.

HuggingFace Transformers

Representation Extraction

Extract hidden-state representations from each encoder layer for parallel sentence pairs across target language pairs.

Python · PyTorch · NumPy

Alignment Analysis

Apply CKA, mutual nearest-neighbour retrieval, and learned linear maps to quantify how well language representations align at each layer.

SciPy · sklearn

Probing Experiments

Train lightweight probing classifiers on frozen representations to determine what linguistic features (POS, dependency, morphology) are encoded and where.

Probing suite · multilingual UD treebanks

Mapping & Transfer

Learn explicit mapping functions between language-specific subspaces and evaluate zero-shot cross-lingual transfer on downstream classification tasks.

XNLI · MLQA · XTREME