Papers
Topics
Authors
Recent
Search
2000 character limit reached

PyGress: Python Proficiency Visualization

Updated 12 November 2025
  • PyGress is a web-based tool that automatically analyzes and visualizes code proficiency by mapping Python constructs to CEFR levels from A1 to C2.
  • It uses a full-stack pipeline with PyDriller, pycefr, and Plotly to extract commit histories, compute delta-scores, and generate interactive visualizations.
  • Empirical evaluations on OSS projects such as django-silk and pandas-profiling reveal proficiency trends and highlight opportunities for AST diffing optimizations.

PyGress is a web-based system for automatically analyzing and visualizing the progression of code proficiency in Python open-source software (OSS) projects. It employs the pycefr analyzer to evaluate Python source code constructs according to the Common European Framework of Reference (CEFR), quantifying developer proficiency from beginner (A1) to advanced (C2) levels. By submitting a GitHub repository, users obtain interactive, project- and contributor-specific visualizations of proficiency distributions and their evolution over time, enabling a nuanced assessment of expertise dynamics in collaborative Python OSS environments.

1. System Architecture and Workflow

PyGress is implemented as a full-stack, end-to-end pipeline comprising five principal phases: repository ingestion, commit-history extraction, code preprocessing and diffing, proficiency scoring, and interactive visualization. User interaction begins with submission of a GitHub repository URL through a Flask-based front end. The back end invokes PyDriller to traverse the Git commit history, extract commit metadata (author, timestamp, file changes), and, for each modified Python file, generate "before" and "after" code snapshots. These snapshot pairs are analyzed by pycefr, which parses the corresponding ASTs and classifies Python constructs according to CEFR levels (A1–C2).

The resulting counts are processed to compute per-commit delta-scores, aggregating new construct insertions per level and tracking them by contributor and time window. Aggregated scores are transformed into interactive charts via Plotly and presented in the browser.

The pipeline is represented as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[User Browser]
     |
 → Flask Front-end
     |
 → Back-end Controller
     |
 → PyDriller (commit extraction)
     |
 → Code Diffing (“before”/“after”)
     |
 → pycefr Engine (proficiency scoring)
     |
 → Aggregator (time series, smoothing)
     |
 → Visualization Module (Plotly)
     ↓
[User Browser Interactive Charts]

2. CEFR-Based Proficiency Modeling

The core of PyGress's analytic capability is its CEFR-aligned proficiency model, implemented via pycefr. Python constructs are mapped to CEFR levels as follows:

CEFR Level Construct Examples
A1–A2 Basic (if, for, nested lists)
B1–B2 Intermediate (break, list comprehensions)
C1–C2 Advanced (generators, metaclasses)

Given a file version ff, let n(f)n_{\ell}(f) denote the count of constructs at level {A1,...,C2}\ell \in \{A1, ..., C2\}. For each commit cc modifying file ff, PyGress calculates:

Δn(c,f)=max(n(fafter)n(fbefore),0)\Delta n_{\ell}(c, f) = \max\bigl(n_{\ell}(f_{\mathrm{after}}) - n_{\ell}(f_{\mathrm{before}}),\, 0\bigr)

This expression captures only added constructs (with deletions set to zero). The per-commit proficiency vector is:

p(c)=[ΔnA1(c),ΔnA2(c),...,ΔnC2(c)]N6\mathbf{p}(c) = [\Delta n_{A1}(c), \Delta n_{A2}(c), ..., \Delta n_{C2}(c)] \in \mathbb{N}^6

Normalized proportion vectors can be formed:

p^(c)=Δn(c)Δn(c)\hat{p}_{\ell}(c) = \frac{\Delta n_{\ell}(c)}{\sum_{\ell'} \Delta n_{\ell'}(c)}

and weighted scalar scores can be computed with arbitrary ww_\ell:

Score(c)=wΔn(c)\mathrm{Score}(c) = \sum_{\ell} w_{\ell} \Delta n_{\ell}(c)

The prototype uses w=1w_\ell = 1 (equal weighting), but the design permits alternative weighting or nonlinear transformation schemes.

3. Commit History Processing and AST Diffing

Commit and file-level analysis is conducted by invoking PyDriller’s RepositoryMining API, restricted to Python file modifications. For each commit and file, the tool acquires complete ASTs of the file before and after the change. The AST snapshots enter the pycefr engine, generating level-wise construct counts. The difference operation yields Δn(c,f)\Delta n_{\ell}(c, f), representing newly introduced constructs only.

Example code used internally:

1
2
3
4
5
6
miner = RepositoryMining(repo_path, only_modifications_with_file_types=['.py'])
for commit in miner.traverse_commits():
    for mod in commit.modifications:
        before_snapshot = mod.diff['before']
        after_snapshot  = mod.diff['after']
        # Process with pycefr for level counts

4. Aggregation, Time-Series Construction, and Trend Detection

Per-commit proficiency vectors p(c)\mathbf{p}(c) are grouped by committer and timestamp. For a set CtC_t of commits within time window tt, the aggregate project-level vector is:

Pt=1CtcCtp(c)R6\mathbf{P}_t = \frac{1}{|C_t|} \sum_{c \in C_t} \mathbf{p}(c) \in \mathbb{R}^6

Contributor-specific time series {Ptu}\{\mathbf{P}_t^u\} are constructed similarly. Smoothing is applied using a moving average of window size kk:

P~t=1ki=0k1Pti\tilde{\mathbf{P}}_t = \frac{1}{k} \sum_{i=0}^{k-1} \mathbf{P}_{t-i}

Trend detection may take the form of linear regression fits for each CEFR level:

P~t,=at+b\tilde{P}_{t, \ell} = a_\ell\,t + b_\ell

A plausible implication is that this method permits identification of progression slopes and periods of increased proficiency complexity within projects or at the individual level.

5. Visualization Strategies

The visualization module utilizes Plotly.js via Python bindings to offer several chart types:

  • Spider (radar) charts: Show aggregate or contributor-specific proficiency across all CEFR levels.
  • Time-slider graphs: Stacked area charts of smoothed proficiency vectors over time, with interactive controls for temporal navigation.
  • Planned heatmap view: Module-by-level matrices, colored by normalized construct counts, to capture per-module proficiency nuances.

Plotly event callbacks in the Flask front end enable interactivity such as hover, zoom, and filtering by contributor.

6. Implementation and Deployment

Backend components utilize Python 3.10, Flask, PyDriller, and pycefr. Frontend elements are implemented with Jinja2 templates, Bootstrap, and Plotly for interactive chart rendering. Dedicated scripts transform aggregated results to JSON compatible with Plotly. The repository is structured into /pygress_backend (data extraction, scoring), /pygress_frontend (Flask app, templates, static assets), and includes Docker/Docker Compose files for containerized deployment.

Setup options are:

1
2
3
git clone https://github.com/MUICT-SERU/PyGress.git
cd PyGress
docker-compose up --build
or, for virtual environment deployment:

1
2
3
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
flask run

7. Empirical Evaluation, Observed Patterns, and Limitations

PyGress has been empirically evaluated on three Python OSS projects—django-silk, pandas-profiling, and pytest-ansible—spanning 2014–2024. The findings include:

  • Dominance of A1–A2 constructs across all projects, consistent with Python’s design emphasis on readability and straightforward syntax.
  • django-silk exhibited substantial C1–C2 usage in its initial years, indicative of early architectural decisions involving advanced Python features.
  • pandas-profiling showed a surge in advanced constructs correlating with major refactor efforts during 2020–2021.
  • pytest-ansible maintained low C1–C2 frequencies over its history, reflecting preferences for maintainable simplicity.

Highly proficient contributors were identified via annual sums of C1 + C2 insertions, often corresponding to key project maintainers.

Performance benchmarking on repositories of 10,000 commits indicated a processing time of approximately 3 seconds per commit, with AST analysis as the primary computational bottleneck.

Current limitations include pycefr’s necessity of re-parsing entire files due to diffing granularity, limited per-module granularity (pending heatmap implementation), and lack of independent, human-rated proficiency ground truth. Future work aims to optimize diff-level AST extraction, extend CEFR-based classification to languages beyond Python (e.g., JavaScript via jscefr), incorporate advanced trend and anomaly detection, and validate insights through user studies with OSS maintainers.

PyGress is openly available at https://github.com/MUICT-SERU/PyGress, with a demonstration video at https://youtu.be/hxoeK-ggcWk.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to PyGress Tool.