Research — Abdul Basit Tonmoy

01 : Interests

Representation geometry

How embedding spaces deform under fine-tuning, quantization, and LoRA adaptation — and whether unlabeled, model-agnostic metrics can predict downstream degradation before it shows up in eval. This is the thread behind SemanticSentry.
Cost-aware multimodal inference

Practical ways to keep vision-language pipelines cheap: frame deduplication, hierarchical attention budgets, adaptive sampling that preserves signal at a fraction of the API cost. This is the thread behind AdaFrame and the AdLovin pipeline.
Systems for ML at scale

The plumbing — Rust OCR engines, TensorRT inference, structured downstream analytics — that lets research artifacts move from notebook to production. This is the thread behind my work at Skop Intelligence.

02 : Publications

2026 BlackboxNLP @ EMNLP under review

Geometric Drift Metrics are Insufficient: A Matched-Magnitude Dissociation Between Aligned and Anti-Aligned Fine-Tuning

Tonmoy, A. B., Deng, Q.

Geometric similarity metrics — Centered Kernel Alignment, neighborhood preservation, isotropy — are widely used to compare neural representations and infer functional similarity. We show this inference fails in two complementary directions. On E5-base-v2, matched-NPS conditions produce order-of-magnitude differences in retrieval damage depending on the fine-tuning objective; on BERT, near-trivial geometric drift coexists with a large, CI-disjoint functional gap between MLM and contrastive fine-tuning on the same corpus. Geometric drift magnitude does not predict the sign or scale of functional change without knowledge of the gradient–pretraining alignment.

contributions
- A matched-magnitude dissociation on E5 between gradient-aligned and gradient-anti-aligned fine-tuning at fixed Neighborhood Preservation Score, demonstrated across three seeds and a pre-registered hypothesis.
- A converse dissociation on BERT where the more-drifted condition is functionally better, opposite to what NPS would predict.
- A pre-registered corpus control ruling out distribution shift, plus structural controls (matched-Frobenius random rank-4, full-rank fine-tuning) ruling out low-rank confinement as the driver.
- A linear-probe methodology finding: up to 28.5 pp accuracy swings at identical embeddings under weak vs. cross-validated probe configurations — large enough to contaminate published fine-tuning evaluations.
pdf code bibtex
2026 ACM Multimedia in review

AdaFrame: Hierarchical Multimodal Deduplication with Adaptive Information Budgeting for Cost-Efficient Video Advertisement Analysis

Tonmoy, A. B., Luthra, A., Deng, Q.

Modern video advertisement analysis pipelines spend most of their compute and API budget on near-duplicate frames. AdaFrame tackles this with a hierarchical multimodal deduplication scheme that allocates an adaptive information budget across frames, prioritizing visual-textual diversity over uniform sampling. On an advertising-video corpus the method achieves 70–90% frame deduplication and reduces downstream vision-API cost by roughly 70% while preserving structured signal extraction.

contributions
- Hierarchical deduplication combining CLIP image embeddings with audio-state features from HuBERT.
- An adaptive per-clip information budget driven by local visual-textual diversity.
- 70–90% frame deduplication with negligible quality loss on downstream extraction tasks.
pdf code arxiv bibtex
2025 IEEE ICAIC

Advanced Facial Emotion Classification with 135 Classes for Enhanced Cybersecurity Applications

Powers, G., Jin, A., Tonmoy, A. B., Cao, H., Gohel, H., Deng, Q.

A fine-grained facial-emotion classifier scaling to 135 compound emotion classes, oriented toward cybersecurity applications where coarse 7-class emotion models miss subtle deception, stress, and intent signals. Presented at the IEEE International Conference on Artificial Intelligence in Cybersecurity (ICAIC) 2025.

ieee xplore bibtex

Code for these papers lives at github.com/abtonmoy. For talks, drafts, or collaboration: atonmoy27@wabash.edu.

01 : Interests

Representation geometry

Cost-aware multimodal inference

Systems for ML at scale

02 : Publications

Geometric Drift Metrics are Insufficient: A Matched-Magnitude Dissociation Between Aligned and Anti-Aligned Fine-Tuning

AdaFrame: Hierarchical Multimodal Deduplication with Adaptive Information Budgeting for Cost-Efficient Video Advertisement Analysis

Advanced Facial Emotion Classification with 135 Classes for Enhanced Cybersecurity Applications