cv
Academic Interests
-
Multi-Modal Representation Learaning
- Agentic AI
- Out-of-Distribution Robustness
- Self-Supervised Learning
- Diffusion Models
-
Learning on Graphs
- Knowledge Graphs
- Graph Representation Learning
Education
-
2019 - present
Ph.D. in Computer Science
McGill University, Montreal, Canada
-
2014 - 2019
B.Sc. in Software Engineering
Sharif University of Technology, Tehran, Iran
Experience
-
2024 - present
Visiting Researcher
ServiceNow Research, Montreal, Canada
- Worked in multimodal foundation model team and was supervised by a Senior Machine Learning Scientist
- Proposed a multimodal benchmark for evaluating state-of-the-art large multimodal models
- Designed efficient evaluation pipelines for large foundation models
-
2023
Machine Learning Research Intern
Recursion, Montreal, Canada
- Worked in the Data Science group and was supervised by a Senior Machine Learning Scientist
- Designed models to learn gene representations for data-centric drug discovery
- Applied multi-modal models with self-supervised learning techniques to better integrate sequential and visual modalities of gene-perturbations to learn better gene representations
-
2018
Research Assistant
University of Toronto, Toronto, Canada
- Worked in a group underthe supervision of Prof. Konstantinos N. Plataniotis
- Project goal was to improve the robustness of convolutional neural networks (CNNs) against adversarial attacks
- Implemented in Python 3
-
2018
Software Engineering Intern
Moj Secure E-commerce Co., Tehran, Iran
- Interned in the AI team underthe supervision of the CTO and co-founder of Moj Company
- Project goal was to parse an arbitrary image of any valid bank card and detect the digits in the 16-digit client card number
- Used a Single Shot Detectorfor object detection and a CNN for digitrecognition
- Implemented in Python 3 using TensorFlow and Keras
Projects
-
2025
PairBench Are Vision-Language Models Reliable at Comparing What They See?
- Developed a framework for evaluating Vision-Language Models (VLMs) as similarity kernels using four key metrics
- Demonstrated the importance of thorough assessment before adopting VLMs for evaluation tasks
-
2025
WebMMU A Benchmark for Multimodal Multilingual Website Understanding and Code Generation
- Released WebMMU, a large-scale benchmark for evaluating multimodal and multilingual website understanding
- Benchmarked code generation and UI reasoning tasks across languages, highlighting limitations in current vision-language models
-
2025
AlignVLM Bridging Vision and Language Latent Spaces for Multimodal Understanding
- Proposed AlignVLM, a framework to align latent spaces between vision and language models for improved multimodal reasoning
- Demonstrated that latent space alignment enhances cross-modal transfer and understanding across diverse tasks
-
2025
Rendering-Aware Reinforcement Learning for Vector Graphics Generation
- Introduced a rendering-aware reinforcement learning method for generating high-quality vector graphics
- Showed improvements in fidelity and structural consistency compared to prior vector graphics generation techniques
-
2024
BigDocs-7.5M A Large-Scale Dataset for Document Understanding
- Created an open-access dataset with 7.5 million multimodal documents for document understanding tasks
- Demonstrated that models trained on BigDocs-Bench improved document reasoning performance by up to 25.8% over GPT-4o
-
2024
Guided Positive Sampling for Self-Supervised Learning
- Designed a novel method, namely GPS-SSL, for integrating a priori knowledge into any self-supervised learning (SSL) model
- GPS-SSL performs positive sampling by approximating strong augmentations
- The method encourages the base SSL method to be more robust against untuned data augmentations when applied to under-studied and/or real-world datasets
-
2022
Revisiting Hotels-50K and Hotel-ID
- Revisited two image datasets, Hotels-50K and Hotel-ID, and proposed new training and evaluation splits with different levels of difficulty
- Proposed evaluation splits based on the images' class and super-class information to imitate real-world scenarios
- Accepted paper for the ICML 2022 DataPerf workshop
-
2020
Structure-Aware Negative Sampling in Knowledge Graphs
- Design and implementation of a novel efficient negative sampling method with low computational cost for knowledge graphs
- Method based on considering the local neighborhood of each node when selecting the negative samples
- Accepted paper for the EMNLP 2020 conference
- Implemented in Python 3 using PyTorch
-
2019
Scientific Paper Acceptance Prediction
- Design and implementation of neural network model which predicts acceptance of scientific papers with 84% accuracy, based on their abstracts and introductions
- Implemented in Python 3 using Keras and TensorFlow
-
2019
Salary Trend Analysis
- Gathered and aggregated employee data from different occupation sectors from years 1996 to 2018 from Ontario, Canada
- Studied salary trends and found similar trend patterns exclusive to a few occupation sectors
- Used the Jensen–Shannon divergence between salary trends to construct a salary-similarity-network across employees in different sectors
- Built a multipartite graph from the employers and found meaningful employer embeddings
Honors and Awards
-
2022
- Received Graduate Research Enhancement and Travel (GREAT) Award
-
2022
- Received the Fonds de recherche du Québec – Nature et technologies (FRQNT) doctoral scholarship
-
2021
- Ranked 3rd in DataJam Against Exploitation Canada competition
-
2014
- Ranked 100th out of over 220,000 students in the National University Entrance Exam