cv

Academic Interests

  • Multi-Modal Representation Learaning
    • Agentic AI
    • Out-of-Distribution Robustness
    • Self-Supervised Learning
    • Diffusion Models
  • Learning on Graphs
    • Knowledge Graphs
    • Graph Representation Learning

Education

  • 2019 - present
    Ph.D. in Computer Science
    McGill University, Montreal, Canada
  • 2014 - 2019
    B.Sc. in Software Engineering
    Sharif University of Technology, Tehran, Iran

Experience

  • 2024 - present
    Visiting Researcher
    ServiceNow Research, Montreal, Canada
    • Worked in multimodal foundation model team and was supervised by a Senior Machine Learning Scientist
    • Proposed a multimodal benchmark for evaluating state-of-the-art large multimodal models
    • Designed efficient evaluation pipelines for large foundation models
  • 2023
    Machine Learning Research Intern
    Recursion, Montreal, Canada
    • Worked in the Data Science group and was supervised by a Senior Machine Learning Scientist
    • Designed models to learn gene representations for data-centric drug discovery
    • Applied multi-modal models with self-supervised learning techniques to better integrate sequential and visual modalities of gene-perturbations to learn better gene representations
  • 2018
    Research Assistant
    University of Toronto, Toronto, Canada
    • Worked in a group underthe supervision of Prof. Konstantinos N. Plataniotis
    • Project goal was to improve the robustness of convolutional neural networks (CNNs) against adversarial attacks
    • Implemented in Python 3
  • 2018
    Software Engineering Intern
    Moj Secure E-commerce Co., Tehran, Iran
    • Interned in the AI team underthe supervision of the CTO and co-founder of Moj Company
    • Project goal was to parse an arbitrary image of any valid bank card and detect the digits in the 16-digit client card number
    • Used a Single Shot Detectorfor object detection and a CNN for digitrecognition
    • Implemented in Python 3 using TensorFlow and Keras

Projects

  • 2025
    PairBench Are Vision-Language Models Reliable at Comparing What They See?
    • Developed a framework for evaluating Vision-Language Models (VLMs) as similarity kernels using four key metrics
    • Demonstrated the importance of thorough assessment before adopting VLMs for evaluation tasks
  • 2025
    WebMMU A Benchmark for Multimodal Multilingual Website Understanding and Code Generation
    • Released WebMMU, a large-scale benchmark for evaluating multimodal and multilingual website understanding
    • Benchmarked code generation and UI reasoning tasks across languages, highlighting limitations in current vision-language models
  • 2025
    AlignVLM Bridging Vision and Language Latent Spaces for Multimodal Understanding
    • Proposed AlignVLM, a framework to align latent spaces between vision and language models for improved multimodal reasoning
    • Demonstrated that latent space alignment enhances cross-modal transfer and understanding across diverse tasks
  • 2025
    Rendering-Aware Reinforcement Learning for Vector Graphics Generation
    • Introduced a rendering-aware reinforcement learning method for generating high-quality vector graphics
    • Showed improvements in fidelity and structural consistency compared to prior vector graphics generation techniques
  • 2024
    BigDocs-7.5M A Large-Scale Dataset for Document Understanding
    • Created an open-access dataset with 7.5 million multimodal documents for document understanding tasks
    • Demonstrated that models trained on BigDocs-Bench improved document reasoning performance by up to 25.8% over GPT-4o
  • 2024
    Guided Positive Sampling for Self-Supervised Learning
    • Designed a novel method, namely GPS-SSL, for integrating a priori knowledge into any self-supervised learning (SSL) model
    • GPS-SSL performs positive sampling by approximating strong augmentations
    • The method encourages the base SSL method to be more robust against untuned data augmentations when applied to under-studied and/or real-world datasets
  • 2022
    Revisiting Hotels-50K and Hotel-ID
    • Revisited two image datasets, Hotels-50K and Hotel-ID, and proposed new training and evaluation splits with different levels of difficulty
    • Proposed evaluation splits based on the images' class and super-class information to imitate real-world scenarios
    • Accepted paper for the ICML 2022 DataPerf workshop
  • 2020
    Structure-Aware Negative Sampling in Knowledge Graphs
    • Design and implementation of a novel efficient negative sampling method with low computational cost for knowledge graphs
    • Method based on considering the local neighborhood of each node when selecting the negative samples
    • Accepted paper for the EMNLP 2020 conference
    • Implemented in Python 3 using PyTorch
  • 2019
    Scientific Paper Acceptance Prediction
    • Design and implementation of neural network model which predicts acceptance of scientific papers with 84% accuracy, based on their abstracts and introductions
    • Implemented in Python 3 using Keras and TensorFlow
  • 2019
    Salary Trend Analysis
    • Gathered and aggregated employee data from different occupation sectors from years 1996 to 2018 from Ontario, Canada
    • Studied salary trends and found similar trend patterns exclusive to a few occupation sectors
    • Used the Jensen–Shannon divergence between salary trends to construct a salary-similarity-network across employees in different sectors
    • Built a multipartite graph from the employers and found meaningful employer embeddings

Honors and Awards

  • 2022
    • Received Graduate Research Enhancement and Travel (GREAT) Award
  • 2022
    • Received the Fonds de recherche du Québec – Nature et technologies (FRQNT) doctoral scholarship
  • 2021
    • Ranked 3rd in DataJam Against Exploitation Canada competition
  • 2014
    • Ranked 100th out of over 220,000 students in the National University Entrance Exam