LoRAverse: A Submodular Framework to Retrieve Diverse Adapters for Diffusion Models

Mert Sonmezer¹, Matthew Zheng², Pinar Yanardag²

Middle East Technical University

Virginia Tech

Paper arXiv Code

LoRAverse Overview - Submodular algorithm selecting diverse LoRA adapters — LoRAverse enhances image diversity by employing a submodular algorithm to select a diverse and representative set of LoRA adapters. This approach begins by clustering the adapters based on their semantic meanings. From these clusters, the algorithm selects models that not only maximize diversity but also maintain strong alignment with the user-provided prompt, ensuring both variety and relevance in the generated images.

Abstract

Low-rank Adaptation (LoRA) models have revolutionized the personalization of pre-trained diffusion models by enabling fine-tuning through low-rank, factorized weight matrices specifically optimized for attention layers. These models facilitate the generation of highly customized content across a variety of objects, individuals, and artistic styles without the need for extensive retraining. Despite the availability of over 100K LoRA adapters on platforms like Civit.ai, users often face challenges in navigating, selecting, and effectively utilizing the most suitable adapters due to their sheer volume, diversity, and lack of structured organization. This paper addresses the problem of selecting the most relevant and diverse LoRA models from this vast database by framing the task as a combinatorial optimization problem and proposing a novel submodular framework. Our quantitative and qualitative experiments demonstrate that our method generates diverse outputs across a wide range of domains.

Method

Architecture of LoRAverse showing concept extractor and submodular retriever

Architecture of LoRAverse. LoRAverse composed of two main modules: concept extractor and submodular retriever. The concept extractor processes the user prompt to identify concepts (keywords). These concepts guide to the submodular retriever, which selects a diverse and relevant subset of LoRA adapters by clustering similar adapters per concept and applying submodular optimization. Additionally, a safety-checking mechanism is integrated to filter out adapters containing offensive or inappropriate content.

Quantitative Results

Quantitative comparison showing diversity metrics

Quantitative Comparison (CFG=7). LoRAverse enhances the diversity of image sets across various metrics while maintaining comparable text-image alignment. The user study reports which method produced preferred outputs (US-P) by participants, and average rating of faithfulness (US-F) and diversity (US-D) of outputs on a scale of 1 to 5.

Qualitative Results

Qualitative comparison showing diverse image generation

Qualitative Comparison. LoRAverse demonstrates a higher diversity compared to image sets generated by Stylus and SD v1.5.

BibTeX

@article{loraverse2025,
    title = {LoRAverse: A Submodular Framework to Retrieve Diverse Adapters for Diffusion Models},
    author = {Sonmezer, Mert and Zheng, Matthew and Yanardag, Pinar},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    year = {2025}
}

This webpage template was borrowed from Nerfies.