AeSlides Improves LLM-Generated Slide Aesthetics with

AeSlides is a reinforcement learning framework designed to improve the aesthetic layout of slides generated by large language models (LLMs) by addressing the modality gap between text-centric generation and visual quality. It uses verifiable metrics to quantify slide layout quality and a GRPO-based reinforcement learning method to optimize models for aesthetically coherent layouts.

Category	Detail
Released by	Not yet disclosed.
Release date	April 28, 2026
What it is	A reinforcement learning framework for aesthetic slide generation.
Who it is for	Developers and researchers working on LLM-based presentation tools.
Where to get it	https://github.com/ympan0508/aeslides
Price	Not yet disclosed.

AeSlides is a reinforcement learning framework for LLM-based slide generation.
It uses verifiable metrics to quantify slide layout quality.
The framework employs a GRPO-based method for aesthetic optimization.
AeSlides significantly improves aspect ratio compliance and reduces layout issues.
Human evaluations show a substantial improvement in overall slide quality.

What is AeSlides
What is new vs the previous version
How does AeSlides work
Benchmarks and evidence
Who should care
How to use AeSlides today
AeSlides vs competitors
Risks, limits, and myths
FAQ
Glossary

AeSlides addresses the modality gap in LLM slide generation by focusing on visual aesthetics.
It introduces verifiable metrics for accurate and efficient layout quality assessment.
The GRPO-based reinforcement learning method directly optimizes for aesthetic layouts.
AeSlides improves aspect ratio compliance to 85% and reduces various layout flaws.
Human evaluators rated AeSlides-generated slides higher than other methods.

What is AeSlides

AeSlides is a reinforcement learning framework that incentivizes aesthetic layout in large language model (LLM)-based slide generation [arXiv:2604.22840]. It aims to bridge the gap between text-centric generation and the visual quality requirements of slides [arXiv:2604.22840]. The framework uses verifiable rewards to provide explicit aesthetic supervision during the generation process [arXiv:2604.22840].

What is new vs the previous version

AeSlides introduces explicit aesthetic principles as supervision, which was previously unexplored [arXiv:2604.22840]. Existing solutions often rely on heavy visual reflection or large-scale dataset fine-tuning [arXiv:2604.22840]. These methods incur high inference costs or provide weak aesthetic supervision [arXiv:2604.22840].

Verifiable Metrics: AeSlides introduces a suite of meticulously designed verifiable metrics [arXiv:2604.22840]. These metrics quantify slide layout quality accurately and efficiently [arXiv:2604.22840].
Direct Optimization: It employs a GRPO-based reinforcement learning method for direct optimization [arXiv:2604.22840]. This method specifically targets aesthetically coherent layouts [arXiv:2604.22840].
Efficiency: The approach offers an efficient and scalable way to align slide generation with human aesthetic preferences [arXiv:2604.22840].

How does AeSlides work

AeSlides operates by integrating verifiable aesthetic metrics into a reinforcement learning framework.

Define Verifiable Metrics: AeSlides first establishes a set of verifiable metrics [arXiv:2604.22840]. These metrics quantify specific aspects of slide layout quality [arXiv:2604.22840].
Quantify Layout Quality: The metrics capture key layout issues in an accurate, efficient, and low-cost manner [arXiv:2604.22840].
Develop GRPO-based RL: A GRPO-based reinforcement learning method is then developed [arXiv:2604.22840].
Optimize Slide Generation: This method directly optimizes slide generation models [arXiv:2604.22840]. The optimization targets aesthetically coherent layouts [arXiv:2604.22840].
Incentivize Aesthetics: The verifiable metrics serve as rewards, incentivizing the LLM to produce more aesthetic designs [arXiv:2604.22840].

Benchmarks and evidence

Metric	Before AeSlides	With AeSlides	Improvement	Source
Aspect Ratio Compliance	36%	85%	+49%	[arXiv:2604.22840]
Whitespace Reduction	Not yet disclosed.	Not yet disclosed.	44%	[arXiv:2604.22840]
Element Collisions Reduction	Not yet disclosed.	Not yet disclosed.	43%	[arXiv:2604.22840]
Visual Imbalance Reduction	Not yet disclosed.	Not yet disclosed.	28%	[arXiv:2604.22840]
Human Evaluation Score (Overall Quality)	3.31	3.56	+7.6%	[arXiv:2604.22840]

AeSlides, trained with 5K prompts on GLM-4.7-Flash, significantly improved aspect ratio compliance [arXiv:2604.22840]. It reduced whitespace by 44%, element collisions by 43%, and visual imbalance by 28% [arXiv:2604.22840]. Human evaluation showed a 7.6% increase in overall quality, outperforming reflection-based and model-based reward optimization approaches [arXiv:2604.22840]. It even edged out Claude-Sonnet-4.5 in human evaluations [arXiv:2604.22840].

Who should care

Builders

Builders of AI presentation tools should care about AeSlides for its method of improving visual aesthetics [arXiv:2604.22840]. The verifiable reward system offers a new paradigm for efficient and scalable aesthetic alignment [arXiv:2604.22840]. This could lead to more sophisticated and user-friendly slide generation capabilities.

Enterprise

Enterprise users seeking high-quality, automated presentation generation will find AeSlides relevant. Improved aesthetic layouts can enhance professional communication and brand consistency. This technology can streamline content creation workflows within organizations.

End users

End users who rely on AI for creating presentations will benefit from AeSlides. They can expect more visually appealing and coherent slides with less manual adjustment. This directly addresses the common issue of aesthetically suboptimal layouts from current LLM tools [arXiv:2604.22840].

Investors

Investors in AI and productivity software should note AeSlides’ potential impact. Enhancing the visual quality of LLM outputs can increase market adoption for slide generation tools. This framework represents a valuable advancement in practical AI applications.

How to use AeSlides today

The repository for AeSlides is available at https://github.com/ympan0508/aeslides [arXiv:2604.22840]. Developers can access the code to implement or adapt the framework. This allows for direct experimentation and integration into existing LLM-based slide generation pipelines.

AeSlides vs competitors

Feature	AeSlides	Model-based Reward Optimization	Reflection-based Agentic Approaches	Claude-Sonnet-4.5
Aesthetic Supervision	Explicit, verifiable metrics [arXiv:2604.22840]	Indirect, model-dependent [arXiv:2604.22840]	Heavy visual reflection [arXiv:2604.22840]	Not yet disclosed.
Inference Cost	Low-cost [arXiv:2604.22840]	Not yet disclosed.	High [arXiv:2604.22840]	Not yet disclosed.
Optimization Method	GRPO-based reinforcement learning [arXiv:2604.22840]	Not yet disclosed.	Not yet disclosed.	Not yet disclosed.
Aspect Ratio Compliance	85% [arXiv:2604.22840]	Not yet disclosed.	Not yet disclosed.	Not yet disclosed.
Human Evaluation (Overall Quality)	3.56 (+7.6% improvement) [arXiv:2604.22840]	Inferior to AeSlides [arXiv:2604.22840]	Inferior to AeSlides [arXiv:2604.22840]	Inferior to AeSlides [arXiv:2604.22840]

AeSlides distinguishes itself by using explicit, verifiable aesthetic principles for supervision [arXiv:2604.22840]. This contrasts with model-based reward optimization and reflection-based agentic approaches, which provide weaker or more costly supervision [arXiv:2604.22840]. Its GRPO-based reinforcement learning method directly optimizes for aesthetic layouts, leading to superior human evaluation scores [arXiv:2604.22840].

Risks, limits, and myths

Subjectivity of Aesthetics: Aesthetic preferences can be subjective and vary across cultures [arXiv:2604.22840]. AeSlides’ metrics aim for general principles, but edge cases may exist.
Training Data Dependency: The effectiveness of AeSlides still depends on the quality and diversity of its training prompts [arXiv:2604.22840].
Computational Resources: While efficient, reinforcement learning frameworks can still require significant computational resources for training.
Myth: LLMs inherently understand visual aesthetics. Fact: LLMs are text-centric, and their quality is governed by visual aesthetics, creating a modality gap [arXiv:2604.22840]. AeSlides aims to bridge this gap.

FAQ

What problem does AeSlides solve?: AeSlides solves the problem of aesthetically suboptimal layouts in LLM-generated slides, which arises from the modality gap between text-centric generation and visual quality [arXiv:2604.22840].
How does AeSlides improve slide aesthetics?: AeSlides improves slide aesthetics by using verifiable metrics to quantify layout quality and a GRPO-based reinforcement learning method to directly optimize for aesthetically coherent layouts [arXiv:2604.22840].
What are “verifiable rewards” in AeSlides?: Verifiable rewards in AeSlides are based on meticulously designed metrics that accurately, efficiently, and at low cost quantify slide layout quality [arXiv:2604.22840].
Which LLM was used to train AeSlides?: AeSlides was trained using 5K prompts on GLM-4.7-Flash [arXiv:2604.22840].
How much did AeSlides improve aspect ratio compliance?: AeSlides improved aspect ratio compliance from 36% to 85% [arXiv:2604.22840].
Did human evaluators prefer slides generated by AeSlides?: Yes, human evaluators showed a substantial improvement in overall quality, increasing scores from 3.31 to 3.56 (+7.6%) for AeSlides-generated slides [arXiv:2604.22840].
Is AeSlides open source?: Yes, the repository for AeSlides is available at https://github.com/ympan0508/aeslides [arXiv:2604.22840].
Can AeSlides be used with other LLMs?: Not yet disclosed. The paper specifies training on GLM-4.7-Flash [arXiv:2604.22840].

Glossary

Large Language Model (LLM): A type of artificial intelligence model trained on vast amounts of text data to understand and generate human-like language [Wikipedia].
Reinforcement Learning (RL): A machine learning paradigm where an agent learns to make decisions by performing actions in an environment to maximize a cumulative reward.
Modality Gap: The discrepancy between the primary input/output modality of a system (e.g., text for LLMs) and the modality required for high-quality output (e.g., visual aesthetics for slides) [arXiv:2604.22840].
GRPO: Generalized Relative Policy Optimization, a specific algorithm used in reinforcement learning for policy optimization [arXiv:2604.22840].
Verifiable Metrics: Quantifiable measures that can be objectively checked or proven, used in AeSlides to assess slide layout quality [arXiv:2604.22840].

Explore the AeSlides GitHub repository to understand its implementation and integrate it into your LLM-based slide generation projects.

Sources

Author

siego237

Writes for FrontierWisdom on AI systems, automation, decentralized identity, and frontier infrastructure, with a focus on turning emerging technology into practical playbooks, implementation roadmaps, and monetization strategies for operators, builders, and consultants.

AeSlides Improves LLM-Generated Slide Aesthetics with Verifiable Rewards

What is AeSlides

What is new vs the previous version

How does AeSlides work

Benchmarks and evidence