Skip to main content
Frontier Signal

LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation

A new agentic framework uses LLMs to parse floor plans into structured knowledge bases, generating accessible indoor navigation instructions for blind and low-vision individuals.

Operator Briefing

Turn this article into a repeatable weekly edge.

Get implementation-minded writeups on frontier tools, systems, and income opportunities built for professionals.

No fluff. No generic AI listicles. Unsubscribe anytime.

A new agentic framework leverages Large Language Models (LLMs) to convert floor plan images into structured, retrievable knowledge bases. This system generates safe, accessible indoor navigation instructions for blind and low-vision individuals, reducing reliance on costly infrastructure. It employs a multi-agent module for parsing and a Path Planner with a Safety Evaluator for instruction generation.

Attribute Detail
Released by arXiv cs.MA
Release date
What it is An agentic framework for generating accessible indoor navigation instructions from floor plans using LLMs.
Who it is for Blind and low-vision individuals seeking accessible indoor navigation solutions.
Where to get it arXiv (arxiv.org/abs/2604.23970)
Price Not yet disclosed.
  • The framework converts floor plan images into structured knowledge bases.
  • It generates accessible indoor navigation instructions for blind and low-vision people.
  • A multi-agent module parses floor plans into a spatial knowledge graph.
  • A Path Planner generates instructions, with a Safety Evaluator assessing hazards.
  • The system outperforms single-call LLM baselines on real-world building data.
  • The system offers a scalable solution for accessible indoor navigation.
  • It reduces the need for expensive, per-building infrastructure.
  • Multi-agent LLM systems can achieve self-correction and iterative refinement.
  • The framework significantly improves navigation success rates over single-call LLMs.
  • Safety evaluation is integrated into the path planning process.

What is LLM-Guided Agentic Floor Plan Parsing?

LLM-Guided Agentic Floor Plan Parsing is an agentic framework that transforms a single floor plan image into a structured, retrievable knowledge base. This system generates safe, accessible navigation instructions for blind and low-vision (BLV) individuals [Source: arXiv:2604.23970]. It aims to provide lightweight infrastructure for indoor navigation accessibility [Source: arXiv:2604.23970].

What is new vs the previous version?

This framework introduces a novel agentic approach for floor plan parsing and navigation.

  • Multi-Agent Module: It uses a multi-agent module for parsing floor plans into a spatial knowledge graph [Source: arXiv:2604.23970].
  • Self-Correcting Pipeline: The parsing includes iterative retry loops and corrective feedback for accuracy [Source: arXiv:2604.23970].
  • Integrated Safety Evaluator: A Safety Evaluator agent assesses potential hazards along each generated route [Source: arXiv:2604.23970].
  • Performance Gains: It consistently outperforms single-call LLM baselines in navigation success rates [Source: arXiv:2604.23970].

How does LLM-Guided Agentic Floor Plan Parsing work?

The system operates in two main phases to generate accessible navigation instructions.

  1. Floor Plan Parsing: A multi-agent module processes a single floor plan image [Source: arXiv:2604.23970]. This module parses the floor plan into a spatial knowledge graph [Source: arXiv:2604.23970]. It uses a self-correcting pipeline with iterative retry loops and corrective feedback [Source: arXiv:2604.23970].
  2. Path Planning and Safety Evaluation: A Path Planner generates accessible navigation instructions [Source: arXiv:2604.23970]. A Safety Evaluator agent then assesses potential hazards for each route [Source: arXiv:2604.23970]. The LLM acts as an agent by incorporating a role, environment, and memory as inputs [Source: 5].

Benchmarks and evidence

Evaluation Metric UMBC MP-1 (Short Routes) UMBC MP-1 (Medium Routes) UMBC MP-1 (Long Routes) UMBC MP-3 (Short Routes) UMBC MP-3 (Medium Routes) UMBC MP-3 (Long Routes) Source
Agentic Framework Success Rate 92.31% 76.92% 61.54% 76.92% 61.54% 38.46% arXiv:2604.23970
Claude 3.7 Sonnet Baseline Success Rate 84.62% 69.23% 53.85% 61.54% 46.15% 23.08% arXiv:2604.23970

The system was evaluated on the UMBC Math and Psychology building (floors MP-1 and MP-3) and the CVC-FP benchmark [Source: arXiv:2604.23970]. It showed consistent gains over single-call LLM baselines [Source: arXiv:2604.23970]. For example, on MP-1, it achieved 92.31% success for short routes, outperforming Claude 3.7 Sonnet at 84.62% [Source: arXiv:2604.23970].

Who should care

Builders

Builders of AI systems for accessibility should care about this framework. It demonstrates a scalable solution for indoor navigation for BLV individuals [Source: arXiv:2604.23970]. The multi-agent approach with self-correction is a valuable design pattern [Source: arXiv:2604.23970].

Enterprise

Enterprises in hospitality, retail, and public services can leverage this technology. It can enhance accessibility for BLV customers and employees [Source: arXiv:2604.23970]. This reduces the need for expensive per-building infrastructure [Source: arXiv:2604.23970].

End users

Blind and low-vision individuals are the primary beneficiaries of this innovation. It offers improved and reliable indoor navigation instructions [Source: arXiv:2604.23970]. This enhances independence and safety in unfamiliar indoor environments [Source: arXiv:2604.23970].

Investors

Investors interested in AI for social good and accessibility technology should take note. This framework presents a scalable and impactful application of LLMs [Source: arXiv:2604.23970]. The market for accessibility solutions is growing, driven by regulatory and social demands [Source: 7].

How to use LLM-Guided Agentic Floor Plan Parsing today

The framework is currently presented as an academic paper on arXiv [Source: arXiv:2604.23970]. Direct public access or an API for immediate use is not yet disclosed. Researchers can access the paper for implementation details [Source: arXiv:2604.23970].

LLM-Guided Agentic Floor Plan Parsing vs competitors

Feature/System LLM-Guided Agentic Floor Plan Parsing Single-Call LLM Baselines (e.g., Claude 3.7 Sonnet) Traditional Indoor Navigation (e.g., costly infrastructure)
Input Single floor plan image Text prompts/limited image input Dedicated sensors, beacons, or pre-mapped environments
Parsing Method Multi-agent module, self-correcting, iterative retry loops Direct LLM interpretation Manual mapping or specialized computer vision
Output Structured spatial knowledge base, accessible navigation instructions with safety evaluation Navigation instructions (potentially less reliable/safe) Navigation instructions based on infrastructure data
Infrastructure Cost Lightweight Low (software-based) High (per-building installation)
Accessibility for BLV High, with safety considerations Moderate, less reliable High, but limited by infrastructure availability
Performance (UMBC MP-1 Short) 92.31% success rate 84.62% success rate Not yet disclosed.
Scalability Scalable solution Limited by single-call LLM robustness Limited by infrastructure deployment

The agentic framework significantly outperforms single-call LLM baselines like Claude 3.7 Sonnet in navigation success rates [Source: arXiv:2604.23970]. It offers a lightweight infrastructure solution compared to traditional methods [Source: arXiv:2604.23970]. Multimodal LLMs, which integrate visual and textual reasoning, are a promising frontier for interpretable assessments [Source: 6]. GPT-5.5 (xhigh) currently ranks #1 on the Artificial Analysis LLM Leaderboard [Source: 2]. GLM-5.1 leads open-source LLMs in coding performance [Source: 3].

Risks, limits, and myths

  • Floor Plan Accuracy: The system’s performance depends on the clarity and accuracy of the input floor plan image. Imperfect or outdated floor plans could lead to incorrect navigation instructions.
  • Dynamic Environments: The current framework may struggle with dynamic changes in indoor environments, such as temporary obstacles or furniture rearrangements.
  • LLM Hallucinations: Like all LLMs, there’s a risk of the model generating plausible but incorrect information, especially in complex parsing tasks.
  • Myth: LLMs alone are sufficient for complex tasks. This research demonstrates that an agentic framework with iterative self-correction significantly outperforms single-call LLMs [Source: arXiv:2604.23970].
  • Myth: Costly infrastructure is always necessary for indoor navigation. This system aims to provide accessible navigation with lightweight infrastructure [Source: arXiv:2604.23970].

FAQ

What is the primary goal of LLM-Guided Agentic Floor Plan Parsing?
The primary goal is to generate safe, accessible indoor navigation instructions for blind and low-vision individuals using a single floor plan image [Source: arXiv:2604.23970].
How does the system process a floor plan image?
A multi-agent module parses the floor plan image into a spatial knowledge graph through a self-correcting pipeline [Source: arXiv:2604.23970].
What role does the Safety Evaluator play?
The Safety Evaluator agent assesses potential hazards along each generated navigation route [Source: arXiv:2604.23970].
Is this system better than using a single LLM for navigation?
Yes, the agentic framework consistently outperforms single-call LLM baselines in navigation success rates [Source: arXiv:2604.23970].
What kind of infrastructure does this system require?
It is designed to work with lightweight infrastructure, reducing the need for costly per-building installations [Source: arXiv:2604.23970].
On which datasets was the system evaluated?
The system was evaluated on the UMBC Math and Psychology building (floors MP-1 and MP-3) and the CVC-FP benchmark [Source: arXiv:2604.23970].
Can this technology be used in commercial applications?
Not yet disclosed. The research paper suggests it is a scalable solution for accessible indoor navigation [Source: arXiv:2604.23970].
What is an agentic framework in the context of LLMs?
An agentic framework extends an LLM by adding supporting elements like a role, environment, and memory, allowing it to perform complex tasks iteratively [Source: 5].

Glossary

Agentic Framework
A system where an LLM is augmented with a role, environment, and memory to perform tasks iteratively and autonomously [Source: 5].
Blind and Low-Vision (BLV)
Individuals with significant visual impairment, requiring specialized accessibility solutions [Source: arXiv:2604.23970].
Floor Plan Parsing
The process of extracting structural and spatial information from a floor plan image [Source: arXiv:2604.23970].
Large Language Model (LLM)
A type of artificial intelligence model trained on vast amounts of text data, capable of understanding and generating human-like text [Source: 5].
Multimodal LLM
An LLM that can process and integrate information from multiple modalities, such as text and images [Source: 6].
Spatial Knowledge Graph
A structured representation of spatial relationships and entities within an environment, derived from a floor plan [Source: arXiv:2604.23970].

Explore the full research paper on arXiv to understand the technical details and potential for implementation.

Author

  • siego237

    Writes for FrontierWisdom on AI systems, automation, decentralized identity, and frontier infrastructure, with a focus on turning emerging technology into practical playbooks, implementation roadmaps, and monetization strategies for operators, builders, and consultants.

Keep Compounding Signal

Get the next blueprint before it becomes common advice.

Join the newsletter for future-economy playbooks, tactical prompts, and high-margin tool recommendations.

  • Actionable execution blueprints
  • High-signal tool and infrastructure breakdowns
  • New monetization angles before they saturate

No fluff. No generic AI listicles. Unsubscribe anytime.

Leave a Reply

Your email address will not be published. Required fields are marked *