From SaaS Sprawl to a Knowledge Graph

Turning fragmented signals into a foundation for secure automation

SaaS adoption has outpaced security’s ability to keep up. Every new application defines its own model of users, roles, and permissions. Each integration adds another path for access. Inactive accounts pile up. What looks like visibility at the app level turns into blind spots at the enterprise level.

Security teams feel this every day. They are buried in alerts, but the basic questions linger: Who still has admin rights across multiple apps? Which third-party connections create exposure? How many dormant identities still hold sensitive access?

Data collection is essential but that alone isn’t enough. Without a way to connect the dots, it’s impossible to see how identities, permissions, and activity interact to create real risk. Solving that requires a new kind of data architecture, one that doesn't just ingest records but understands relationships, derives context, continuously learns as new data flows in and turns these insights into action. That’s where the concept of a knowledge graph becomes essential.

What a Knowledge Graph Really Is

A knowledge graph connects facts about entities such as people, accounts, tokens, and resources into relationships that reveal context. Instead of storing data in tables or indexes, it links things the way they exist in the real world. That structure makes it possible to ask questions that cross systems: who owns what, how access was granted, when it changed, and what it connects to now. For security, it means turning scattered identity and application state or process data into a single, living model that can explain how risk actually forms and moves.

Why you Need a Purpose-built Graph that Knows SaaS

SaaS environments are not like traditional infrastructure. They’re built around people, permissions, and connections that cross products, tenants, and identity systems. Answering basic but critical questions like Who has access? How was it granted? Where does access spread? requires tracing how a single identity moves through multiple systems, roles, permissions. Each relationship builds on the next, creating a long chain of dependencies that are hard to follow. Relational databases struggle with these kinds of questions because they depend on complex joins that don’t scale.

A graph built for SaaS makes this practical:

Many-to-many identity resolution. See how one user’s identity ties together across different systems – their department, their roles, their app accounts – and understand the full access footprint, with all those connections mapped, standardized, and searchable. For example, connect a user created in Azure AD to their department and role in HRIS (linked through Okta), to their app access in Salesforce, and to the specific app role that defines what they can do. Every link is normalized, contextualized and queryable.
Multi-hop visibility. Understand how risk spreads by following relationships several steps deep. For instance, trace how a single compromised account in Google Workspace, with a connected integration to Slack, can escalate into unauthorized access to a linked GitHub repository through shared tokens or automation workflows. The graph can model those multi-hop relationships such as user → integration → API token → external system in real time, surfacing exposure paths before they’re exploited. What used to take hours of manual correlation becomes a single, traceable query.
Nested permissions. Most SaaS environments layer roles and groups on top of one another, hiding privilege inheritance deep within entitlements. A graph built for SaaS untangles these hierarchies, showing when one role inherits another or when a group nested inside multiple teams gives unintended admin rights. For example, a user might belong to a project group that sits inside a broader department role with elevated permissions. Instead of collapsing that data into massive tables, the graph preserves every relationship, revealing how those hidden permissions stack up to create silent privilege creep.

This is why the Knowledge Graph must be purpose-built for SaaS from the start. It models relationships as they really exist across vendors and identity systems, and scales those connections naturally and brings visibility to places other tools can’t reach. This is something that traditional databases or retrofitted cloud graphs simply weren’t designed to capture how SaaS identities, permissions, and integrations actually interconnect.

Why Traditional Graphs Don't Work for SaaS

At first glance, it might seem like the security industry has already solved this problem. Graphs are everywhere. In EDRs, cloud posture tools, in SIEMs. Each promise to secure SaaS. But how a graph is built defines what it can represent. And no matter how advanced these legacy models are, none can be retrofitted to deliver comprehensive SaaS security. The foundation simply isn’t built for it. Let’s take a look at a few of them.

Endpoint first graphs don’t understand SaaS: These graphs were designed for a world of hosts and processes, where everything revolves around client agents and events from devices. That model breaks when there’s no agent to observe. In SaaS, there are no hosts to monitor. There are only users, roles, and entitlements exposed through APIs that don’t share a common schema. The result: endpoint-style graphs can tell you what’s running on a machine, but not who has access to what in your SaaS estate, or how that access shifts when an integration or admin token changes ownership.

Cloud posture graphs miss the SaaS and identity context: Cloud posture systems map infrastructure like compute, storage, IAM policies, and configuration into a digital twin of the environment. That’s great for answering questions like “which S3 buckets are public?” but not access questions like “which vendor integration can impersonate a user across tenants?” SaaS apps don’t behave like EC2 or IAM. Each app defines its own identity model, and many don’t emit full telemetry (and AWS Cloudwatch logs are expensive!). Treating SaaS as cloud infrastructure creates an illusion of coverage, while blind spots and real risk accumulates in the relationships between users, entitlements, and third-party integrations.

Your SIEM can’t explain SaaS access: SIEMs excel at centralizing and correlating events, which is valuable for detection. But SaaS risk doesn’t live in events, it lives in relationships. Questions like “Who still has admin rights across multiple apps?” or “Which inactive account retains effective access?” require a normalized entitlement model that spans hundreds of SaaS APIs. SIEMs don’t maintain that structure. Without it, you can ingest events endlessly and still miss where the real exposure lives.

The bottom line! Extending endpoint graphs, cloud posture graphs, or SIEMs into SaaS doesn’t work because their anchors, joins, and update patterns were never meant for it.They model infrastructure, not identities. SaaS security demands something purpose-built, a graph that treats users, roles, permissions, and integrations as first-class citizens, and captures how they actually interact across systems.

Rethinking Security Graph Foundations for SaaS and AI Enterprise

It’s clear that traditional graphs weren’t built for the realities of SaaS and extending them won’t bridge the gap. Securing this new landscape, needs a fresh ground-up thinking of how SaaS security graphs work.

We know SaaS risk doesn’t live in one account or one misconfiguration. It hides in the connections between users, systems, and now AI agents. As enterprises adopt AI-driven workflows with delegated credentials and broad access to data, the attack surface shifts from static to dynamic. A dormant admin token might seem harmless until it is paired with an overly broad Oauth integration or an AI workflow that inherits those permissions. A forgotten local account becomes dangerous when an agent chains its access through other unsupervised SaaS APIs.

Logs can’t reveal this kind of exposure. Only a stateful security graph that understands relationships between users, AI agents, integrations, and workflows can map the relationships that attackers exploit or that misaligned AI actions might take advantage of.

A few outcomes that matter on day one:

Find orphaned accounts and lingering API access. Shows identities and tokens that still own resources but no longer belong to active users.
See blast radius with identity and AI agent level risk. Combines role, usage, integration scope, and activity to rank where mistakes would have the biggest impact.
Catch configuration and permission drift. Exposes where settings or entitlements quietly expand over time, creating unexpected openings that attackers can exploit.
Operate across many tenants. Summarizes posture and drift across environments without rewriting policies per app.
Trace known and unknown data paths. Maps how corporate data flows through integrations or AI agents into shadow SaaS, exposing the hidden web of interconnections that make up the SaaS supply chain. And then enforce the right controls to stop lateral risk.‍
Purpose built for SaaS-native incident response. Surfaces true positives across external and insider activity and brings the right logs into context so investigations focus on evidence, not noise.

In the SaaS and AI enterprise, visibility isn’t just about collecting data, it’s about understanding how everything connects. That’s the foundation of a modern SaaS security graph, and it’s why Obsidian is redefining how security intelligence is built for this new SaaS and AI era.

The Obsidian Knowledge Graph

If traditional graphs fall short in SaaS, the answer isn’t to retrofit, it’s to rebuild from first principles. The Obsidian Knowledge Graph does exactly that.

Our approach starts with SaaS itself as the system of record. The graph models people, accounts (human, non-human), entitlements, applications, tokens, and the relationships that connect them. Every entity and edge is versioned to capture how things change over time. Data from admin APIs, activity streams, and integrations is normalized into a common schema that scales naturally across hundreds of applications and many tenants. The result is a living, stateful model of your SaaS environment, one that continuously learns and adapts as your ecosystem evolves.

Why the Obsidian Knowledge Graph Has an Advantage

Enterprise scale. Obsidian dominates at the enterprise level, with some of the largest SaaS deployments in the world, including Fortune 25 environments. The platform is built to handle millions of identities and thousands of connected apps while staying extensible for any customer architecture and emerging use case.

Deep and extensible connectors. We build deep connectors where they matter most, pulling both posture data and activity data across major SaaS platforms. And through our community connector strategy, partners, customers and SaaS vendors can use our SDK to extend coverage, adding their own integrations without losing normalization or control.

Soft visibility on top of hard data. Structured APIs only tell part of the story. Obsidian layers inferential insight from the client side through our browser extension, email metadata, and related behavioral context. This reveals shadow SaaS and AI use, local access patterns, and the hidden web of interconnections across integrations, tokens, and identities that connect your business. The result is full-spectrum visibility that grows with your environment and sees beyond the systems you manage to the ones your people actually use.

How We Layered the Data to See What Others Can’t

Each layer of the Obsidian Knowledge Graph builds upon the last, transforming raw data into actionable intelligence:

Start with identity and activity: Security begins with two essentials: who someone is and what they did. Our connectors pull identity data such as users, groups, roles, and entitlements, along with activity data like authentications and in-app actions and aligns them over time. These events are versioned and never overwritten. So, identity changes and behavior stay consistent and analysts can replay changes across vendors and tenants.
Build baselines and learn from real incidents: Our machine learning algorithms establish behavioral baselines across users, roles, and applications, continuously refining them with lessons from real world SaaS incidents investigated by Obsidian, the only vendor standardized for SaaS incident response. The system learns which patterns matter, and analysts can reinforce it in product by marking outcomes to keep detections precise and grounded in reality.
Add SaaS state to manage rights and surface misconfigurations: Configuration and policy data such as settings, controls, roles, and ownership, are layered to show how access and posture align across applications. This makes it easy to spot critical misconfigurations and privilege overlaps before they create exposure. Each finding includes its evidence path and the smallest change needed to restore least privilege.
Model non-human identities: Service accounts, API keys, and OAuth clients and AI Agents are modeled as first-class entities with origin, scope, and usage. Linking them to apps and data makes it clear when automation drifts beyond its intended role.‍
Capture client-side truth: Server-side APIs show what’s reported, not always what’s real. Our browser extension fills that gap by reading authentication signals, confirming local account use, and identifying when sessions move to different infrastructure that could indicate an active threat. It also exposes shadow SaaS and AI use that admin consoles never see.

Together, these layers form a living, learning graph, one that starts with normalized facts, learns from real-world incidents, explains posture with evidence, separates humans from AI automation, and adds client visibility where servers are blind. That is the foundation for understanding and governing SaaS risk at enterprise scale.

Understanding Authentication and Authorization Sprawl in Your SaaS Supply Chain

Once you have a graph that captures every relationship between identities, applications, and integrations, a deeper challenge emerges: understanding how access is actually exercised.

In SaaS, most access doesn’t happen through an interactive user session, it happens server-side, driven by integrations and automations. OAuth clients receive tokens routinely with scopes that can outlive the click that created them. That is where authentication and authorization sprawl takes root.

The Obsidian Knowledge Graph pulls all of this into one system. It connects who (users, AI agent) and what (identities, accounts, applications, resources) with how access is exercised (grants, scopes, and observed use from admin APIs and browser telemetry). Because it’s time-aware, you can see when scopes expand, when usage patterns change, and where tokens sit idle with unnecessary privilege. It turns integration risk into something you can query and govern, restored to your control environment rather than left dangling as a silo.

The graph is designed to make this manageable and actionable:

Use versus grant. Compares scopes granted to those that are actually used, informed by admin APIs and browser telemetry. The result is a concrete, data-backed "shrink to use" plan rather than guesswork.
Publisher and posture signals. The model captures activity and configuration posture so risky publishers and patterns can be descoped or removed safely, without business interruption.

With these relationships modeled in the graph, control becomes straightforward: find unused or over-coped tokens, right-size privilege safely, detect anomalous API and token activity, and enforce consistent security and governance policies across every application and tenant.

This is how the Knowledge Graph protects your SaaS supply chain end to end. You can restore trust with 3rd and 4th party integrations by modeling authentication, authorization, identity and entitlements as part of a single connected fabric.

Seeing the SaaS Estate Through Identities

Once authentication and authorization are connected, the next layer of understanding comes into view: identity.

The Knowledge Graph makes the account–to–identity picture explicit across tenants and applications, and, just as importantly, it shows how each account is used. In practice, a single identity can fan out to multiple application accounts across core SaaS systems such as productivity suites, HR platforms, data warehouses, and CRM tools with clear markers for SSO logins, local logins, and integration-driven access. Obsidian turns this sprawl into a navigable map of relationships and access paths. What’s represented:

Accounts and login paths. Both SSO and local access are modeled as first-class edges, showing where a person uses federation, connects from unmanaged devices, and where a local account still exists that bypasses central control.
Client evidence. Through the browser extension, the session issued on a client is correlated with follow-on activity, revealing when strong authentication was applied and when it wasn’t. Investigations gain insights into ‘how it happened’ and not just ‘what they had’.
Proving privileged access with evidence of enforcement. At the click of a button, you can pull every privileged identity, show the controls in place, and present the evidence that those identities are governed appropriately. SaaS security becomes audit-ready by design, feeding seamlessly into GRC processes without manual efforts.

This identity-centric view transforms SaaS security from reactive log review into proactive governance, built on clear relationships and verifiable evidence.

Scaling to AI Agent Access

The rise of AI agents raises the stakes because they automate actions across many SaaS applications. They also show why Obsidian’s approach to the Knowledge Graph model matters most now.

AI agents don’t just interact with SaaS systems, they act across them, chaining permissions and actions that span multiple applications. After years mapping identities, accounts, tokens, scopes, sessions, and their history across large SaaS estates, and refining that model with real incident outcomes and analyst feedback, the Obsidian Knowledge Graph that answers who has what and how, now extends that same precision to who is behind an agent, what tools it used, and where it moves your critical data.

What makes this powerful is Obsidian’s ability to identify toxic combinations where agent access, user privilege, and SaaS data sensitivity align in ways that expose critical systems. The graph understands permissions for both user and runner, the routes each agent can take, and which SaaS objects carry sensitive data. That insight reveals paths to unauthorized use, privilege escalation, public exposure, or cross-SaaS data leakage before they happen.

Just as important, it allows defenders to take decisive, evidence-based action. Analysts can visualize an agent’s identity lineage, review which OAuth scopes it has inherited, and see exactly how its activity traverses tenants and apps. With this understanding, teams can confidently revoke dangerous access, redesign workflows safely, and set policies that limit agent privileges to what’s truly required.

By grounding AI visibility in the same graph that models SaaS identities and entitlements, Obsidian makes AI security operational, not theoretical. The result is control that keeps pace with automation, ensuring that innovation doesn’t outstrip governance.

In a world where agents act faster than humans can review, the Obsidian Knowledge Graph gives security teams back the context, confidence, and control they need to move just as fast, safely.

Closing thought

SaaS sprawl isn’t slowing down, and the rise of AI agents is adding even more motion and complexity. The durable answer isn’t another integration or dashboard or superficial SaaS ‘visibility’. It’s a knowledge graph that starts from identities (human and agentic) and access, learns from real data, and keeps the relationships current. Do that and automation becomes safe, policy becomes portable, and the SaaS estate finally makes sense.

This is the foundation Obsidian was built on from day one. Every part of our platform, from how we collect data to how we model it, is engineered for the realities of SaaS. We didn’t retrofit an endpoint, cloud or SIEM graph. We built a one-of-a-kind Knowledge Graph purpose-built for SaaS and extended the intelligence for the world of AI. It’s designed to represent how access, identity and behavior actually interact in dynamic, multi-tenant environments.

That deep, normalized model, powered by an AI-native security engine, is capable of reasoning over complex relationships, understanding how risk propagates, and surfaces the signals that matter most. It’s how Obsidian turns SaaS chaos into clarity, making automation safe, policy portable, and security decisions explainable.

If we’ve piqued your interest, here are a few resources to get you started:

Explore our platform differentiators on this page
Listen to CISOs from some of the leading enterprises as they discuss identity, AI, and the SaaS supply chain as being the next security reckoning‍
Try out Obsidian at no cost right away

From SaaS Sprawl to a Knowledge Graph

Turning fragmented signals into a foundation for secure automation

What a Knowledge Graph Really Is

Why you Need a Purpose-built Graph that Knows SaaS

Why Traditional Graphs Don't Work for SaaS

Rethinking Security Graph Foundations for SaaS and AI Enterprise

The Obsidian Knowledge Graph

Why the Obsidian Knowledge Graph Has an Advantage

How We Layered the Data to See What Others Can’t

Understanding Authentication and Authorization Sprawl in Your SaaS Supply Chain

Seeing the SaaS Estate Through Identities

Scaling to AI Agent Access

Closing thought

Curious for more? Dive deeper into our next-gen Knowledge Graph

See how AI Assistant can help your organizations

Deep-dive into the Community SDK and Connectors

Join the SaaS Security Standards Program

Explore our platform

Sign up for a demo

Frequently Asked Questions (FAQs)

What is a SaaS knowledge graph and why is it important for security?

How does the Obsidian Knowledge Graph address SaaS sprawl and access risk?

Why can't traditional security graphs or SIEMs effectively secure SaaS environments?

How does the Obsidian Knowledge Graph enable governance of AI agent access in SaaS?

You May Also Like

How the Obsidian Community SDK and Connectors, Unified Knowledge Graph, and AI Assistant close the gap in SaaS security coverage and intelligence

From SaaS Sprawl to a Knowledge Graph

From noise to clarity: Accelerate your SaaS security operations with Obsidian AI Assistant

Get Started