Autograph: Knowledge Graphs are just the beginning

May 9, 2024

I’ve been obsessed with knowledge graphs since I started using Obsidian in 2022. I think having a densely connected knowledge graph is like having a densely connected brain/neural network — there is a lot of intelligence embedded within.

A knowledge graph, according to Wikipedia, is “a knowledge base that uses a graph-structured data model or topology to represent and operate on data. Knowledge graphs are often used to store interlinked descriptions of entities – objects, events, situations or abstract concepts – while also encoding the semantics or relationships underlying these entities.” They are very powerful structures that encode complex, multidimensional relationships, which can be useful for querying on top of!

The idea of generating a knowledge graph from an LLM has to me been a latent killer app, as maintaining a knowledge graph manually is hard to do in a robust and thorough way. Doing so with an LLM creates massive possibilities for these knowledge graphs and use cases for them, as the tasks associated with a knowledge graph min/max the weaknesses and strengths of an LLM, notably, summarization.

I think knowledge graphs can empower something much better than RAG, which I call ASTRA: Autograph-based Semantic Text Retrieval and Augmentation. It’s a rough idea in my mind, but it’s basically finding some “relevant” data and retrieving not just that data, but its semantic neighbors and putting that in the context. Any “decent” RAG implementation is likely some form of this, otherwise you’re just getting top-k junk and hoping for the best, so you make K bigger.

I’ve created a small pipeline to do this and to me, this pipeline has a world of potential, like a “stem cell” or the Pokemon “Eevee”. I’ll use this post to explore some ideas of what is possible if you take the generic knowledge graph generator and create a pipeline oriented at a specific use case. These definitions are illustrative and require deeper thought to really find some gold, but I’m pretty sure there is some here.

AutoGraph and its Potential Evolutions #

Like Eevee, AutoGraph is a lot of different things to a lot of different people. This is an attempt to capture that, in order to focus on making sure the base AutoGraph is adaptable. It may make it worse as the design/opinions are geared towards generalization, but in theory, it should be a good jumping-off point for many different ideas!

On Data Privacy: A lot of these ideas are expecting a world where audio transcripts of conversations are accessible, where there from meeting recordings, an AI voice pendant, or some other method of capturing the relevant data of what a user is experiencing (biometrics, etc.). The privacy implications are not addressed in this post, but I do think very seriously about this. I would encourage looking at the capability without regard to privacy – for now.

Normal: AutoGraph #

The normal Eevee evolution is the generic and default AutoGraph. This is the raw capability to go from unstructured data to a generated knowledge graph. This approach, like the base form of Eevee, is not strongly opinionated and is a generalist, but works with many different use cases, but not specifically. It lacks structures and niceties that could enable opinionated use cases

Water: Therapy Conversation #

The Therapy Conversation AutoGraph is focused on capturing thoughts, ideas, emotions, experiences, mental states, and more from recordings of therapy sessions. This enables reflective tools and systems to help individuals explore their thoughts, emotions, and experiences within a therapeutic context. By capturing and organizing therapy sessions into a knowledge graph, this system enables users to gain insights, identify patterns, and track their personal growth over time.

Dark: Company Conversations #

The Company Conversations AutoGraph is a tool that captures and organizes discussions, decisions, and ideas shared within an organization. By transforming unstructured data into a dynamic knowledge graph, this system empowers employees to access and leverage collective knowledge, enhancing collaboration, alignment, and productivity across teams. Such a powerful graph contains many secrets and nodes can be permissioned (unlike embeddings) with only nodes you are permissioned to see being pulled into your context.

Fire: Customer Journey #

The Customer Journey AutoGraph is a tool that captures and maps the full spectrum of customer experiences, from initial contact to post-purchase interactions. By generating a comprehensive knowledge graph, this system enables businesses to understand customer needs, preferences, and pain points, facilitating personalized engagement and service delivery. Such an AutoGraph could enable a sales Agent or customer service agent to effectively understand the nature and context of a user interaction.

Electric: Personal Finance #

The Personal Finance AutoGraph is an illuminating tool that helps individuals navigate the complex landscape of their financial lives. By organizing income, expenses, assets, and liabilities into a coherent knowledge graph, this system empowers users to gain clarity, identify opportunities, and make informed decisions to achieve their financial goals. Such a graph could be used and manipulated by budgeting, investing, or personal shopper agents to understand your financial landscape, goals, and day-to-day life.

Grass: Learning Journey #

The Learning Journey AutoGraph is a tool that supports individuals in their pursuit of knowledge and skill development. By capturing and linking learning experiences, resources, and results into a dynamic knowledge graph, learners can interact with AI systems that understand what they’ve encountered, and how they’ve encountered it and can provide powerful insights that can make learning easier. (For example, generating nodes in a graph based on lecture notes.) Systems built on this AutoGraph enable learners to track their progress, identify areas for improvement, and unlock new growth opportunities.

Fairy: Language Learning #

The Language Learning AutoGraph is a tool that builds a representation of a person’s language-learning journey by capturing what they are saying and how they are saying it. By capturing and categorizing moments of people learning language including making mistakes, applying previous feedback, and successful language use into an interconnected knowledge graph, this system empowers language learning system to help learners identify patterns, track their progress, and unlock new ways of expressing themselves in their target language.

Ice: Legal Case #

The Legal Case AutoGraph is a precise and interconnected tool that helps legal professionals navigate the complex landscape of case law, statutes, and precedents. By generating a knowledge graph that links key legal documents, arguments, and decisions, this system enables users to efficiently retrieve relevant information, identify patterns, and build strong cases. Adding and connecting case data at the knowledge graph building level allows AI agents and systems to work with the knowledge graph to use their capability to provide insights on top of the existing connections.

Psychic: Medical Knowledge #

The Medical Knowledge AutoGraph is a tool that integrates patient data, symptoms, and medical expertise into a unified knowledge graph. By leveraging this interconnected web of information, health Agents, Doctor AIs, and physical healthcare providers can gain a deeper understanding of each patient’s unique needs, facilitate accurate diagnoses, and develop personalized treatment plans.

Conclusion #

Thanks for exploring these ideas with me! If you want to use and explore AutoGraph, check out the repo here. If you have any feedback, message me on Twitter/X @samgbafa.