ʼһ

XClose

UCL Centre for Digital Innovation

Home
Menu

Technical Case Study: Graffinity’s technical evolution - Advancing search and knowledge graphs

Graffinity

6 June 2024

Graffinity, an EdTech startup leveraging machine learning to create searchable mind maps and accelerate comprehension for learners disadvantaged by large volumes of text, joined Cohort 4 of the UCL CDI Impact Accelerator to enhance the technical aspects of their digital innovation. Through participation in the Technical Needs Assessment workshop, they pinpointed three critical areas for improvement to focus on during the programme: revolutionising search capabilities, boosting the quality of their Knowledge Graph (KG), and refining the platform’s design. Their journey was marked by challenges, experimentation, and breakthroughs, culminating in significant technical advancements. 

Revolutionising Search Functionality 

Graffinity identified that their search functionality was underperforming in three key areas, necessitating improvements for a better user experience. 

1. Addressing Misspellings: The system was intolerant of misspellings, requiring a "fuzzier" approach to accommodate user typos while still functioning effectively. 

2. Supporting Concept Searches: Searches were limited to single entities (e.g., the name of a person or company) that had to exist within the Knowledge Graph (KG). They aimed to support searches on concepts even if these weren't directly represented in the KG. 

3. Enhancing Relevance: The system often returned irrelevant related entities, highlighting the need to enhance the relevance of the results. They sought a solution that was both robust and intuitive. 

Initially, the team turned to Amazon Kendra for its powerful semantic search. However, integrating Kendra with their KG proved unfeasible. As Kendra couldn’t directly interact with the KG, Graffinity had to rethink their approach. Facing a roadblock, Graffinity explored the potential of Large Language Models (LLMs). They considered a Retrieval Augmented Generation (RAG) approach but encountered a catch-22: to implement RAG, they first needed the very improvements they were seeking to get from it. This led them to a novel strategy—LLM augmented search. 

Using Anthropic’s Claude 2 foundation model via AWS Bedrock, Graffinity worked with data scientists from ARC to devise a system where user queries were fed to the LLM to identify relevant entities. This information was then used to pull additional data from the KG. The results were astonishing. For instance, a user inputting “Mab ceqth Shbajelaspear” would receive a graph displaying Macbeth and key characters from Shakespeare’s play, showcasing the system’s ability to handle misspellings and contextual searches seamlessly. Concept searches, such as “what is the historical context for Macbeth,” yielded comprehensive graphs connecting play entities with relevant historical figures from Scottish history. 

Enhancing the Knowledge Graph 

With search capabilities on a new trajectory, Graffinity turned to improving the KG itself. The goal was to extract entities and relationships from unstructured text to enrich their existing data. 

Here, they explored AWS Comprehend NLP’s beta features for relationship extraction. While promising, it fell short in entity disambiguation—crucial for their needs. Returning to Bedrock and Claude 2, they leveraged the LLM to extract entities and relationships from text. This approach not only worked but excelled. A proof-of-concept allowed for website scraping to generate KGs, with plans to enable user-uploaded text integration. By utilizing Amazon Transcribe, Graffinity envisioned a future where video and audio files could also be converted into rich KGs. 

Optimising Platform Design 

Graffinity’s Neo4J graph database was running in a single container within their Amazon EKS cluster, lacking persistent storage and scalability. They needed a more robust solution. 

AWS Neptune presented the answer. Supporting the Open Cypher query language, Neptune allowed Graffinity to create a serverless cluster, loading their graph database from the same CSV files used for Neo4J. Their existing Cypher queries worked without modification, seamlessly transitioning their web application to a more scalable and enterprise-grade database environment. 

A Technical Triumph 

 Graffinity’s technical journey on the UCL CDI Accelerator was transformative. By overcoming search limitations, enhancing their KG, and optimising their database design, they not only improved user experience but set a new standard for their platform’s capabilities. Through innovation and resilience, Graffinity emerged stronger, ready to tackle future challenges and seize new opportunities.