The development of large language models (LLMs) has exploded in recent years and this leaves a lot of organisations asking themselves the question ‘how can we benefit from LLMs?’ But before you start on the LLM journey, it is vital to understand use cases – the areas of an organisation where LLMs will add the most business value. This will help avoid disappointing results, products that don’t cut it and investment going down the drain.
Why proof of concept is important
Short proof of value projects can be extremely beneficial in helping organisations understand the benefit of LLMs. And proving business value is not just a technical matter. Embedding LLM application causes significant process changes, so implementation needs to align with management strategies, the involvement of end users and training users up with new skills.
LLMs in action at ProRail
ProRail, the Dutch railway management company, was keen to understand the value of LLMs. As a result, they initiated five proof of concept – so-called ‘pressure cooker’-projects to investigate the value of LLMs for various business cases in their organisation. Valcon was asked to conduct one of these projects.
LLMs in use in the railway sector
As ProRail is responsible for setting up and enforcing regulations for the safe and accessible design of railway stations across Holland. These design regulations prescribe, for example, the minimal width of a platform, the amount of free space in front of an elevator entrance and the required space for bicycle parking.
Together with thousands of other documents, the design regulations are stored in a large database, the Rail Infra Catalogue (RIC). Contractors, architects and inspectors regularly access this database to search for information they might need.
But the large size of the database and the limited search engine was stopping users from getting the right information quickly. ProRail was keen to understand how LLMs could improve RIC, so asked Valcon to embark on a pressure cooker project to assess how RIC could use an LLM for information retrieval.
Meet RICO the chatbot
In just a week – using a team of data scientists and business change consultants – Valcon built RICO, a chatbot prototype that could answer questions about some of the design regulations in the RIC. RICO is a so-called RAG chatbot. RAG stands for Retrieval Augmented Generation and involves a two-step approach in answering user questions. First, RICO searches for the relevant passages from the rail catalogue that contain the information it needs to answer the question. Then, an LLM answers the question based on the retrieved passages.
For instance, if a user asks about the minimal width of a platform, RICO will identify the sections in the catalogue that discuss platform widths and then generate an easy-to-understand response that answers the question.
While RICO looks similar to ChatGPT, the key distinction is that RICO is instructed to use only information from the catalogue in its answers and to refuse to answer questions outside its scope.
Demonstrating value quickly
After one week, Valcon demonstrated RICO to end users to show its business value. RICO could answer most questions correctly and was able to identify and explain ambiguous and conditional information in the documents. To facilitate information validation, RICO includes references to the original, relevant passages in its answers. End users found RICO to be a significant improvement over the search engine they had been using.
Valcon also identified opportunities for improvement and outlined the risks of full-scale implementation. First, RICO would benefit from an improved document management system, which would cost time and effort, but will eventually lead to increased quality of answers.
Secondly, despite improvements over the existing search engine, the 100% accuracy of RICO’s answers cannot be guaranteed due to its probabilistic nature. So users need to be aware they need to consult the relevant passages in RIC to confirm RICO’s answers. This highlights the importance of involving end users in demonstrations and test sessions during development.
Business value
The productive collaboration between ProRail and Valcon clearly demonstrates the value of the pressure cooker approach in quickly validating innovative LLM solutions. A pressure cooker project is a low-risk way for organisations to get a swift idea of the impact AI might have on an organisation and the business value it can create.
To understand how large language models can assist in your business processes and find out about chatbot pressure cookers, please get in touch with [email protected].