How to analyse customer feedback using ChatGPT

With customer feedback dataset, we want to understand some details (like about the gym or beverages for breakfast). They are buried in the categories of “Hotel facilities” and “Breakfast” topics. LLMs could help us with this analysis and save many hours of going through customers’ reviews. Here are some ways:
1. Key Word approaches The most straightforward way to get comments related to a specific topic is just to look for some particular words in the texts, like “gym” or “drink”. I’ve been using this approach many times when ChatGPT didn’t exist. The problems with this approach are pretty obvious: - You might get quite a lot of not relevant comments about gymnasia nearby or alcoholic drinks in the hotel restaurant. Such filters are not specific enough and can’t take context into account so that you will have a lot of false positives. - On the other hand, you might not have good enough coverage as well. People tend to use slightly different words for the same things (for example, drinks, refreshments, beverages, juices, etc). There might be typos. And this task might become even more convoluted if your customers speak different languages. So, this approach has problems both with precision and recall. It will give you a rough understanding of the question, but its capabilities are limited. 2. Topic Modelling based on all the topics The other potential solution is to send all customer comments to LLM and ask the model to define whether they are related to our topic of interest (beverages at breakfast or gym). We can even ask the model to sum up all customer feedback and provide a conclusion. This approach is likely to work pretty well. However, it has its limitations too: you will need to send all the documents you have to LLM each time you want to dive deeper into a particular topic. Even with high-level filtering based on topics we defined, it might be quite a lot of data to pass to LLM, and it will be rather costly or impossible. 3. Retrieval-augmented generation We have a set of documents (customer reviews), and we want to ask questions related to the content of these documents (for example, “What do customers like about breakfast?”). As we discussed before, we don’t want to send all customer reviews to LLM, so we need to have a way to define only the most relevant ones. Then, the task will be pretty straightforward: pass the user question and these documents as the context to LLM, and that’s it. The pipeline for RAG consists of the following stages: - Loading documents from the data sources we have. - Splitting documents into chunks that are easy to use further. - Storage: vector stores are often used for this use case to process data effectively. - Retrieval of relevant to the question documents. - Generation is passing a question and relevant documents to LLM and getting the final answer.