A conversation between your users and LLMs consists of multiple interactions.
Previously, our platform focused on analyzing user interactions with LLMs on an interaction-level, meaning that we computed metrics such as topics, user intents, and user issues for each individual interaction. Now, we’ve expanded our capabilities to also provide insights at the conversation level, where multiple interactions are summarized to give a more holistic view of the entire user conversation.
• Conversation-Level Analysis (Aggregate Mode).
In this mode, we provide insights at the conversation level, offering a holistic view of entire user interactions with LLMs. By aggregating multiple interactions within a conversation, this mode highlights key metrics like overall topics, user intents, and recurring problems.
• Interaction-Level Analysis (Detailed Mode).
This mode focuses on the individual interactions within each conversation, enabling a more granular analysis. By breaking down each interaction separately, it captures specific user intents, topics, and problems at each step of the conversation.
You can choose between the two modalities using the button in the top right corner of the platform:
We’ve also added a “Conversations” page, where you can view individual conversations, similar to how you could previously view interactions on the interactions page.
We’re excited to introduce a new overview page for LLM performance issues, along with a global metric to help you assess how your LLM is performing based on user interactions: the LLM User Error Rate. This metric provides a clear and actionable representation of your LLM’s performance from the users’ perspective. Additionally, we’ve added benchmarks to compare your performance against market data, providing better context for evaluation.
Here’s how the LLM User Error Rate is categorized:
• “Optimal” - Error rate < 10%
The LLM is performing exceptionally well, with minimal user frustration.
• “Moderate” - Error rate between 10% and 20%
The LLM’s performance is generally acceptable, though some users are encountering issues. In this case, we highly recommend reviewing the “Problems Identified” section and addressing those issues to improve the user experience and bring the metric closer to “Optimal.”
• “Critical” - Error rate > 20%
The LLM’s performance is suboptimal, with a significant number of users experiencing frustration. It’s highly recommended to immediately investigate the “Problems Identified” section and prioritize solving these issues to enhance the user experience and reduce the error rate.
This new metric will help you better understand and improve your LLM’s performance, ensuring you can take actionable steps toward achieving optimal user satisfaction.
New & Improved:
• Improved the UX in chart duplication: Now when duplicating a chart you can directly select the report you want to duplicate the chart in.
Fixes:
• Fixed a bug that prevented scrolling directly to the new chart when duplicating a chart in the same report.
• Fixed a bug in user filters that prevented the filters from being applied correctly.
• Improved the general stability of the reports page.
• Interactive Chart Reports
You can now click directly on points of interest (e.g., spikes) within your report charts to view the associated interactions, user intents, or topics. This feature allows for a more detailed and intuitive exploration of data trends.
• Enhanced Navigation for Horizontal Charts
For horizontal charts with breakdowns, you can now seamlessly navigate to the detailed user-intelligence page, just as you would when interacting with the user-intelligence tables, providing a consistent and efficient user experience.
We’ve improved the clarity of the interaction details page by clearly distinguishing between user interactions and assistant responses, making it easier to follow conversations.
• Translation UI Improvements
We’ve added a new feature to the translation UI that allows you to switch back to the original text after a message (or conversation) has been translated, offering greater flexibility in viewing content.
Now, when you hover over a chart (whether a line chart or horizontal bar chart) with a breakdown applied, the selected breakdown value is highlighted relative to the other lines or bars, enhancing the visibility and clarity of the chart data.
We added support for both the latest models released by OpenAI:
• o1-preview
• o1-mini
New & Improved:
• On the reports page, when a new chart is created, the page now automatically scrolls to the bottom where the chart has been added, ensuring you can easily view it right away.
• We have removed the global filters from the reports, as they were causing confusion with the dedicated report filters, streamlining the filtering process for better clarity.
Fixes:
• Fixed a bug preventing the sharing of user-intelligence pages.
On the user intelligence page, you can now customise the tabs to suit your preferences.
You can activate or deactivate the data types you’re interested in and arrange them in your desired order. Simply click the edit button on the right side of the tabs component and drag and drop the tab names to reorder them.
We have also added the ability to sort charts in reports.
Simply click the “Edit order” button in the top right corner of the report page. You can then rearrange the charts by dragging and dropping their names to your desired positions. This feature gives you greater control over how your data is displayed and shared with different stakeholders.
For applications where users interact in languages other than English, we added the option to translate the entire conversation instead of just a single message. This full-conversation translation feature greatly enhances the user experience by providing a broader view of what the user discussed across multiple messages.
We have improved the clarity of the retention charts. The configurable parameters and retention descriptions now explain the chart functionality more effectively. It is also easier to understand which parameters need to be modified to obtain the desired retention chart.
New & Improved:
• Now the granularity in the overview page charts automatically switches to hour when a single day time-range is selected on the platform.
• Default granularity values have been improved across the platform.
Fixes:
• Fixed several UI bugs across the platform, resulting in a cleaner user interface and improved user experience.
• Fixed a bug on the report page that directed users to the wrong page when clicking on the breadcrumb.
• Fixed a bug in the average sentiment score rounding.
In order to give a better visibility on the trends of quantities like topics, behavioural alerts and user intents, we updated the user intelligence page. Now it is possible to choose between two different visualizations:
• Table
Where you will keep visualizing the information you are used to in a tabular way.
• Chart
Where you can select the topics, intents or behavioural alerts you are most interested in and visualize them as a time-series in the selected time range. This will help you in analysing the trends in the time-frames you are most interested in.
For each user metric (such as a specific topic), you can choose to visualize the trend of various related “primary metrics.” For example, you might explore:
• The trend of interactions related to the topic
• The LLM error rate associated with the topic, and more
To visualize these trends, simply select the primary metric of interest using the button below:
To adjust the granularity of the x-axis, you can select from “Hour,” “Day,” “Week,” “Month,” or “Year.”
To provide a comprehensive analysis of how your users are interacting with the chatbot, we have added two more metrics you can monitor to the user intelligence page:
• User emotions
We selected a set of 27 emotions based on psychological literature.
For each interaction, we detect whether the user is expressing one or more of these emotions.
• User sentiment
Each interaction is classified into one of six sentiment categories:
• Very Negative. When the user uses highly explicit negative terms or expresses a strongly opinionated negative view.
• Negative. When the user is irritated, complaining, or mildly insulting the assistant.
• Neutral. No discernible sentiment is detected.
• Mixed. Both positive and negative sentiments are expressed in the interaction.
• Positive. When the user praises something, either implicitly or explicitly.
• Very Positive. When the user is enthusiastic about something.
Both metrics are computed at the interaction level and can be visualized on the User Intelligence page as well as in reports.
For applications where users interact in languages other than English (such as Spanish, German, etc.), you can now easily translate raw interactions into English using the “Translate to English” feature.
This functionality simplifies platform use in multilingual environments, eliminating the need to manually translate what your users are saying.
To activate translation, simply click the “Translation” button within the details of the raw interaction.
To simplify using the User Retention page, you can now easily select the “Retention Frequency” from:
• “Daily”: users coming back every day
• “Weekly”: users coming back every week
• “Monthly”: users coming back every month
As well as the “Starting Date”, which is the date you want to begin analyzing retention.
copy link
button to copy the link to your clipboard.