Apple explains how App Store reviews are summarized with AI

Apple has explained how its AI generates summaries of App Store Reviews.

Apple’s iconic App Store was recently updated to feature AI-generated summaries of user reviews, and now we know how it all works.

In October 2024, an unlisted App Store article revealed that Apple wanted to summarize user application reviews with the help of artificial intelligence. Months later, in March 2025, the feature became available to the general public with the release of iOS 18.4.

While we already had a few details about Apple’s AI-generated review summaries, a new post on Apple’s Machine Learning blog explains the intricacies and specifics of the feature.

The characteristics and goals of AI-generated review summaries

The ultimate goal of these summaries is to provide users with a clear picture of an app’s reviews, so that they may more easily decide whether or not to purchase or install a particular application. In summarizing user reviews, however, Apple had to make sure that the AI output was up to date and that it didn’t include off-topic or offensive information.

App Store applications often receive updates, and changes such as new features, bug fixes, or in-app items often influence user reviews. App reviews themselves also vary by style, length, and even relevance. Apple’s AI summarization needed to account for all of these factors, so the company implemented a multi-step process.

How Apple’s AI summarizes user reviews

First, user reviews with spam and profanity are filtered out. Eligible reviews are then put through a series of different LLMs or large language models, which extract key insights from user reviews. After that, common themes are aggregated, and user sentiment is balanced. The result is an AI-generated summary that reflects broad user sentiment, with a length of 100 to 300 words.

During the first phase of the process, known as “Insight Extraction,” user reviews are boiled down to distinct insights. Apple says that these insights encapsulate “one specific aspect of the review, articulated in standardized, natural language, and confined to a single topic and sentiment.”

“Dynamic Topic Modeling” lets Apple’s AI compare relevant topics across different reviews, so that the software can identify the most prominent topics discussed. The approach and terminology bear some resemblance to Apple’s AI test applications, which we outlined in 2024.

For each app, a set of topics, along with the “most representative” insights for these topics, are used by AI in the creation of summaries. The specially designed LLMs ensured that user sentiment was balanced, and that the summaries maintained the required form and length.

During development, Apple’s AI-generated summaries were evaluated for characteristics such as groundedness, composition, helpfulness, and more. This part of the process involved human reviewers, which serves as an indication of how seriously Apple took its AI summary development.

Apple’s blog details all of the steps mentioned here, with more specific information on the technology used during each part of the process. All in all, the iPhone maker’s approach ensures that AI-generated summaries of user reviews are accurate, helpful, spam-free, and up to date.

Source link