Overview of the ML Kit GenAI APIs

ML Kit's GenAI APIs harness the power of Gemini Nano to help your apps perform tasks. These APIs provide out-of-the-box quality for popular use cases through a high-level interface. The ML Kit GenAI APIs are built on top of AICore, an Android system service that enables on-device execution of GenAI foundation models to facilitate features such as enhanced app functionality and improved user privacy by processing data locally.

The ML Kit GenAI APIs support the following features:

  • Summarization: Summarize articles or chat conversations as a bulleted list.
  • Proofreading: Polish short content by refining grammar and fixing spelling errors.
  • Rewriting: Rewrite short messages in different tones or styles.
  • Image description: Generate a short description of a given image.

Benefits of GenAI APIs

Similar to other existing ML Kit features, GenAI APIs run entirely on-device and thus provide the following benefits:

  • Input, inference, and output data is processed locally
  • Functionality remains the same without reliable internet connection
  • No additional server cost incurred for each API call

In addition, since GenAI APIs are built on top of AICore and powered by Gemini Nano, every app is able to use the shared Gemini Nano model that is on the device. This avoids the need to have to wait for a model to be downloaded if it already exists on a device, and in turn conserves storage space. Learn more about how AICore isolates requests to protect privacy.

Streaming versus non-streaming

ML Kit GenAI APIs offer both streaming and non-streaming options for receiving results. The streaming API delivers responses incrementally as they are generated, providing a continuous flow of data. In contrast, the non-streaming API waits until the entire response is complete before returning it as a single block.

Choose the streaming API for lengthy responses, as it allows for quicker initial feedback. The non-streaming API is more suitable for short responses or when processing results in batches.

Device support

The ML Kit GenAI APIs are available on the following devices, with plans to expand support to additional devices:

  • Google: Pixel 9, Pixel 9 Pro, Pixel 9 Pro XL, Pixel 9 Pro Fold
  • Honor: Magic 7 Pro, Magic 7
  • iQOO: iQOO 13
  • Motorola: Razr 60 Ultra
  • OnePlus: OnePlus 13, OnePlus 13s
  • OPPO: Find N5, Find X8, Find X8 Pro
  • POCO: POCO F7 Ultra
  • realme: realme GT 7 Pro
  • Samsung: Galaxy S25, Galaxy S25+, Galaxy S25 Ultra
  • vivo: vivo X200, vivo X200 Pro
  • Xiaomi: Xiaomi 15 Ultra, Xiaomi 15

Availability of specific language support may vary depending on the particular device's configuration and the models that have been downloaded to the device.

Quota per application

AICore enforces an inference quota per app. This means that making too many GenAI API requests in a short period will result in an ErrorCode.BUSY response. When receiving such an error, consider using exponential backoff to retry the request.

Background usage

GenAI API inference is permitted only when the app is the top foreground application. Using the API when the app is not in the foreground, including using a foreground service, will result in an ErrorCode.BUSY response due to the current lack of background usage quota.

Sample code

To get this code, check out the following samples: