Label Images with ML Kit on Android

You can use ML Kit to label objects recognized in an image, using either an on-device model or a cloud model. See the overview to learn about the benefits of each approach.

Before you begin

  1. If you haven't already, add Firebase to your Android project.
  2. Add the dependencies for the ML Kit Android libraries to your module (app-level) Gradle file (usually app/build.gradle):
    applyplugin:'com.android.application'
    applyplugin:'com.google.gms.google-services'
    dependencies{
    // ...
    
    implementation'com.google.firebase:firebase-ml-vision:24.0.3'
    implementation'com.google.firebase:firebase-ml-vision-image-label-model:20.0.1'
    }
  3. Optional but recommended: If you use the on-device API, configure your app to automatically download the ML model to the device after your app is installed from the Play Store.

    To do so, add the following declaration to your app's AndroidManifest.xml file:

    <application ...>
     ...
     <meta-data
     android:name="com.google.firebase.ml.vision.DEPENDENCIES"
     android:value="label" />
     <!-- To use multiple models: android:value="label,model2,model3" -->
    </application>
    If you do not enable install-time model downloads, the model will be downloaded the first time you run the on-device detector. Requests you make before the download has completed will produce no results.
  4. If you want to use the Cloud-based model, and you have not already enabled the Cloud-based APIs for your project, do so now:

    1. Open the ML Kit APIs page of the Firebase console.
    2. If you have not already upgraded your project to a Blaze pricing plan, click Upgrade to do so. (You will be prompted to upgrade only if your project isn't on the Blaze plan.)

      Only Blaze-level projects can use Cloud-based APIs.

    3. If Cloud-based APIs aren't already enabled, click Enable Cloud-based APIs.

    If you want to use only the on-device model, you can skip this step.

Now you are ready to label images using either an on-device model or a cloud-based model.

1. Prepare the input image

Create a FirebaseVisionImage object from your image. The image labeler runs fastest when you use a Bitmap or, if you use the camera2 API, a JPEG-formatted media.Image, which are recommended when possible.

  • To create a FirebaseVisionImage object from a media.Image object, such as when capturing an image from a device's camera, pass the media.Image object and the image's rotation to FirebaseVisionImage.fromMediaImage().

    If you use the CameraX library, the OnImageCapturedListener and ImageAnalysis.Analyzer classes calculate the rotation value for you, so you just need to convert the rotation to one of ML Kit's ROTATION_ constants before calling FirebaseVisionImage.fromMediaImage():

    Java

    privateclass YourAnalyzerimplementsImageAnalysis.Analyzer{
    privateintdegreesToFirebaseRotation(intdegrees){
    switch(degrees){
    case0:
    returnFirebaseVisionImageMetadata.ROTATION_0;
    case90:
    returnFirebaseVisionImageMetadata.ROTATION_90;
    case180:
    returnFirebaseVisionImageMetadata.ROTATION_180;
    case270:
    returnFirebaseVisionImageMetadata.ROTATION_270;
    default:
    thrownewIllegalArgumentException(
    "Rotation must be 0, 90, 180, or 270.");
    }
    }
    @Override
    publicvoidanalyze(ImageProxyimageProxy,intdegrees){
    if(imageProxy==null||imageProxy.getImage()==null){
    return;
    }
    ImagemediaImage=imageProxy.getImage();
    introtation=degreesToFirebaseRotation(degrees);
    FirebaseVisionImageimage=
    FirebaseVisionImage.fromMediaImage(mediaImage,rotation);
    // Pass image to an ML Kit Vision API
    // ...
    }
    }

    Kotlin

    privateclassYourImageAnalyzer:ImageAnalysis.Analyzer{
    privatefundegreesToFirebaseRotation(degrees:Int):Int=when(degrees){
    0->FirebaseVisionImageMetadata.ROTATION_0
    90->FirebaseVisionImageMetadata.ROTATION_90
    180->FirebaseVisionImageMetadata.ROTATION_180
    270->FirebaseVisionImageMetadata.ROTATION_270
    else->throwException("Rotation must be 0, 90, 180, or 270.")
    }
    overridefunanalyze(imageProxy:ImageProxy?,degrees:Int){
    valmediaImage=imageProxy?.image
    valimageRotation=degreesToFirebaseRotation(degrees)
    if(mediaImage!=null){
    valimage=FirebaseVisionImage.fromMediaImage(mediaImage,imageRotation)
    // Pass image to an ML Kit Vision API
    // ...
    }
    }
    }

    If you don't use a camera library that gives you the image's rotation, you can calculate it from the device's rotation and the orientation of camera sensor in the device:

    Java

    privatestaticfinalSparseIntArrayORIENTATIONS=newSparseIntArray();
    static{
    ORIENTATIONS.append(Surface.ROTATION_0,90);
    ORIENTATIONS.append(Surface.ROTATION_90,0);
    ORIENTATIONS.append(Surface.ROTATION_180,270);
    ORIENTATIONS.append(Surface.ROTATION_270,180);
    }
    /**
     * Get the angle by which an image must be rotated given the device's current
     * orientation.
     */
    @RequiresApi(api=Build.VERSION_CODES.LOLLIPOP)
    privateintgetRotationCompensation(StringcameraId,Activityactivity,Contextcontext)
    throwsCameraAccessException{
    // Get the device's current rotation relative to its "native" orientation.
    // Then, from the ORIENTATIONS table, look up the angle the image must be
    // rotated to compensate for the device's rotation.
    intdeviceRotation=activity.getWindowManager().getDefaultDisplay().getRotation();
    introtationCompensation=ORIENTATIONS.get(deviceRotation);
    // On most devices, the sensor orientation is 90 degrees, but for some
    // devices it is 270 degrees. For devices with a sensor orientation of
    // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
    CameraManagercameraManager=(CameraManager)context.getSystemService(CAMERA_SERVICE);
    intsensorOrientation=cameraManager
    .getCameraCharacteristics(cameraId)
    .get(CameraCharacteristics.SENSOR_ORIENTATION);
    rotationCompensation=(rotationCompensation+sensorOrientation+270)%360;
    // Return the corresponding FirebaseVisionImageMetadata rotation value.
    intresult;
    switch(rotationCompensation){
    case0:
    result=FirebaseVisionImageMetadata.ROTATION_0;
    break;
    case90:
    result=FirebaseVisionImageMetadata.ROTATION_90;
    break;
    case180:
    result=FirebaseVisionImageMetadata.ROTATION_180;
    break;
    case270:
    result=FirebaseVisionImageMetadata.ROTATION_270;
    break;
    default:
    result=FirebaseVisionImageMetadata.ROTATION_0;
    Log.e(TAG,"Bad rotation value: "+rotationCompensation);
    }
    returnresult;
    }

    Kotlin

    privatevalORIENTATIONS=SparseIntArray()
    init{
    ORIENTATIONS.append(Surface.ROTATION_0,90)
    ORIENTATIONS.append(Surface.ROTATION_90,0)
    ORIENTATIONS.append(Surface.ROTATION_180,270)
    ORIENTATIONS.append(Surface.ROTATION_270,180)
    }
    /**
     * Get the angle by which an image must be rotated given the device's current
     * orientation.
     */
    @RequiresApi(api=Build.VERSION_CODES.LOLLIPOP)
    @Throws(CameraAccessException::class)
    privatefungetRotationCompensation(cameraId:String,activity:Activity,context:Context):Int{
    // Get the device's current rotation relative to its "native" orientation.
    // Then, from the ORIENTATIONS table, look up the angle the image must be
    // rotated to compensate for the device's rotation.
    valdeviceRotation=activity.windowManager.defaultDisplay.rotation
    varrotationCompensation=ORIENTATIONS.get(deviceRotation)
    // On most devices, the sensor orientation is 90 degrees, but for some
    // devices it is 270 degrees. For devices with a sensor orientation of
    // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
    valcameraManager=context.getSystemService(CAMERA_SERVICE)asCameraManager
    valsensorOrientation=cameraManager
    .getCameraCharacteristics(cameraId)
    .get(CameraCharacteristics.SENSOR_ORIENTATION)!!
    rotationCompensation=(rotationCompensation+sensorOrientation+270)%360
    // Return the corresponding FirebaseVisionImageMetadata rotation value.
    valresult:Int
    when(rotationCompensation){
    0->result=FirebaseVisionImageMetadata.ROTATION_0
    90->result=FirebaseVisionImageMetadata.ROTATION_90
    180->result=FirebaseVisionImageMetadata.ROTATION_180
    270->result=FirebaseVisionImageMetadata.ROTATION_270
    else->{
    result=FirebaseVisionImageMetadata.ROTATION_0
    Log.e(TAG,"Bad rotation value: $rotationCompensation")
    }
    }
    returnresult
    }

    Then, pass the media.Image object and the rotation value to FirebaseVisionImage.fromMediaImage():

    Java

    FirebaseVisionImageimage=FirebaseVisionImage.fromMediaImage(mediaImage,rotation);

    Kotlin

    valimage=FirebaseVisionImage.fromMediaImage(mediaImage,rotation)
  • To create a FirebaseVisionImage object from a file URI, pass the app context and file URI to FirebaseVisionImage.fromFilePath(). This is useful when you use an ACTION_GET_CONTENT intent to prompt the user to select an image from their gallery app.

    Java

    FirebaseVisionImageimage;
    try{
    image=FirebaseVisionImage.fromFilePath(context,uri);
    }catch(IOExceptione){
    e.printStackTrace();
    }

    Kotlin

    valimage:FirebaseVisionImage
    try{
    image=FirebaseVisionImage.fromFilePath(context,uri)
    }catch(e:IOException){
    e.printStackTrace()
    }
  • To create a FirebaseVisionImage object from a ByteBuffer or a byte array, first calculate the image rotation as described above for media.Image input.

    Then, create a FirebaseVisionImageMetadata object that contains the image's height, width, color encoding format, and rotation:

    Java

    FirebaseVisionImageMetadatametadata=newFirebaseVisionImageMetadata.Builder()
    .setWidth(480)// 480x360 is typically sufficient for
    .setHeight(360)// image recognition
    .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
    .setRotation(rotation)
    .build();

    Kotlin

    valmetadata=FirebaseVisionImageMetadata.Builder()
    .setWidth(480)// 480x360 is typically sufficient for
    .setHeight(360)// image recognition
    .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
    .setRotation(rotation)
    .build()

    Use the buffer or array, and the metadata object, to create a FirebaseVisionImage object:

    Java

    FirebaseVisionImageimage=FirebaseVisionImage.fromByteBuffer(buffer,metadata);
    // Or: FirebaseVisionImage image = FirebaseVisionImage.fromByteArray(byteArray, metadata);

    Kotlin

    valimage=FirebaseVisionImage.fromByteBuffer(buffer,metadata)
    // Or: val image = FirebaseVisionImage.fromByteArray(byteArray, metadata)
  • To create a FirebaseVisionImage object from a Bitmap object:

    Java

    FirebaseVisionImageimage=FirebaseVisionImage.fromBitmap(bitmap);

    Kotlin

    valimage=FirebaseVisionImage.fromBitmap(bitmap)
    The image represented by the Bitmap object must be upright, with no additional rotation required.

2. Configure and run the image labeler

To label objects in an image, pass the FirebaseVisionImage object to the FirebaseVisionImageLabeler's processImage method.

  1. First, get an instance of FirebaseVisionImageLabeler.

    If you want to use the on-device image labeler:

    Java

    FirebaseVisionImageLabelerlabeler=FirebaseVision.getInstance()
    .getOnDeviceImageLabeler();
    // Or, to set the minimum confidence required:
    // FirebaseVisionOnDeviceImageLabelerOptions options =
    // new FirebaseVisionOnDeviceImageLabelerOptions.Builder()
    // .setConfidenceThreshold(0.7f)
    // .build();
    // FirebaseVisionImageLabeler labeler = FirebaseVision.getInstance()
    // .getOnDeviceImageLabeler(options);
    

    Kotlin

    vallabeler=FirebaseVision.getInstance().getOnDeviceImageLabeler()
    // Or, to set the minimum confidence required:
    // val options = FirebaseVisionOnDeviceImageLabelerOptions.Builder()
    // .setConfidenceThreshold(0.7f)
    // .build()
    // val labeler = FirebaseVision.getInstance().getOnDeviceImageLabeler(options)
    

    If you want to use the cloud image labeler:

    Java

    FirebaseVisionImageLabelerlabeler=FirebaseVision.getInstance()
    .getCloudImageLabeler();
    // Or, to set the minimum confidence required:
    // FirebaseVisionCloudImageLabelerOptions options =
    // new FirebaseVisionCloudImageLabelerOptions.Builder()
    // .setConfidenceThreshold(0.7f)
    // .build();
    // FirebaseVisionImageLabeler labeler = FirebaseVision.getInstance()
    // .getCloudImageLabeler(options);
    

    Kotlin

    vallabeler=FirebaseVision.getInstance().getCloudImageLabeler()
    // Or, to set the minimum confidence required:
    // val options = FirebaseVisionCloudImageLabelerOptions.Builder()
    // .setConfidenceThreshold(0.7f)
    // .build()
    // val labeler = FirebaseVision.getInstance().getCloudImageLabeler(options)
    
  2. Then, pass the image to the processImage() method:

    Java

    labeler.processImage(image)
    .addOnSuccessListener(newOnSuccessListener<List<FirebaseVisionImageLabel>>(){
    @Override
    publicvoidonSuccess(List<FirebaseVisionImageLabel>labels){
    // Task completed successfully
    // ...
    }
    })
    .addOnFailureListener(newOnFailureListener(){
    @Override
    publicvoidonFailure(@NonNullExceptione){
    // Task failed with an exception
    // ...
    }
    });
    

    Kotlin

    labeler.processImage(image)
    .addOnSuccessListener{labels->
    // Task completed successfully
    // ...
    }
    .addOnFailureListener{e->
    // Task failed with an exception
    // ...
    }
    

3. Get information about labeled objects

If the image labeling operation succeeds, a list of FirebaseVisionImageLabel objects will be passed to the success listener. Each FirebaseVisionImageLabel object represents something that was labeled in the image. For each label, you can get the label's text description, its Knowledge Graph entity ID (if available), and the confidence score of the match. For example:

Java

for(FirebaseVisionImageLabellabel:labels){
Stringtext=label.getText();
StringentityId=label.getEntityId();
floatconfidence=label.getConfidence();
}

Kotlin

for(labelinlabels){
valtext=label.text
valentityId=label.entityId
valconfidence=label.confidence
}

Tips to improve real-time performance

If you want to label images in a real-time application, follow these guidelines to achieve the best framerates:

  • Throttle calls to the image labeler. If a new video frame becomes available while the image labeler is running, drop the frame.
  • If you are using the output of the image labeler to overlay graphics on the input image, first get the result from ML Kit, then render the image and overlay in a single step. By doing so, you render to the display surface only once for each input frame.
  • If you use the Camera2 API, capture images in ImageFormat.YUV_420_888 format.

    If you use the older Camera API, capture images in ImageFormat.NV21 format.

Next steps

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025年11月10日 UTC.