StartGcpVisionAnnotateImagesOperation

StartGcpVisionAnnotateImagesOperation 2.5.0

Bundle: org.apache.nifi | nifi-gcp-nar
Description: Trigger a Vision operation on image input. It should be followed by GetGcpVisionAnnotateImagesOperationStatus processor in order to monitor operation status.
Tags: Cloud, Google, Machine Learning, Vision
Input Requirement
Supports Sensitive Dynamic Properties: false

Additional Details for StartGcpVisionAnnotateImagesOperation 2.5.0
Google Vision

Google Cloud Vision - Start Annotate Images Operation

Prerequisites
- Make sure Vision API is enabled and the account you are using has the right to use it
- Make sure the input image(s) are available in a GCS bucket under /input folder
Usage

StartGcpVisionAnnotateImagesOperation is designed to trigger image annotation operations. This processor should be used in pair with the GetGcpVisionAnnotateImagesOperationStatus Processor. Outgoing FlowFiles contain the raw response to the request returned by the Vision server. The response is in JSON format and contains the result and additional metadata as written in the Google Vision API Reference documents.

Payload

The JSON Payload is a request in JSON format as documented in the Google Vision REST API reference document. Payload can be fed to the processor via the JSON Payload property or as a FlowFile content. The property has higher precedence over FlowFile content. Please make sure to delete the default value of the property if you want to use FlowFile content payload. A JSON payload template example:
```
{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": "gs://${gcs.bucket}/${filename}"
        }
      },
      "features": [
        {
          "type": "${vision-feature-type}",
          "maxResults": 4
        }
      ]
    }
  ],
  "outputConfig": {
    "gcsDestination": {
      "uri": "gs://${output-bucket}/${filename}/"
    },
    "batchSize": 2
  }
}
```
Features types
- TEXT_DETECTION: Optical character recognition (OCR) for an image; text recognition and conversion to machine-coded text. Identifies and extracts UTF-8 text in an image.
- DOCUMENT_TEXT_DETECTION: Optical character recognition (OCR) for a file (PDF/TIFF) or dense text image; dense text recognition and conversion to machine-coded text.
- LANDMARK_DETECTION: Provides the name of the landmark, a confidence score and a bounding box in the image for the landmark.
- LOGO_DETECTION: Provides a textual description of the entity identified, a confidence score, and a bounding polygon for the logo in the file.
- LABEL_DETECTION: Provides generalized labels for an image.
- etc.
You can find more details at Google Vision Feature List

Example: How to set up a simple Annotate Image Flow

Prerequisites
- Create an input and output bucket
- Input image files should be available in a GCS bucket
- This bucket must not contain anything else but the input image files
- Set the bucket property of ListGCSBucket processor to your input bucket name
- Keep the default value of JSON PAYLOAD property in StartGcpVisionAnnotateImagesOperation
- Set the Output Bucket property to your output bucket name in StartGcpVisionAnnotateImagesOperation
- Setup GCP Credentials Provider Service for all GCP related processor
Execution steps:
- ListGCSBucket processor will return a list of files in the bucket at the first run.
- ListGCSBucket will return only new items at subsequent runs.
- StartGcpVisionAnnotateImagesOperation processor will trigger GCP Vision image annotation jobs based on the JSON payload.
- StartGcpVisionAnnotateImagesOperation processor will populate the operationKey flow file attribute.
- GetGcpVisionAnnotateImagesOperationStatus processor will periodically query status of the job.

Properties

GCP Credentials Provider Service
The Controller Service used to obtain Google Cloud Platform credentials.

Display Name

GCP Credentials Provider Service

Description

The Controller Service used to obtain Google Cloud Platform credentials.

API Name

gcp-credentials-provider-service

Service Interface

org.apache.nifi.gcp.credentials.service.GCPCredentialsService

Service Implementations

org.apache.nifi.processors.gcp.credentials.service.GCPCredentialsControllerService

Expression Language Scope

Not Supported

Sensitive

false

Required

true
JSON Payload
JSON request for AWS Machine Learning services. The Processor will use FlowFile content for the request when this property is not specified.

Display Name

JSON Payload

Description

JSON request for AWS Machine Learning services. The Processor will use FlowFile content for the request when this property is not specified.

API Name

json-payload

Default Value

{ "requests": [{ "image": { "source": { "imageUri": "gs://${gcs.bucket}/${filename}" } }, "features": [{ "type": "${vision-feature-type}", "maxResults": 4 }] }], "outputConfig": { "gcsDestination": { "uri": "gs://${output-bucket}/${filename}/" }, "batchSize": 2 } }

Expression Language Scope

Environment variables and FlowFile Attributes

Sensitive

false

Required

false
Output Bucket
Name of the GCS bucket where the output of the Vision job will be persisted. The value of this property applies when the JSON Payload property is configured. The JSON Payload property value can use Expression Language to reference the value of ${output-bucket}

Display Name

Output Bucket

Description

Name of the GCS bucket where the output of the Vision job will be persisted. The value of this property applies when the JSON Payload property is configured. The JSON Payload property value can use Expression Language to reference the value of ${output-bucket}

API Name

output-bucket

Expression Language Scope

Environment variables and FlowFile Attributes

Sensitive

false

Required

false
Vision Feature Type
Type of GCP Vision Feature. The value of this property applies when the JSON Payload property is configured. The JSON Payload property value can use Expression Language to reference the value of ${vision-feature-type}

Display Name

Vision Feature Type

Description

Type of GCP Vision Feature. The value of this property applies when the JSON Payload property is configured. The JSON Payload property value can use Expression Language to reference the value of ${vision-feature-type}

API Name

vision-feature-type

Default Value

TEXT_DETECTION

Expression Language Scope

Environment variables and FlowFile Attributes

Sensitive

false

Required

false

Relationships

Name	Description
failure	FlowFiles are routed to failure relationship
success	FlowFiles are routed to success relationship

Writes Attributes

Name	Description
operationKey	A unique identifier of the operation returned by the Vision server.

StartGcpVisionAnnotateImagesOperation 2.5.0

Google Vision

Google Cloud Vision - Start Annotate Images Operation

Usage

Payload

Features types

Example: How to set up a simple Annotate Image Flow