-
Processors
- AttributeRollingWindow
- AttributesToCSV
- AttributesToJSON
- CalculateRecordStats
- CaptureChangeMySQL
- CompressContent
- ConnectWebSocket
- ConsumeAMQP
- ConsumeAzureEventHub
- ConsumeElasticsearch
- ConsumeGCPubSub
- ConsumeIMAP
- ConsumeJMS
- ConsumeKafka
- ConsumeKinesisStream
- ConsumeMQTT
- ConsumePOP3
- ConsumeSlack
- ConsumeTwitter
- ConsumeWindowsEventLog
- ControlRate
- ConvertCharacterSet
- ConvertRecord
- CopyAzureBlobStorage_v12
- CopyS3Object
- CountText
- CryptographicHashContent
- DebugFlow
- DecryptContentAge
- DecryptContentPGP
- DeduplicateRecord
- DeleteAzureBlobStorage_v12
- DeleteAzureDataLakeStorage
- DeleteByQueryElasticsearch
- DeleteDynamoDB
- DeleteFile
- DeleteGCSObject
- DeleteGridFS
- DeleteMongo
- DeleteS3Object
- DeleteSFTP
- DeleteSQS
- DetectDuplicate
- DistributeLoad
- DuplicateFlowFile
- EncodeContent
- EncryptContentAge
- EncryptContentPGP
- EnforceOrder
- EvaluateJsonPath
- EvaluateXPath
- EvaluateXQuery
- ExecuteGroovyScript
- ExecuteProcess
- ExecuteScript
- ExecuteSQL
- ExecuteSQLRecord
- ExecuteStreamCommand
- ExtractAvroMetadata
- ExtractEmailAttachments
- ExtractEmailHeaders
- ExtractGrok
- ExtractHL7Attributes
- ExtractRecordSchema
- ExtractText
- FetchAzureBlobStorage_v12
- FetchAzureDataLakeStorage
- FetchBoxFile
- FetchDistributedMapCache
- FetchDropbox
- FetchFile
- FetchFTP
- FetchGCSObject
- FetchGoogleDrive
- FetchGridFS
- FetchS3Object
- FetchSFTP
- FetchSmb
- FilterAttribute
- FlattenJson
- ForkEnrichment
- ForkRecord
- GenerateFlowFile
- GenerateRecord
- GenerateTableFetch
- GeoEnrichIP
- GeoEnrichIPRecord
- GeohashRecord
- GetAsanaObject
- GetAwsPollyJobStatus
- GetAwsTextractJobStatus
- GetAwsTranscribeJobStatus
- GetAwsTranslateJobStatus
- GetAzureEventHub
- GetAzureQueueStorage_v12
- GetDynamoDB
- GetElasticsearch
- GetFile
- GetFileResource
- GetFTP
- GetGcpVisionAnnotateFilesOperationStatus
- GetGcpVisionAnnotateImagesOperationStatus
- GetHubSpot
- GetMongo
- GetMongoRecord
- GetS3ObjectMetadata
- GetSFTP
- GetShopify
- GetSmbFile
- GetSNMP
- GetSplunk
- GetSQS
- GetWorkdayReport
- GetZendesk
- HandleHttpRequest
- HandleHttpResponse
- IdentifyMimeType
- InvokeHTTP
- InvokeScriptedProcessor
- ISPEnrichIP
- JoinEnrichment
- JoltTransformJSON
- JoltTransformRecord
- JSLTTransformJSON
- JsonQueryElasticsearch
- ListAzureBlobStorage_v12
- ListAzureDataLakeStorage
- ListBoxFile
- ListDatabaseTables
- ListDropbox
- ListenFTP
- ListenHTTP
- ListenOTLP
- ListenSlack
- ListenSyslog
- ListenTCP
- ListenTrapSNMP
- ListenUDP
- ListenUDPRecord
- ListenWebSocket
- ListFile
- ListFTP
- ListGCSBucket
- ListGoogleDrive
- ListS3
- ListSFTP
- ListSmb
- LogAttribute
- LogMessage
- LookupAttribute
- LookupRecord
- MergeContent
- MergeRecord
- ModifyBytes
- ModifyCompression
- MonitorActivity
- MoveAzureDataLakeStorage
- Notify
- PackageFlowFile
- PaginatedJsonQueryElasticsearch
- ParseEvtx
- ParseNetflowv5
- ParseSyslog
- ParseSyslog5424
- PartitionRecord
- PublishAMQP
- PublishGCPubSub
- PublishJMS
- PublishKafka
- PublishMQTT
- PublishSlack
- PutAzureBlobStorage_v12
- PutAzureCosmosDBRecord
- PutAzureDataExplorer
- PutAzureDataLakeStorage
- PutAzureEventHub
- PutAzureQueueStorage_v12
- PutBigQuery
- PutBoxFile
- PutCloudWatchMetric
- PutDatabaseRecord
- PutDistributedMapCache
- PutDropbox
- PutDynamoDB
- PutDynamoDBRecord
- PutElasticsearchJson
- PutElasticsearchRecord
- PutEmail
- PutFile
- PutFTP
- PutGCSObject
- PutGoogleDrive
- PutGridFS
- PutKinesisFirehose
- PutKinesisStream
- PutLambda
- PutMongo
- PutMongoBulkOperations
- PutMongoRecord
- PutRecord
- PutRedisHashRecord
- PutS3Object
- PutSalesforceObject
- PutSFTP
- PutSmbFile
- PutSNS
- PutSplunk
- PutSplunkHTTP
- PutSQL
- PutSQS
- PutSyslog
- PutTCP
- PutUDP
- PutWebSocket
- PutZendeskTicket
- QueryAirtableTable
- QueryAzureDataExplorer
- QueryDatabaseTable
- QueryDatabaseTableRecord
- QueryRecord
- QuerySalesforceObject
- QuerySplunkIndexingStatus
- RemoveRecordField
- RenameRecordField
- ReplaceText
- ReplaceTextWithMapping
- RetryFlowFile
- RouteHL7
- RouteOnAttribute
- RouteOnContent
- RouteText
- RunMongoAggregation
- SampleRecord
- ScanAttribute
- ScanContent
- ScriptedFilterRecord
- ScriptedPartitionRecord
- ScriptedTransformRecord
- ScriptedValidateRecord
- SearchElasticsearch
- SegmentContent
- SendTrapSNMP
- SetSNMP
- SignContentPGP
- SplitAvro
- SplitContent
- SplitExcel
- SplitJson
- SplitPCAP
- SplitRecord
- SplitText
- SplitXml
- StartAwsPollyJob
- StartAwsTextractJob
- StartAwsTranscribeJob
- StartAwsTranslateJob
- StartGcpVisionAnnotateFilesOperation
- StartGcpVisionAnnotateImagesOperation
- TagS3Object
- TailFile
- TransformXml
- UnpackContent
- UpdateAttribute
- UpdateByQueryElasticsearch
- UpdateCounter
- UpdateDatabaseTable
- UpdateRecord
- ValidateCsv
- ValidateJson
- ValidateRecord
- ValidateXml
- VerifyContentMAC
- VerifyContentPGP
- Wait
-
Controller Services
- ADLSCredentialsControllerService
- ADLSCredentialsControllerServiceLookup
- AmazonGlueSchemaRegistry
- ApicurioSchemaRegistry
- AvroReader
- AvroRecordSetWriter
- AvroSchemaRegistry
- AWSCredentialsProviderControllerService
- AzureBlobStorageFileResourceService
- AzureCosmosDBClientService
- AzureDataLakeStorageFileResourceService
- AzureEventHubRecordSink
- AzureStorageCredentialsControllerService_v12
- AzureStorageCredentialsControllerServiceLookup_v12
- CEFReader
- ConfluentEncodedSchemaReferenceReader
- ConfluentEncodedSchemaReferenceWriter
- ConfluentSchemaRegistry
- CSVReader
- CSVRecordLookupService
- CSVRecordSetWriter
- DatabaseRecordLookupService
- DatabaseRecordSink
- DatabaseTableSchemaRegistry
- DBCPConnectionPool
- DBCPConnectionPoolLookup
- DistributedMapCacheLookupService
- ElasticSearchClientServiceImpl
- ElasticSearchLookupService
- ElasticSearchStringLookupService
- EmailRecordSink
- EmbeddedHazelcastCacheManager
- ExcelReader
- ExternalHazelcastCacheManager
- FreeFormTextRecordSetWriter
- GCPCredentialsControllerService
- GCSFileResourceService
- GrokReader
- HazelcastMapCacheClient
- HikariCPConnectionPool
- HttpRecordSink
- IPLookupService
- JettyWebSocketClient
- JettyWebSocketServer
- JMSConnectionFactoryProvider
- JndiJmsConnectionFactoryProvider
- JsonConfigBasedBoxClientService
- JsonPathReader
- JsonRecordSetWriter
- JsonTreeReader
- Kafka3ConnectionService
- KerberosKeytabUserService
- KerberosPasswordUserService
- KerberosTicketCacheUserService
- LoggingRecordSink
- MapCacheClientService
- MapCacheServer
- MongoDBControllerService
- MongoDBLookupService
- PEMEncodedSSLContextProvider
- PropertiesFileLookupService
- ProtobufReader
- ReaderLookup
- RecordSetWriterLookup
- RecordSinkServiceLookup
- RedisConnectionPoolService
- RedisDistributedMapCacheClientService
- RestLookupService
- S3FileResourceService
- ScriptedLookupService
- ScriptedReader
- ScriptedRecordSetWriter
- ScriptedRecordSink
- SetCacheClientService
- SetCacheServer
- SimpleCsvFileLookupService
- SimpleDatabaseLookupService
- SimpleKeyValueLookupService
- SimpleRedisDistributedMapCacheClientService
- SimpleScriptedLookupService
- SiteToSiteReportingRecordSink
- SlackRecordSink
- SmbjClientProviderService
- StandardAsanaClientProviderService
- StandardAzureCredentialsControllerService
- StandardDropboxCredentialService
- StandardFileResourceService
- StandardHashiCorpVaultClientService
- StandardHttpContextMap
- StandardJsonSchemaRegistry
- StandardKustoIngestService
- StandardKustoQueryService
- StandardOauth2AccessTokenProvider
- StandardPGPPrivateKeyService
- StandardPGPPublicKeyService
- StandardPrivateKeyService
- StandardProxyConfigurationService
- StandardRestrictedSSLContextService
- StandardS3EncryptionService
- StandardSSLContextService
- StandardWebClientServiceProvider
- Syslog5424Reader
- SyslogReader
- UDPEventRecordSink
- VolatileSchemaCache
- WindowsEventLogReader
- XMLFileLookupService
- XMLReader
- XMLRecordSetWriter
- YamlTreeReader
- ZendeskRecordSink
TailFile 2.1.0
- Bundle
- org.apache.nifi | nifi-standard-nar
- Description
- "Tails" a file, or a list of files, ingesting data from the file as it is written to the file. The file is expected to be textual. Data is ingested only when a new line is encountered (carriage return or new-line character or combination). If the file to tail is periodically "rolled over", as is generally the case with log files, an optional Rolling Filename Pattern can be used to retrieve data from files that have rolled over, even if the rollover occurred while NiFi was not running (provided that the data still exists upon restart of NiFi). It is generally advisable to set the Run Schedule to a few seconds, rather than running with the default value of 0 secs, as this Processor will consume a lot of resources if scheduled very aggressively. At this time, this Processor does not support ingesting files that have been compressed when 'rolled over'.
- Tags
- file, log, source, tail, text
- Input Requirement
- FORBIDDEN
- Supports Sensitive Dynamic Properties
- false
-
Additional Details for TailFile 2.1.0
TailFile
Introduction
This processor offers a powerful capability, allowing the user to periodically look at a file that is actively being written to by another process. When the file changes, the new lines are ingested. This Processor assumes that data in the file is textual.
Tailing a file from a filesystem is a seemingly simple but notoriously difficult task. This is because we are periodically checking the contents of a file that is being written to. The file may be constantly changing, or it may rarely change. The file may be “rolled over” (i.e., renamed) and it’s important that even after restarting the application (NiFi, in this case), we are able to pick up where we left off. Other additional complexities also come into play. For example, NFS mounted drives may indicate that data is readable but then return NUL bytes (Unicode 0) when attempting to read, as the actual bytes are not yet known (see the
property), and file systems have different timestamp granularities. This Processor is designed to handle all of these different cases. This can lead to slightly more complex configuration, but this document should provide you with all you need to get started!
Modes
This processor is used to tail a file or multiple files, depending on the configured mode. The mode to choose depends on the logging pattern followed by the file(s) to tail. In any case, if there is a rolling pattern, the rolling files must be plain text files (compression is not supported at the moment).
- Single file: the processor will tail the file with the path given in ‘File(s) to tail’ property.
- Multiple files: the processor will look for files into the ‘Base directory’. It will look for file recursively according to the ‘Recursive lookup’ property and will tail all the files matching the regular expression provided in the ‘File(s) to tail’ property.
Rolling filename pattern
In case the ‘Rolling filename pattern’ property is used, when the processor detects that the file to tail has rolled over, the processor will look for possible missing messages in the rolled file. To do so, the processor will use the pattern to find the rolling files in the same directory as the file to tail.
In order to keep this property available in the ‘Multiple files’ mode when multiples files to tail are in the same directory, it is possible to use the ${filename} tag to reference the name (without extension) of the file to tail. For example, if we have:
/my/path/directory/my-app.log.1 /my/path/directory/my-app.log /my/path/directory/application.log.1 /my/path/directory/application.log
the ‘rolling filename pattern’ would be ${filename}.log.*.
Descriptions for different modes and strategies
The ‘Single file’ mode assumes that the file to tail has always the same name even if there is a rolling pattern. Example:
/my/path/directory/my-app.log.2 /my/path/directory/my-app.log.1 /my/path/directory/my-app.log
and new log messages are always appended in my-app.log file.
In case recursivity is set to ’true’. The regular expression for the files to tail must embrace the possible intermediate directories between the base directory and the files to tail. Example:
/my/path/directory1/my-app1.log /my/path/directory2/my-app2.log /my/path/directory3/my-app3.log
Base directory: /my/path Files to tail: directory[1-3]/my-app[1-3].log Recursivity: true
If the processor is configured with ‘Multiple files’ mode, two additional properties are relevant:
- Lookup frequency: specifies the minimum duration the processor will wait before listing again the files to tail.
- Maximum age: specifies the necessary minimum duration to consider that no new messages will be appended in a file regarding its last modification date. If the amount of time that has elapsed since the file was modified is larger than this period of time, the file will not be tailed. For example, if a file was modified 24 hours ago and this property is set to 12 hours, the file will not be tailed. But if this property is set to 36 hours, then the file will continue to be tailed.
It is necessary to pay attention to ‘Lookup frequency’ and ‘Maximum age’ properties, as well as the frequency at which the processor is triggered, in order to achieve high performance. It is recommended to keep ‘Maximum age’ > ‘Lookup frequency’ > processor scheduling frequency to avoid missing data. It also recommended not to set ‘Maximum Age’ too low because if messages are appended in a file after this file has been considered “too old”, all the messages in the file may be read again, leading to data duplication.
If the processor is configured with ‘Multiple files’ mode, the ‘Rolling filename pattern’ property must be specific enough to ensure that only the rolling files will be listed and not other currently tailed files in the same directory ( this can be achieved using ${filename} tag).
Handling Multi-Line Messages
Most of the time, when we tail a file, we are happy to receive data periodically, however it was written to the file. There are scenarios, though, where we may have data written in such a way that multiple lines need to be retained together. Take, for example, the following lines of text that might be found in a log file:
2021-07-09 14:12:19,731 INFO [main] org.apache.nifi.NiFi Launching NiFi... 2021-07-09 14:12:19,915 INFO [main] o.a.n.p.AbstractBootstrapPropertiesLoader Determined default application properties path to be '/Users/mpayne/devel/nifi/nifi-assembly/target/nifi-1.14.0-SNAPSHOT-bin/nifi-1.14.0-SNAPSHOT/./conf/nifi.properties' 2021-07-09 14:12:19,919 INFO [main] o.a.nifi.properties.NiFiPropertiesLoader Loaded 199 properties from /Users/mpayne/devel/nifi/nifi-assembly/target/nifi-1.14.0-SNAPSHOT-bin/nifi-1.14.0-SNAPSHOT/./conf/nifi.properties 2021-07-09 14:12:19,925 WARN Line 1 of Log Message Line 2: This is an important warning. Line 3: Please do not ignore this warning. Line 4: These lines of text make sense only in the context of the original message. 2021-07-09 14:12:19,941 INFO [main] Final message in log file
In this case, we may want to ensure that the log lines are not ingested in such a way that our multi-line log message is not broken up into Lines 1 and 2 in one FlowFile and Lines 3 and 4 in another. To accomplish this, the Processor exposes the
property. If we set this Property to a value of \d{4}-\d{2}-\d{2}
, then we are telling the Processor that each message should begin with 4 digits, followed by a dash, followed by 2 digits, a dash, and 2 digits. I.e., we are telling it that each message begins with a timestamp in yyyy-MM-dd format. Because of this, even if the Processor runs and sees only Lines 1 and 2 of our multiline log message, it will not ingest the data yet. It will wait until it sees the next message, which starts with a timestamp.Note that, because of this, the last message that the Processor will encounter in the above situation is the “Final message in log file” line. At this point, the Processor does not know whether the next line of text it encounters will be part of this line or a new message. As such, it will not ingest this data. It will wait until either another message is encountered (that matches our regex) or until the file is rolled over (renamed). Because of this, there may be some delay in ingesting the last message in the file, if the process that writes to the file just stops writing at this point.
Additionally, we run the chance of the Regular Expression not matching the data in the file. This could result in buffering all the file’s content, which could cause NiFi to run out of memory. To avoid this, the
property limits the amount of data that can be buffered. If this amount of data is buffered, it will be flushed to the FlowFile, even if another message hasn’t been encountered.
-
State Location
Specifies where the state is located either local or cluster so that state can be stored appropriately in order to ensure that all data is consumed without duplicating data upon restart of NiFi
- Display Name
- State Location
- Description
- Specifies where the state is located either local or cluster so that state can be stored appropriately in order to ensure that all data is consumed without duplicating data upon restart of NiFi
- API Name
- File Location
- Default Value
- Local
- Allowable Values
-
- Local
- Remote
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
File(s) to Tail
Path of the file to tail in case of single file mode. If using multifile mode, regular expression to find files to tail in the base directory. In case recursivity is set to true, the regular expression will be used to match the path starting from the base directory (see additional details for examples).
- Display Name
- File(s) to Tail
- Description
- Path of the file to tail in case of single file mode. If using multifile mode, regular expression to find files to tail in the base directory. In case recursivity is set to true, the regular expression will be used to match the path starting from the base directory (see additional details for examples).
- API Name
- File to Tail
- Expression Language Scope
- Environment variables defined at JVM level and system properties
- Sensitive
- false
- Required
- true
-
Initial Start Position
When the Processor first begins to tail data, this property specifies where the Processor should begin reading data. Once data has been ingested from a file, the Processor will continue from the last point from which it has received data.
- Display Name
- Initial Start Position
- Description
- When the Processor first begins to tail data, this property specifies where the Processor should begin reading data. Once data has been ingested from a file, the Processor will continue from the last point from which it has received data.
- API Name
- Initial Start Position
- Default Value
- Beginning of File
- Allowable Values
-
- Beginning of Time
- Beginning of File
- Current Time
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Line Start Pattern
A Regular Expression to match against the start of a log line. If specified, any line that matches the expression, and any following lines, will be buffered until another line matches the Expression. In doing this, we can avoid splitting apart multi-line messages in the file. This assumes that the data is in UTF-8 format.
- Display Name
- Line Start Pattern
- Description
- A Regular Expression to match against the start of a log line. If specified, any line that matches the expression, and any following lines, will be buffered until another line matches the Expression. In doing this, we can avoid splitting apart multi-line messages in the file. This assumes that the data is in UTF-8 format.
- API Name
- Line Start Pattern
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- false
- Dependencies
-
- Tailing mode is set to any of [Single file]
-
Max Buffer Size
When using the Line Start Pattern, there may be situations in which the data in the file being tailed never matches the Regular Expression. This would result in the processor buffering all data from the tailed file, which can quickly exhaust the heap. To avoid this, the Processor will buffer only up to this amount of data before flushing the buffer, even if it means ingesting partial data from the file.
- Display Name
- Max Buffer Size
- Description
- When using the Line Start Pattern, there may be situations in which the data in the file being tailed never matches the Regular Expression. This would result in the processor buffering all data from the tailed file, which can quickly exhaust the heap. To avoid this, the Processor will buffer only up to this amount of data before flushing the buffer, even if it means ingesting partial data from the file.
- API Name
- Max Buffer Size
- Default Value
- 64 KB
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
- Dependencies
-
- Line Start Pattern is set to any value specified
-
Post-Rollover Tail Period
When a file is rolled over, the processor will continue tailing the rolled over file until it has not been modified for this amount of time. This allows for another process to rollover a file, and then flush out any buffered data. Note that when this value is set, and the tailed file rolls over, the new file will not be tailed until the old file has not been modified for the configured amount of time. Additionally, when using this capability, in order to avoid data duplication, this period must be set longer than the Processor's Run Schedule, and the Processor must not be stopped after the file being tailed has been rolled over and before the data has been fully consumed. Otherwise, the data may be duplicated, as the entire file may be written out as the contents of a single FlowFile.
- Display Name
- Post-Rollover Tail Period
- Description
- When a file is rolled over, the processor will continue tailing the rolled over file until it has not been modified for this amount of time. This allows for another process to rollover a file, and then flush out any buffered data. Note that when this value is set, and the tailed file rolls over, the new file will not be tailed until the old file has not been modified for the configured amount of time. Additionally, when using this capability, in order to avoid data duplication, this period must be set longer than the Processor's Run Schedule, and the Processor must not be stopped after the file being tailed has been rolled over and before the data has been fully consumed. Otherwise, the data may be duplicated, as the entire file may be written out as the contents of a single FlowFile.
- API Name
- Post-Rollover Tail Period
- Default Value
- 0 sec
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- false
-
Pre-Allocated Buffer Size
Sets the amount of memory that is pre-allocated for each tailed file.
- Display Name
- Pre-Allocated Buffer Size
- Description
- Sets the amount of memory that is pre-allocated for each tailed file.
- API Name
- pre-allocated-buffer-size
- Default Value
- 65536 B
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Reread when NUL encountered
If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. (Note: Yielding may delay the processing of other files tailed by this processor, not just the one with the NUL character.) The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. CAUTION! If the file contains legitimate NUL values, setting this flag causes this processor to get stuck indefinitely. For this reason users should refrain from using this feature if they can help it and try to avoid having the target file on a file system where reads are unreliable.
- Display Name
- Reread when NUL encountered
- Description
- If this option is set to 'true', when a NUL character is read, the processor will yield and try to read the same part again later. (Note: Yielding may delay the processing of other files tailed by this processor, not just the one with the NUL character.) The purpose of this flag is to allow users to handle cases where reading a file may return temporary NUL values. NFS for example may send file contents out of order. In this case the missing parts are temporarily replaced by NUL values. CAUTION! If the file contains legitimate NUL values, setting this flag causes this processor to get stuck indefinitely. For this reason users should refrain from using this feature if they can help it and try to avoid having the target file on a file system where reads are unreliable.
- API Name
- reread-on-nul
- Default Value
- false
- Allowable Values
-
- true
- false
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- false
-
Rolling Filename Pattern
If the file to tail "rolls over" as would be the case with log files, this filename pattern will be used to identify files that have rolled over so that if NiFi is restarted, and the file has rolled over, it will be able to pick up where it left off. This pattern supports wildcard characters * and ?, it also supports the notation ${filename} to specify a pattern based on the name of the file (without extension), and will assume that the files that have rolled over live in the same directory as the file being tailed. The same glob pattern will be used for all files.
- Display Name
- Rolling Filename Pattern
- Description
- If the file to tail "rolls over" as would be the case with log files, this filename pattern will be used to identify files that have rolled over so that if NiFi is restarted, and the file has rolled over, it will be able to pick up where it left off. This pattern supports wildcard characters * and ?, it also supports the notation ${filename} to specify a pattern based on the name of the file (without extension), and will assume that the files that have rolled over live in the same directory as the file being tailed. The same glob pattern will be used for all files.
- API Name
- Rolling Filename Pattern
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- false
-
Base directory
Base directory used to look for files to tail. This property is required when using Multifile mode.
- Display Name
- Base directory
- Description
- Base directory used to look for files to tail. This property is required when using Multifile mode.
- API Name
- tail-base-directory
- Expression Language Scope
- Environment variables defined at JVM level and system properties
- Sensitive
- false
- Required
- false
-
Tailing mode
Mode to use: single file will tail only one file, multiple file will look for a list of file. In Multiple mode the Base directory is required.
- Display Name
- Tailing mode
- Description
- Mode to use: single file will tail only one file, multiple file will look for a list of file. In Multiple mode the Base directory is required.
- API Name
- tail-mode
- Default Value
- Single file
- Allowable Values
-
- Single file
- Multiple files
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
-
Lookup frequency
Only used in Multiple files mode. It specifies the minimum duration the processor will wait before listing again the files to tail.
- Display Name
- Lookup frequency
- Description
- Only used in Multiple files mode. It specifies the minimum duration the processor will wait before listing again the files to tail.
- API Name
- tailfile-lookup-frequency
- Default Value
- 10 minutes
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- false
-
Maximum age
Only used in Multiple files mode. It specifies the necessary minimum duration to consider that no new messages will be appended in a file regarding its last modification date. This should not be set too low to avoid duplication of data in case new messages are appended at a lower frequency.
- Display Name
- Maximum age
- Description
- Only used in Multiple files mode. It specifies the necessary minimum duration to consider that no new messages will be appended in a file regarding its last modification date. This should not be set too low to avoid duplication of data in case new messages are appended at a lower frequency.
- API Name
- tailfile-maximum-age
- Default Value
- 24 hours
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- false
-
Recursive lookup
When using Multiple files mode, this property defines if files must be listed recursively or not in the base directory.
- Display Name
- Recursive lookup
- Description
- When using Multiple files mode, this property defines if files must be listed recursively or not in the base directory.
- API Name
- tailfile-recursive-lookup
- Default Value
- false
- Allowable Values
-
- true
- false
- Expression Language Scope
- Not Supported
- Sensitive
- false
- Required
- true
Scopes | Description |
---|---|
CLUSTER, LOCAL | Stores state about where in the Tailed File it left off so that on restart it does not have to duplicate data. State is stored either local or clustered depend on the <File Location> property. |
Required Permission | Explanation |
---|---|
read filesystem | Provides operator the ability to read from any file that NiFi has access to. |
Name | Description |
---|---|
success | All FlowFiles are routed to this Relationship. |
Name | Description |
---|---|
tailfile.original.path | Path of the original file the flow file comes from. |