0

I have the following fluent.conf

<source>
  type forward
</source>

<source>
  type monitor_agent
  port 24220
</source>

# Listen DRb for debug
<source>
  type debug_agent
  port 24230
</source>


<source>
  type tail
  path /var/data/www/apps/app/logs/*.log
  pos_file /tmp/fluent.nginx.pos
  format syslog
  tag app.nginx-access
  # Regex fields
  format /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"$/
  # Date and time format
  time_format %d/%b/%Y:%H:%M:%S %z
</source>

<match app.**>
  type copy
  <store>
    type file
    path /var/log/fluent/app
  </store>
</match>

Is it necessary to use Logrotate @ /var/log/fluent/app/* or will Fluent handle this itself?

james_womack
  • 113
  • 1
  • 5

3 Answers3

3

As pointed by @kiyoto-tamura, fluentd can partition output files by day and this is the default behaviour.

And, as pointed by @vaab, fluentd cannot delete old files. Hence, the obvious solution would be to disable the partitioning of fluentd and let logrotate to handle partitioning along with watching for the number of files.

However, this might introduce an unnecessary complexity, especially for simple cases: one will have to install, configure, and monitor an additional service, which is logrotate.

Moreover, it can be problematic to get analogous of logrotate on Windows (unless one is comfortable with installing cygwin or using 0.0.0.x versions in production).

So, another feasible solution would be let fluentd to partition output files by day as usual, but delete old files periodically.

In UNIX-like systems this is a no-brainer and can be achieved by writing a one-liner shell script invoking find and scheduling it via cron.

The same logic applies to Windows environment. For example, one can write a PowerShell script and schedule it via the system task scheduler (see the Gist for readability).

The following listing defines such a script as an example:

# delete-old-service-logs.ps1

[CmdletBinding()]

param(
  [Parameter(Position=0, Mandatory=$true)]
  [ValidateScript({
    if( -Not ($_ | Test-Path) ){
      throw "Specified logs dir path does not exist (path='$_')"
    }
    if(-Not ($_ | Test-Path -PathType Container) ){
      throw "Specified logs dir path does not point to a directory (path='$_')"
    }
    return $true
  })]
  [string]$LogsDirPath,

  [Parameter(Position=1)]
  [int]$LogsFileMaxN = 31,

  [Parameter(Position=2)]
  [ValidateNotNullOrEmpty()]
  [string]$LogsFileNamePattern = "service.????-??-??.log"
)


[string[]]$FileNamesToRemove = Get-ChildItem -Path $LogsDirPath -Filter $LogsFileNamePattern |
  Sort-Object -Property CreationTime -Descending |
  Select-Object -Skip $LogsFileMaxN |
  Select -ExpandProperty "Name"


$Shell = new-object -comobject "Shell.Application"
$LogsDir = $Shell.Namespace($LogsDirPath)


Foreach ($FileName in $FileNamesToRemove)
{
  $Item = $LogsDir.ParseName($FileName)
  $Item.InvokeVerb("delete")
}

The logic is pretty straightforward:

  1. Take the path to the dir with logs.
  2. Take the max number of files to keep.
  3. Take the file pattern to search logs by.
  4. Find all files in the logs dir, reverse-sort by creation date and take names of old files.
  5. Delete old files if any.

Invocation example:

./delete-old-service-logs.ps1 "path/to/logs/dir"

or:

./delete-old-service-logs.ps1 -LogsDirPath "path/to/logs/dir"

As creating a scheduled task in Windows via GUI can be a pain, one can have a script, which automates this:

# create-task_delete-old-service-logs.ps1

[CmdletBinding()]


param(
  [Parameter(Position=0, Mandatory=$true)]
  [ValidateNotNullOrEmpty()]
  [string]$LogsDirPath,

  [Parameter(Position=1)]
  [int]$LogsFileMaxN = 31,

  [Parameter(Position=2)]
  [ValidateNotNullOrEmpty()]
  [string]$LogsFileNamePattern = "service.????-??-??.log",

  [Parameter(Position=3)]
  [ValidateNotNullOrEmpty()]
  [string]$TaskScriptFilePath = "delete-old-service-logs.ps1",

  [Parameter(Position=4)]
  [ValidateNotNullOrEmpty()]
  [string]$TaskName = "SERVICE NAME - Delete Old Logs",

  [Parameter(Position=5)]
  [ValidateNotNullOrEmpty()]
  [string]$TaskDescription = "Delete old logs of SERVICE NAME aggregated via Fluentd",

  [Parameter(Position=6)]
  [ValidateNotNullOrEmpty()]
  [string]$TaskTriggerTime = "5:00:00 AM"
)


try {
  $LogsDirPath = Resolve-Path -Path $LogsDirPath
}
catch {
  throw "Specified logs dir path does not exist (path='$LogsDirPath')"
}

if(-Not ($LogsDirPath | Test-Path -PathType Container) ){
  throw "Specified logs dir path does not point to a directory (path='$LogsDirPath')"
}


try {
  $TaskScriptFilePath = Resolve-Path -Path $TaskScriptFilePath
}
catch {
  throw "Specified task script file path does not exist (path='$TaskScriptFilePath')"
}

if( -Not ($TaskScriptFilePath | Test-Path -PathType Leaf) ){
  throw "Specified task script file path is not a file (path='$TaskScriptFilePath')"
}


$TaskAction = New-ScheduledTaskAction -Execute "C:\Windows\system32\WindowsPowerShell\v1.0\powershell.exe" `
  -Argument "-NoProfile -WindowStyle Hidden -command ""$TaskScriptFilePath -LogsDirPath ""$LogsDirPath"" -LogsFileMaxN $LogsFileMaxN -LogsFileNamePattern ""$LogsFileNamePattern"""""


$TaskTrigger = New-ScheduledTaskTrigger -Daily -At $TaskTriggerTime


Register-ScheduledTask -TaskName $TaskName -Description $TaskDescription -Action $TaskAction -Trigger $TaskTrigger

The actual logic happens at the last 3 lines, where an action, a trigger and a task are created and registered. In overall, the workflow is as follows:

  1. Take params for delete-old-service-logs.ps1.
  2. Take path to delete-old-service-logs.ps1 script.
  3. Take task name, description and time to trigger the script.
  4. Try to resolve and validate paths.
  5. Create an action for invoking PowerShell with args to run delete-old-service-logs.ps1.
  6. Create a daily trigger.
  7. Register the task.

Invocation example, assuming both scripts are in the same directory:

./create-task_delete-old-service-logs.ps1 "path/to/logs/dir"

Or:

./create-task_delete-old-service-logs.ps1 -LogsDirPath "path/to/logs/dir" -TaskScriptFilePath "path/to/delete-old-service-logs.ps1"
oblalex
  • 131
  • 3
2

Fluentd's out_file plugin automatically partitions the output files by day, so you do NOT need to use logrotate.

If you want to partition by different granularity, change the "time_slice_format" parameter (by default, it is %Y%m%d).

However, this means there is no current, canonical name for the output file. For that, you can use the parameter "symlink_path" with "buffer_type file". This is not the feature of out_file per se, but any buffered output.

1

Unfortunately, if out_file plugin is currently able to split log files by time (a thing logrotate can do also), it is not removing the N old files (which logrotate implements).

So using logrotate seem still necessary if you need to keep control of the number and the size of the kept logs (source: https://github.com/fluent/fluentd/issues/2111 ).

At that point, you can disable fluentd time related file-splitting functionality and be sure to use append true to let logrotate do it's full job. Note that there are no need of postrotate niceties in logrotate's conf as fluentd re-open the file at each flushing of the buffer... and this is a welcome perk of using fluentd.

fluentd will remain useful for it's filters / copy of log streams, file splitting by tag and buffering.

vaab
  • 512
  • 3
  • 13