-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Closed
Labels
Description
Tell us about the issue
Description:
There are various situation where ES may reject the event with document already exist. Purpose of this issue to collect such cases and add a short documentation (under the whichever suitable place, in troubleshooting or support doc or es-output
) as we are getting same question over and over.
- Ingestion from agent
Possibly two cases I can think of now:- events are datastream with integration and integration has a fingerprint processor which sets the document
_id
. For example,tenable_sc
integration may havelogs-tenable_sc.vulnerability-{version}
&logs-tenable_sc.plugin-{version}
ingest pipelines which have fingerprint sets the_id
:
{ "fingerprint": { "fields": [ "json.lastSeen", "json.pluginID", "json.ip", "json.uuid", "json.firstSeen", "json.lastSeen", "json.exploitAvailable", "json.vulnPubDate", "json.patchPubDate", "json.pluginPubDate", "json.pluginModDate", "json.pluginText", "json.dnsName", "json.macAddress", "json.operatingSystem", "json.pluginInfo" ], "target_field": "_id", "ignore_missing": true } },
- events are datastream with integration and integration has a fingerprint processor which sets the document
Example log when Logstash receives a rejected event:
[2024-04-24T14:13:25,988][WARN ][logstash.outputs.elasticsearch][Elastic-Agent-to-Logstash].[6ef18a6008d3cea8f01e0cd409c22213845cff0829c5f76b02d18a30a22c024d] Failed action {:status=>409, :action=>["create", {:_id=>nil, :_index=>"metrics-windows.service-default", :routing=>nil}, {"host"=>{"mac"=>["...", "..."], "name"=>"redacted-name", "ip"=>["1", "2", "3"], "architecture"=>"x86_64", "id"=>"aae15", "os"=>{"name"=>"Windows Server 2016 Datacenter", "platform"=>"windows", "type"=>"windows", "kernel"=>"10.0.14393.6897 (rs1_release.240404-1613)", "family"=>"windows", "build"=>"14393.6897", "version"=>"10.0"}, "hostname"=>"redacted-host"}, "service"=>{"type"=>"windows"}, "elastic_agent"=>{"version"=>"8.11.3", "id"=>"48423d1c-5a87-46ef-b6a8-baf90b515e63", "snapshot"=>false}, "metricset"=>{"name"=>"service", "period"=>60000}, "event"=>{"duration"=>211345500, "module"=>"windows", "dataset"=>"windows.service"}, "cloud"=>{"service"=>{"name"=>"redacted-name"}, "instance"=>{"id"=>"redacted-id", "name"=>"redacted-name"}, "provider"=>"openstack", "machine"=>{"type"=>"t"}, "availability_zone"=>"zone"}, "@timestamp"=>2024-04-24T14:12:24.991Z, ...., "type"=>"metricbeat", "id"=>"48423d1c-5a87-46ef-b6a8-baf90b515e63", "ephemeral_id"=>"fdd26638-3865-4e63-961a-69d8f419538a", "version"=>"8.11.3"}, "@version"=>"1"}], :response=>{"create"=>{"status"=>409, "error"=>{"type"=>"version_conflict_engine_exception", "reason"=>"[abcd][{agent.id=my-agent-id, cloud.availability_zone=zone, cloud.instance.id=i-id, windows.service.pid=996, windows.service.state=Running}@2024-04-24T14:12:24.991Z]: version conflict, document already exists (current version [1])", "index_uuid"=>"wBkYL16ZSo2fJ7tAC3zbbw", "shard"=>"0", "index"=>".ds-metrics-windows.service-default-2024.04.22-000065"}}}}
-
Logstash is having a backpressure where it cannot acknowledge the events to agent, as a result agent timeouts and resends the event. In a reality events might be indexed already in the ES. Quick resolution would be extending agent timeout but may depend on the situation.
-
etc.
URL:
Example: https://www.elastic.co/guide/en/logstash/current/introduction.html
Anything else?