The assumption, "ARP poisoning attacks only work when the gateway's MAC address is not in the victim's ARP cache", is false.
A type of ARP request or reply exists known as a "gratuitous ARP"; from the Wireshark Wiki:
Gratuitous in this case means a request/reply that is not normally
needed according to the ARP specification (RFC 826)
A gratuitous ARP reply is a reply to which no request has been made.
The wiki page also has some information about legitimate use of gratuitous ARP replies. However, they can also be used maliciously, and they are the driving force behind most ARP poisoning attacks.
Let's look at an example network setup:
Gateway:
IP address: 192.168.1.1
MAC address: 11:11:11:11:11:11
Victim:
IP address: 192.168.1.100
MAC address: 22:22:22:22:22:22
Victim's ARP cache:
192.168.1.1 is at 11:11:11:11:11:11
Attacker:
IP address: 192.168.1.200
MAC address: 33:33:33:33:33:33
In this case, the gateway's MAC address is already cached by the victim, and the attacker wants to convince the victim that they are the gateway. The attacker can send a gratuitous ARP reply to the victim:
192.168.1.1 is at 33:33:33:33:33:33
Even though the victim did not request this information, it will happily update its ARP table:
Victim's ARP cache:
192.168.1.1 is at 33:33:33:33:33:33
Now the victim's traffic will go to the attacker instead of the gateway.
So, the takeaway is that gratuitous ARP replies can be used to update existing, legitimate entries in favor of malicious ones. The victim doesn't have a way of knowing whether or not the gratuitous ARP is malicious; what if the machine was just reconfigured or had a NIC swap? There is not really a way to tell.
I would like to understand which ARP reply should he save and which he should discard
As somewhat indicated above, the most recently received ARP reply is used to update the cache, regardless of whether is was gratuitous or not.