Duplicate frames are compressed to zero (after CABAC, they use ~1byte/frame). VFR is a hack; it is not a compression tool.
Unless you have the original source before any artificial noise/grains are added, the "duplicate" frames are never compressed to zero. Even with quite heavy temporal smoothing I still get somewhere in the range of 200 bytes to 7kb per "duplicate" frame.
There are two types of VFR, one is mixing clips with different base rate, the other one is to have all sort of different rates as low as 1.2FPS (1Pass Dedup). I personally find that Deduping (probably a better word than VFR which confuses with the other type) saves between 2000-16000 frames per episode of anime. Most samples I've tried the bitrate saving is as high as 3 CRF. The preset is not so much the issue, the key part really is --ref 16 unless you want your encoding to be DXVA compatible which you're stuck with whatever the resolution allows.
The saving comes from the reduced number of i-frames, better use of reference frames and from temporal smoothing.
The main problem with Dedup is unless you come up with some really good hack (assisting dedup to identify what is detail and what is noise), you're going to drop a lot of details from your clip. Noise could be as high as 0.9% but a small mouth movement or a moving cloud will only register 0.3%. Set the dedup threshold too low you don't drop enough duplicates to benefit from it and setting it too high your encoding will be jerky.
Depends on how resource-intensive is your Dedup hack if you have one and your x264 setting, you might actually see your encoding time increases. I also frequently see dedup skipping so much frames that dgavc is giving decoding error on h264 sources, essentially requiring lossless pass prior to dedup().
Seekability is NOT affected because any given frame is still only as far away to the keyframe as the interval. I couldn't see the point of using Dedup on live action because there is hardly anything that is 100% still.