I don't have the numbers, but listen to Sven in the last interviews (I think it was the PFH) where he talked about rigs.
Its not 1x Large and 1x short one.
Lets start with the basics: I will make an assumption here - humans, tieflings, elves, gith and half-elves most likely share a rig (= skeleton which gets animated) halflings, dwarfs and gnomes (goblins as well) most likely same thing. I would have to check the heights, but there is a chance dragonborn and orcs have the same rig + tail for dragonborn (otherwise those 2 might be separated as well or half orcs use the human one).
Now multiply that by 2 for genders. Lets ignore the bodytypes because I've not seen enough to say how many there are and what possible can be reused.
Since males and females usually move very differently (think of female Shepard's complaints about their male walk) you will need to keep the animations done for both genders.
Ok, that's 6 sets of custom animations for each one of the characters involved in the scene (probably all created just to be used in these scenes and not in the average game) - lets assume the origin characters have always the same animation and role in their own scene - that is also assuming that when you play origin characters they don't have their own animation when in someone else's sex scene (so all male average will act like Astarion when sleeping with Haslin). This puts us a 6 different sets of animations + origin character per sex scene.
That's just the animations. Now the different sizes might mean different camera and light setups - which would be probably up to 3 different setups, and just looking at Haslin's scene its not 1 camera angle and done.
Add to that special effects for narrative reasons like Haslin's transformation or other cases.
All of this in scenes that have high quality expectations because they are important to players, while the camera is often close up, a lot of touching is happening with things like hair and softer body parts like breasts - all things that can't be simulated with video game character assets (so if you want those to look good you actually would have to do custom solutions for those scenes are not feasible for the regular game due to performance).
Comparing that to Geralt having sex scenes with chosen female characters 10 years ago when the graphical fidelity was way lower. Or games where those scenes fade to black - its a tremendous difference in effort and time spent.
Sure you can make shortcuts and lower the quality of each of those, but that's my point - if you are doing cinematics you should have a certain level of consistency at least in the main ones (main story, important side quests, romances,...). If you can't it will look off in the game and that usually isn't something AAA game devs want.