It makes me really look forward to tech such as Adobe's VOCO (which was a prototype only).

I would definitely take no VO over something like that. There is so much more to Voice Over then reading lines.
This was not text to speech, this was natural. It could be used for example to do deep fakes indistinguishable from an actual person. It was very impressive but may remain a prototype only due to concerns of privacy etc.