Idk the full process but the common (if you can call it that) method takes the raw vocals of an artist, learns from it, then sort of layers it atop the vocals of another artist. Thus imitating the sound of the artist, but not generating entirely new (per se) audio. A different method which generates audio off of text input sounds more like the artist it’s trying to replicate IMO but in more cases it’ll fail and throw in random notes where they shouldn’t be (e.g. the Ariana Grande cover of Diamonds)