Further looking into this and arguing with Bing, I think I see what the issue is: the brain doesn't function like a neural network, because there is no "back propagation" in the brain, rather all training for humans is performed race-wide. Genetics change generation for generation, which manifests in the brain updating based on the new DNA. There is no immediate updating of the neurons related to the training experiences for humans. The effect is one of causing negative or positive experiences in which the neural pathway used is then reinforced in the manner of the strain placed on muscles.
The actual "learning" is taking place collectively at an organizational level, externally, and that is what the back propagation emulates. Each neuron here represents a person and it takes the form of orders pass down from management and reports passed upwards from the workers, and that is how the process looks irl. This causes inertia where the management won't easily change, which results in things like image generation AI fixating a flawed structure (6 fingers and 3 arms, for example) higher up in the structure, then refines it to a high level without fixing the major errors easily.
If you want an AI who can actually create art, this method of working has to be changed, let a bot instead learn from how artists actually work by training on their use of image creation software or in more advanced cases, let the AI watch someone paint or draw manually.
This is how self driving cars are trained, they aren't trained on mere finished product like the image generation AIs (whatever a finished product would be in the self driving car example).