At this moment, my focus is on improving the evaluation methods that are used within the field of co-speech gesture generation. Nowadays, co-speech gestures for embodied conversational agents (say virtual agents and robots) are generated using models that are trained on human motion data. These models rely on machine learning, and objective metrics are used in both the training phase and evaluation phase of these models. When we want to use the generated nonverbal behaviour of these models in virtual agents and social robots, it is crucial to have user tests with human participants, and to not just rely on the outcomes of objective metrics. For this evaluation phase, we rely on subjective methods, as nonverbal behavior is inherently subjective. Improving subjective evaluation strategies will lead to improved generation of nonverbal behaviour, such that these generation models can be used to drive nonverbal behaviour in virtual agents and social robots.