So typically the person running the stream will have created a bot that runs locally, connected to the game/emulator being run that reads chat and looks for specific inputs words or letter (for example "a" or "start" or "up"), and every time it reads "a", it will send the input that activates the "A" button on the game/emulator being used.
The same applies for other inputs. "start" = start button, "up" = up arrow, etc.