I asked Claude to write questions to ask another LLM (without saying which), to assess who was better at Baroque theory. Claude won easily: https://claude.ai/share/1f846bcd-e5d6-45a5-9c3f-f7e57f9d7308
To be fair, I asked Google Gemini Pro to do the same thing for Claude. It admitted defeat, 5-0 for Claude: https://gemini.google.com/share/77d69ec1a19c