As well as playing against themselves and fellow AI agents, the LLMs played against 2,000 experienced human players. They were evaluated based on how well they kept track of what was going on. For ...
Results that may be inaccessible to you are currently showing.
Hide inaccessible results