Facebook researchers believe that NetHack, the game, is well-tailored to testing, evaluating artificial intelligence models, and training. Thus, they released the NetHack learning Environment. It is a research tool to benchmark the robustness and generalize reinforcement learning agents.
Games have served as benchmarks for artificial intelligence for decades. Nevertheless, things kicked into gear in 2013. This is the same year when Google subsidiary, DeepMind, demonstrated an Artificial Intelligence system that could play Q*bert, Enduro, Beamrider, Seaquest, Space Invaders, Breakout, and Pong to superhuman levels. According to folk, like DeepMind cofounder Demis Hassabis, advancement is not merely improving the game design. Instead, they inform the development of systems, which might one day diagnose illness. Thus, they predict the use of complicated segment CT scans and protein structures.
It is more complicated than it initially seems. It gives tasks players to descend down more than 50 dungeon levels to retrieve a magical amulet. During that journey, they must use hundreds of items. Moreover, they have fight monsters while contending with productive interactions between the two. Levels in NetHack generates procedurally. Furthermore, every playthrough in the game is different. Facebook researchers note that it tests the limits of generalization of current state-of-the-art artificial intelligence.
With its lightweight architecture, NetHack has another advantage. A turn-based, ASCII-art world and a game engine written primarily in C can capture this complexity. It forgoes everything. Nevertheless, the most straightforward physics, while rendering symbols instead of pixels, allows models to learn fast. Moreover, it does this without the waste of computational resources on making observations or simulating dynamics.
Training sophisticated machine learning models in the cloud are prohibitively expensive, though. The University of Washington’s Grover is tailored for both the detection and generation of fake news. So, according to a recent Synced report, Washington’s Grover University cost $25,000 to train over two weeks. To prepare its GPT-2 language model, OpenAI racked up $256 per hour. Moreover, Google spent around $6,912 to train BERT. BERT is a bidirectional transformer model that redefined what was state of the art, for 11 processing tasks for natural language.
A single high-end graphics card is enough to train artificial intelligence-driven NetHack agents hundreds of millions of steps a day using the framework of TorchBeast. It supports further scaling by adding more graphics machines or cards. In a reasonable time frame, Agents may even experience billions of steps in the environment, while still challenging the limits of what current artificial intelligence techniques can achieve.
Without the computational costs of other challenging simulation environments, NetHack presents a challenge that is on the frontier of current methods. Reinforcement learning (standard deep) agents are currently operating on NetHack to explore only a fraction of the overall game. That is what Facebook researchers wrote this week in a preprint paper. To make progress in that challenging new environment, we will have to move beyond the age of tabula rasa learning.
The NetHack Learning Environment consists of three components: a baseline agent, a Python interface to NetHack using the accessible OpenAI Gym API, and a suite of benchmark tasks. Moreover, it includes seven benchmark tasks. Those tasks would measure against the agent’s measure.