“Researchers at Google’s AI labs created a couple of novel neural networks that can succeed in navigating Web forms, such as an online flight-booking site. Although baby steps at the moment, the program succeeds as well or better than some models trained using human demonstrations of pointing and clicking.
“As an example, in the flight-booking environment the number of possible instructions/tasks can grow to more than 14 millions, with more than 1700 vocabulary words and approximately 100 Web elements at each episode.
The first, “QWeb,” is a Deep Q-Network that is enhanced by breaking up the webpage into rewards for each step in a travel booking exercise, such as entering the date of a flight. That tends to increase the rewards that the neural net receives as it goes along.
The second, called “INET,” is another Deep Q-Network that gets rewards as it properly generates instructions for QWeb to follow. It’s the INET’s job to digest the Web page, in the form of a “document-object model,” or “DOM,” and come up with the steps QWeb should take to make choices in the Web form, such as picking an airport code from a drop-down list of “destinations” in the form.”