Google says its algorithm can correctly caption a photograph with nearly 94 percent accuracy.
The company says the improvements come in the third version of its system named Inception, with the score coming from a standardized auto-caption test named ImageNet. It reports the first version scored 89.6 percent, the second 91.8 percent and the new one 93.9 percent.
According to Google, the biggest improvement to the system has been working on describing rather than classifying images. It gives the example of more basic models being able to recognize a dog, grass and Frisbee, but a more sophisticated model knowing that useful descriptive information includes how the dog and Frisbee are interacting.
The improvements have also included not just better identifying colors, but also working out in which elements of the picture the color is significant and potentially variable.
Google has now opened up the source code of the model for people to try themselves and train their own copy. It’s available through TensorFlow, an open source software library.