The final step in UI test automation is to check the correct appearance. What seems to a simple image processing task makes even modern AI struggle. It surprisingly fails to correctly attribute many simple changes.

LLMs can explain images but only detect differences in features they're trained to recognize. Traditional image comparison libraries require perfect pixel alignment with little tolerance for distortions. This limitation is problematic in visual test automation when screenshots of new and previous versions need to be compared. The goal is to identify how an element has changed, not just if it did. Did it just move a bit, or did the content change, too? This talk covers common algorithms used in test automation, compares the performance of various AI tools, and explains why this task is hard.

We will discuss:

what multimodel foundation model can achieve and where they struggle
why this task is easy for humans
what tools are easily available in test automation suites such as applitools, cypress or playwright.
how to setup and train a neural network to do this using opencv and tensorflow.
what else can be done and where are limits

Interview:

What is your session about, and why is it important for senior software developers? Why should attendees prioritize your session?

This is session is about images and screenshots as they appear in frontend testing and automation. Processing visual information is a current frontier of AI. Sometimes it shines, sometimes it seems incomprehensibly incapable. Join this session to lear how to monitor your application with its final appearance on the user's screen.

What are the common challenges developers and architects face in this area?

Two types of approaches dominate this field. Pixel based image algorithms are overly sensitive to layout changes require frequent manual intervention. Recent AI models can understand screenshots, but completely miss obvious things that they have not been trained to name.

What's one thing you hope attendees will implement immediately after your talk?

After the talk you should start writing visual front-end tests for your application.

What makes InfoQ Dev Summit stand out as a conference for senior software professionals?

The Summit allows me to get into contact with experts from many domains in my home town in Munich.

Image Processing for Automated Tests

Interview:

What is your session about, and why is it important for senior software developers? Why should attendees prioritize your session?

What are the common challenges developers and architects face in this area?

What's one thing you hope attendees will implement immediately after your talk?

What makes InfoQ Dev Summit stand out as a conference for senior software professionals?

Speaker

Stefan Dirnstorfer

Find Stefan Dirnstorfer at:

Speaker

Stefan Dirnstorfer

Date

Location

Share