How does the model make predictions on a single image?
How does the model make predictions on a single image?