AutoAlt
2024
·
A browser extension that describes shoe images for screen reader users — built at the LG AI Youth Camp.
The problem
Visually impaired people shopping online often hit a wall with product images. Screen readers need alt text to describe what’s on screen — and most shopping sites don’t have it. You can’t tell two pairs of shoes apart if the page just says “image” twelve times.
Our team spotted this during the LG AI Youth Camp in 2024 — a program run by LG Discovery Lab and Seoul National University. We spent about three months building a solution.
What we built
AutoAlt takes a product image, runs it through a custom object detection model, then hands the result off to GPT-4 to turn into a natural-language description. What comes out the other end is text a screen reader can actually read aloud.
We narrowed the scope a lot over the project. Started with “all clothing on shopping sites,” ended at shoes — the only category we could train reliably in the time we had.
Stack
- YOLOv8 trained on my MacBook Pro (MPS acceleration — first time I heard the fans sound like a jet)
- FastAPI backend — receives image, returns model JSON
- GPT-4 Turbo — turns
{type: "sneaker", laces: true}into a full sentence - Single HTML frontend, triggered via right-click context menu on images
I was the only developer on the team. Everyone else handled planning, design, and presentation.
Results
We won three awards at the final ceremony — LG Talent, Growth, and Exploration prizes — and I was individually selected for the US Silicon Valley trip.

At Stanford during the US camp, we also prototyped an AI legal advice chatbot for minor traffic violations — different project, same design-thinking energy.
More
The full story — application panic, SNU dorm all-nighter, YOLO training on a laptop — is in my LG AI Youth Camp blog post. The US camp writeup covers Silicon Valley.