Tech Log-Browser Image Agent project
Tech Log Entry — Browser Image Agent: An AI-Powered Image Tagger and Saver Category: AI / Computer Vision / Browser Automation / Local LLM / Python Initial Goal I wanted a tool to automatically save images from open browser tabs to a local folder, with descriptive, search-friendly filenames and tags generated by an AI vision model — replacing manual save-and-rename workflows. All processing to be fully local: no cloud APIs, no subscriptions, no data leaving the machine. The tool should use only free open-source software, run on my Windows 11 desktop, and leverage my RTX 3060 GPU for local AI inference. Two operating modes: fully automatic (save all qualifying images without interruption) and human-in-the-loop review (confirm each image before saving). Hardware Used Desktop: Windows 11 Home, NVIDIA GeForce RTX 3060 (12 GB VRAM) No additional hardware required — everything runs locally on the desktop Software Stack Python 3.12 Ollama 0.17.7 (local AI runtime) Vision mode...