Overview
This prompt guides building a complex decentralized AI vision assistant on Ethereum through clear, modular phases. Developers and learners in blockchain, AI, and computer vision will benefit from structured, comprehensive project delivery.
Prompt Overview
Purpose: To create a decentralized AI vision assistant integrating computer vision, AI reasoning, privacy proofs, and blockchain.
Audience: Developers and researchers interested in AI, blockchain, privacy, and decentralized autonomous agents.
Distinctive Feature: Combines real-time vision, autonomous AI logic, zero-knowledge proofs, and Ethereum smart contracts.
Outcome: A modular, privacy-preserving system that verifies physical events and interacts with Ethereum autonomously.
Quick Specs
- Media: Text
- Use case: Generation
- Industry: Content Management Systems (CMS), Computer Vision, Generative AI
- Techniques: Plan-Then-Solve, Structured Output, Role/Persona Prompting
- Models: Llama 4 Maverick, YOLOv10, circom/snarkjs, ZoKrates, Solidity, Web3.py, React, Streamlit
- Estimated time: 5-10 minutes
- Skill level: Beginner
Variables to Fill
No inputs required — just copy and use the prompt.
Example Variables Block
No example values needed for this prompt.
The Prompt
You are tasked with building a complete end-to-end project named “AIVA — Agentic Intelligent Vision Assistant on Ethereum” based on the detailed proposal provided. To ensure a focused and thorough development process, the project should be divided into multiple phases, each focusing on specific core modules and functionalities.
### Project Overview
– Develop a decentralized autonomous AI agent that combines computer vision, large language models, zero-knowledge proof-based privacy, blockchain smart contract interaction, and user interface components.
– Enable real-world physical event verification through vision inputs, autonomous decision-making via AI, privacy-preserving proofs, and Ethereum smart contract transactions.
### Core Modules to Implement
1. **Vision Module:** Real-time detection of objects/faces/documents from video frames using Python, OpenCV, and YOLOv10.
2. **Agent Logic Module:** Interpret vision outputs and create actionable plans using LLaMA (a large language model) with Hugging Face Transformers and LangChain.
3. **ZK Proof & Identity Module:** Generate and verify zero-knowledge proofs for image-based verification and handle decentralized identities (DIDs) using circom/snarkjs or ZoKrates and DID management libraries.
4. **Blockchain Module:** Deploy and interact with Ethereum smart contracts, automatically sending signed transactions via Web3.py, Solidity, Hardhat/Ganache.
5. **Frontend & UX Module:** Build a user-friendly interface using Streamlit or React, integrating webcam preview, scan controls, wallet connection (MetaMask/Web3Modal), and real-time status updates.
### Development Phases
– Follow the detailed 24-hour hackathon day plan as a guideline, breaking the project into time-boxed phases from environment setup, module development, integration, testing, to final review and demo preparation.
– Ensure modular, clean, and well-documented code that supports each feature and facilitates integration.
### Expected Deliverables
– Fully functioning webcam-based vision system with live detection overlays.
– Autonomous AI agent interpreting vision data and triggering correct smart contract calls.
– Zero-knowledge proof generation and verification integrated with blockchain events.
– Ethereum smart contracts deployed on a testnet with verified transactions.
– Interactive UI supporting scanning, camera feeds, wallet connection, and transaction feedback.
– Comprehensive documentation including README, architecture diagrams, quickstart guides, and an 8–10 slide pitch deck.
### Guidance
– Reason through each phase’s design and implementation before proceeding.
– Emphasize secure, privacy-preserving approaches when handling sensitive data.
– Maintain modularity to allow scalability and multi-agent extensions.
– Provide explanations and code comments to ensure clarity and maintainability.
# Output Format
– Provide completion for each phase distinctly and modularly, including code snippets, configuration files, and usage instructions.
– Deliver documentation artifacts and architecture diagrams in markdown or compatible formats.
– Generate the pitch deck outline as a bullet-point slide list.
# Notes
– Assume access to all relevant open-source tools, libraries, and frameworks (YOLOv10, LLaMA models, circom/snarkjs, ZoKrates, Solidity, Hardhat, Web3.py, React/Streamlit).
– Adhere strictly to decentralized principles: no centralized oracles or manual verification.
Build the full project with clear phase-driven explanations and outputs, ensuring someone new to blockchain or AI can understand and extend it.
Screenshot Examples
[Insert relevant screenshots after testing]
How to Use This Prompt
- Copy the prompt exactly as provided for full project context and requirements.
- Use the detailed project overview to understand core modules and objectives.
- Follow the phased development plan to build each module step-by-step.
- Include code snippets, config files, and usage instructions per phase.
- Generate documentation and architecture diagrams in markdown format.
- Create a bullet-point pitch deck outline summarizing the project.
Tips for Best Results
- Phase 1 – Environment Setup: Configure Python, Node.js, and Ethereum testnet tools; install YOLOv10, Hugging Face Transformers, circom/snarkjs, Hardhat, and React/Streamlit dependencies to create a unified development workspace.
- Phase 2 – Vision & Agent Logic: Implement real-time object detection with YOLOv10 and OpenCV; integrate LLaMA via LangChain to interpret detections and generate autonomous action plans linked to smart contract calls.
- Phase 3 – ZK Proof & Blockchain Integration: Develop zero-knowledge circuits for image verification using circom or ZoKrates; deploy Ethereum smart contracts with Hardhat; connect proof verification and DID handling to trigger secure blockchain transactions via Web3.py.
- Phase 4 – Frontend & Final Integration: Build an interactive UI with webcam preview, wallet connection, and transaction feedback using React or Streamlit; integrate all modules for seamless user experience; provide comprehensive documentation and a pitch deck for project demonstration.
FAQ
- What is the primary goal of the AIVA project?
To create a decentralized AI agent combining vision, AI, privacy, and Ethereum smart contracts. - Which technologies power the Vision Module in AIVA?
Python, OpenCV, and YOLOv10 for real-time object, face, and document detection. - How does the Agent Logic Module function?
It interprets vision outputs using LLaMA, Hugging Face Transformers, and LangChain for planning. - What privacy technique is used for image verification?
Zero-knowledge proofs generated and verified via circom/snarkjs or ZoKrates.
Compliance and Best Practices
- Best Practice: Review AI output for accuracy and relevance before use.
- Privacy: Avoid sharing personal, financial, or confidential data in prompts.
- Platform Policy: Your use of AI tools must comply with their terms and your local laws.
Revision History
- Version 1.0 (March 2026): Initial release.


