Showcasing my creative work and technical projects. Each project represents a unique challenge and learning experience.
• Immersive language learning using your own photos. Take a photo and start learning a language!
• CoT LLM image recognition and dialogue generations
• TTS for speech synthesis
• Next.js + Tailwind + FastAPI
Go to Viseal Website →
Product Hunt Featured →
• An intelligent AI agent for image translation and explanation.
• Method: mixed OCR + LLM with cross-matching
• Lesson Learn: LLM is good at understanding the context of the image, but it can't give correct text coordinates. OCR+traditional section based translation can't fully capture the context correctly. A cross-matching can join the capabilities
• Cons: speed is challenges. For complex images with many text, it takes too long (up to 20s) to translate.
Try My Demo →
• Academic cooperation to help reviewer process large batch of unstructured pdf
• each pdf contains up to 40 pages of undefined sequences of personal information, emails, publications, etc.
• Method: sectional RAG with local LLM to make sure it doesn't hallucinate
• Lesson Learn: RAG+loop+straight forward model (thinking models often get wrong results)
🔒 Private Project - Confidential
• Research Lab operation with Tokyo UEC University
• A Python analytic software for calculating and plotting transient absorption spectroscopy data with an intuitive graphical user interface
View on GitHub →
Let's connect and discuss potential collaborations or opportunities.