top of page
LATEST TECH ARTICLES
![[On-Device AI Chatbot] Part 10: The Future of On-Device AI and TecAce's Roadmap (Conclusion)](https://static.wixstatic.com/media/2ea07e_d1771a9889764093a8c855756693ba51~mv2.png/v1/fill/w_444,h_250,fp_0.50_0.50,q_35,blur_30,enc_avif,quality_auto/2ea07e_d1771a9889764093a8c855756693ba51~mv2.webp)
![[On-Device AI Chatbot] Part 10: The Future of On-Device AI and TecAce's Roadmap (Conclusion)](https://static.wixstatic.com/media/2ea07e_d1771a9889764093a8c855756693ba51~mv2.png/v1/fill/w_300,h_169,fp_0.50_0.50,q_95,enc_avif,quality_auto/2ea07e_d1771a9889764093a8c855756693ba51~mv2.webp)
[On-Device AI Chatbot] Part 10: The Future of On-Device AI and TecAce's Roadmap (Conclusion)
The Future of On-Device AI and TecAce's Roadmap Throughout this 9-part series, we have chronicled the entire journey of developing an on-device chatbot—a solution to cloud cost and data security issues. We covered everything from selecting a Small Language Model (SLM) and applying quantization, integrating offline STT/TTS, building local RAG, to rigorously validating quality using AI SuperVision and overcoming hardware performance constraints. In this grand finale, Part 10,
3 days ago
![[On-Device AI Chatbot] Part 9: Challenging Performance Limits: Heat, Battery, and Response Speed](https://static.wixstatic.com/media/2ea07e_826bc45db874477090ea018335b34059~mv2.png/v1/fill/w_444,h_250,fp_0.50_0.50,q_35,blur_30,enc_avif,quality_auto/2ea07e_826bc45db874477090ea018335b34059~mv2.webp)
![[On-Device AI Chatbot] Part 9: Challenging Performance Limits: Heat, Battery, and Response Speed](https://static.wixstatic.com/media/2ea07e_826bc45db874477090ea018335b34059~mv2.png/v1/fill/w_300,h_169,fp_0.50_0.50,q_95,enc_avif,quality_auto/2ea07e_826bc45db874477090ea018335b34059~mv2.webp)
[On-Device AI Chatbot] Part 9: Challenging Performance Limits: Heat, Battery, and Response Speed
Challenging Performance Limits Heat, Battery, and Response Speed In Part 8, we shared how we caught hallucinations and improved response quality using 'AI SuperVision'. While making the model smarter and more accurate is a huge milestone, running it in a real-world smartphone environment (like the Galaxy S25 FE) forces us to confront harsh physical walls: Thermal management, Battery consumption, and Latency limits. Unlike the limitless resources of cloud data centers, a mob
6 days ago
![[On-Device AI Chatbot] Part 8: Catching Hallucinations: Analyzing SuperVision Test Results](https://static.wixstatic.com/media/2ea07e_69fba1e933354148a97a50bbfb2f2dcb~mv2.png/v1/fill/w_444,h_250,fp_0.50_0.50,q_35,blur_30,enc_avif,quality_auto/2ea07e_69fba1e933354148a97a50bbfb2f2dcb~mv2.webp)
![[On-Device AI Chatbot] Part 8: Catching Hallucinations: Analyzing SuperVision Test Results](https://static.wixstatic.com/media/2ea07e_69fba1e933354148a97a50bbfb2f2dcb~mv2.png/v1/fill/w_300,h_169,fp_0.50_0.50,q_95,enc_avif,quality_auto/2ea07e_69fba1e933354148a97a50bbfb2f2dcb~mv2.webp)
[On-Device AI Chatbot] Part 8: Catching Hallucinations: Analyzing SuperVision Test Results
Catching Hallucinations Analyzing SuperVision Test Results In Part 7, we built an automated testing pipeline that bridged our on-device chatbot app inside a smartphone with the AI SuperVision server on a PC. This enabled an end-to-end flow from prompt injection and answer extraction to automated grading. We finally had an environment capable of running dozens of test cases automatically. So, what kind of report card did our on-device SLM (Gemma-2B based) receive from these
Feb 24
![[On-Device AI Chatbot] Part 7: Building SuperVision: An Automated Chatbot Testing Pipeline](https://static.wixstatic.com/media/2ea07e_22b8a8781b1743cb8aaa018b782ab4da~mv2.png/v1/fill/w_444,h_250,fp_0.50_0.50,q_35,blur_30,enc_avif,quality_auto/2ea07e_22b8a8781b1743cb8aaa018b782ab4da~mv2.webp)
![[On-Device AI Chatbot] Part 7: Building SuperVision: An Automated Chatbot Testing Pipeline](https://static.wixstatic.com/media/2ea07e_22b8a8781b1743cb8aaa018b782ab4da~mv2.png/v1/fill/w_300,h_169,fp_0.50_0.50,q_95,enc_avif,quality_auto/2ea07e_22b8a8781b1743cb8aaa018b782ab4da~mv2.webp)
[On-Device AI Chatbot] Part 7: Building SuperVision: An Automated Chatbot Testing Pipeline
Building SuperVision An Automated Chatbot Testing Pipeline In Part 6, we explained the background of introducing Testworks' 'AI SuperVision' tool to objectively evaluate the chronic hallucination issues inherent in generative AI. However, to actually apply this tool to our project, we had to overcome a significant technical barrier. Our LLM chatbot operates completely offline "On-device" (inside a smartphone), whereas the AI SuperVision system evaluating it exists in a "PC
Feb 23
![[On-Device AI Chatbot] Part 6: How to Verify AI Quality? (Introduction to SuperVision)](https://static.wixstatic.com/media/2ea07e_38184c3eec5940288ae0fcc2e73f6e2d~mv2.png/v1/fill/w_444,h_250,fp_0.50_0.50,q_35,blur_30,enc_avif,quality_auto/2ea07e_38184c3eec5940288ae0fcc2e73f6e2d~mv2.webp)
![[On-Device AI Chatbot] Part 6: How to Verify AI Quality? (Introduction to SuperVision)](https://static.wixstatic.com/media/2ea07e_38184c3eec5940288ae0fcc2e73f6e2d~mv2.png/v1/fill/w_300,h_169,fp_0.50_0.50,q_95,enc_avif,quality_auto/2ea07e_38184c3eec5940288ae0fcc2e73f6e2d~mv2.webp)
[On-Device AI Chatbot] Part 6: How to Verify AI Quality? (Introduction to SuperVision)
How to Verify AI Quality? Introduction to SuperVision In Part 5, we explored how to inject our company's proprietary knowledge into the on-device chatbot using Local RAG (Retrieval-Augmented Generation) and Multi-Context Switching. However, equipping the chatbot with knowledge does not immediately solve all problems. "How can we be absolutely sure that this chatbot isn't fabricating answers and is truthfully speaking only about what is in the provided documents?" In Part 6,
Feb 21
![[On-Device AI Chatbot] Part 4: The Ears and Mouth of a Chatbot: On-Device STT/TTS Integration](https://static.wixstatic.com/media/2ea07e_f9b2f825229d4e4b8e86be78ac4fd73b~mv2.png/v1/fill/w_444,h_250,fp_0.50_0.50,q_35,blur_30,enc_avif,quality_auto/2ea07e_f9b2f825229d4e4b8e86be78ac4fd73b~mv2.webp)
![[On-Device AI Chatbot] Part 4: The Ears and Mouth of a Chatbot: On-Device STT/TTS Integration](https://static.wixstatic.com/media/2ea07e_f9b2f825229d4e4b8e86be78ac4fd73b~mv2.png/v1/fill/w_300,h_169,fp_0.50_0.50,q_95,enc_avif,quality_auto/2ea07e_f9b2f825229d4e4b8e86be78ac4fd73b~mv2.webp)
[On-Device AI Chatbot] Part 4: The Ears and Mouth of a Chatbot: On-Device STT/TTS Integration
The Ears and Mouth of a Chatbot On-Device STT/TTS Integration In Part 3, we explored the optimization process of compressing a massive language model to fit the constrained resources of a smartphone and boosting inference speed using the mobile NPU. Now that we have successfully embedded a fast and smart "brain" inside the device, it is time to give our chatbot the "ears and mouth" it needs to interact naturally with users. In a mobile environment, typing out long texts eve
Feb 20
![[On-Device AI Chatbot] Part 5: A Chatbot That Understands Context: Implementing Local RAG and Multi-Context Switching](https://static.wixstatic.com/media/2ea07e_42172a5ac3454535a81160a2408d0b5b~mv2.png/v1/fill/w_444,h_250,fp_0.50_0.50,q_35,blur_30,enc_avif,quality_auto/2ea07e_42172a5ac3454535a81160a2408d0b5b~mv2.webp)
![[On-Device AI Chatbot] Part 5: A Chatbot That Understands Context: Implementing Local RAG and Multi-Context Switching](https://static.wixstatic.com/media/2ea07e_42172a5ac3454535a81160a2408d0b5b~mv2.png/v1/fill/w_300,h_169,fp_0.50_0.50,q_95,enc_avif,quality_auto/2ea07e_42172a5ac3454535a81160a2408d0b5b~mv2.webp)
[On-Device AI Chatbot] Part 5: A Chatbot That Understands Context: Implementing Local RAG and Multi-Context Switching
A Chatbot That Understands Context Implementing Local RAG and Multi-Context Switching In Part 4, we gave our chatbot "eyes, ears, and a mouth" by integrating on-device STT and TTS. However, no matter how well a chatbot listens and speaks, it is only half-useful as a business assistant if it doesn't know your specific "domain knowledge"—like internal company regulations or specific product manuals. Because Small Language Models (SLMs) are compact, they do not perform as well
Feb 19
![[On-Device AI Chatbot] Part 3: Core Technologies of Mobile AI: Quantization and NPU Optimization](https://static.wixstatic.com/media/2ea07e_08ed983f9efb45fe9129e06967a91163~mv2.png/v1/fill/w_444,h_250,fp_0.50_0.50,q_35,blur_30,enc_avif,quality_auto/2ea07e_08ed983f9efb45fe9129e06967a91163~mv2.webp)
![[On-Device AI Chatbot] Part 3: Core Technologies of Mobile AI: Quantization and NPU Optimization](https://static.wixstatic.com/media/2ea07e_08ed983f9efb45fe9129e06967a91163~mv2.png/v1/fill/w_300,h_169,fp_0.50_0.50,q_95,enc_avif,quality_auto/2ea07e_08ed983f9efb45fe9129e06967a91163~mv2.webp)
[On-Device AI Chatbot] Part 3: Core Technologies of Mobile AI: Quantization and NPU Optimization
Core Technologies of Mobile AI Quantization and NPU Optimization In Part 2, we discussed our selection of Gemma-2B as the ideal Small Language Model (SLM) for our project and shared our experiences benchmarking CPU and GPU performance in a constrained smartphone environment. However, the initial tests revealed significant challenges: noticeable latency delays and out-of-memory errors. To run LLMs in real-time on a mobile device held in the palm of your hand—not on a data ce
Feb 18
![[On-Device AI Chatbot] Part 2: Giant Language Models in the Palm of Your Hand: Mobile SLM Selection Strategy](https://static.wixstatic.com/media/2ea07e_7ef19534e8cc4690850ed424d904dee6~mv2.png/v1/fill/w_444,h_250,fp_0.50_0.50,q_35,blur_30,enc_avif,quality_auto/2ea07e_7ef19534e8cc4690850ed424d904dee6~mv2.webp)
![[On-Device AI Chatbot] Part 2: Giant Language Models in the Palm of Your Hand: Mobile SLM Selection Strategy](https://static.wixstatic.com/media/2ea07e_7ef19534e8cc4690850ed424d904dee6~mv2.png/v1/fill/w_300,h_169,fp_0.50_0.50,q_95,enc_avif,quality_auto/2ea07e_7ef19534e8cc4690850ed424d904dee6~mv2.webp)
[On-Device AI Chatbot] Part 2: Giant Language Models in the Palm of Your Hand: Mobile SLM Selection Strategy
Giant Language Models in the Palm of Your Hand Mobile SLM Selection Strategy In Part 1, we explored how "On-Device AI" is becoming an essential paradigm for solving cloud cost and data security issues. But how can we fit massive Large Language Models (LLMs) with tens or hundreds of billions of parameters—which typically run on massive GPU racks in data centers—into a small smartphone? The answer lies in Small Language Models (SLMs) . In Part 2, we will compare the most nota
Feb 17
![[On-Device AI Chatbot] Part 1: Why "On-Device AI" Now? (Overview)](https://static.wixstatic.com/media/2ea07e_fe141ac84a2c46b8b5daf9987efc1ea7~mv2.png/v1/fill/w_444,h_250,fp_0.50_0.50,q_35,blur_30,enc_avif,quality_auto/2ea07e_fe141ac84a2c46b8b5daf9987efc1ea7~mv2.webp)
![[On-Device AI Chatbot] Part 1: Why "On-Device AI" Now? (Overview)](https://static.wixstatic.com/media/2ea07e_fe141ac84a2c46b8b5daf9987efc1ea7~mv2.png/v1/fill/w_300,h_169,fp_0.50_0.50,q_95,enc_avif,quality_auto/2ea07e_fe141ac84a2c46b8b5daf9987efc1ea7~mv2.webp)
[On-Device AI Chatbot] Part 1: Why "On-Device AI" Now? (Overview)
Why "On-Device AI" Now? Over the past few years, generative AI, led by services like ChatGPT, has revolutionized our daily lives and workflows. However, behind these powerful AI services lies a common limitation: cloud dependency . The standard architecture—where user queries are sent to cloud servers and the computed results from massive data centers are sent back—inevitably introduces risks such as data privacy breaches, network latency, and exorbitant server maintenance c
Feb 16
SECURE YOUR BUSINESS TODAY
bottom of page