The future is shaped by intelligent systems, and I’m passionate about building AI solutions that create real impact.

Currently working on projects like Zeni and YogAI, focused on ultra-low-latency voice AI, robot control, and real-time learning.

Winner of 2 hackathons and Amazon ML School ’25 alumnus, dedicated to advancing AI and machine learning.

You can find my projects and updates here, or get in touch via the Contact section.

For more, check out my Resume, Projects, and Skills.

Recent projects

  • Zeni — RAG-Powered AI Web Search & Voice Assistant Robot.
  • YogAI — Posture correction system using computer vision and ML.
  • DeepTrack Attendance System — Real-time video attendance using facial recognition.

Portfolio

Education

Graphic Era Hill University Logo

Graphic Era Hill University Ongoing

Bachelor of Technology
Specialization: CSE (Hons.) with Machine Learning and Artificial Intelligence
GPA: 8.65
Aug 2023 – Jul 2027

Location: Haldwani, Uttarakhand
The Masters School Logo

The Masters School

Completed Intermediate with 83.40% and Matriculation with 82.17%.
July 2021 – July 2023

Location: Bhimtal, Uttarakhand

Certificates & Courses

Achievements & Awards

AWS JAM Logo

AWS JAM 2026 - 2nd Place

Secured 2nd place among 400+ participants in a competitive cloud challenge organized in association with Amazon Web Services (AWS).

Location: Graphic Era Hill University

24-Hour Hackathon Winner

Created FaultExpert AI/ML fault diagnosis model detecting production line defects at Graphic Era Hill University hackathon.

24-Hour Hackathon Logo
Location: Haldwani, Uttarakhand
Amazon ML School Logo

Amazon ML School ’25 Alumnus

Completed Amazon Machine Learning School 2025, gaining advanced hands-on AI and ML skills.

Location: Virtual Program (Amazon India)

Experience

WeCode Logo

AI/ML Mentor

WeCode  |  Aug 2025 – Present ()

Mentoring students in AI/ML at my college. Conducted sessions on machine learning and AI basics, helping others grasp complex concepts simply.

Artificial Intelligence (AI) and Machine Learning

Freelance AI Developer

Personal Client Project  |  Aug 2025 – Present ()

Currently building a comprehensive fitness app that integrates AI-powered Yoga posture detection, personalized gym routines, and nutritional guidance — all in one platform. The app uses computer vision, voice AI, and real-time feedback to deliver a tailored wellness experience.

Technologies: Computer Vision, LLMs, Python, TensorFlow

Location: Remote / Freelance

Projects

Zeni Logo

Zeni: Real-Time Bilingual Voice AI Robot NEW

Zeni is a production-grade, ultra-low-latency voice AI assistant deployed as a physical AI receptionist at Graphic Era Hill University. Built by a team of CSE students, it features a ~500ms first-word latency, achieving a highly responsive full-duplex conversational experience seamlessly in Hindi and English.

Technical Highlights:

  • Speculative ASR & RAG: Pre-warms the LLM using high-confidence partial transcripts and ChromaDB to fetch college FAQ data intuitively.
  • Streaming Engine: Utilizes Groq (Llama 3.3 70B) for ~200ms first-token latency, paired with zero-buffer Google Cloud TTS.
  • Multimodal Functionality: Autonomously controls its physical robot body and continuously analyzes camera feeds in parallel using Vision models.

Technologies: Llama 3.3 70B, FastAPI, Google Cloud ASR/TTS, ChromaDB, WebSockets, Android/Java

Impact: Deployed live on campus as a robot receptionist. View on GitHub
“The Zeni speaks so realistically and so fast, it feels like you’re talking to a real human.”
—Student User, GEHU Bhimtal
“This project has real product potential — it’s awesome and ready to be launched.”
—Head of the Department of Computer Science, GEHU

The Previous Projects

DeepTrack Cover

DeepTrack: AI Face Recognition Attendance System

An AI-powered bilingual attendance system using face recognition via CCTV or webcam to automate attendance in real-time.

Built with YOLOv8, DeepFace, TensorFlow, and OpenCV, this project captures and identifies faces from live feeds or videos, applies Region of Interest (ROI) filtering, and logs attendance directly into CSV — all with high speed and accuracy.

[Technologies: YOLOv8, DeepFace, OpenCV, Numpy, Pandas]
[GitHub: DeepTrack Repo ⭐ Loading stars...]

Bio

Shankar Singh is an AI researcher, developer, and engineer with deep interests in large language models (LLMs), computer vision, and real-world applications of artificial intelligence. His work bridges the gap between cutting-edge AI and its practical deployment through mobile and web applications.

He is the creator of several notable open-source projects, including DeepTrack, a real-time facial recognition attendance system using YOLOv8 and DeepFace, and AlphaMind, an offline bilingual campus assistant built using LLaMA 3.2 and a RAG pipeline. His projects span voice interfaces, offline AI systems, and multimodal applications combining speech, vision, and LLMs.

Shankar is an alumnus of the prestigious Amazon ML Summer School 2025, and has won multiple hackathons, including the Neural Nexus AI/ML International Hackathon and the 24-Hour TechFest Hackathon at Graphic Era Hill University.

His recent freelance work involves building an integrated fitness and wellness platform combining gym routines, AI-driven yoga posture correction, and personalized nutrition tracking.

For more about his work, visit his GitHub, check out his LinkedIn, or explore his open-source DeepTrack project.

My Work

My work revolves around developing AI systems that operate in real-world environments — from educational institutions to physical spaces like gyms or classrooms. I specialize in building practical, offline-first applications that integrate local LLMs, computer vision, voice interaction, and retrieval-augmented generation (RAG).

The common theme across my projects is solving real, on-ground problems using modern AI. Whether it's AlphaMind — an offline bilingual campus assistant, DeepTrack — a facial recognition attendance system, or YogAI — a real-time yoga posture corrector, each system aims to deliver fast, accurate, and accessible experiences with minimal hardware requirements.

cont→

cont→

I build with a strong focus on privacy (offline/local models), inclusivity (Hindi-English bilingualism), and edge-efficiency (optimized pipelines with low latency). My stack includes tools like FastAPI, LLaMA via Ollama, YOLOv8, DeepFace, FAISS, ChromaDB, AWS (EC2, S3), and Docker.

Over the last few years, I’ve worked across diverse AI domains — from face recognition and posture correction to building multilingual AI assistants and personalized recommendation engines. Many of my projects originate from everyday problems I encounter as a student or developer — whether it's automating classroom attendance, understanding human behavior through Reddit activity, or making offline AI tools accessible in Hindi and English.

While most of my work is open-source and individually developed, I’ve also collaborated in hackathons and team projects, often winning or placing at national-level events. I value hands-on, end-to-end execution — from data pipelines and model training to deployment and optimization. Some systems, like DeepTrack and Zeni, required integrating multiple complex modules, including LLMs, YOLO-based object detection, TensorFlow models, and speech recognition—all working together in a high-performance setting.

I approach AI from a builder's mindset, but I’m also drawn to the ethics, generalization, and long-term impacts of deploying such systems at scale. A recurring question in my mind is: Can we make AI systems that are both useful and contextually aware, without requiring massive cloud infrastructure or exposing private data?

Some of my other public projects include:

  • YogAI: Posture correction for yoga sessions using webcam-based pose classification.
  • RedditUserPersona: Behavioral analysis and persona generation using local LLMs + Reddit scraping.
  • BonelossDetector: A dental AI model for identifying alveolar bone loss in X-rays using YOLOv8.
  • NextBest: Dual-mode recommendation engine (movies + anime) powered by content-based filtering.
  • FaultXpert: Micro-fault detection system for physical systems with real-time alerts.

My work continues to evolve, and I’m currently experimenting with lightweight RAG-based tools, emotion-aware models, and agents that can reason with voice, text, and image inputs — all while remaining fully offline or edge-deployable.

Contact

I'm open to new collaborations, freelance projects, or just tech chats.

Email:
shankarbisht1224@gmail.com

LinkedIn:
linkedin.com/in/shankarsingh077

GitHub:
github.com/shankarsingh077

Let's Talk

Feel free to reach out — whether it's a project, a question, or just a hello.

Featured Projects

DeepTrack — Real-time AI Attendance via Face Recognition

Zeni — Real-Time Bilingual Voice AI Robot

RedditUserPersona — LLM-based Persona Generation from Reddit