🎓 Data Science Capstone Project

Learn Python &
Data Science,
one lesson at a time

DataLingo is an AI-powered learning platform that generates structured, beginner-friendly data science lessons with interactive MCQ quizzes and a built-in tutor chatbot.

50+
Lessons
100+
Practice MCQs
Free
Forever
🔥 7 Day Streak!
https://datalingo-ucsd.vercel.app
Level 1: Fundamentals
Seeing the World Through Data
Computational Tools
Completed
2
Statistical Techniques
In Progress ⭐
🔒
Why Data Science?
Locked
Lesson Progress 3/5 questions
🎯 About the Project

What is DataLingo?

The Problem

Creating quality data science learning content is labor-intensive and hard to scale. Beginners often lack structured, interactive, and affordable resources to get started.

💡
Our Solution

We built a full-stack pipeline that uses LLMs to generate structured lesson and quiz datasets, serves them through a FastAPI backend, and delivers them via a beautiful Next.js frontend with an AI tutor.

🎯
Who It's For

Absolute beginners who want to learn Python and Data Science through bite-sized, gamified lessons. Free, forever.

"Our goal is to reduce the manual effort of content creation while still giving learners a guided experience with quizzes and tutor-style explanations."
Intended Impact

A reproducible AI pipeline for generating lesson content, and a platform that makes that content usable for real learners. DataLingo demonstrates how generative AI can accelerate educational content creation without sacrificing quality.

⚙️ System Architecture

How DataLingo Works

A three-layer system that takes raw data science content and turns it into interactive lessons.

1
Data Collection

Raw data science content is collected and cleaned using Python scripts and Jupyter notebooks.

Python · Notebooks
2
AI Generation

n8n workflows apply LLM-based generation to transform content into structured lessons and MCQ datasets.

n8n · OpenAI · Supabase
3
Backend API

FastAPI loads the lesson JSON and exposes endpoints for lessons, questions, and a context-scoped tutor chat.

FastAPI · Python
4
Frontend App

Next.js renders lesson content, quizzes, and the embedded AI tutor widget for learners.

Next.js · React · Tailwind
✨ Key Features

Everything a beginner needs

DataLingo combines gamification, AI tutoring, and structured curriculum to make learning stick.

📚
Structured Lessons

Bite-sized, beginner-friendly lessons covering Python basics, statistics, and data science fundamentals, organized by topic and level.

🧩
Interactive MCQ Quizzes

Multiple-choice quizzes with difficulty ratings test your understanding after each lesson. Progress is saved automatically.

🤖
AI Tutor Chat

Stuck on a question? Ask the built-in tutor for hints, explanations, or breakdowns of each answer choice, without spoiling the answer.

🏆
Gamification

Streaks, XP points, locked levels, and daily goals keep learners motivated and coming back.

🔄
Reproducible Pipeline

Lesson content is generated via an automated n8n + LLM workflow inspired by academic research papers, making it easy to scale to new topics and difficulty levels.

🆓
Free Forever

DataLingo is completely free to use. No paywalls, no subscriptions, just learning.

🛠️ Tech Stack

Built with modern tools

A full-stack system combining AI workflows, a Python API, and a React frontend.

🔧 Backend
🐍 Python 3.8+ ⚡ FastAPI 🦄 Uvicorn 🤖 OpenAI SDK 🗄️ Supabase 📦 Pydantic
🎨 Frontend
⚛️ Next.js 16 💙 React 19 📘 TypeScript 🎨 Tailwind CSS 📝 react-markdown
🤖 AI Pipeline
🔄 n8n Workflows 🧠 OpenAI GPT 📄 JSON Datasets 🔬 Jupyter
👥 The Team

Meet the builders

DSC 180B Capstone - University of California San Diego

Capstone Mentor: Dr. Benjamin Smarr

Norah Kerendian
Norah Kerendian
AI Engineer
Natasha Lie
Natasha Lie
Frontend Developer
Yiran Zhao
Yiran Zhao
Database Engineer
Chloe Kim
Chloe Kim
Frontend Developer
Sadrac Santacruz
Sadrac Santacruz
AI Engineer
Kevin Wu
Kevin Wu
Database Engineer
Omar Ali
Omar Ali
Recommender System Developer
Eshaan Roy
Eshaan Roy
Recommender System Developer
🚀 Get Started

Ready to start learning?

Jump into DataLingo and start your data science journey today - completely free.

🚀 Launch DataLingo 📂 View on GitHub