Agentic loop that searches Arxiv and the web, deduplicates findings, and writes a structured briefing doc on any topic you give it.
Your portfolio is your proof
You are competing for a role you have not done yet. Without on-the-job FDE reps, a portfolio of things you have actually built and deployed is the evidence that you can do the work — and the clearest way to stand out from a crowd holding the same certificates.
Why a portfolio wins
Three reasons proof-of-work beats credentials for FDE hiring.
A portfolio is the proof of work that stands in for the on-the-job experience you can't have until someone hires you.
A green commit wall proves capability. Hiring panels screen for shipped work, not course-completion badges.
Consistent building is visible, searchable, and differentiating. Every commit is one more data point in your favour.
What an FDE portfolio must prove
Six capabilities, weighted to the technical core. Five are AI-engineering; the sixth is the customer craft that makes you an FDE rather than an AI engineer.
You can build an LLM agent that plans, calls tools, and acts — not just a chatbot.
You can ground a model in a customer's own data, with anti-hallucination guardrails and evals.
You can orchestrate specialised agents for a real workflow and keep them reliable.
You ship with tracing, evaluation, guardrails, PII redaction, and cost control — not demos.
You can deploy into the customer's environment — APIs, MCP servers, auth, RBAC, audit.
You can run discovery → SoW → build → deploy → handover, the way an FDE actually works.
Map your technical skills to projects
One read of where each project demonstrates each capability. The flagship FDE capstone (C7) covers the full set; the rest each prove a focused subset.
These projects are built on a specific skill set. See the full FDE skills roadmap →
Inside IK FDE: the live build slate
You do not start from a blank repo. The program ships you a portfolio: eight guided builds across the AI-engineering spine, then a capstone you own end-to-end — including a full FDE customer engagement.
Guided projects
P1–P8 · live code-along buildsCRM Lead Qualifier Agent
Build your first LLM-powered agent from scratch: function calling as a reasoning engine, tools for domain lookup and lead scoring, and a working think–act–observe loop. It is the 0→1 every later build stands on — the moment a model stops answering and starts acting.
- Function calling
- tool use
- the ReAct loop
- Python
- Git & GitHub
- Docker
- AWS
SupportDesk-RAG
Production RAG for IT support that answers strictly from retrieved tickets — never the model's imagination. You implement chunking and multi-index strategies, anti-hallucination guardrails, and a two-layer evaluation harness that proves every answer is grounded in real source data.
- LangChain
- LlamaIndex
- Chroma
- FAISS
- Python
- Git & GitHub
- Docker
- AWS
Multi-Agent Travel Planner
An Orchestrator → Search → Itinerary Planner → Synthesizer workflow that coordinates specialised agents toward one goal. You wire up routing, parallelization, subagent delegation, and shared state — the exact patterns behind every real multi-agent system you will ship later.
- LangGraph
- LangChain
- Tavily
- SerpAPI
- Python
- Git & GitHub
- Docker
- AWS
AxiomCart · Voice Shopping Assistant
A stateful voice assistant built on a LangGraph StateGraph, with RAG-based product discovery, human-in-the-loop order tracking, and a full Whisper + OpenAI TTS voice pipeline. Speech in, reasoning in the middle, speech out — a complete conversational loop.
- LangGraph
- Whisper
- OpenAI TTS
- RAG
- Python
- Git & GitHub
- Docker
- AWS
Real Estate Negotiation Simulator
Buyer and seller agents that negotiate against each other over typed Pydantic schemas, finite-state-machine terminal states, MCP-grounded tools, and true agent-to-agent transport on Google ADK. The deep end of agent communication protocols, built to spec.
- Google ADK
- A2A
- MCP
- Pydantic
- Python
- Git & GitHub
- Docker
- AWS
Vertical Insights Agent
A domain agent (Finance, Health, or SaaS) that ingests a corpus and ships extractive and abstractive summaries, sentiment scoring, and top-N recommendations through a Streamlit dashboard. Hybrid RAG-over-API retrieval with OAuth'd calls, rate-limit handling, caching, and structured JSON output.
- LangChain
- hybrid RAG + REST
- Streamlit
- Python
- Git & GitHub
- Docker
- AWS
Production-Ready Fintech Support Agent
Take a supervisor + specialist multi-agent system and harden it for production: LangSmith tracing, DeepEval evaluation, Guardrails AI, Presidio PII redaction, and live cost dashboards. This is the gap between a demo and something a regulated enterprise will actually run.
- LangSmith
- DeepEval
- Guardrails AI
- Presidio
- Python
- Git & GitHub
- Docker
- AWS
Domain-Specific Fine-Tuned Agent
A healthcare Q&A agent fine-tuned with 4-bit QLoRA on Qwen2.5-1.5B. You deploy the LoRA adapter to the HF Hub and benchmark it side-by-side against the base model in LangSmith — learning first-hand exactly when fine-tuning beats prompting or RAG, and when it does not.
- HF Transformers
- PEFT
- TRL SFTTrainer
- BitsAndBytes
- Python
- Git & GitHub
- Docker
- AWS
Capstone projects
C1–C7 · you own these end-to-endFinnie · AI Finance Assistant
A six-agent system on LangGraph + RAG that delivers personalised investment guidance, portfolio analysis, goal planning, and tax education through one conversational interface. Six specialists coordinated into a single coherent financial co-pilot.
- LangGraph
- RAG
- six-agent orchestration
- Python
- Git & GitHub
- Docker
- AWS
ContentAlchemy · Content Marketing Assistant
A multi-agent content platform that generates SEO-optimised blogs, LinkedIn posts, visuals, and long-form — with brand-voice and platform-format guardrails baked in so every output is on-brand and channel-ready, not generic AI filler.
- Multi-agent LLM orchestration
- Python
- Git & GitHub
- Docker
- AWS
Call Center Intelligence System
A seven-stage LangGraph pipeline that turns raw call audio into compliance-ready insight: STT + diarization → PII redaction → injection defense → dual-LLM analysis → a weighted QA scorecard → PDF/JSON reports. Audio in, audit-ready report out.
- LangGraph
- faster-whisper
- dual-LLM analysis
- Python
- Git & GitHub
- Docker
- AWS
Multi-Agent Customer Support Assistant
A supervisor coordinates specialist agents over a live relational database — catalog search, invoice lookups, identity verification, and per-customer memory — with strict anti-hallucination grounding so every answer comes from real data, never a guess.
- Multi-agent orchestration
- relational-DB grounding
- Python
- Git & GitHub
- Docker
- AWS
SmartDesk AI · IT & HR Ops Agent
First-line IT and HR support resolved end-to-end: RAG-grounded answers from a curated knowledge base, ticket creation in Jira / Asana / Notion with human-in-the-loop confirmation, and on-demand ticket status — the full self-service desk, automated.
- RAG
- multi-agent
- Jira / Asana / Notion
- Python
- Git & GitHub
- Docker
- AWS
BYOP · Bring Your Own Project
Build a personal or professional project of your choice with mentor guidance — structured feedback plus help on scope, tools, and framework selection. The one slot where the portfolio bends to your own goal instead of a set brief.
- Mentor-guided
- learner-led
- Python
- Git & GitHub
- Docker
- AWS
PriorAuth AI · Hospital Prior-Auth Co-pilot
Own a full FDE engagement at a regional hospital network whose prior-auth team is buckling under a 9-day turnaround and a stalled IT pilot that died in HIPAA review. Across five weeks you run discovery, write the SoW, then build a multi-agent co-pilot that reads chart notes, retrieves payer policy, drafts citation-backed auth requests, and routes ambiguous cases to a human reviewer — with a per-facility FastMCP server, end-user SSO, and full audit logging.
- OpenAI Agents SDK + Azure
- alt: Google ADK / Vertex
- Claude + Bedrock
- Python
- Git & GitHub
- Docker
- AWS
You graduate with this portfolio, mentor-guided — eight guided builds plus an end-to-end FDE capstone, not a blank repo.
Beyond the program — build your own
The program slate proves the floor. What separates you is what you build on your own time. There are two underused sources, and both are open right now.
One of the most underused portfolio opportunities is inside your current company. You don't need a side project to demonstrate real-world AI engineering — your employer is your first client. Find a workflow, process, or pain point on your team and build an internal AI agent or tool that solves it end-to-end. What matters is a full deployment pipeline against a genuine problem, not a toy demo.
It does three things at once: it makes your AI skills visible inside your org — which can accelerate your transition or open internal opportunities — it generates concrete, specific interview talking points, and it counts as legitimate portfolio work. "I deployed this at my company and here's what happened" lands far harder than a generic GitHub project. Treat it like a real engagement.
Personal projects have no gatekeeper. No approval, no budget, no scope review — just you, an idea, and a deploy button. That freedom is the point: you can ship every week, fail cheaply, and stack proof faster than any cohort moves.
Building and shipping something real is the most rewarding way to learn the stack — and the clearest signal that you'll do the same for a customer. Start small, ship often, and let the repo tell the story. The contribution wall above is not luck; it's a habit anyone can start this week.
The AI stack to build with
Pick the stack that matches your goal. Start low-friction to ship fast; graduate to the engineering-heavy stack to prove the production judgment that defines the FDE bar. Both use the same tools as the in-program build slate.
40 ideas to start today
Twenty personal builds and twenty you can ship inside your company. Switch tabs, filter by type, and pick one you can deploy this week.
Embed and index your entire Obsidian vault; query your own notes with citations back to the source file.
Build a custom MCP server exposing read/write tools for your calendar so any LLM client can schedule, reschedule, and summarize events.
Given a channel or playlist, transcribe videos, chunk by topic, and generate a weekly email digest with key timestamps.
Scrape job postings, score fit against your resume with an LLM, draft tailored cover letters, and track applications in a structured store.
Build a reusable framework that runs a prompt suite against multiple models and scores outputs on accuracy, format adherence, and hallucination rate.
Watch a downloads folder, classify files with an LLM, and move them into a labelled directory structure — with a dry-run mode for safety.
Poll subreddits on a schedule, cluster rising threads by topic using embeddings, and push a Slack or email digest.
Pull papers by keyword, chunk and embed abstracts + full text, and let you ask cross-paper questions with source citations.
Pull financials from a public API, pass structured data through a chain of LLM calls, and produce a formatted analyst-style report.
Multi-step agent that searches flights, hotels, and activities, resolves constraints (budget, dates, interests), and outputs a structured day-by-day plan.
Index a personal recipe collection, accept natural-language ingredient constraints, and return ranked suggestions with substitution notes.
Scrape curated sources, run LLM summarization and deduplication, render an HTML email, and schedule delivery — end to end without manual editing.
Given a GitHub URL, clone the repo, walk the file tree, and produce a structured README-style summary of architecture, dependencies, and entry points.
Pull weekly player stats via API, engineer features, and run an LLM chain that drafts natural-language waiver-wire recommendations.
Build a chatbot grounded in a corpus of source material (interviews, writings, transcripts) with retrieval to stay in-character and cite sources.
Connect an LLM to a local or hosted database; accept plain-language questions, generate and validate SQL, execute, and return formatted results.
Identify contacts who missed an event, score re-engagement likelihood from prior behavior, and generate personalized follow-up emails in bulk.
Pull merged PRs and commit messages from GitHub, cluster by theme with embeddings, and draft versioned release notes in your team's style.
Upload any PDF or doc bundle; an agent extracts structure, answers questions, flags inconsistencies, and outputs a structured JSON summary.
Ingest webinar/demo sign-up data and transcripts, run an LLM diagnostic against an ICP rubric, and output a scored JSON payload to CRM.
Index wikis, runbooks, and Confluence docs; serve a chat interface so engineers and ops staff can query institutional knowledge with citations.
Automated harness that runs production prompts against a rubric, scores outputs across dimensions (accuracy, tone, policy), and surfaces regressions.
Process call transcripts through a structured extraction chain to surface objections, competitor mentions, next steps, and miss-selling signals.
Classify incoming tickets by product area, urgency, and sentiment using an LLM; route to the right queue and draft a first-response suggestion.
Scrape and parse public RFPs and competitor filings, extract pricing and feature signals, and generate a structured competitive brief.
Run embedding similarity and structured field matching across AP invoices to flag likely duplicates before payment approval.
Given an incident alert, the agent queries logs and metrics, retrieves relevant runbooks via RAG, and drafts a ranked RCA with suggested mitigations.
Extract structured data (tables, clauses, key fields) from PDFs at scale using multimodal LLMs and load into a data warehouse.
Embed resumes and JDs, run semantic similarity ranking, and surface top-N candidates with LLM-generated fit reasoning per role.
Given a data source and a report template, an agent queries the data, performs analysis, populates the template, and exports a formatted PDF.
Wrap internal APIs (ticketing, HR, analytics) as MCP-compatible tool endpoints so any LLM client in the org can call them safely.
Version-control prompts, run evals on every change, track metric drift across model upgrades, and gate deploys on eval thresholds.
Ingest reviews from multiple sources, extract structured themes and sentiment with an LLM, and push aggregated signals to a BI dashboard.
RAG system over HR and onboarding docs that new hires can query in natural language — with fallback escalation to a human if confidence is low.
Parse vendor and customer contracts, extract key clauses (termination, liability, SLA), flag non-standard terms, and load into a structured store.
Score unconverted leads from event behavior, generate personalized outreach copy with an LLM, and push to email or CRM sequences in bulk.
Collect labeled examples from internal data, fine-tune a base model on a classification task (ticket type, lead segment, doc category), and serve via API.
Orchestrate a planner + researcher + writer agent loop that synthesizes internal and external sources into a structured deliverable on demand.
Build a middleware layer that runs every LLM response through policy, hallucination, and PII checks before it reaches the end user.
Your portfolio blueprint
You do not need fifteen projects. Four, chosen to cover the bar, beat a pile of demos.
Proves you can ground a model in customer data with evals.
Proves orchestration and reliability under coordination.
Proves you ship with tracing, guardrails, and cost control.
Proves the customer craft — discovery to handover — that defines the role.
Built your portfolio? The next test is talking about it. See the FDE interview loop →
An Interview Kickstart advisor walks you through where you stand today, the exact gap to close, and the fastest route to a Forward Deployed Engineer offer — built around your background.
Book a call with an advisor →