ML Research | Consti Ertel

Machine Learning Research

RAID: Retrieval-Augmented World Models for Robotics

A robotics research project exploring retrieval-augmented inverse dynamics and world-model-style representations for manipulation. Built with PyTorch on LIBERO-style simulation workflows, using frozen visual representations, next-state prediction, and a demonstration memory bank to test whether memory can improve action inference when direct environment interaction is expensive.

Read paper Watch demo GitHub

Tiny GPT

A complete GPT-style language model implemented from the ground up in PyTorch, trained on 104M tokens drawn from a large corpus of short narratives. Three model configurations were trained and evaluated: small (1.05M parameters), medium (3.40M), and large (7.07M). Every component of the stack was written from scratch: a 3,000-token BPE vocabulary, sinusoidal positional encodings, pre-norm transformer blocks with 8-head causal self-attention, GELU feed-forward layers, and a weight-tied output projection. The large configuration converges to a validation perplexity of 8.08, more than halving the perplexity of the small model (17.75) and producing coherent multi-sentence narrative output.

Read case study GitHub

Deep RL Agents: DQN and PPO for Strategic Play

A two-part reinforcement learning project: a rigorous treatment of value functions, Bellman equations, policy gradients, and advantage estimation, followed by full implementations of DQN (experience replay, target network) and PPO (clipped surrogate objective, generalized advantage estimation) trained on a Connect Four environment. Both agents learn from self-play, and their training dynamics are measured through win-rate curves and loss trajectories across thousands of episodes.

Read case study GitHub

JEPA, Rebuilt from the Paper

A research report and experiment comparing two self-supervised vision objectives: JEPA-style latent prediction versus masked-patch reconstruction (MAE-style). Implements both approaches with matched Vision Transformer backbones on CIFAR-10, then evaluates the learned representations through linear probing, retrieval, anomaly detection, and embedding visualization. The report argues that the choice of objective shapes not just accuracy but the structural quality and perturbation-invariance of the learned embedding space.

Read paper GitHub

Fincast

A financial valuation platform built with Next.js and Claude AI. Enter any stock ticker, choose DCF or exit-multiple analysis (P/E, EV/EBITDA, EV/FCF, EV/Sales), and get AI-generated 5-year projections, fair value estimates, and upside-to-current-price calculations, all powered by real-time market data and the Anthropic API.

Open live GitHub

Personalized Listing Photo Editing

A knowledge distillation pipeline for personalizing real-estate photo edits. FLUX.1 Kontext, a flow-matching diffusion transformer, acts as teacher: generating content-preserving, instruction-following edits that capture a photographer's editing style. InstructPix2Pix with LoRA distills that behavior into compact, photographer-specific adapters (~3 MB each), collapsing a multi-billion-parameter diffusion stack into a lightweight module deployable without the teacher. BLIP-2 bridges the gap between visual style signals and the language-conditioned edit instructions both models require.

Open live GitHub

Tesla Supply Chain Consulting Project

Tesla brought this problem to our team because the standard spreadsheet approach to factory location was not good enough for a decision of this magnitude. We built a Bayesian simulation comparing manufacturing costs across three countries, incorporating raw materials, labor, logistics, foreign exchange, tariffs, and discrete risk events. The model replaces hand-crafted assumptions with data-driven posteriors fit to FRED economic series, producing uncertainty intervals that reflect how these variables actually move together.

View one pager Open live