Consulting

AI Infrastructure

You don't need an ML team to ship AI features. You need infrastructure that lets your engineers integrate LLMs without becoming AI experts.

I help startups deploy AI applications reliably - whether that's integrating OpenAI APIs, running local models, or deploying to air-gapped environments where cloud APIs aren't an option.

What I Do

LLM Integration Architecture

Design the infrastructure for AI-powered features. API gateway patterns, prompt management, response caching, fallback strategies. Production-ready, not prototype-quality.

Self-Hosted AI Deployment

Deploy open-source models (Llama, Mistral, etc.) on your own infrastructure. GPU provisioning, model serving, scaling strategies. For when you can't or won't use cloud APIs.

Air-Gapped AI Infrastructure

Deploy AI capabilities in environments with no internet access. On-premises, government, healthcare, financial services. This is what I do at Tensor9.

Cost Optimization

OpenAI bills adding up? Audit token usage, implement caching, evaluate when to switch to smaller models or self-hosted alternatives. Right-size your AI spend.

RAG & Vector Search

Retrieval-augmented generation for your data. Vector database selection (Pinecone, Weaviate, pgvector), embedding strategies, chunk sizing. Make AI work with your content.

Background

Currently a founding engineer at Tensor9, building an Any-Prem platform for deploying SaaS and AI applications anywhere - AWS, Azure, GCP, on-premises, or air-gapped environments.

20+ years building infrastructure at Amazon, AWS, Uber, Twitter, and Meta. I bring hyperscale operational experience to AI deployment challenges.

Let's Talk

30-minute intro call to discuss your AI infrastructure needs.

AI Platforms

  • OpenAI / Azure OpenAI
  • Anthropic Claude
  • AWS Bedrock
  • Self-hosted (Llama, Mistral)

Infrastructure

  • GPU provisioning (AWS, GCP)
  • Model serving (vLLM, TGI)
  • Vector databases
  • Air-gapped deployment

Use Cases

  • Product AI features
  • Internal tooling
  • Document processing
  • Customer support automation

Location

Seattle, WA

Remote-friendly