Chat API

The chat API (/api/chat) handles design generation requests with support for reference images, garment mockups, and multiple AI models.

Endpoint

POST /api/chat

Request Body

interface ChatRequest {
  prompt: string;           // Design description (min 8 chars)
  images?: string[];        // Legacy: reference image URLs
  garmentUrls?: string[];   // Garment blank URLs for mockup
  designUrls?: string[];    // Existing design URLs for reference
  model?: "nano-banana" | "nano-banana-pro"; // AI model
  threadId?: string | null; // Continue existing thread
}

Response

interface ChatResponse {
  threadId: string;
  messageId: string;
  mockups: string[];           // Generated mockup URLs
  graphicUrl: string | null;   // Clean design (no mockup)
  printAssetUrl: string | null; // Print-ready file
  tokensSpent: number;
  tokensRemaining: number;
}

Example Request

curl -X POST https://yourapp.com/api/chat \
  -H "Content-Type: application/json" \
  -H "Cookie: sb-access-token=..." \
  -d '{
    "prompt": "a vintage surf logo with palm trees and sunset vibes",
    "garmentUrls": ["https://cdn.shopify.com/blank-tee.png"],
    "model": "nano-banana"
  }'

Example Response

{
  "threadId": "550e8400-e29b-41d4-a716-446655440000",
  "messageId": "6ba7b810-9dad-11d1-80b4-00c04fd430c8",
  "mockups": [
    "https://replicate.delivery/pbxt/xxx/mockup-0.png",
    "https://replicate.delivery/pbxt/xxx/mockup-1.png"
  ],
  "graphicUrl": "https://replicate.delivery/pbxt/xxx/design.png",
  "printAssetUrl": "https://ucarecdn.com/xxx/print-asset.png",
  "tokensSpent": 12,
  "tokensRemaining": 88
}

Two-Step Generation

The chat API uses a two-step process:

Design Generation

AI generates clean design graphic from prompt

Mockup Compositing

Design placed on garment blanks (if provided)

// lib/ai/mockup.ts
const result = await generateUnified({
  prompt,
  referenceImages,
  garmentUrls,
  designUrls,
  mockupCount: 2,
  modelKey: "nano-banana",
});

// result.graphicUrl = clean design
// result.mockups = designs on garments

Model Selection

Model	Tokens	Quality	Speed
`nano-banana`	12	Standard	Fast (~5s)
`nano-banana-pro`	45	High	Medium (~15s)
`nano-banana-pro-4k`	90	Print-ready	Slower (~25s)

Default is nano-banana. Pro model recommended for final designs.

Reference Images

Garment Blanks

Provide garment images for mockup generation:

{
  "prompt": "minimalist mountain logo",
  "garmentUrls": [
    "https://cdn.example.com/black-tee.png",
    "https://cdn.example.com/white-hoodie.png"
  ]
}

Design References

Provide existing designs for style reference:

{
  "prompt": "similar style but with ocean theme",
  "designUrls": [
    "https://cdn.example.com/existing-design.png"
  ]
}

Combined

{
  "prompt": "adapt this logo for summer collection",
  "garmentUrls": ["https://cdn.example.com/tank-top.png"],
  "designUrls": ["https://cdn.example.com/my-logo.png"]
}

Token Validation

Before generation, the API checks token balance:

const tokensNeeded = estimateTokensForJob({ imagesRequested: 2 });
const tokensAvailable = await getProjectTokenBalance(projectId);

if (tokensAvailable < tokensNeeded) {
  return Response.json({
    error: "Token balance too low",
    tokensNeeded,
    tokensAvailable,
  }, { status: 402 });
}

Database Records

Each generation creates:

Thread (if new conversation)
User message with prompt
Assistant message with mockups
Render job with generation details
Render assets for each output image
Usage ledger entry for tokens

// Simplified flow
await supabase.from('threads').insert({ ... });
await supabase.from('messages').insert([userMessage, assistantMessage]);
await supabase.from('render_jobs').insert({ ... });
await supabase.from('render_assets').insert(assets);
await recordUsageLedger({ tokens: tokensSpent, ... });

Error Responses

Status	Error	Description
400	”Prompt is required”	Missing or too short prompt
401	”Unauthorized”	No valid session
402	”Token balance too low”	Insufficient tokens
500	”Generation failed”	AI provider error

Provider Fallback

If Replicate fails, the API falls back to fal.ai:

// lib/ai/mockup.ts
try {
  return await generateWithReplicate(params);
} catch (error) {
  console.warn("Replicate failed, trying fal.ai");
  return await generateWithFal(params);
}

Rate Limiting

10 requests per minute per user
100 requests per hour per project
Enforced via middleware

Streaming (Future)

The API currently returns complete results. Streaming support planned:

// Future: streaming response
return new Response(
  new ReadableStream({
    async start(controller) {
      // Stream progress updates
      controller.enqueue(JSON.stringify({ status: "generating" }));
      // ...
    }
  }),
  { headers: { "Content-Type": "text/event-stream" } }
);

Endpoints

​Chat API

​Endpoint

​Request Body

​Response

​Example Request

​Example Response

​Two-Step Generation

​Model Selection

​Reference Images

​Garment Blanks

​Design References

​Combined

​Token Validation

​Database Records

​Error Responses

​Provider Fallback

​Rate Limiting

​Streaming (Future)