Skip to main content

AI API Model Guide

Your complete reference for choosing, configuring, and optimizing AI models. Get the specs, parameters, and practical recommendations all in one place.
Everything you need to know about picking and using our AI models. We’ve laid out the specs, parameters, and best practices so you can get the most out of each one.

Quick Start: Find Your Model

Not sure which model to use? Here’s the breakdown:

Text Generation

From quick chatbots to complex reasoning. GPT-4o for most cases, O3 Pro for heavy thinking.

Image Generation

Professional artwork to product photos. Midjourney for style, GPT-Image-1 for products, Flux for speed.

Video Creation

Cinematic clips to stylized storytelling. Gemini Veo 3.1 for quality, Sora 2 for versatility.

Image Editing

Transform existing images. Magic Edit for natural changes, Flux for style transfers.

Text Generation Models

Pick the right LLM for your project based on speed, reasoning power, and cost.
Model NameCapabilityCostBest ForSpecial Notes
OpenAI O3 Pro (Deep Thinker)Heavy-duty reasoning, complex analysisMaxComplex problem-solving, researchHigh precision
OpenAI GPT-5 (Omni Expert)Advanced reasoning, multimodalStandardAdvanced understanding, multimodal tasksVersatile
OpenAI GPT-4o (Smart Assistant)Conversational, general purposeStandardMost everyday tasks, chatbotsReliable workhorse
Claude 4.0 Sonnet (Creative)Writing, creative contentStandardCopywriting, creative workExcellent for long-form
Gemini 2.5 Pro (Tech Expert)Technical Q&A, structured thinkingStandardTechnical documentation, code reviewGreat for precision
OpenAI o4-mini (Real-time)Quick responses, time-sensitiveStandardReal-time interactions, live chatUltra-fast
Claude Haiku 3.5 (Summarizer)Summarization, content condensingLowQuick summaries, bulk processingCost-effective
Gemini 2.5 Flash (Fastest)Lightweight tasks, snappy responsesLowBasic tasks, high-volume processingSpeed-optimized
DeepSeek V3 (Multi-Tool)General-purpose, cost-effectiveLowBudget-conscious projectsAll-rounder
DeepSeek R1 (Problem Solver)Step-by-step reasoningLowTransparent reasoning, logic problemsShows its work
  • Need to think hard? Go with O3 Pro or GPT-5 for complex analysis
  • Building a chatbot? GPT-4o is your default choice
  • Writing-heavy content? Claude Sonnet shines for copywriting and creative work
  • Budget tight? DeepSeek or Haiku will get you there without breaking the bank
  • Speed matters most? o4-mini or Flash for real-time applications

Image Generation Models

From fast drafts to premium artwork—pick based on your quality needs and timeline.
ModelUse CaseCostResponse FormatSupported FeaturesQuality
Midjourney (Creative Studio)Professional artistic imagery, style diversityStandardOnly JPG, JSONCustom style, detailed promptsHigh, authentic
GPT-Image-1 (Design Studio)Product visuals, polished designsStandardOnly JPG, JSONDesign-focused, professionalHigh
Gemini Imagen 4 (Consistent Image)High-quality generation, clear text renderingStandardJSON_JSONText-clear, consistent resultsHigh
Doubao Seedream 4.0 (Smart Image)Premium 2K output, fine detailMaxOnly urlFine detail, HD qualityPremium
Gemini Nano (Magic Edit)Photo editing, natural modificationsStandardOnly jpgNatural edits, style adjustmentHigh
Flux AI (Quick Image)Fast turnaround, basic needsLowMtSimple, efficientStandard

Midjourney Parameters Guide

If you’re using Midjourney Creative Studio, here are the key parameters:

Aspect Ratio

--aspect or --arExamples: 16:9, 1:1, 9:16

Quality

--quality or --qValues: 0.25, 0.5, 1 (higher = more detail, more cost)

Style

--style raw, --style cute, etc.Affects artistic direction and mood

Image Upload

Include URL of reference imageFor guided generation from existing images
Pro tip: Start with --q 0.5 to preview, then bump to 1 for final versions to save credits.

Image Editing & Image-to-Image Models

Transform existing images with AI, from subtle edits to full style transfers.

Gemini Nano (Magic Edit)

Use Case: Photo editing by natural descriptionCost: Standard | Format: JPGEdit, adjust, and modify existing images with natural language instructions.

Flux

Use Case: Style transfer & variationsCost: Standard | Format: URLCreate variations and transform images with different artistic styles.

Midjourney

Use Case: Upscaling & iterationCost: Standard | Format: MTNatural editing with full Midjourney parameters for advanced control.

Doubao

Use Case: Quick practical editsCost: Standard | Format: MTFast processing for straightforward image adjustments and modifications.

Fooocus

Use Case: Batch & custom editingCost: Standard | Format: MT-JSON URLProcess multiple images with custom edits and batch operations.

Magic Edit Specs

  • Aspect ratio: 16:9, 1:1, 9:16, etc. (defaults: 1:1)
  • Response: Always returns url (typically generates 4 images unless tweaked)
  • Display: Images typically display at default size unless you specify a different resolution
  • Quality: Solid results, can adjust style via prompt
Example: “Edit this beach photo to have a sunset sky instead” — Magic Edit handles it naturally

Video Generation Models

Create videos from scratch or from images. Choose based on quality, speed, and complexity.
ModelText-to-VideoImage-to-VideoDescriptionCapabilitiesCost
Gemini Veo 3.1 (Cinema Studio)Full-featured video generation with audio synthesis. Synced by the latest AI model.High-quality generation with audio, synchronized speech/musicMax credits
Sora 2 (Synced-audio)Richly detailed, dynamic clips from natural language or reference imagesSynced audio, dynamic output, detailed generationStandard credits
Midjourney Video (Artistic Video)Stylized storytelling with visual flair, artistic motionArtistic effects, smooth transitions, stylized motionStandard credits
  • Video Parameters
  • Use Case Examples
Common Parameters:
  • prompt (required): Text description of the video to generate
  • video_resolution: Width (default: 720p)
  • aspect_ratio (optional): 16:9, 9:16, 1:1 (default: 16:9)
  • generate_audio (optional): Generate audio track (true or false)
  • image_input (optional): Base image file to lock the opening frame for image-to-video generation
Response: Generates video and returns url (typically 1 video output)

Model Parameters & Advanced Usage

Temperature (0.0 - 2.0)
  • Lower = more focused, deterministic
  • Higher = more creative, varied
  • Default: typically around 0.7-1.0
Max Tokens
  • Controls response length
  • Varies by model
Top P (0-1)
  • Controls diversity of output
  • Lower = more focused, higher = more variety
Size
  • Standard: 1024x1024, 512x512
  • Some models support custom: 1280x720, 720x1280, etc.
  • Check model specs for supported sizes
Number of Images
  • Most models: 1-4 outputs
  • Batch operations: can request multiple sets
Guidance Scale
  • Controls how strictly the model follows your prompt
  • Higher = more adherence, lower = more creative freedom
Duration
  • Typical: 5-10 seconds depending on model
  • Some models support longer outputs with parameter adjustment
Frame Rate
  • Standard: 24-30 fps
  • Can specify for smooth playback

Choosing the Right Model: Quick Decision Tree

What are you building?

  • Chatbot
  • Content Creation
  • E-commerce
  • Video Platform
  • Photo Editor
  • Budget App
  • Research/Analysis
  • Real-time App
→ GPT-4o (Smart Assistant)Reliable, conversational, great for customer support. Fast enough for real-time chat.

Cost Breakdown

Understanding Credit Tiers

Max Credit Models

Highest quality, best reasoning or outputUse when quality and accuracy matter most

Standard Credit Models

Balanced cost and qualityBest for most projects — good value

Low Credit Models

Budget-friendly, still solid qualityHigh volume, quick drafts, tight budgets

Pro Tips to Save Credits

  1. Start small — Use a low/standard model first, upgrade if needed
  2. Batch operations — Process multiple requests together when possible
  3. Optimize prompts — Better descriptions = fewer re-runs
  4. Version control — Save what works so you don’t re-run experiments
  5. Quality settings — Use lower quality settings for drafts and previews

API Response Formats

When using our API directly, models return responses in these formats:

`url`

Direct image/video URL for immediate access

`JPG`

JPG image format for compatibility

`JSON`

Structured JSON response for programmatic use

`MT`

Native Midjourney format with advanced metadata
Always check your specific model’s output spec when using the API directly. Some models support multiple formats; choose based on your integration needs.

Rate Limits & Quotas

  • Concurrent requests: Varies by model tier
  • Monthly limits: Can be adjusted based on your plan
  • Burst capacity: Standard models support higher burst rates
Contact our support team if you need custom quotas for your project.

Still Have Questions?

Pro tip: This page has decision trees (Tabs section) that match your use case — start there to narrow down your choices.