Key Takeaways
- DALL-E 1 was trained on 250 million image-text pairs scraped from the internet
- DALL-E 2 uses a diffusion model with CLIP for text conditioning trained on repurposed LAION dataset
- DALL-E 3 was trained on synthetic captions generated by GPT-4 for improved prompt adherence
- DALL-E 1 model has 12 billion parameters in total
- DALL-E 2 prior GLIDE model has 3.5 billion parameters
- DALL-E 3 uses a 128x128 to 1024x1024 upscaling decoder with 1 billion parameters
- DALL-E 3 achieves 92% prompt adherence on Evals benchmark
- DALL-E 2 scores 2.0 on 0-4 human preference scale vs DALL-E 1's 1.7
- DALL-E 1 achieves 72.3% nearest neighbor accuracy on retrieval tasks
- DALL-E 3 integrated in ChatGPT Plus with 50 generations/week limit
- DALL-E 2 generated over 2 million images daily at peak in 2022
- Over 1.5 million users accessed DALL-E via ChatGPT by Q1 2024
- DALL-E 3 safety filters block 86% of violent prompts
- C2PA metadata embedded in 100% of DALL-E 3 outputs
- DALL-E 2 rejected 1.5% of generation attempts for policy violations
DALL-E stats include training, compute, safety, performance, and efficiency.
Model Parameters and Architecture
Model Parameters and Architecture Interpretation
Performance Metrics
Performance Metrics Interpretation
Safety and Moderation
Safety and Moderation Interpretation
Training and Data
Training and Data Interpretation
User Engagement and Usage
User Engagement and Usage Interpretation
Sources & References
- Reference 1ARXIVarxiv.orgVisit source
- Reference 2OPENAIopenai.comVisit source
- Reference 3LAIONlaion.aiVisit source
- Reference 4THEVERGEtheverge.comVisit source
- Reference 5PLATFORMplatform.openai.comVisit source
- Reference 6TECHCRUNCHtechcrunch.comVisit source
- Reference 7BUSINESSINSIDERbusinessinsider.comVisit source
- Reference 8BLOGSblogs.bing.comVisit source
- Reference 9DESIGNERdesigner.microsoft.comVisit source
- Reference 10HELPhelp.openai.comVisit source
- Reference 11SOCIALMEDIATODAYsocialmediatoday.comVisit source





