Industry Ingest [Archive]

Historical record of major AI breakthroughs and milestones

Category:

Show archived developments

Showing 31 of 31 developments

2025(4 developments)

Image GenerationAug 26, 2025Recent

Google's 'Nano Banana' AI Image Editor Launches in Gemini 2.5 Flash

Google releases Gemini 2.5 Flash Image (codenamed 'Nano Banana'), a breakthrough AI image editor that excels at maintaining character consistency while enabling natural language transformations and multi-image blending. Available via Gemini API at $0.04 per image.

Foundation ModelsAug 7, 2025Recent

GPT-5: OpenAI's Most Advanced Model

OpenAI unveils GPT-5, breaking the 98% barrier on HumanEval with 98.2% and achieving 99.1% on GSM8K. The model demonstrates unprecedented capabilities in code generation and mathematical reasoning.

Foundation ModelsAug 5, 2025Recent

Claude Opus 4.1: Anthropic's Latest Flagship Model

Anthropic releases Claude Opus 4.1 with breakthrough performance, achieving 97.5% on HumanEval and 98.7% on GSM8K. Features enhanced reasoning capabilities and improved long-context understanding.

World ModelsAug 5, 2025Recent

Genie 3: Google DeepMind's Interactive World Model

DeepMind unveils Genie 3, a foundation world model generating real-time interactive environments at 24 fps from text prompts. The model represents a crucial stepping stone toward AGI with auto-regressive frame generation and physical consistency without explicit 3D representation.

2024(10 developments)

Foundation ModelsDec 12, 2024

Gemini 2.0: Native Tool Use and Agentic Capabilities

Google releases Gemini 2.0 with native tool use capabilities, enabling AI agents to interact with external systems and perform complex multi-step tasks autonomously.

Foundation ModelsOct 22, 2024

Claude 3.5 Computer Use: AI Controls Computers

Anthropic introduces computer use capabilities, allowing Claude to control computers through screenshots, opening new possibilities for automation and accessibility.

Foundation ModelsSep 12, 2024

OpenAI o1: Reasoning Through Chain-of-Thought

OpenAI releases o1-preview, a model that uses extended chain-of-thought reasoning to solve complex problems in mathematics, coding, and science.

Foundation ModelsAug 23, 2024

Llama 3.1 405B: Meta's Open-Source Frontier Model

Meta releases Llama 3.1 with a 405B parameter model, the largest open-source model to date, democratizing access to frontier AI capabilities.

Foundation ModelsJun 20, 2024

Claude 3.5 Sonnet: Enhanced Coding and Reasoning

Anthropic releases Claude 3.5 Sonnet with significant improvements in code generation, mathematical reasoning, and visual understanding.

Foundation ModelsMay 13, 2024

GPT-4o: Omni-Modal Understanding

OpenAI introduces GPT-4o with native multimodal capabilities, processing text, vision, and audio in a unified architecture for more natural interactions.

ResearchApr 15, 2024

AI Index 2024: AI Beats Humans on Several Benchmarks

Stanford HAI's AI Index reveals AI now surpasses human performance on several benchmarks including image classification, visual reasoning, and English understanding, but still lags in complex tasks like math and planning.

Foundation ModelsMar 4, 2024

Claude 3: Opus, Sonnet, and Haiku Family

Anthropic releases the Claude 3 model family with vision capabilities, offering different sizes for various use cases from edge devices to complex reasoning tasks.

Computer VisionFeb 15, 2024

Sora: OpenAI's Text-to-Video Generation

OpenAI unveils Sora, a text-to-video model capable of generating minute-long videos with consistent characters, physics, and camera movements.

Foundation ModelsFeb 8, 2024

Gemini 1.5: 1M Token Context Window

Google introduces Gemini 1.5 with a breakthrough 1 million token context window, enabling processing of entire books, codebases, and long videos.

2023(7 developments)

Foundation ModelsDec 6, 2023

Gemini: Google's Multimodal Model Family

Google launches Gemini, a family of multimodal models designed from the ground up to understand text, images, audio, video, and code.

Foundation ModelsNov 6, 2023

GPT-4 Turbo: 128K Context and Improved Performance

OpenAI releases GPT-4 Turbo with 128K context window, improved instruction following, and significantly reduced costs.

Computer VisionSep 20, 2023

DALL-E 3: Advanced Text Understanding for Images

OpenAI launches DALL-E 3 with dramatically improved text understanding and adherence to prompts, integrated with ChatGPT for iterative refinement.

Foundation ModelsJul 11, 2023

Claude 2: 100K Context Window

Anthropic releases Claude 2 with 100K token context window, improved reasoning, and better performance on coding and mathematical tasks.

Foundation ModelsMay 10, 2023

PaLM 2: Google's Multilingual Reasoning Model

Google announces PaLM 2 with improved reasoning capabilities, better multilingual support, and efficient scaling across different model sizes.

Foundation ModelsMar 14, 2023

GPT-4: Multimodal Capabilities

OpenAI releases GPT-4 with vision capabilities, improved reasoning, and significantly better performance on academic and professional exams.

Foundation ModelsMar 14, 2023

Claude: Constitutional AI Assistant

Anthropic launches Claude, an AI assistant trained with Constitutional AI to be helpful, harmless, and honest.

2022(4 developments)

Foundation ModelsNov 30, 2022

ChatGPT: Conversational AI Reaches Mainstream

OpenAI releases ChatGPT, reaching 100 million users in just 2 months and sparking global interest in conversational AI.

Computer VisionAug 22, 2022

Stable Diffusion: Open-Source Image Generation

Stability AI releases Stable Diffusion, democratizing AI art generation with an open-source model that can run on consumer hardware.

Computer VisionMay 23, 2022

Imagen: Google's Photorealistic Text-to-Image

Google Research unveils Imagen, a text-to-image diffusion model with unprecedented photorealism and deep language understanding.

Foundation ModelsApr 4, 2022

PaLM: 540B Parameters Show Emergent Abilities

Google introduces PaLM with 540 billion parameters, demonstrating breakthrough performance and emergent capabilities in reasoning tasks.

2021(3 developments)

Foundation ModelsAug 10, 2021

Codex: AI Code Generation Powers GitHub Copilot

OpenAI releases Codex, a descendant of GPT-3 fine-tuned on code, powering GitHub Copilot and revolutionizing programming assistance.

Computer VisionJan 5, 2021

DALL-E: Text-to-Image Generation

OpenAI introduces DALL-E, demonstrating the ability to generate creative and diverse images from text descriptions.

Computer VisionJan 5, 2021

CLIP: Connecting Text and Images

OpenAI releases CLIP, a neural network that learns visual concepts from natural language supervision, enabling zero-shot image classification.

2020(3 developments)

ResearchNov 30, 2020

AlphaFold 2: Solving Protein Folding

DeepMind's AlphaFold 2 achieves a breakthrough in protein structure prediction, solving a 50-year grand challenge in biology.

Foundation ModelsJun 11, 2020

GPT-3: 175B Parameters Enable Few-Shot Learning

OpenAI releases GPT-3 with 175 billion parameters, demonstrating remarkable few-shot learning capabilities across diverse tasks.

ResearchJan 23, 2020

Scaling Laws for Neural Language Models

OpenAI discovers predictable scaling relationships between model size, dataset size, and performance, guiding future model development.