Phi-3 Family
The Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and the next size up across a variety of language, reasoning, coding, and math benchmarks. This release expands the selection of high-quality models for customers, offering more practical choices for composing and building generative AI applications.
The Phi-3 Family includes mini, small, medium and vision versions, trained based on different parameter amounts to serve various application scenarios each model is instruction-tuned and developed in accordance with Microsoft’s Responsible AI, safety and security standards to ensure it’s ready to use off-the-shelf.
Phi-3-Mini
Phi-3-mini is a 3.8B parameter language model, available in two context lengths 128K and 4K.
Phi-3-Mini is a Transformer-based language model with 3.8 billion parameters. It was trained using high-quality data containing educationally useful information, augmented with new data sources consisting of various NLP synthetic texts, and both internal and external chat datasets, which significantly improve chat capabilities. Additionally, Phi-3-Mini has been chat fine-tuned after pre-training through supervised fine-tuning (SFT) and Direct Preference Optimization (DPO). Following this post-training, Phi-3-Mini has demonstrated significant improvements in several capabilities, particularly in alignment, robustness, and safety. The model is part of the Phi-3 family and comes in the Mini version with two variants, 4K and 128K, which represent the context length (in tokens) that it can support.
Phi-3-Small
Phi-3-small is a 7B parameter language model, available in two context lengths 128K and 8K.
Phi-3-Small is a Transformer-based language model with 7 billion parameters. It was trained using high-quality data containing educationally useful information, augmented with new data sources that consist of various NLP synthetic texts, and both internal and external chat datasets, which significantly improve chat capabilities. In addition, Phi-3-Small has been chat fine-tuned after pre-training via supervised fine-tuning (SFT) and Direct Preference Optimization (DPO). Following this post-training, Phi-3-Small has shown significant improvements in several capabilities, particularly in alignment, robustness, and safety. Phi-3-Small is also more intensively trained on multilingual datasets compared to Phi-3-Mini. The model family offers two variants, 8K and 128K, which represent the context length (in tokens) that it can support.
Phi-3-Medium
Phi-3-medium is a 14B parameter language model, available in two context lengths 128K and 4K.
Phi-3-Medium is a Transformer-based language model with 14 billion parameters. It was trained using high-quality data containing educationally useful information, augmented with new data sources that consist of various NLP synthetic texts, and both internal and external chat datasets, which significantly improve chat capabilities. Additionally, Phi-3-Medium has been chat fine-tuned after pre-training through supervised fine-tuning (SFT) and Direct Preference Optimization (DPO). Following this post-training, Phi-3-Medium has exhibited significant improvements in several capabilities, particularly in alignment, robustness, and safety. The model family offers two variants, 4K and 128K, which represent the context length (in tokens) that it can support.
Phi-3-vision
The Phi-3-vision is a 4.2B parameter multimodal model with language and vision capabilities.
Phi-3-vision is the first multimodal model in the Phi-3 family, bringing together text and images. Phi-3-vision can be used to reason over real-world images and extract and reason over text from images. It has also been optimized for chart and diagram understanding and can be used to generate insights and answer questions. Phi-3-vision builds on the language capabilities of the Phi-3-mini, continuing to pack strong language and image reasoning quality in a small size.