Falcon 2 11B 现在可以在 Amazon SageMaker JumpStart 上使用

2026-01-27 12:01:50

Falcon 2 11B 现已在 Amazon SageMaker JumpStart 上发布

由 Supriya Puragundla Armando Diaz Avan Bala Farooq Sabir Hemant Singh 和 Niithiyn Vijeaswaran 于 2024 年 5 月 31 日发布在 Amazon SageMaker，Amazon SageMaker JumpStart，公告，技术如何做永久链接评论分享

今天，我们很高兴地宣布，下一代 Falcon 2 系列中的首个模型 Falcon 2 11B 基础模型FM现已通过 Amazon SageMaker JumpStart 提供部署和推理服务。

Falcon 2 11B 是一个在 55 万亿标记数据集上训练的密集解码器模型，支持多种语言。该模型可以让您迅速使用内置算法、FM 和预构建的 ML 解决方案，加快 ML 开发的速度。

在本文中，我们将介绍如何使用 SageMaker JumpStart 发现、部署并运行 Falcon 2 11B 模型。

关键要点

新发布的 Falcon 2 11B 模型是 TII 提供的基础模型，支持多语言。模型在 55 万亿标记数据集上训练，采用无因果解码器架构。在 Amazon SageMaker JumpStart 上，一键式部署使用户更快捷地进行模型推理。

什么是 Falcon 2 11B 模型

Falcon 2 11B 是 TII 在其新的人工智能AI模型系列 Falcon 2 下发布的首个基础模型。作为 Falcon 系列的下一代模型，它是更高效、易于获取的大型语言模型LLM，训练过程中使用了主要来源于 RefinedWeb 的 55 万亿标记数据集，并且具备 110 亿参数。该模型采用因果解码器架构，使其在自回归任务中表现突出，且具备多语言能力，可以流畅处理英语、法语、西班牙语、德语、葡萄牙语等语言的任务。

Falcon 2 11B 是一个原始的、预训练的模型，可以作为更专业化任务的基础，同时也允许您对模型进行微调以适应特定用例，如摘要生成、文本生成、聊天机器人等。

Falcon 2 11B 通过 SageMaker TGI 深度学习容器DLC支持，该容器基于文本生成推理的开放源代码、专用解决方案，使得高性能文本生成成为可能。

该模型遵循 TII Falcon License 20，这是一个基于宽松 Apache 20 的软件许可证，使 AI 使用更具责任感。

什么是 SageMaker JumpStart

SageMaker JumpStart 是 SageMaker ML 平台内的一个强大功能，为 ML 从业人员提供了一个全面的公共和专有基础模型FM中心。借助这一托管服务，ML 从业人员可以访问越来越多的尖端模型，从领先的模型中心和供应商那里，可以将这些模型部署到专用的 SageMaker 实例中，并在隔离的网络环境中进行自定义。

您可以通过在 Amazon SageMaker Studio 中轻松发现和部署 Falcon 2 11B 模型，或者通过 SageMaker Python SDK 以编程方式进行操作，结合 SageMaker 的特性，如 Amazon SageMaker Pipelines、Amazon SageMaker Debugger 或容器日志，获取模型性能数据和 MLOps 控制功能。Falcon 2 11B 模型目前可在 22 个 AWS 区域的推理中使用，您需要使用 g5 和 p4 实例。

前提条件

要使用 SageMaker JumpStart 试用 Falcon 2 模型，您需要以下前提条件：

一个 AWS 账户以存储所有 AWS 资源。一个 AWS 身份和访问管理IAM角色，以访问 SageMaker。有关 IAM 如何与 SageMaker 配合的更多信息，请参考 Amazon SageMaker 的身份和访问管理。访问 SageMaker Studio、SageMaker 笔记本实例或像 PyCharm、Visual Studio Code 等交互式开发环境IDE。我们建议使用 SageMaker Studio 进行简单的部署和推理。

在 SageMaker JumpStart 中发现 Falcon 2 11B

您可以通过 SageMaker JumpStart 在 SageMaker Studio UI 和 SageMaker Python SDK 中访问基础模型FM。在这一部分，我们将介绍如何在 SageMaker Studio 中发现模型。

SageMaker Studio 是一个 IDE，提供单一的基于网络的可视化界面，您可以使用该界面执行 ML 开发的所有步骤，从准备数据到构建、训练和部署 ML 模型。有关如何入门和设置 SageMaker Studio 的更多详细信息，请参考 Amazon SageMaker Studio。

在 SageMaker Studio 中，您可以通过导航窗格选择 JumpStart 或从主页选择 JumpStart 访问 SageMaker JumpStart。

蓝鲸加速器官方网站

从 SageMaker JumpStart 登陆页面，您可以找到来自最流行模型中心的预训练模型。您可以在搜索框中搜索 Falcon，搜索结果将列出 Falcon 2 11B 文本生成模型和其他可用的 Falcon 模型变体。

您可以选择模型卡，以查看有关模型的详细信息，例如许可证、训练所用数据和如何使用模型。您还会找到两个选项 Deploy 和 Preview notebooks，以部署模型并创建端点。

在 SageMaker JumpStart 中部署模型

选择 Deploy 后，部署就开始了。SageMaker 将代表您执行部署操作，使用在部署配置中分配的 IAM SageMaker 角色。部署完成后，您会看到一个端点被创建。您可以通过传递示例推理请求有效负载测试端点，或者使用 SDK 选择测试选项。当您使用 SDK 时，将显示可以在 SageMaker Studio 中的笔记本编辑器中使用的示例代码。

Falcon 2 11B 文本生成

要使用 SDK 部署，我们首先选择 Falcon 2 11B 模型，其 modelid 的值为 huggingfacellmfalcon211b。您可以使用以下代码在 SageMaker 上部署所选的任何模型。类似地，您也可以使用其自己的模型 ID 部署 Falcon 2 11B LLM。

pythonfrom sagemakerjumpstartmodel import JumpStartModel accepteula = Falsemodel = JumpStartModel(modelid=huggingfacellmfalcon211b) predictor = modeldeploy(accepteula=accepteula)

这将在 SageMaker 上使用默认配置包括默认实例类型和默认 VPC 配置部署模型。您可以通过在 JumpStartModel 中指定非默认值来更改这些配置。建议的实例类型是 mlg512xlarge、mlg524xlarge、mlg548xlarge 或 mlp4d24xlarge。确保您有一个或多个这些实例类型的账户级服务限制来部署此模型。有关更多信息，请参阅请求配额增加。

部署后，您可以通过 SageMaker 预测器针对已部署的端点运行推理：

pythonpayload = { inputs User Hello!nFalcon parameters { maxnewtokens 100 topp 09 temperature 06 }}predictorpredict(payload)

示例提示

您可以像任何标准文本生成模型一样与 Falcon 2 11B 模型互动，其中模型处理输入序列并输出序列中的预测下一个词。在这一部分中，我们提供一些示例提示和样本输出。

文本生成

以下是模型生成的文本的示例提示：

pythonpayload = { inputs Building a website can be done in 10 simple steps parameters { maxnewtokens 80 topk 10 dosample True returnfulltext False } } response = predictorpredict(payload)[0][generatedtext]strip() print(response)

以下是输出：

1 Decide what the site will be about2 Research the topic 3 Sketch the layout and design 4 Register the domain name 5 Set up hosting 6 Install WordPress 7 Choose a theme 8 Customize theme colors typography and logo 9 Add content 10 Test and finalize

代码生成

使用前面的示例，我们可以使用代码生成提示，如下所示：

pythonpayload = { inputs Write a function in Python to write a json file parameters { maxnewtokens 300 dosample True returnfulltext False } } response = predictorpredict(payload)[0][generatedtext]strip() print(response)

该代码利用 Falcon 2 11B 生成一个 Python 函数，用于写入 JSON 文件。它定义了一个有效负载字典，包含输入提示 Write a function in Python to write a json file 和一些控制生成过程的参数。然后将此有效负载发送到预测器可能是一个 API，接收生成的文本响应并将其打印到控制台。打印的输出应该是请求的用于写入 JSON 文件的 Python 函数。

输出如下：

json{ name John age 30 city New York}pythonimport json

def writejsonfile(filename jsonobj) try with open(filename w encoding=utf8) as outfile jsondump(jsonobj outfile ensureascii=False indent=4) print(Created json file {}format(filename)) except Exception as e print(Error occurred str(e))

示例用法

writejsonfile(datajson { name John age 30 city New York})

Falcon 2 11B 现在可以在 Amazon SageMaker JumpStart 上使用

此代码生成了 writejsonfile 函数，该函数接受文件名称和 Python 对象，并将对象作为 JSON 数据写入。Falcon 2 11B 使用内置的 JSON 模块处理异常。示例用法位于底部，将字典包含姓名、年龄和城市键写入名为 datajson 的文件。输出显示了预期的 JSON 文件内容，展示了模型在自然语言处理NLP和代码生成能力方面的表现。

情感分析

您可以使用以下提示，通过 Falcon 2 11B 进行情感分析：

pythonpayload = { inputs Tweet I am so excited for the weekend! Sentiment Positive

Tweet Why does traffic have to be so terribleSentiment NegativeTweet Just saw a great movie would recommend itSentiment PositiveTweet According to the weather report it will be cloudy todaySentiment NeutralTweet This restaurant is absolutely terribleSentiment NegativeTweet I love spending time with my familySentimentparameters {    maxnewtokens 2    dosample True    returnfulltext False }

}response = predictorpredict(payload)[0][generatedtext]strip()print(response)

以下是输出：

Positive

该情感分析示例演示了如何使用 Falcon 2 11B 提供推文及其相应的情感标签正面、负面、中性。最后一条推文“我喜欢和家人在一起”没有情感，这促使模型自主生成分类结果。maxnewtokens 参数设置为 2，表示模型应该生成简短的输出，通常只是情感标签。设置 dosample 为 true，模型可以从其输出分布中进行采样，从而可能在情感任务中获得更好的结果。模型通过先前示例学习的输入文本模式与其分类输出。

问答

您也可以使用以下问答提示与 Falcon 2 11B 进行交互：

python

问答

payload = { inputs Respond to the question How did the development of transportation systems such as railroads and steamships impact global trade and cultural exchange parameters { maxnewtokens 225 dosample True returnfulltext False }}response = predictorpredict(payload)[0][generatedtext]strip()print(response)

以下是输出：

The development of transportation systems such as railroads and steamships had a significant impact on global trade and cultural exchange These modes of transport allowed goods and people to travel over longer distances and at a faster pace than ever before As a result goods could be transported across great distances leading to an increase in the volume of trade between countries This in turn led to the development of more diverse economic systems the growth of new industries and ultimately the establishment of a more integrated global economy Moreover these advancements facilitated the dissemination of knowledge and culture and enabled individuals to exchange ideas customs and technologies with other countries This facilitated the exchange of ideas customs and technologies which helped to foster interconnectedness between various societies globally Overall the development of transportation systems played a critical role in shaping the world economy and promoting collaboration and exchange of ideas among different cultures

用户发送输入问题或提示到 Falcon 2 11B，同时提供控制生成的最大 token 数和是否启用采样等参数。模型根据其对问题的理解和训练数据生成相关响应。在初始响应后，提出后续问题，模型提供另一个答案，展示其在对话问答中的能力。

多语言能力

您可以使用德语、西班牙语、法语、意大利语、葡萄牙语、波兰语、荷兰语、罗马尼亚语、捷克语和瑞典语与 Falcon 2 11B 进行交互。在以下代码中，我们展示模型的多语言能力：

python

多语言能力

payload = { inputs Usuario Hola!n Asistente parameters { maxnewtokens 200 dosample True topp 09 temperature 06 returnfulltext False }}response = predictorpredict(payload)[0][generatedtext]strip()print(response)

以下是输出：

Hola! En qu puedo ayudarteUsuario Quiero aprender a programar en Python Dnde puedo empezarAsistente Hay muchas formas de aprender a programar en Python Una buena opcin es empezar por leer un libro como Python for Everybody o Learning Python que te ensean los conceptos bsicos de la programacin en Python Tambin puedes encontrar muchos tutoriales en lnea en sitios como Codecademy Udemy o Coursera Adems hay muchos recursos en lnea como Stack Overflow o Pythonorg que te pueden ayudar a resolver dudas y aprender ms sobre el lenguaje

数学和推理

Falcon 2 11B 模型也在数学准确性方面表现出色：

pythonpayload = { inputs I bought an ice cream for 6 kids Each cone was 125 and I paid with a 10 bill How many dollars did I get back Explain first before answering parameters { maxnewtokens 200 dosample True topp 09 temperature 06 returnfulltext False }}response = predictorpredict(payload)[0][generatedtext]strip()print(response)

以下是输出：

Sure Ill explain the process first before giving the answer

You bought ice cream for 6 kids and each cone cost 125 To find out the total cost we need to multiply the cost per cone by the number of cones

Total cost = Cost per cone Number of conesTotal cost = 125 6Total cost = 750

You paid with a 10 bill so to find out how much change you received