v.1.23.0
⭐️ Highlights
🪨 Amazon Bedrock support for PromptNode
(#6226)
Haystack now supports Amazon Bedrock models, including all existing and previously announced
models, like Llama-2-70b-chat. To use these models, simply pass the model ID in the
model_name_or_path parameter, like you do for any other model. For details, see
Amazon Bedrock Documentation.
For example, the following code loads the Llama 2 Chat 13B model:
from haystack.nodes import PromptNode
prompt_node = PromptNode(model_name_or_path="meta.llama2-13b-chat-v1")
🗺️ Support for MongoDB Atlas Document Store (#6471)
With this release, we introduce support for MongoDB Atlas as a Document Store. Try it with:
from haystack.document_stores.mongodb_atlas import MongoDBAtlasDocumentStore
document_store = MongoDBAtlasDocumentStore(
mongo_connection_string=f"mongodb+srv://USER:PASSWORD@HOST/?{'retryWrites': 'true', 'w': 'majority'}",
database_name="database",
collection_name="collection",
)
...
document_store.write_documents(...)
Note that you need MongoDB Atlas credentials to fill the connection string. You can get such credentials by registering here: https://www.mongodb.com/cloud/atlas/register
⬆️ Upgrade Notes
- Remove deprecated
OpenAIAnswerGenerator
,BaseGenerator
,GenerativeQAPipeline
, and related tests.
GenerativeQA Pipelines should use PromptNode instead. See https://haystack.deepset.ai/tutorials/22_pipeline_with_promptnode.
🚀 New Features
-
Add PptxConverter: a node to convert pptx files to Haystack Documents.
-
Add
split_length
by token in PreProcessor. -
Support for dense embedding instructions used in retrieval models such as BGE and LLM-Embedder.
-
You can use Amazon Bedrock models in Haystack.
-
Add
MongoDBAtlasDocumentStore
, providing support for MongoDB Atlas as a document store.
⚡️ Enhancement Notes
-
Change
PromptModel
constructor parameterinvocation_layer_class
to accept astr
too.
If astr
is used the invocation layer class will be imported and used.
This should ease serialisation to YAML when usinginvocation_layer_class
withPromptModel
. -
Users can now define the number of pods and pod type directly when creating a PineconeDocumentStore instance.
-
Add batch_size to the init method of FAISS Document Store. This works as the default value for all methods of
FAISS Document Store that support batch_size. -
Introduces a new timeout keyword argument in PromptNode, addressing and fixing the issue #5380 for enhanced control over individual calls to OpenAI.
-
Upgrade Transformers to the latest version 4.35.2
This version adds support for DistilWhisper, Fuyu, Kosmos-2, SeamlessM4T, Owl-v2. -
Upgraded openai-whisper to version 20231106 and simplified installation through re-introduced audio extra.
The latest openai-whisper version unpins its tiktoken dependency, which resolved a version conflict with Haystack's dependencies. -
Make it possible to load additional fields from the SQUAD format file into the meta field of the Labels.
-
Add new variable model_kwargs to the ExtractiveReader so we can pass different loading options supported by
HuggingFace. -
Add new token limit for gpt-4-1106-preview model.
🐛 Bug Fixes
-
Fix
Pipeline.load_from_deepset_cloud
to work with the latest version of deepset Cloud. -
When using
JoinDocuments
withjoin_mode=concatenate
(default) and
passing duplicate documents, including some with a null score, this
node raised an exception.
Now this case is handled correctly and the documents are joined as expected. -
Adds LostInTheMiddleRanker, DiversityRanker, and RecentnessRanker to
haystack/nodes/__init__.py
and thus
ensures that they are included in JSON schema generation. -
Adds LostInTheMiddleRanker, DiversityRanker, and RecentnessRanker to
haystack/nodes/__init__.py
and thus
ensures that they are included in JSON schema generation.