Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
-fixed typos
  • Loading branch information
edemirci-aai authored Sep 26, 2024
1 parent 6226d24 commit de51af6
Showing 1 changed file with 6 additions and 6 deletions.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ This document outlines the core functions used in the `VirtualHavruta` class. Th
| `retrieve_docs(self, query: str, msg_id: str = '', filter_mode: str = 'primary')` | Retrieves documents matching a query, filtered as primary or secondary sources. | - `query: str`: Query string<br>- `msg_id: str = ''`: Message ID for logging<br>- `filter_mode: str = 'primary'`: 'primary' or 'secondary' | List of documents |
| `retrieve_docs_metadata_filtering(self, query: str, msg_id: str = '', metadata_filter: = None)` | Retrieves documents matching a query, filtered based on metadata. | - `query: str`: Query string<br>- `msg_id: str = ''`: Message ID for logging<br>- `list: A list of documents that meet the criteria of the specified metadata filter. | List of documents |
| `retrieve_nodes_matching_linker_results(self, linker_results: list[dict], msg_id: str = '', filter_mode: str = 'primary', url_prefix: str = "https://www.sefaria.org/")` | Retrieves nodes corresponding to linker results from the graph database. | - `linker_results: list[dict]`: Results from the linker API<br>- `msg_id: str = ''`: Message ID for logging<br>- `filter_mode: str = 'primary'`: 'primary' or 'secondary'<br>- `url_prefix: str`: URL prefix | List of `Document` objects |
| `get_retrieval_results_knowledge_graph(self, url: str, direction: str, order: int, score_central_node: float, filter_mode_nodes: str | None = None, msg_id: str = '')` | Retrieves neighbor nodes of a given URL from the knowledge graph. | - `url: str`: Central node URL<br>- `direction: str`: Edge direction ('incoming', 'outgoing', 'both_ways')<br>- `order: int`: Number of hops<br>- `score_central_node: float`: Central node score<br>- `filter_mode_nodes: str | None = None`: Node filter mode<br>- `msg_id: str = ''`: Message ID for logging | List of tuples `(Document, score)` |
| `get_retrieval_results_knowledge_graph(self, url: str, direction: str, order: int, score_central_node: float, filter_mode_nodes: str, msg_id: str = '')` | Retrieves neighbor nodes of a given URL from the knowledge graph. | - `url: str`: Central node URL<br>- `direction: str`: Edge direction ('incoming', 'outgoing', 'both_ways')<br>- `order: int`: Number of hops<br>- `score_central_node: float`: Central node score<br>- `filter_mode_nodes: str= None`: Node filter mode<br>- `msg_id: str = ''`: Message ID for logging | List of tuples `(Document, score)` |
| `query_graph_db_by_url(self, urls: list[str])` | Queries the graph database for nodes with given URLs. | - `urls: list[str]`: List of URLs | List of `Document` objects |
| `query_sefaria_linker(self, text_title="", text_body="", with_text=1, debug=0, max_segments=0, msg_id: str = '')` | Queries the Sefaria Linker API and returns the JSON response. | - `text_title: str = ""`: Text title<br>- `text_body: str = ""`: Text body<br>- `with_text: int = 1`: Include text in response<br>- `debug: int = 0`: Debug flag<br>- `max_segments: int = 0`: Max segments<br>- `msg_id: str = ''`: Message ID for logging | JSON response (dict or str) |
| `retrieve_docs_linker(self, screen_res: str, enriched_query: str, msg_id: str = '', filter_mode: str = 'primary')` | Retrieves documents from the Sefaria Linker API based on a query. | - `screen_res: str`: Screen result query<br>- `enriched_query: str`: Enriched query<br>- `msg_id: str = ''`: Message ID for logging<br>- `filter_mode: str = 'primary'`: 'primary' or 'secondary' | List of document dictionaries |
Expand All @@ -64,7 +64,7 @@ This document outlines the core functions used in the `VirtualHavruta` class. Th
| Function Name | Purpose | Input Parameters | Output |
|---------------|---------|------------------|--------|
| `select_reference(self, query: str, retrieval_res, msg_id: str = '')` | Selects useful references from retrieval results using a language model. | - `query: str`: Query string<br>- `retrieval_res`: Retrieved documents<br>- `msg_id: str = ''`: Message ID for logging | Tuple `(selected_retrieval_res: list, tokens_used: int)` |
| `sort_reference(self, scripture_query: str, enriched_query: str, retrieval_res, filter_mode: str | None = 'primary', msg_id: str = '')` | Sorts retrieval results based on relevance to the query. | - `scripture_query: str`: Scripture query<br>- `enriched_query: str`: Enriched query<br>- `retrieval_res`: Retrieval results<br>- `filter_mode: str | None = 'primary'`: Filter mode<br>- `msg_id: str = ''`: Message ID for logging | Tuple `(sorted_src_rel_dict: dict, src_data_dict: dict, src_ref_dict: dict, total_tokens: int)` |
| `sort_reference(self, scripture_query: str, enriched_query: str, retrieval_res, filter_mode: str = 'primary', msg_id: str = '')` | Sorts retrieval results based on relevance to the query. | - `scripture_query: str`: Scripture query<br>- `enriched_query: str`: Enriched query<br>- `retrieval_res`: Retrieval results<br>- `filter_mode: str = 'primary'`: Filter mode<br>- `msg_id: str = ''`: Message ID for logging | Tuple `(sorted_src_rel_dict: dict, src_data_dict: dict, src_ref_dict: dict, total_tokens: int)` |
| `merge_references_by_url(self, retrieval_res: list[tuple[Document, float]], msg_id: str = '')` | Merges chunks with the same URL to consolidate content and sources. | - `retrieval_res: list[tuple[Document, float]]`: Documents and scores<br>- `msg_id: str = ''`: Message ID for logging | Tuple `(sorted_src_rel_dict: dict, src_data_dict: dict, src_ref_dict: dict)` |
| `merge_linker_refs(self, retrieved_docs: list, p_sorted_src_rel_dict: dict, p_src_data_dict: dict, p_src_ref_dict: dict, msg_id: str = '')` | Merges new linker references into existing reference dictionaries. | - `retrieved_docs: list`: New documents<br>- `p_sorted_src_rel_dict: dict`: Existing relevance dict<br>- `p_src_data_dict: dict`: Existing data dict<br>- `p_src_ref_dict: dict`: Existing ref dict<br>- `msg_id: str = ''`: Message ID for logging | Tuple of updated dictionaries |

Expand All @@ -75,7 +75,7 @@ This document outlines the core functions used in the `VirtualHavruta` class. Th
| Function Name | Purpose | Input Parameters | Output |
|---------------|---------|------------------|--------|
| `score_document_by_graph_distance(self, n_hops: int, start_score: float, score_decrease_per_hop: float) -> float` | Scores a document based on its distance from the central node in the graph. | - `n_hops: int`: Number of hops<br>- `start_score: float`: Starting score<br>- `score_decrease_per_hop: float`: Score decrease per hop | `float` score |
| `rank_documents(self, chunks: list[Document], enriched_query: str, scripture_query: str | None = None, semantic_similarity_scores: list[float] | None = None, filter_mode: str | None = None, msg_id: str = '')` | Ranks documents based on relevance to the query. | - `chunks: list[Document]`: Documents to rank<br>- `enriched_query: str`: Enriched query<br>- `scripture_query: str | None = None`: Scripture query<br>- `semantic_similarity_scores: list[float] | None = None`: Precomputed scores<br>- `filter_mode: str | None = None`: Filter mode<br>- `msg_id: str = ''`: Message ID for logging | Tuple `(sorted_chunks: list[Document], ranking_scores: list[float], total_token_count: int)` |
| `rank_documents(self, chunks: list[Document], enriched_query: str, scripture_query: str = None, semantic_similarity_scores: list[float]= None, filter_mode: str = None, msg_id: str = '')` | Ranks documents based on relevance to the query. | - `chunks: list[Document]`: Documents to rank<br>- `enriched_query: str`: Enriched query<br>- `scripture_query: str = None`: Scripture query<br>- `semantic_similarity_scores: list[float] = None`: Precomputed scores<br>- `filter_mode: str = None`: Filter mode<br>- `msg_id: str = ''`: Message ID for logging | Tuple `(sorted_chunks: list[Document], ranking_scores: list[float], total_token_count: int)` |
| `compute_semantic_similarity_documents_query(self, documents: list[Document], query: str, msg_id: str = '')` | Computes semantic similarity between documents and a query. | - `documents: list[Document]`: Documents<br>- `query: str`: Query string<br>- `msg_id: str = ''`: Message ID for logging | `np.array` of similarity scores |
| `get_reference_class(self, documents: list[Document], scripture_query: str, enriched_query: str, msg_id: str = '')` | Determines the reference class for each document based on the query. | - `documents: list[Document]`: Documents<br>- `scripture_query: str`: Scripture query<br>- `enriched_query: str`: Enriched query<br>- `msg_id: str = ''`: Message ID for logging | Tuple `(reference_classes: np.array, total_token_count: int)` |
| `get_page_rank_scores(self, documents: list[Document], msg_id: str = '')` | Retrieves PageRank scores for documents. | - `documents: list[Document]`: Documents<br>- `msg_id: str = ''`: Message ID for logging | `np.array` of PageRank scores |
Expand All @@ -86,8 +86,8 @@ This document outlines the core functions used in the `VirtualHavruta` class. Th

| Function Name | Purpose | Input Parameters | Output |
|---------------|---------|------------------|--------|
| `get_graph_neighbors_by_url(self, url: str, relationship: str, depth: int, filter_mode_nodes: str | None = None, msg_id: str = '')` | Retrieves neighbor nodes from the graph database based on a URL. | - `url: str`: Central node URL<br>- `relationship: str`: Edge relationship<br>- `depth: int`: Neighbor depth<br>- `filter_mode_nodes: str | None = None`: Node filter mode<br>- `msg_id: str = ''`: Message ID for logging | List of tuples `(Node, distance)` |
| `get_chunks_corresponding_to_nodes(self, nodes: list[Document], batch_size: int = 20, max_nodes: int | None = None, unique_url: bool = True, msg_id: str = '')` | Retrieves chunks corresponding to given nodes. | - `nodes: list[Document]`: Nodes<br>- `batch_size: int = 20`: Batch size<br>- `max_nodes: int | None = None`: Max nodes<br>- `unique_url: bool = True`: Ensure unique URLs<br>- `msg_id: str = ''`: Message ID for logging | List of `Document` objects |
| `get_graph_neighbors_by_url(self, url: str, relationship: str, depth: int, filter_mode_nodes: str = None, msg_id: str = '')` | Retrieves neighbor nodes from the graph database based on a URL. | - `url: str`: Central node URL<br>- `relationship: str`: Edge relationship<br>- `depth: int`: Neighbor depth<br>- `filter_mode_nodes: str = None`: Node filter mode<br>- `msg_id: str = ''`: Message ID for logging | List of tuples `(Node, distance)` |
| `get_chunks_corresponding_to_nodes(self, nodes: list[Document], batch_size: int = 20, max_nodes: int = None, unique_url: bool = True, msg_id: str = '')` | Retrieves chunks corresponding to given nodes. | - `nodes: list[Document]`: Nodes<br>- `batch_size: int = 20`: Batch size<br>- `max_nodes: int = None`: Max nodes<br>- `unique_url: bool = True`: Ensure unique URLs<br>- `msg_id: str = ''`: Message ID for logging | List of `Document` objects |
| `get_node_corresponding_to_chunk(self, chunk: Document, msg_id: str = '')` | Retrieves the node corresponding to a given chunk. | - `chunk: Document`: Chunk document<br>- `msg_id: str = ''`: Message ID for logging | `Document` object representing the node |
| `is_primary_document(self, doc: Document) -> bool` | Checks if a document is a primary document. | - `doc: Document`: Document to check | `bool` |

Expand All @@ -114,7 +114,7 @@ This document outlines the core functions used in the `VirtualHavruta` class. Th

| Function Name | Purpose | Input Parameters | Output |
|---------------|---------|------------------|--------|
| `graph_traversal_retriever(self, screen_res: str, scripture_query: str, enriched_query: str, filter_mode_nodes: str | None = None, linker_results: list[dict] | None = None, semantic_search_results: list[tuple[Document, float]] | None = None, msg_id: str = '')` | Retrieves related chunks by traversing the graph starting from seed chunks. | - `screen_res: str`: Screen result query<br>- `scripture_query: str`: Scripture query<br>- `enriched_query: str`: Enriched query<br>- `filter_mode_nodes: str | None = None`: Node filter mode<br>- `linker_results: list[dict] | None = None`: Linker results<br>- `semantic_search_results: list[tuple[Document, float]] | None = None`: Semantic search results<br>- `msg_id: str = ''`: Message ID for logging | Tuple `(retrieval_res_kg: list[tuple[Document, float]], total_token_count: int)` |
| `graph_traversal_retriever(self, screen_res: str, scripture_query: str, enriched_query: str, filter_mode_nodes: str = None, linker_results: list[dict] = None, semantic_search_results: list[tuple[Document, float]] = None, msg_id: str = '')` | Retrieves related chunks by traversing the graph starting from seed chunks. | - `screen_res: str`: Screen result query<br>- `scripture_query: str`: Scripture query<br>- `enriched_query: str`: Enriched query<br>- `filter_mode_nodes: str = None`: Node filter mode<br>- `linker_results: list[dict]= None`: Linker results<br>- `semantic_search_results: list[tuple[Document, float]] = None`: Semantic search results<br>- `msg_id: str = ''`: Message ID for logging | Tuple `(retrieval_res_kg: list[tuple[Document, float]], total_token_count: int)` |

## Configuration Guide for config.yaml

Expand Down

0 comments on commit de51af6

Please sign in to comment.