Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature : added TikaPdfParser #778

Open
wants to merge 1 commit into
base: community/tikapdfparser
Choose a base branch
from

Conversation

palash018
Copy link


Link to Issue

Feature Name

TikaPdfParser.py

Feature Description

Using TikaPdfParser, parse text from PDF files.
Solves #502

Description

Files Added:

  • pkgs/community/swarmauri_community/parsers/concrete/TikaPdfParser.py
    This file contains the implementation of the TikaPdfParser class, which is used to parse PDF documents. It includes features for reading text content from PDF files.

  • Test File is pending


I kindly ask the maintainers to review my code and point out any mistakes. Thank you!

@palash018 palash018 requested a review from cobycloud as a code owner November 10, 2024 09:59
@cobycloud cobycloud changed the base branch from master to community/tikapdfparser November 11, 2024 09:52
@cobycloud
Copy link
Contributor

this is looking good, once we have the test file, we will be able to start pushing the component forward. great work

@cobycloud
Copy link
Contributor

pending test file

@palash018
Copy link
Author

pending test file

Please check discord. I have some issues regarding imports which hinder my ability to make tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants