Spicy

is a tag-based parser of text.

For example, HTML or XML are based on tags. And data text parsing can be useful when you need to find some tags by name and attributes or take separate parts of document.

Running

git clone 'https://github.com/michael7nightingale/spicy.git'

Python >= 3.11 There is no any installed python libraries. Every thing is from the box.

"""your parser file"""
from spicy import Spicy
from urllib.request import Request,urlopen

request = Request(url='https://example.com/')
with urlopen(request) as response:
    html_text = response.read().decode(encoding='utf-8')
    
spicy = Spicy(
    text=html_text, 
    doctype='html'    # it`s already default
)

print(spicy.tag)    # html
print(spicy.children)   # ['<Tag: head>', '<Tag: body>']
print(spicy)    # all the document in string type

head, body = spicy.children
print(el.attributes for el in body)

Spicy tags and document have rich searching logic:

findAll() - returns the list of tag objects with the given parameters;
findIter() - generator version of findAll(), can reduce memory usage;
findFirst() - returns first tag object with given parameters;
findLast() - returns last tag object with given parameters;
getElementById() - returns tag object with given id;

Useful properties:

tag - represents tag name (link, div, html, img, etc.)
className - class attribute value, if exists;
id - id attribute value, if exists;
attributes - representation of all tag options (attributes), for example: align=center, href=/admin/user;
parent - parent tag node of DOM;
children - the list of children tag nodes;

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.github/workflows		.github/workflows
spicy		spicy
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.pypirc		.pypirc
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spicy

is a tag-based parser of text.

Running

About

Releases

Packages

Languages

License

michael7nightingale/spicy

Folders and files

Latest commit

History

Repository files navigation

Spicy

is a tag-based parser of text.

Running

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages