Skip to content
This repository has been archived by the owner on Jan 3, 2024. It is now read-only.

Fix sub-definitions not being listed #68

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions tests/testOutput.json
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,11 @@
"partOfSpeech": "noun",
"text": [
"grapple (countable and uncountable, plural grapples)",
"A tool with claws or hooks which is used to catch or hold something.",
[
"A tool with claws or hooks which is used to catch or hold something.",
"(nautical) A device consisting of iron claws, attached to the end of a rope, used for grasping and holding an enemy ship prior to boarding; a grappling iron.",
"(nautical) A grapnel (“type of anchor”)."
],
"A close hand-to-hand struggle.",
"(uncountable) The act of grappling."
],
Expand Down Expand Up @@ -156,7 +160,11 @@
"house (countable and uncountable, plural houses or (dialectal) housen or (chiefly humorous) hice)",
"A structure built or serving as an abode of human beings. [from 9thc.]",
"The people who live in a house; a household. [from 9thc.]",
"A building used for something other than a residence (typically with qualifying word). [from 10thc.]",
[
"A building used for something other than a residence (typically with qualifying word). [from 10thc.]",
"A place of business; a company or organisation, especially a printing press, a publishing company, or a couturier. [from 10thc.]",
"A place of public accommodation or entertainment, especially a public house, an inn, a restaurant, a theatre, or a casino; or the management thereof.[from 10thc.]"
],
"The audience for a live theatrical or similar performance. [from 10thc.]",
"(politics) A building where a deliberative assembly meets; whence the assembly itself, particularly a component of a legislature. [from 10thc.]",
"A dynasty; a family with its ancestors and descendants, especially a royal or noble one. [from 10thc.]",
Expand Down
12 changes: 10 additions & 2 deletions wiktionaryparser/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,15 @@ def parse_definitions(self, word_contents):
if definition_tag.name in ['ol', 'ul']:
for element in definition_tag.find_all('li', recursive=False):
if element.text:
definition_text.append(element.text.strip())
sub_definitions = element.find_all("li")
if sub_definitions:
element.find("ol").extract()
top_definition = element.text.strip()
sub_definitions_list = [sub_definition.text.strip() for sub_definition in sub_definitions]
sub_definitions_list.insert(0, top_definition)
definition_text.append(sub_definitions_list)
else:
definition_text.append(element.text.strip())
if def_type == 'definitions':
def_type = ''
definition_list.append((def_index, definition_text, def_type))
Expand All @@ -198,7 +206,7 @@ def parse_examples(self, word_contents):
examples.append(example_text)
element.clear()
example_list.append((def_index, examples, def_type))
for quot_list in table.find_all(['ul', 'ol']):
for quot_list in table.find_all("ul", recursive=True):
quot_list.clear()
table = table.find_next_sibling()
return example_list
Expand Down