You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've developed an addition to mlextend. It makes the process of encoding transactions, mining rules rules and filtering extracted rules seamlessly using sklearn pipelines. As an input you would have your clean pd dataset, it would pass through each step and would produce filtered rules by specified values.
Describe your proposed solution
I've developed a new TransactionEncoder class, which first discretizes numerical variables and then encodes numerical values using intervals and categorical using its discrete values.
Another class, RuleExtractor, encapsulates frequent itemset and rule extraction. It takes a onehot-encoded Dataframe and produces rules for desired support and metric values.
As a last step, there are two classes: one for filtering extracted classes by items in consequent or antecedent and another for filtering rules based on it's metric's values.
As a independent module, there is a class used for negative transaction generation. It takes a onehot-encoded trasaction dataframe and for each specified column, it generates a new column for the negated variables.
Usually association rules describe relations between items A -> B (presence of A is associated with presence of B). Including negated items, we can study group of rules that involve negated items, such as:
Rules where A->B and ¬A ->¬B have high confidence would be strong rules, as presence of items is associated and absence is also associated.
All mentioned classes conform to sklearn fit_transform standard, so they can seamlessly be integrated into sklearn pipelines with other algorythms.
Describe alternatives you've considered, if relevant
Additional context
The text was updated successfully, but these errors were encountered:
Describe the workflow you want to enable
I've developed an addition to mlextend. It makes the process of encoding transactions, mining rules rules and filtering extracted rules seamlessly using sklearn pipelines. As an input you would have your clean pd dataset, it would pass through each step and would produce filtered rules by specified values.
Describe your proposed solution
I've developed a new TransactionEncoder class, which first discretizes numerical variables and then encodes numerical values using intervals and categorical using its discrete values.
Another class, RuleExtractor, encapsulates frequent itemset and rule extraction. It takes a onehot-encoded Dataframe and produces rules for desired support and metric values.
As a last step, there are two classes: one for filtering extracted classes by items in consequent or antecedent and another for filtering rules based on it's metric's values.
As a independent module, there is a class used for negative transaction generation. It takes a onehot-encoded trasaction dataframe and for each specified column, it generates a new column for the negated variables.
Usually association rules describe relations between items A -> B (presence of A is associated with presence of B). Including negated items, we can study group of rules that involve negated items, such as:
All mentioned classes conform to sklearn fit_transform standard, so they can seamlessly be integrated into sklearn pipelines with other algorythms.
Describe alternatives you've considered, if relevant
Additional context
The text was updated successfully, but these errors were encountered: