ENH: Allow flexible spacing for numerical list-of-list (array-like) #758

asottile · 2021-04-03T06:43:20Z

In GitLab by @larsoner on Jul 8, 2020, 17:21

For numerical libraries there is a need for flexible spacing of list(-of-list) of numerical values. The related issues raised initially for PyCodestyle basically boiled down to (use flake8 and then):

Add # noqa: E221,... on the lines that need it, or
--ignore the problematic E2XX's, then write a plugin to restore the checks everywhere but in arrays (or so)

refs:

(1) is a pain because there are a lot of these cases for example in SciPy. I started working toward (2) with the following to be able to deal with E201/202/203/221/222/241 by recasting them as A2XX:

$ flake8 --ignore E201,E202,E203,E221,E222,E241 --select A --array-spacing flake8_array_spacing
flake8_array_spacing/__init__.py:1:8: A201 whitespace after '['
flake8_array_spacing/__init__.py:1:11: A221 multiple spaces before operator
flake8_array_spacing/__init__.py:1:17: A203 whitespace before ','
flake8_array_spacing/__init__.py:1:19: A241 multiple spaces after ','
flake8_array_spacing/__init__.py:1:26: A222 multiple spaces after operator
flake8_array_spacing/__init__.py:1:31: A202 whitespace before ']'

setup.py

import setuptools

requires = [
    "flake8 > 3.0.0",
]

setuptools.setup(
    name="flake8_array_spacing",
    license="MIT",
    version="0.1.0",
    description="our extension to flake8",
    author="Me",
    author_email="[email protected]",
    url="https://github.com/larsoner/flake8-array-spacing",
    packages=[
        "flake8_array_spacing",
    ],
    install_requires=requires,
    entry_points={
        'flake8.extension': [
            'A2 = flake8_array_spacing:ArraySpacing',
        ],
    },
    classifiers=[
        "Framework :: Flake8",
        "Environment :: Console",
        "Intended Audience :: Developers",
        "License :: OSI Approved :: MIT License",
        "Programming Language :: Python",
        "Programming Language :: Python :: 3",
        "Topic :: Software Development :: Libraries :: Python Modules",
        "Topic :: Software Development :: Quality Assurance",
    ],
)

flake8_array_spacing/__init__.py

_ok = [ -3  + 4j ,  -10 +  20j ]


class ArraySpacing(object):
    name = 'array-spacing'
    version = '0.1.dev0'

    @classmethod
    def add_options(cls, option_manager):
        option_manager.add_option(
            '--array-spacing', action='store_true', default=False,
            parse_from_config=True,
            help='Allow array-like of float and int to violate E2XX.')

    @classmethod
    def parse_options(cls, options):
        cls.enabled = options.array_spacing

    def __init__(self, logical_line):
        self.logical_line = logical_line

    def __iter__(self):
        if not self.enabled:
            return
        # TODO: figure out range(s) that should be skipped because they are
        # lists of lists of numbers, then skip them if "found" is in the ranges
        # below
        from pycodestyle import (
            extraneous_whitespace,  # 201, 202, 203
            whitespace_around_operator,  # 221, 222
            whitespace_around_comma,  # 241
        )
        for found, msg in extraneous_whitespace(self.logical_line):
            yield found, 'A' + msg[1:]
        for found, msg in whitespace_around_operator(self.logical_line):
            yield found, 'A' + msg[1:]
        for found, msg in whitespace_around_comma(self.logical_line):
            yield found, 'A' + msg[1:]

You can see the TODO bit still needs to be done (i.e., don't emit cases that are within the acceptable array-like-of-numerical ranges of the logical lines), but I think some suitable regex could do it. However, this is already pretty clearly a sub-optimal solution because now any existing # noqa: E2XX's in the codebase would need to be renamed # noqa: A2XX, and also any checks that are really E2XX are now called by this other annoying A2XX name.

My sense is that some deeper restructuring of flake8 might be required to support this sort of use case properly. Maybe adding a hook into ``Manager._handle_results, Manager.report`, `StyleGuide`, or somewhere else, allowing post-hoc filtering of results by a plugin like:

        # filter results
        for plugin in filtering_plugins:
            results = plugin.filter_results(results)

If there were such a hook, the class above could make a filename->logical line->ranges to ignore or so. Of course this could have some memory problems, so maybe there is a better way to do it.

The text was updated successfully, but these errors were encountered:

asottile · 2021-04-03T06:43:23Z

In GitLab by @asottile on Jul 8, 2020, 18:19

the complaint for (2) is pretty easily solved with sed

I believe the rest of this is a duplicate of #515

(note that sigmavirus42 and I (the two people helping in the pycodestyle issues) are the maintainers here)

asottile · 2021-04-03T06:43:26Z

In GitLab by @asottile on Jul 8, 2020, 18:19

closed

asottile · 2021-04-03T06:43:29Z

In GitLab by @larsoner on Jul 8, 2020, 18:51

the complaint for (2) is pretty easily solved with sed

I would call this a work around rather than solution. Then people need to know what these A errors are about and how they relate to the E ones. It certainly seems suboptimal to me at least.

I believe the rest of this is a duplicate of #515

That one appears to be about ignoring entire files / adding to the file ignore list.

The proposal here is really about (ideally) filtering errors based on logical-line (and/or physical line) criteria after they have been detected by the given global rules. In other words, adding this new hook would allow general algorithmic subselection of errors based on content. For example, maybe you want to:

ignore E501 for any line that is a variable assignment to a string
ignore E501 for any line with a string containing "http" (if we allow the subselection hook to pass the physical as well as logical line)
ignore some E2XX within any part of the logical line that is list of numerical (SciPy's use case)

I know the third would be useful in NumPy and scipy, and the second would be useful in many projects I maintain. But maybe I'm underestimating the complexity of the implementation / API contract Flake8 would have to absorb for this functionality.

asottile · 2021-04-03T06:43:31Z

In GitLab by @larsoner on Jul 8, 2020, 18:58

TL;DR: I guess what I'm saying would be awesome would be a hook that allowed you to algorithmically effectively do the equivalent of inserting # noqa: ... based on the logical and/or physical content of each error that is actually found. This seems different from extending the file exclude list.

asottile · 2021-04-03T06:43:34Z

In GitLab by @sigmavirus24 on Jul 9, 2020, 06:25

So I'll say that the more complex we make Flake8 the less we can add in the future and this genuinely isn't something that has a benefit that would outweigh the sheer complexity of what you're actually asking for. On the other hand, Flake8 could be used as part of a pipeline and produce machine-readable output that is parsed/handled/etc by some other tool. flake8-json exists. You could build something that reads that, analyses the line it's referring to and does the algorithmic decision making you're concerned with. That's something you can do today (could have done months ago really) and will continue to work as the output isn't likely to change. It keeps Flake8 maintainable. It doesn't introduce the nightmare of "2 hooks have differing opinions on the same violation on the same line, what do?" it means you control all of this (which it sounds like is what you want) and you don't need to block on us.

asottile · 2021-04-03T06:43:37Z

In GitLab by @larsoner on Jul 9, 2020, 06:59

this genuinely isn't something that has a benefit that would outweigh the sheer complexity of what you're actually asking for.

Fair enough, I had wrongly assumed this wouldn't require too many additional lines or extensive modifications to flake8.

flake8-json exists. You could build something that reads that, analyses the line it's referring to

Indeed I could even pretty easily write a custom reporter and filter directly, which seems easy enough to do. However, I wouldn't have access to the logical line at that point so still wouldn't be as straightforward, unless there is some way to (at the reporter stage level) get the logical line that a physical line error corresponds to. (I also don't know if I'm using these terms correctly since I'm relatively new.)

But I think what I'll really do instead is just make this A error-raiser and make it respect any existing # noqa: E2XX variants. It seems like the simplest way to do what we need. Thanks for your help!

asottile closed this as completed Apr 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Allow flexible spacing for numerical list-of-list (array-like) #758

ENH: Allow flexible spacing for numerical list-of-list (array-like) #758

asottile commented Apr 3, 2021

asottile commented Apr 3, 2021

asottile commented Apr 3, 2021

asottile commented Apr 3, 2021

asottile commented Apr 3, 2021

asottile commented Apr 3, 2021

asottile commented Apr 3, 2021

ENH: Allow flexible spacing for numerical list-of-list (array-like) #758

ENH: Allow flexible spacing for numerical list-of-list (array-like) #758

Comments

asottile commented Apr 3, 2021

asottile commented Apr 3, 2021

asottile commented Apr 3, 2021

asottile commented Apr 3, 2021

asottile commented Apr 3, 2021

asottile commented Apr 3, 2021

asottile commented Apr 3, 2021