Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flag to print list of files processed #205

Closed
waldyrious opened this issue Sep 10, 2023 · 7 comments
Closed

Add flag to print list of files processed #205

waldyrious opened this issue Sep 10, 2023 · 7 comments
Labels
enhancement New feature or request fixed in next Fixed in the "next" branch for the next release

Comments

@waldyrious
Copy link

Basically something along the same lines as igorshubovych/markdownlint-cli#415. The output would change from:

$ markdownlint-cli2 '**/*.md'
markdownlint-cli2 v0.9.2 (markdownlint v0.30.0)
Finding: **/*.md
Linting: 5 file(s)
Summary: 3 error(s)
CHANGELOG.md:13:52 MD052/reference-links-images Reference links and images should use a label that is defined [Missing link or image reference definition: "#161"] [Context: "[#161][]"]
docs/index.md:1 MD041/first-line-heading/first-line-h1 First line in a file should be a top-level heading [Context: "Foobar"]
README.md:94 MD001/heading-increment/header-increment Heading levels should only increment by one level at a time [Expected: h3; Actual: h4]

to this:

$ markdownlint --verbose '**/*.md'
markdownlint-cli2 v0.9.2 (markdownlint v0.30.0)
Finding: **/*.md
Linting: 5 file(s)
    README.md
    LICENSE.md
    CONTRIBUTING.md
    CHANGELOG.md
    docs/index.md 
Summary: 3 error(s)
    CHANGELOG.md:13:52 MD052/reference-links-images Reference links and images should use a label that is defined [Missing link or image reference definition: "#161"] [Context: "[#161][]"]
    docs/index.md:1 MD041/first-line-heading/first-line-h1 First line in a file should be a top-level heading [Context: "Foobar"]
    README.md:94 MD001/heading-increment/header-increment Heading levels should only increment by one level at a time [Expected: h3; Actual: h4]

(Or something similar — I'm not particular about the specific formatting of the output!)

@DavidAnson
Copy link
Owner

You can do part of that with an output formatter. This one lists files that failed validation (the second half of your example): https://github.com/DavidAnson/markdownlint-cli2/tree/main/formatter-summarize

The first half of the example, printing all the files that are examined is not possible with an output formatter today. If you do something like that interactively with a large set of files, the cost of writing everything to the console can become overwhelming and slow everything down. There's also no guarantee of the order things would be printed because of the asynchronous nature of the tool. And ordering output would slow everything down even further.

What is the objective here? Are you trying to verify the right set of files are being looked at? Or make sure the tool is not hung?

@DavidAnson DavidAnson added the question Further information is requested label Sep 10, 2023
@waldyrious
Copy link
Author

waldyrious commented Sep 10, 2023

Thanks for the quick response! Replies below with inline quotes.

What is the objective here? Are you trying to verify the right set of files are being looked at?

Yes, in essence that's it. More detailed context below, collapsed for convenience.

This came out in the context that triggered DavidAnson/markdownlint#915. I created that issue after being puzzled by a mistake that was being caught locally in VS Code, but not by markdownlint-cli2 in GitHub Actions (see relevant comment).

It turns out the issue was being caught by VS Code's own markdown linter, not by the markdownlint extension (which I also have installed in VS Code). There's an UI issue here which is not the responsibility of markdownlint, but VS Code's, which does not make it clear that the source of the reported problem is its native markdown support and not markdownlint extension, despite the presentation being very similar.

But beyond that, I was further confused due to not being aware that MD0052 explicitly excluded shortcut-style links. I ended up sifting the markdownlint repo for clues into this, leading to this comment and the discussion following it.

Having the list of files would have confirmed that markdownlint-cli2 was indeed processing the file in question and not finding any issue with it.


There's also no guarantee of the order things would be printed because of the asynchronous nature of the tool.

IIUC, this is already the case in the printing of the issues list, so I wasn't expecting the list of files to be processed to be sorted either. (It would be nice, sure, but most of the value of the flag would be preserved even without the sorting).

If you do something like that interactively with a large set of files, the cost of writing everything to the console can become overwhelming and slow everything down.

Isn't this already the case if the command is invoked with a plain (unquoted) glob in a directory with many markdown files? In that case the "Finding" line already prints the list of files anyway:

$ markdownlint-cli2 *.md
markdownlint-cli2 v0.9.2 (markdownlint v0.30.0)
Finding: CHANGELOG.md CONTRIBUTING.md LICENSE.md README.md
...

...so the potential issue already exists (and happens by default in that case). In the mode I'm proposing, printing the full file list (for quoted globs) would only happen if the user explicitly opts in to that via the --verbose flag (or whatever it's called). I would argue that these two aspects diminish the severity of the potential slowdown you describe.

@DavidAnson
Copy link
Owner

Thanks for the info! I have some ideas here I will explore soon.

Regarding how to verify a file is being included, an easy thing is to introduce a problem and see if it gets caught.

Regarding status output slowing things down, it's not displaying the large list that's a problem (in the example you show, the shell has already enumerated those files), instead it's the printing of each file as it is scanned that can slow things down. If you have a tool that does that and it has an option not to, time it both ways and you may be surprised how much faster it is without the status. :) I agree with you that opting into this demonstrates a person's willingness to deal with that, though.

@waldyrious
Copy link
Author

Regarding how to verify a file is being included, an easy thing is to introduce a problem and see if it gets caught.

Yes, that's what we eventually did, but not at first because we assumed a problem was already there. When there are rules with subtle exceptions (like MD052), what one assumes to be a problem may not actually be a problem in markdownlint's eyes, so a file list might help reassure/guide one to look in the right places.

Regarding status output slowing things down, it's not displaying the large list that's a problem (in the example you show, the shell has already enumerated those files), instead it's the printing of each file as it is scanned that can slow things down.

I assumed markdownlint either consumed the expanded list produced by the shell, or produced such a list itself via globby, before processing the files. I wasn't expecting the tool to interactively print the list of files as it processes them, but rather just dump the list all at once at the beginning. I'm not sure if this alleviates the issue, but if I'm understanding you correctly, then perhaps it does?

@DavidAnson
Copy link
Owner

Yes, displaying the whole list at once avoids the overhead of doing so during processing. I'm thinking to maybe provide the file list to the output formatters so people can do whatever they want with it. If so, it will be provided after linting along with results. (Unless linting takes a very long time, I don't think this is worse than displaying it earlier.)

@waldyrious
Copy link
Author

waldyrious commented Sep 11, 2023

I'll be honest, I would prefer not having to install multiple packages (markdownlint-cli2 plus the formatter) as well as set up a .markdownlint-cli2.jsonc config file just to get this behavior... 😅 Dunno, IMHO it feels basic enough to warrant a CLI flag.

@DavidAnson
Copy link
Owner

@waldyrious, here's the proposal: 3ea8f24

It's not a command-line option because those are reserved for things that NEED to be.

I plan to review this and commit tomorrow evening.

@DavidAnson DavidAnson added enhancement New feature or request fixed in next Fixed in the "next" branch for the next release and removed question Further information is requested labels Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request fixed in next Fixed in the "next" branch for the next release
Projects
None yet
Development

No branches or pull requests

2 participants