Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep content of textual, unparsed files to the dataset #151

Open
nguyenhoan opened this issue Jul 19, 2017 · 7 comments
Open

Keep content of textual, unparsed files to the dataset #151

nguyenhoan opened this issue Jul 19, 2017 · 7 comments

Comments

@nguyenhoan
Copy link
Contributor

Useful for tasks that processing contents as textual documents.
Users might also write simple parser in Boa to parse and extract information.

@nguyenhoan
Copy link
Contributor Author

53daba4

@psybers psybers reopened this Aug 14, 2017
@psybers
Copy link
Member

psybers commented Aug 14, 2017

Should we store all textual files, including ones we parsed? That gives users a consistent capability of being able to analyze any text file (including source).

@psybers
Copy link
Member

psybers commented Aug 14, 2017

The data should be stored into its own data file, not in the AST sequence file.

@psybers
Copy link
Member

psybers commented Aug 14, 2017

There needs to be a Boa function to read the file and return the contents for a given ChangedFile.

@nguyenhoan
Copy link
Contributor Author

I don't see benefit of storing parsed files because they are already in the asts.
Storing them would increase the space significantly.

@hridesh
Copy link
Member

hridesh commented Aug 14, 2017 via email

@psybers
Copy link
Member

psybers commented Aug 14, 2017

Parsed files do not retain all information (such as whitespace and comments) however. Hence my thoughts to including them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants