-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
redatam #672
Comments
@ropensci-review-bot check package |
Thanks, about to send the query. |
Error (500). The editorcheck service is currently unavailable |
@ropensci-review-bot check package |
Thanks, about to send the query. |
Error (500). The editorcheck service is currently unavailable |
@pachadotdev and @emilyriederer Sorry for any inconvenience caused by these errors. Our check system hasn't yet been properly configured to handle packages in sub-directories. I'll let you know here when we've updated, and you can call checks again. |
Hey @pachadotdev ! This seems like some very cool software and an important goal. Could you please elaborate on how you see this package interacting with the |
redatamx is a new package made by ECLAC, it is focused in the new "Redatam X" format and I have no part on it redatam (retired from CRAN, I hope to get it back there soon) is more focused on data "archeology," and I already have a group of users from Latin America that need demographic data for the period 1990-2020, that is the span of years where the formats DIC (Redatam versions 1 to 5) and DICX (Redatam 6 and ongoing) were in use. @litalbarkai wrote the C++ parts, then I focused on the R and Python code and I made some refactors to make it work with C++ 11 and very minimal dependencies (i.e., pugixml instead of building/installing Apache Xerces), but it is a collaborative project and Lital is a co-author. I also wrote the article that we sent to the journal, where I was 100% focused on the "human writing" and not the "code writing", and Lital is the lead singer for the C++ parts. We have two repos and send each other PRs to keep it neat. Could we use branches? yes, but I am a boomer. The alternative to this package is to use old hardware and a point-and-click tool on Windows 98/XP, which is why I keep my old ThinkPad X200 and an external DVD reader. It not feasible to read old census data with modern hardware, which is a problem derived from it being in a closed source format. Even worse, some recent census data comes with an installer that does not work on Windows 10+, and that I was able to extract the data by using Wine on my main modern laptop. |
Hi @pachadotdev - thanks for your patience as we discussed internally. This looks like a great project that is undoubtedly extremely useful and well within the rOpenSci scope. However, we feel overall we need to adhere to the rOpenSci policy of not reviewing forks due to the complexities that could cause in downstream transparency, maintainability, etc. For example, changes is one project are less likely to be fully tested for the other; one part of the project could change it's license; etc. I think either solutions where the full codebase is contained in one repo or another solution which limits yours to the current R package and uses the C++ project as a dependency could work. If you might be interested in restructuring, I'd happily hold this and plan for a full review of the new project. Otherwise, it may be best to close for now. |
thinking out of the box, should https://github.com/litalbarkai/open-redatam/ be the url? I think @litalbarkai could comment on that solution. I cannot just create a "copy and paste" in my own repository, that would hide Lital's great contributions. I have also contributed to the C++ codebase and I prefer to keep a track of all the changes I made. This is only a fork in technical terms, but it is not a fork in terms of a derived project. About this "changes is one project are less likely to be fully tested for the other; one part of the project could change it's license; etc.":
|
Thanks @pachadotdev for the replies. Just to clarify: I understand that you and @litalbaraki have established a good working model that sounds very effective for your purpose. I understand the complexities here and the concern isn't so much the operating model of this specific package but general concerns that illustrate why we have the current policy. |
hi @mpadge @emilyriederer |
Hi @pachadotdev ! Thanks for the update. We do review submissions on branches, but that case assumes that packages post-review will ultimately be hosted on the default GitHub branch as the sole and primary instance. If both you and @litalbaraki are happy with that, this might work. Otherwise, I can run this back by the team to discuss |
@litalbarkai are you ok sending your repo to submission? I was considering merging my fork into your repo, but all the tutorials were McGyver level and I gave up. |
@pachadotdev @litalbarkai This StackOverflow chunk can be used to merge two repos. I just tried it for your repos, and everything worked perfectly. You can also easily transfer issues across, if you need. Because the two repos are in different orgs, you might need to create a dummy repo in origin-org, transfer all issues to that, transfer dummy repo to destination-org, then move issues from there to destination-repo. |
can I keep my fork just in case? using https://github.com/litalbarkai/open-redatam/tree/main/rpkg seems to be ok (that is the original repo) |
or for a dummy/meta repo use https://github.com/orgs/open-redatam/repositories? I just created that |
Per discussion here, we're moving this to "hold" state while |
Submitting Author Name: Mauricio Pacha Vargas Sepulveda
Submitting Author Github Handle: @pachadotdev
Other Package Authors Github handles: (comma separated, delete if none) @litalbarkai
Repository: https://github.com/litalbarkai/open-redatam/tree/main/rpkg
Submission type: Pre-submission
Language: en
Scope
Please indicate which category or categories from our package fit policies or statistical package categories this package falls under. (Please check one or more appropriate boxes below):
Data Lifecycle Packages
Statistical Packages
Explain how and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:
REDATAM is a closed-source format for census and survey data. This package is an "archeological" version of the haven package, and allows to read this specific format widely used in Latin America by different govt. statistical offices. With this package, I have been able to convert census data from the 1990s that is not possible with the Redatam software on Windows because of multiple hardware changes in the last 30 years, and this software also reads recent census data (2017-2020) correctly.
If submitting a statistical package, have you already incorporated documentation of standards into your code via the srr package?
Who is the target audience and what are scientific applications of this package?
Sociologists, Political Scientists and Economists that need census data and an easy way to read it in R (or Python) to fit regression models or different kinds of analysis.
No. There is a "redatamx" that reads a newer format.
-(https://devguide.ropensci.org/policies.html#ethics-data-privacy-and-human-subjects-research)?
Yes.
This package was removed from CRAN for asking about a specific CLANG-ASAN error that took me long to replicate. The error was asked here as well https://stackoverflow.com/questions/79171799/addresssanitizer-error-alloc-dealloc-mismatch-operator-new-vs-free-in-r-packa
The text was updated successfully, but these errors were encountered: