You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The normalized MARC XML files generated by POD from MARC21 files submitted by Johns Hopkins contain non-printing control characters that are not valid in XML. The visible symptom in the UI is that the records count is displayed as ??? for the XML version of the normalized files. Parsing these files with normal MARC tooling such as Marc Edit or ruby-marc raise errors because the files are not valid. I'm not sure what we can do about this in POD since the problem is with the files submitted to POD. While the records will be available for downstream consumers they are likely to run into problems making use of these files since they are not valid.
It's possible we could filter out non-printing characters during the normalized file writing process. Will need to investigate.
Example record:
JHU 001/bib number: 9638894 contains two instances of \x07 in the MARC 505$a
It'd be best if possible if Johns Hopkins fixes this on their end. (I'm not even sure if they know it's happening.) Has @bobpersing talked about this with them?
The normalized MARC XML files generated by POD from MARC21 files submitted by Johns Hopkins contain non-printing control characters that are not valid in XML. The visible symptom in the UI is that the records count is displayed as
???
for the XML version of the normalized files. Parsing these files with normal MARC tooling such as Marc Edit or ruby-marc raise errors because the files are not valid. I'm not sure what we can do about this in POD since the problem is with the files submitted to POD. While the records will be available for downstream consumers they are likely to run into problems making use of these files since they are not valid.It's possible we could filter out non-printing characters during the normalized file writing process. Will need to investigate.
Example record:
JHU 001/bib number: 9638894 contains two instances of
\x07
in the MARC505$a
cc @bobpersing
The text was updated successfully, but these errors were encountered: