Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSL omitted in translation to "machine" form #19

Closed
jgm opened this issue May 28, 2020 · 17 comments
Closed

CSL omitted in translation to "machine" form #19

jgm opened this issue May 28, 2020 · 17 comments

Comments

@jgm
Copy link

@jgm jgm commented May 28, 2020

In processor-tests/machines/name_BibliographyNameFormNeverShrinks.json I see

    "csl": "<info>\n    <title>SCRDiss2</title>\n    <id>http://www.zotero.org/styles/juristische-zitierweise</id>\n    <link href=\"http://www.zotero.org/styles/juristische-zitierweise\" rel=\"self\"/>\n    <link href=\"www.niederle-media.de/Zitieren.pdf\" rel=\"documentation\"/>\n    <link href=\"https://forums.zotero.org/discussion/20886/citation-style-for-german-lawyers/\" rel=\"documentation\"/>\n    <author>\n      <name>SCR</name>\n      <email>Sophie.Catherine@gmx.de</email>\n    </author>\n    <contributor>\n      <name>SCR</name>\n    </contributor>\n    <category citation-format=\"note\"/>\n    <category field=\"law\"/>\n    <summary>Juristische Zitierweise nach Stüber www.niederle-media.de/Zitieren.pdf</summary>\n    <updated>2017-12-28T23:40:35+00:00</updated>\n    <rights license=\"http://creativecommons.org/licenses/by-sa/3.0/\">This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License</rights>\n  </info>\n  <locale xml:lang=\"de-DE\">\n    <terms>\n      <term name=\"accessed\">besucht am</term>\n    </terms>\n  </locale>\n  <macro name=\"author\">\n    <names variable=\"author\" font-style=\"italic\">\n      <name delimiter=\"/ \" name-as-sort-order=\"all\" sort-separator=\", \" form=\"long\"/>\n      <label form=\"short\" prefix=\" (\" suffix=\")\"/>\n      <substitute>\n        <names variable=\"editor\"/>\n      </substitute>\n    </names>\n  </macro>\n\n<macro name=\"editor\">\n    <names variable=\"editor\" font-style=\"italic\">\n      <name delimiter=\"/ \" name-as-sort-order=\"all\" sort-separator=\", \" form=\"long\"/>\n      <label form=\"short\" prefix=\" (\" suffix=\")\"/>\n    </names>\n  </macro>\n  <macro name=\"author-note\">\n    <names variable=\"author\" font-style=\"italic\">\n      <name form=\"short\" delimiter=\"/\" et-al-min=\"3\" et-al-use-first=\"1\"  initialize-with=\". \"/>\n    </names>\n  </macro>\n  <macro name=\"autor-editor-note\">\n    <names variable=\"author\" font-style=\"italic\">\n      <name form=\"short\" delimiter=\"/\" et-al-min=\"3\" et-al-use-first=\"1\" sort-separator=\"\"/>\n      <substitute>\n        <names variable=\"editor\"/>\n      </substitute>\n    </names>\n  </macro>\n  <citation disambiguate-add-names=\"true\" disambiguate-add-givenname=\"true\" givenname-disambiguation-rule=\"all-names-with-initials\">\n    <layout delimiter=\"; \" suffix=\".\">\n      <text macro=\"autor-editor-note\"/>\n    </layout>\n  </citation>\n  <bibliography>\n    <sort>\n      <key macro=\"author\"/>\n      <key variable=\"issued\"/>\n    </sort>\n    <layout>\n       <text macro=\"author\"/>\n    </layout>\n  </bibliography>\n</style>",

which leaves off the beginning of the CSL file. The whole thing is correctly represented in the corresponding (human) .txt file, and it seems to be valid. So this seems to be an issue with the generation of the machine-readable versions by python processor.py --grind.

@jgm
Copy link
Author

@jgm jgm commented May 28, 2020

Similar issues in several other tests, including
bugreports_DelimiterOnLayout
name_BibliographyNameFormNeverShrinks
group_LegalWithAuthorDate
bugreports_EnvAndUrb

Also textcase_NoSpaceBeforeApostrophe has invalid xml:

            </label prefix=" (" suffix=")" form="short"/>
@bdarcus
Copy link
Member

@bdarcus bdarcus commented May 28, 2020

Is this a new problem @jgm?

I merged a PR yesterday that was intended to fix a regular expression, but perhaps it had unintended consequences.

Can you take a look at this please?

af561aa

@bdarcus
Copy link
Member

@bdarcus bdarcus commented May 28, 2020

And here's the PR:

#10

@jgm
Copy link
Author

@jgm jgm commented May 28, 2020

I don't know, I'm doing something new (fooling around with writing a lua citeproc), so it's not that something that was working before broke.

@bdarcus
Copy link
Member

@bdarcus bdarcus commented May 28, 2020

OK. I don't really know anything about this myself.

Maybe @fbennett can offer insight.

@jgm
Copy link
Author

@jgm jgm commented May 28, 2020

Not sure what is going on now!
I reset to current master HEAD.
Now when I run python processor.py --grind I get

Traceback (most recent call last):
  File "processor.py", line 421, in <module>
    params.refreshSource(force=True)
  File "processor.py", line 151, in refreshSource
    self.grindFile(hpath,filename,mp)
  File "processor.py", line 173, in grindFile
    test.parse()
  File "processor.py", line 238, in parse
    self.extract(element,required=True,is_json=False)
  File "processor.py", line 258, in extract
    raise ElementMissing(self.script,tag,self.testname)
__main__.ElementMissing: ('processor.py', 'MODE', 'ignore.txt')

and I notice that many of the .txt files are changed from their repository versions. (I wouldn't have expected this program to change the sources.)

Reverting the commit you mention doesn't affect this.

Am I doing something wrong?

If I revert to 4a549a6, then the script succeeds but again I see changes in the .txt files. (Maybe just line endings.)

The error noted above is still present with that version.

@bdarcus
Copy link
Member

@bdarcus bdarcus commented May 28, 2020

OK, that error I understand, and can fix. Give me a bit.

@bdarcus
Copy link
Member

@bdarcus bdarcus commented May 28, 2020

That error should be fixed.

@bdarcus
Copy link
Member

@bdarcus bdarcus commented May 28, 2020

I just ran the script and looked at the processor-tests/machines/name_BibliographyNameFormNeverShrinks.json. Note that I do NOT see the same output as you. I see a full CSL style for the "csl" value, with the root "style" element.

   "csl": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<style xmlns=\"http://purl.org/net/xbiblio/csl\" 

I'm running Python 3.8.3.

@bdarcus
Copy link
Member

@bdarcus bdarcus commented May 28, 2020

Also textcase_NoSpaceBeforeApostrophe has invalid xml:

            </label prefix=" (" suffix=")" form="short"/>

Again, on this I see what I expect: a full style; not the above.

adam3smith added a commit that referenced this issue May 28, 2020
Reported in #19 (comment) (and I believe also by Norm)
@adam3smith
Copy link
Member

@adam3smith adam3smith commented May 28, 2020

@bdarcus -- pretty sure @jgm just pulled out that line, which is invalid XML. I've created a pull request, should be OK to merge, but I'm doing too many things at once so please take a quick look before merging.

bdarcus pushed a commit that referenced this issue May 28, 2020
Closes #15.

Also reported in #19.
@bdarcus
Copy link
Member

@bdarcus bdarcus commented May 28, 2020

Right; I clearly misread that one.

So @jgm - some of this is now fixed.

@jgm
Copy link
Author

@jgm jgm commented May 28, 2020

Great! That fixes all my xml parsing errors, thank you.

I'm still wondering why the grind script changes the source files, but that's a more minor issue.

@jgm
Copy link
Author

@jgm jgm commented May 28, 2020

Feel free to close this if you intend the changing .txt files.

@bdarcus
Copy link
Member

@bdarcus bdarcus commented May 28, 2020

I'll leave it open for a bit in case @adam3smith or @fbennett have anything to say on that.

But glad it's working now!

Since you're looking at the test-suite now, if you have any time, I'd welcome input on #16. I've just pushed a PR that addresses it, I think.

@fbennett
Copy link
Member

@fbennett fbennett commented May 29, 2020

For citeproc-js development, I built a new test runner, citeproc-test-runner that can run in Travis-CI. It has some nice features, including a watch mode that provides dynamic updates in a terminal window as a style file is being edited, and configurable paths to the various inputs needed to run a processor. It's coded to run citeproc-js, but could probably be adapted for use with other processors.

@bdarcus
Copy link
Member

@bdarcus bdarcus commented May 29, 2020

@fbennett do you recommended we add that new test runner to this README?

@bdarcus bdarcus closed this May 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.