Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upPage ranges with letters #25
Comments
I don't have my copy of chicago handy. Anyone else know? Beyond confirming the tests, looks like we need to add something on this to the spec. |
I think the logic makes some sense: At times, page number have a prefix that identifies them as pages, e.g., in a separately published appendix. In those cases, there's no reason to think that standard page number collapsing rules shouldn't apply, i.e. if you're citing A231 to A232, A231-32 (Chicago) or A231-2 (minimal) does seem to make sense and increase readability. The downside I'm seeing (and I think that may have come up before) is that you may see hyphens in electronic article numbers and then this rule can produce bizarre outcomes. I think by testing for identical prefixes it tries to prevent this. The Chicago Manual has nothing to say on this or any of the examples in the Chicago weird tests, so no help there ;). However, Citing Medicine, (aka Vancouver) which uses minimal page ranges does have a number of relevant examples and unless I'm misreading something, they confirm the test-suite's behavior: https://www.ncbi.nlm.nih.gov/books/NBK7282/#A32739
|
It would be helpful if the test suite could have a field that says whether the test really tests spec behavior or just some additional behavior that citeproc-js implements but isn't part of the spec. |
Anyway, thank for explaining the logic. I think it makes sense, and I'm happy to close this! |
By the way, the reason I have all these questions is that I'm writing a new Haskell CSL processing library. The legacy code in pandoc-citeproc (inherited from citeproc-hs) is really hairy and I can't understand it well enough to maintain it; in addition, I never really understood CSL, and this is forcing me to learn it. The new library will be faster and more accurate than pandoc-citeproc, and it is parameterized on a document type, so it should be easy to use outside fo the pandoc ecosystem. If quality is high enough I might make it a dependency of pandoc so a filter isn't needed. Just about everything is implemented now except disambiguation and collapsing. I'm sure I'll have more questions as I go along, and I'll put it in a public repository once it gets a bit closer. |
Wow, very cool. |
This is related to #17, so I strongly agree. Identifying these would also give us a checklist of details that we should add to the spec. As you work through these, could you perhaps post a list of tests you think might qualify, beyond this one? Also, do you have in mind what the content of that field should be? I experimented a bit with just adding this to page_Minimal, and the current python script just ignores it; would of course be easy to extend though.
So possible values would be the releases ("1.0", "1.1"), with an optional variant, including maybe "undocumented" (to flag what needs updating in the spec)? |
Since you're working on this (great!), the activity on the schema repo is aimed towards pushing out two releases this Summer, one of them a 1.1 release (we're also doing a minor release with new strings (types, variables, etc.). So far 1.1. doesn't have any breaking changes, but it does make explicit, and extends, a feature you already support in pandoc: citet citation config. The new element allows us to fully support styles like APA on this. |
jgm commentedJun 12, 2020
The spec doesn't make clear what to do when page ranges have letters, but there are tests like this. Some of the expected results seem wrong to me, though.
E.g. in page_Minimal.txt, the second to last example
turns into
Why? That makes no sense to me. I am inferring that the algorithm is:
Is that the algorithm, and is that really how it should work here?
Similar questions apply to some of the weird Chicago cases.