Skip to content

Chore/maintenance#32

Merged
rempairamore merged 36 commits intomainfrom
chore/maintenance
Apr 3, 2026
Merged

Chore/maintenance#32
rempairamore merged 36 commits intomainfrom
chore/maintenance

Conversation

@arcangelo7
Copy link
Copy Markdown
Member

DRY refactoring and dead code removal

Shared logic between indexapi_v1.py and indexapi_v2.py (config, SPARQL metadata queries, date/author/self-citation helpers) extracted into indexapi_common.py. Removed the unreachable v1 /metadata endpoint, dead branches, unused WebLogger module, and replaced dateutil.parser with strptime.

Integration tests

Added tests for index v1 and v2, running against real SPARQL backends (QLever) with pre-built indexes tracked in the repo.

Ramose: vendored to dependency

Deleted the vendored src/ramose.py and switched to ramose>=2.0.0 as a package dependency.

Other

Python support: 3.9-3.11 to 3.10-3.12 (3.13 dropped, web.py incompatibility).

arcangelo7 and others added 30 commits March 8, 2026 13:22
Add pyright as dev dependency for static type checking.
Citations from DOI 10.1162/qss_a_00292 (OpenCitations Meta paper).
Use the local ramose fork instead of the installed package to
support multi-value parameter handling required by non-OMID
identifier queries. Add tests for venue-citation-count endpoint.
…uation bug

Virtuoso incorrectly evaluates subsequent OPTIONAL blocks when a
preceding OPTIONAL contains a UNION with a transitive property path
(frbr:partOf+) that produces no matches. This caused empty author and
source metadata for non-JournalArticle entities (e.g. fabio:Expression),
leading to wrong author_sc and journal_sc values in citation results.

Splitting the single OPTIONAL with UNION into two separate OPTIONALs
sharing the same ?venue variable is semantically equivalent and avoids
the bug.

Also adds integration tests for author self-citation, journal
self-citation, negative timespan, and month/day precision in timespan
calculations, with real data from OpenCitations.
…rptime

Remove unreachable code paths (encode, isinstance else branches,
multi=False, reverse=False, defensive key checks) and replace
dateutil.parser.parse with datetime.strptime using date padding.
Add mock tests for SPARQL endpoint failures to reach 100% coverage.
The v1 __get_omid_of used plain literals only, failing to resolve DOIs
stored as typed literals (^^xsd:string). The __br_meta_metadata had a
UNION inside OPTIONAL that triggered a Virtuoso evaluation bug,
corrupting GROUP_CONCAT results for authors. Both issues caused
inconsistent author_sc values between v1 and v2 for the same citations.

Port the typed literal UNION and the split OPTIONAL fixes from v2.
Replace bare except clauses with except RequestException and return
({}, []) instead of (None, None) from __br_meta_metadata to resolve
pyright errors. Add dedicated v1 test file, rename v2 tests, extract
shared helpers into conftest, and exclude ramose.py from coverage.
BREAKING CHANGE: the /v1/metadata/{dois} endpoint has been removed.
Clients relying on this endpoint will receive a 410 Gone response.
Extract common functions (metadata fetching, citation helpers, duration
calculation) into indexapi_common to eliminate duplication between v1 and
v2.

Move test database setup from shell scripts into pytest session fixtures
with pinned Docker images. Pin CI actions to commit SHAs and consolidate
setup-uv with python-version parameter.
web-py 0.62 imports the cgi module, removed in Python 3.13 (PEP 594).
No compatible release exists yet. Narrow requires-python and CI matrix
to <=3.12 until upstream fixes this.
@rempairamore
Copy link
Copy Markdown
Contributor

LGTM

@rempairamore rempairamore merged commit 05272ce into main Apr 3, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants