CRS Entity matching #37
Labels
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: WASHWeb/WASHWeb-2019#37
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
We need a way to start matching organizations across different WASH dataset with wikidata. I'm opening this issue do aside from manual matching, we can discuss alternatives and find better solutions. Such as the approach here:
https://github.com/cwrc/wikidata-entity-lookup
I've had good progress working with OpenRefine for assisted manual matching. Some matches are automatic but there is still significant manual effort in the first round. Neat thing is that there is an API and we can control OpenRefine and repeat procedures using Python. It is also well documented/supported in the wikidata space: https://www.wikidata.org/wiki/Wikidata:Tools/OpenRefine/Editing