Skip to content

Usage

Actions

This extension adds numerous new actions. These are primarily CRUD actions for managing models, with inline documentation and predictable interactions. It's probably more helpful to only go over the more "unusual" new actions here.

agent_list

Search for agents by name or external ID, or just list all agents.

data_dict = {
    'q': 'QUERY',  # optional; searches in name, family_name, given_names, and external_id
}

toolkit.get_action('agent_list')({}, data_dict)

package_contributions_show

Show all contribution records for a package, grouped by agent. Optionally provide a limit and offset for pagination.

data_dict = {
    'id': 'PACKAGE_ID',
    'limit': 'PAGE_SIZE',
    'offset': 'OFFSET'
}

toolkit.get_action('package_contributions_show')({}, data_dict)

Returns a dict:

{
    'contributions': [
        {
            'agent': {
                # Agent.as_dict()
            },
            'activities': [
                # list of Activity.as_dict()
            ],
            'affiliations': [
                {
                    'affiliation': {
                        # Affiliation.as_dict()
                    },
                    'other_agent': {
                        # Agent.as_dict()
                    }
                },
                # ...
            ]
        },
        # ...
    ],
    'total': total,
    'offset': offset,
    'page_size': limit or total
}

agent_affiliations

Show all affiliations for a given agent, optionally limited to a specific dataset/package (plus ' global' affiliations).

data_dict = {
    'agent_id': 'AGENT_ID',
    'package_id': 'PACKAGE_ID'  # optional
}

toolkit.get_action('agent_affiliations')({}, data_dict)

Returns a list of records formatted as such:

{
    'affiliation': {
        # Affiliation.as_dict()
    },
    'other_agent': {
        # Agent.as_dict()
    }
}

attribution_controlled_lists

Returns collections of defined values (which can be modified by using @toolkit.chained_action).

data_dict = {
    'lists': ['NAME1', 'NAME2']  # optional; only return these lists
}

toolkit.get_action('attribution_controlled_lists')({}, data_dict)

There are four collections:

  1. agent_types describes valid types for agents and adds additional detail;
  2. contribution_activity_types contains role/activity taxonomies (i.e. Datacite and CRediT) and lists the available activity values;
  3. contribution_activity_levels is a list of contribution levels (i.e. 'lead', 'equal', and ' supporting', from CRediT);
  4. agent_external_id_schemes describes valid schemes for external IDs (currently, ORCID and ROR).

These collections are useful for validation and frontend connectivity/standardisation. They are contained within an action to a. enable frontend access via AJAX requests, and b. allow users to override values as needed.

Search external sources (ORCID and ROR) for agent data. Ignores records that already exist in the database.

data_dict = {
    'q': 'QUERY_STRING',
    'sources': ['SOURCE1', 'SOURCE2']  # optional; only search these sources
}

toolkit.get_action('agent_external_search')({}, data_dict)

Results are returned formatted as such:

{
    'SCHEME_NAME': {
        'records': [
            # list of agent dicts
        ]
        'remaining': 10000  # number of other records found
    }
}

agent_external_read

Read data from an external source like ORCID or ROR, either from an existing record or a new external ID.

# EITHER
data_dict_existing = {
    'agent_id': 'AGENT_ID',
    'diff': False
    # optional; only show values that differ from the record's current values (default False)
}

# OR
data_dict_new = {
    'external_id': 'EXTERNAL_ID',
    'external_id_scheme': 'orcid'  # or 'ror', etc.
}

toolkit.get_action('agent_external_read')({}, data_dict)

Commands

NB: you will have to install the optional [cli] packages to use several of these commands.

initdb

ckan -c $CONFIG_FILE attribution initdb
Initialise database tables.

sync

ckan -c $CONFIG_FILE attribution sync $OPTIONAL_ID $ANOTHER_OPTIONAL_ID
Retrieve up-to-date information from external APIs for contributors with an external ID set.

refresh-packages

ckan -c $CONFIG_FILE attribution refresh-packages $OPTIONAL_ID $ANOTHER_OPTIONAL_ID
Update the author string for all (or the specified) packages.

ckan -c $CONFIG_FILE attribution agent-external-search --limit 10 $OPTIONAL_ID $ANOTHER_OPTIONAL_ID
Search external APIs for contributors without an external ID set. Run refresh-packages and rebuild the search index after this command.

merge-agents

ckan -c $CONFIG_FILE attribution merge-agents --q $SEARCH_QUERY --match-threshold 75
Find agents with similar names (optionally matching the search query) and merge them. Run refresh-packages and rebuild the search index after this command.

migratedb

ckan -c $CONFIG_FILE attribution migratedb --limit 10 --dry-run --no-search-api
Attempt to extract names of contributors from author fields and convert them to the new format. - --limit will only convert a certain number of packages at a time. - --dry-run prevents saving to the database. - --no-search-api just extracts the names, without searching external APIs for contributors after.

It is recommended to run merge-agents, refresh-packages, and rebuild the search index after running this command.