Skip to content

Data Curation API (3)

This API provides services to curate and enrich Business Partner and address data.

Download OpenAPI description
Languages
Servers
Mock server

https://idp.cdq.com/_mock/apis/data-curation-api/api-v3/

Production SOAP

https://api.cdq.com/data-curation/soap/v3/

Production

https://api.cdq.com/data-curation/

Batch Curation

Everything about Batch Curation

Operations

Start Curation Job

Request

Start a batch curation job on a given storage.

Security
basicAuth
Headers
X-Credential-Usernamestringrequired

Username that is passed as header parameter with the name X-Credential-Username. The header can take form of:

  • username (e.g. "lukaszmichta")
  • user id (e.g. "87b1bdb1-ba87-4522-b363-c5a0e6e917b3")
Example: 87b1bdb1-ba87-4522-b363-c5a0e6e917b3
Bodyapplication/jsonrequired
namestring<= 50 characters

Name of a Job.

Example: "Process vendor data."
descriptionstring<= 200 characters

Detailed description of a Job.

Example: "I started this job to improve quality of our data."
storageIdstring(BusinessPartnerStorageId)required

Unique identifier of the Storage.

Example: "72d6900fce6b326088f5d9d91049e3e6"
dataSourceIdsArray of strings(BusinessPartnerStorageDataSourceId)

If set, only the records that belong to the data sources identified by these IDs are processed. By default, all records of the storage (means from all data sources) are processed (considering other filters).

Example: ["648824a691d8d2503d65103e"]
countryShortNamesArray of strings(CountryShortName)

If set, only the records that belong to the countries identified by these short names are processed. By default, all records of the storage (means from all countries) are processed (considering other filters).

Example: ["CH"]
workersinteger[ 1 .. 8 ]

Number of workers to be used for the job. By default, the number of workers is 1.

Default 1
Example: "3"
profilestring(Profile)
Enum"STANDARD""ADDRESS_ONLY""STANDARD_ADDRESS_CURATION_AND_ENRICHMENT""ADDRESS_STANDARDIZATION""ADDRESS_TRANSLATION""BUSINESS_PARTNER_ONLY""FEATURES_OFF""NATURAL_PERSON_SCREENING""PRECURATION""GOLDEN_RECORD"
Example: "STANDARD"
languagestring(LanguageTechnicalKey)

ISO 639-1 two-letter code of languages.

Example: "DE"
outputCharsetsArray of objects(OutputCharset)

List of Output Character Sets.

addressDataSourcesobject(AddressDataSources)

Preferred data sources for curation. Default PrimaryAddressDataSource is HERE. Default SecondaryAddressDataSource is CDQ.

developmentOptionsArray of objects(DevelopmentOption)

List of Development Options.

addressCurationLevelThresholdstring(CurationLevel)

Indicator for curation quality. Defines how good curation was.

UNKNOWN: No possibility to determine curation level. LEVEL_1: The address was not found by the CDQ in the employed external data sources. LEVEL_2: The address was found, but there were significant changes in critical fields. LEVEL_3: The address was found and there are minor changes in highly important fields. LEVEL_4: The address was found by the CDQ. There were only changes in less critical fields such as the address/premise or address/thoroughfare/number. LEVEL_5: The address was found by the CDQ, but no major changes have been made as the address was correct. LEVEL_6: The address was found in the shared CDQ data pool. This means another company uses the same address which is a very reliable indicator that the address is correct (only in a alpha version)

Additional documentation can be found here.

Example: "LEVEL_1"
fieldsArray of strings

Fields are deprecated.

Items Enum"formattedAddress""legalAddress""companyStatus""classifications"
Example: ["formattedAddress"]
featuresOnArray of strings(Feature)

List of features to be activated.

Items Enum"ACTIVATE_DATASOURCE_BVD""ACTIVATE_DATASOURCE_DNB""ALL_ADDRESS_VERSIONS""CAPITALIZE_ADDRESS""CONFIRM_IDENTIFIERS""DETECT_INDUSTRIAL_ZONE""ENABLE_FUZZY_ENRICHMENTS""ENABLE_SETTINGS""ENABLE_UNALLOWED_NAME_VALUE_VALIDATION""ENRICH_ADDRESS"
Example: ["ENRICH_ADDRESS"]
featuresOffArray of strings(Feature)

List of features to be deactivated.

Items Enum"ACTIVATE_DATASOURCE_BVD""ACTIVATE_DATASOURCE_DNB""ALL_ADDRESS_VERSIONS""CAPITALIZE_ADDRESS""CONFIRM_IDENTIFIERS""DETECT_INDUSTRIAL_ZONE""ENABLE_FUZZY_ENRICHMENTS""ENABLE_SETTINGS""ENABLE_UNALLOWED_NAME_VALUE_VALIDATION""ENRICH_ADDRESS"
Example: ["ENRICH_ADDRESS"]
configurationIdstring(DataCurationConfigurationId)

Configuration ID used to set up curation. If provided, those parameters will be affected. If any of them is provided in this request, will overwrite one from configuration (except for features which are merged):

  • outputLanguageTechnicalKey
  • addressDataSources
  • profile
  • featuresOn
  • featuresOff
  • outputCharsets
  • addressCurationLevelThreshold
  • numberSeparator
  • inputAddressConceptsIgnored
Example: "5c5356588c72a028c448adbd"
tagsArray of strings(JobTags)<= 12 itemsunique

List of Job Tags.

Example: ["Reporting"]
taskItemSizeinteger

Size of a task item. By default, the size is 100.

Example: "100"
waitThresholdinteger

Wait threshold. By default, the threshold is 1000.

Example: "1000"
jobDependenciesArray of objects(JobDependency)

List of Job Dependencies.

processingLogIdstring(ProcessingLogId)

Processing log ID to which data should be upserted by worker. In the future, when processing log ID is provided, data is not upserted to job result storage.

Example: "CURATION_LOG"
processingLogTriggerTypestring(ProcessingLogTriggerType)

Processing log trigger type determining way how the curation item or job has been triggered.

Example: "INITIALIZED"
modifiedAfterstring

Makes the job evaluate business partners which were modified after given date described in ISO-8601 format.

Example: "2023-07-06T12:14:03.204Z"
optionSkipReportbooleanDeprecated

Deprecated and not usable. For a report creation, use reportsRequest.

Default true
Example: "true"
reportsRequestobject(DataCurationReportsRequest)Deprecated

Deprecated. Reports are available in Data Clinic app.

curl -i -X POST \
  -u <username>:<password> \
  https://idp.cdq.com/_mock/apis/data-curation-api/api-v3/public/curationjobs \
  -H 'Content-Type: application/json' \
  -H 'X-Credential-Username: 87b1bdb1-ba87-4522-b363-c5a0e6e917b3' \
  -d '{
    "name": "Process vendor data.",
    "description": "I started this job to improve quality of our data.",
    "storageId": "72d6900fce6b326088f5d9d91049e3e6",
    "dataSourceIds": [
      "648824a691d8d2503d65103e"
    ],
    "countryShortNames": [
      "CH"
    ],
    "workers": "3",
    "profile": "STANDARD",
    "language": "DE",
    "outputCharsets": [
      {
        "concept": "ADDRESS",
        "charset": "LATIN"
      }
    ],
    "addressDataSources": {
      "primaryAddressDataSource": {
        "technicalKey": "HERE",
        "threshold": "0.4"
      },
      "secondaryAddressDataSources": [
        {
          "technicalKey": "HERE",
          "threshold": "0.4"
        }
      ]
    },
    "developmentOptions": [
      {
        "element": "N/A",
        "countryShortName": "CH",
        "customization": "N/A"
      }
    ],
    "addressCurationLevelThreshold": "LEVEL_1",
    "fields": [
      "formattedAddress"
    ],
    "featuresOn": [
      "ENRICH_ADDRESS"
    ],
    "featuresOff": [
      "ENRICH_ADDRESS"
    ],
    "optionSkipReport": "true",
    "reportsRequest": {
      "dataCurationJobId": "a34fb367-85aa-400f-b369-53863432050c",
      "name": "Data Curation Reports Job",
      "description": "The report will be generated for the Data Curation Job with ID: a34fb367-85aa-400f-b369-53863432050c.",
      "tags": [
        "Reporting"
      ],
      "jobDependencies": [
        {
          "id": "35f23c03-1c22-45fe-9484-3ffe769325de",
          "strategy": "SCHEDULE_NEXT_WHEN_FINISHED"
        }
      ],
      "reportsConfiguration": {
        "addressCuration": {
          "build": "true"
        },
        "legalEntityCuration": {
          "build": "true"
        },
        "naturalPersonScreening": {
          "build": "true"
        },
        "vatRegistrationData": {
          "build": "true"
        }
      }
    },
    "configurationId": "5c5356588c72a028c448adbd",
    "tags": [
      "Reporting"
    ],
    "taskItemSize": "100",
    "waitThreshold": "1000",
    "jobDependencies": [
      {
        "id": "35f23c03-1c22-45fe-9484-3ffe769325de",
        "strategy": "SCHEDULE_NEXT_WHEN_FINISHED"
      }
    ],
    "processingLogId": "CURATION_LOG",
    "processingLogTriggerType": "INITIALIZED",
    "modifiedAfter": "2023-07-06T12:14:03.204Z"
  }'

Responses

OK

Bodyapplication/json
idstring(JobId)

Unique identifier of a job.

Example: "35f23c03-1c22-45fe-9484-3ffe769325de"
namestring

Name of a Job.

Example: "Process vendor data"
descriptionstring

Detailed description of a Job.

Example: "I started this job to improve quality of our data."
storageIdstring(BusinessPartnerStorageId)

Unique identifier of the Storage.

Example: "72d6900fce6b326088f5d9d91049e3e6"
dataSourceIdsArray of strings(BusinessPartnerStorageDataSourceId)

List of Data Source IDs.

Example: ["648824a691d8d2503d65103e"]
countryShortNamesArray of strings(CountryShortName)

List of country short names.

Example: ["CH"]
statusstring(JobStatus)

Curation Job execution status.

Enum"UNKNOWN""CREATED""PERSISTED""SCHEDULED""WAITING""RUNNING""FINISHED""DIED""CANCELED""FAILED"
Example: "RUNNING"
statusMessagestring(JobStatusMessage)

Additional information to explain the status.

Example: "The job failed because storage is empty."
createdAtstring(CreatedAt)

Date of creation (ISO 8601-compliant).

Example: "2020-08-31T16:47+00:00"
userstring(JobUser)

ID of (human) user or API key.

Example: "742429-234242-4343-232323"
progressinteger(JobProgress)[ 0 .. 100 ]

Progress (%) of the job.

Example: "77"
attachmentsArray of objects(FileResource)

List of attachments.

reportsJobIdstring

Unique identifier of a Reports Job.

Example: "6be92567-4327-4463-813f-a8c990410d79"
reportsConfigurationobject(DataCurationReportsConfiguration)

Configures if and how Data Curation reports are generated.

jobDependenciesArray of objects(JobDependency)

List of Job Dependencies.

Response
application/json
{ "id": "35f23c03-1c22-45fe-9484-3ffe769325de", "name": "Process vendor data", "description": "I started this job to improve quality of our data.", "storageId": "72d6900fce6b326088f5d9d91049e3e6", "dataSourceIds": [ "648824a691d8d2503d65103e" ], "countryShortNames": [ "CH" ], "status": "RUNNING", "statusMessage": "The job failed because storage is empty.", "createdAt": "2020-08-31T16:47+00:00", "user": "742429-234242-4343-232323", "progress": "77", "attachments": [ {} ], "reportsJobId": "6be92567-4327-4463-813f-a8c990410d79", "reportsConfiguration": { "addressCuration": {}, "legalEntityCuration": {}, "naturalPersonScreening": {}, "vatRegistrationData": {} }, "jobDependencies": [ {} ] }

Poll Curation Job

Request

After you have started a curation job, you will receive a job id in the response. { 'id' : <ID> } Use this ID to poll for the status of the job using this endpoint. Once the status is FINISHED, you can download the results.

Security
basicAuth
Path
idstring(JobId)required

ID of the Data Curation job.

Example: 35f23c03-1c22-45fe-9484-3ffe769325de
Headers
X-Credential-Usernamestringrequired

Username that is passed as header parameter with the name X-Credential-Username. The header can take form of:

  • username (e.g. "lukaszmichta")
  • user id (e.g. "87b1bdb1-ba87-4522-b363-c5a0e6e917b3")
Example: 87b1bdb1-ba87-4522-b363-c5a0e6e917b3
curl -i -X GET \
  -u <username>:<password> \
  https://idp.cdq.com/_mock/apis/data-curation-api/api-v3/public/curationjobs/35f23c03-1c22-45fe-9484-3ffe769325de \
  -H 'X-Credential-Username: 87b1bdb1-ba87-4522-b363-c5a0e6e917b3'

Responses

OK

Bodyapplication/json
idstring(JobId)

Unique identifier of a job.

Example: "35f23c03-1c22-45fe-9484-3ffe769325de"
namestring

Name of a Job.

Example: "Process vendor data"
descriptionstring

Detailed description of a Job.

Example: "I started this job to improve quality of our data."
storageIdstring(BusinessPartnerStorageId)

Unique identifier of the Storage.

Example: "72d6900fce6b326088f5d9d91049e3e6"
dataSourceIdsArray of strings(BusinessPartnerStorageDataSourceId)

List of Data Source IDs.

Example: ["648824a691d8d2503d65103e"]
countryShortNamesArray of strings(CountryShortName)

List of country short names.

Example: ["CH"]
statusstring(JobStatus)

Curation Job execution status.

Enum"UNKNOWN""CREATED""PERSISTED""SCHEDULED""WAITING""RUNNING""FINISHED""DIED""CANCELED""FAILED"
Example: "RUNNING"
statusMessagestring(JobStatusMessage)

Additional information to explain the status.

Example: "The job failed because storage is empty."
createdAtstring(CreatedAt)

Date of creation (ISO 8601-compliant).

Example: "2020-08-31T16:47+00:00"
userstring(JobUser)

ID of (human) user or API key.

Example: "742429-234242-4343-232323"
progressinteger(JobProgress)[ 0 .. 100 ]

Progress (%) of the job.

Example: "77"
attachmentsArray of objects(FileResource)

List of attachments.

reportsJobIdstring

Unique identifier of a Reports Job.

Example: "6be92567-4327-4463-813f-a8c990410d79"
reportsConfigurationobject(DataCurationReportsConfiguration)

Configures if and how Data Curation reports are generated.

jobDependenciesArray of objects(JobDependency)

List of Job Dependencies.

Response
application/json
{ "id": "35f23c03-1c22-45fe-9484-3ffe769325de", "name": "Process vendor data", "description": "I started this job to improve quality of our data.", "storageId": "72d6900fce6b326088f5d9d91049e3e6", "dataSourceIds": [ "648824a691d8d2503d65103e" ], "countryShortNames": [ "CH" ], "status": "RUNNING", "statusMessage": "The job failed because storage is empty.", "createdAt": "2020-08-31T16:47+00:00", "user": "742429-234242-4343-232323", "progress": "77", "attachments": [ {} ], "reportsJobId": "6be92567-4327-4463-813f-a8c990410d79", "reportsConfiguration": { "addressCuration": {}, "legalEntityCuration": {}, "naturalPersonScreening": {}, "vatRegistrationData": {} }, "jobDependencies": [ {} ] }

Read Business Partner Curation Batch Results

Request

Retrieves curation results for particular job.

Security
basicAuth
Path
idstring(JobId)required

ID of the Data Curation job.

Example: 35f23c03-1c22-45fe-9484-3ffe769325de
Query
businessPartnerIdArray of strings(BusinessPartnerId)

Business Partner IDs which should be filtered.

Example: businessPartnerId=63e635235c06b7396330fe40
startAfterstring(StartAfter)

Used to retrieve the next page of results.

Example: startAfter=5712566172571652
limitinteger(int32)[ 1 .. 100 ]

Number of results that should be fetched. Maximum 100 results can be returned in one page.

Default 100
Example: limit=50
Headers
X-Credential-Usernamestringrequired

Username that is passed as header parameter with the name X-Credential-Username. The header can take form of:

  • username (e.g. "lukaszmichta")
  • user id (e.g. "87b1bdb1-ba87-4522-b363-c5a0e6e917b3")
Example: 87b1bdb1-ba87-4522-b363-c5a0e6e917b3
curl -i -X GET \
  -u <username>:<password> \
  'https://idp.cdq.com/_mock/apis/data-curation-api/api-v3/public/v2/curationjobs/35f23c03-1c22-45fe-9484-3ffe769325de/results?businessPartnerId=63e635235c06b7396330fe40&startAfter=5712566172571652&limit=50' \
  -H 'X-Credential-Username: 87b1bdb1-ba87-4522-b363-c5a0e6e917b3'

Responses

OK

Bodyapplication/json
startAfterstring(StartAfter)

The ID which is used to read the page.

Example: "5712566172571652"
limitinteger(Limit)

Number of items per page.

Example: "100"
totalinteger(PageTotal)

Total number of items which can be paged.

Example: "67"
valuesArray of objects(CurationJobResult)

List of Curation Job Results.

nextStartAfterstring(NextStartAfter)

Provides a value to be used as a startAfter in next page request.

Example: "5712566172571652"
Response
application/json
{ "startAfter": "5712566172571652", "limit": "100", "total": "67", "values": [ {} ], "nextStartAfter": "5712566172571652" }

Business Partners

Everything about Business Partners

Operations

Processing Logs

Operations

Configuration

Operations

Cache

Operations

Addresses

Operations