Creating an harmonisation layer
About this guide
In this guide you will learn to create a data harmonisation layer to unify disparate data fields, dimensions, and more into a composite dataset. The Data API configuration DSL allows the creation of a harmonisation layer with the top level function $map.
Steps
1 - Create a new Data API
Navigate to https://demyst.com/app/create-api and include the Connectors that you want to include in your new Data API. Follow along the guide to leverage multiple connectors if you need help.
2 - Configure data harmonisation through the refine section
The refine option allows the user to define custom attributes and build the harmonisation layer. This harmonisation layer will be evaluated and returned in the response. These expressions allow a user to create an abstraction layer within which disparate data fields, formats, dimensions and columns are unified into a composite output response. This empowers your downstream users and processes democratic access to clean, conforming, and high-quality data.
Add your low code JSON snippet to the configuration section of the Data API.
{
"providers": {
"zoominfo_company_enrich": {
"version": "$latest"
},
"hosted_experian_cpdb": {
"version": "$latest"
},
"hosted_infogroup_business_places": {
"version": "$latest"
}
},
"refine": {
"sales": {
"$firstOf": [
"hosted_infogroup_business_places.results[0].location_sales_volume",
"hosted_experian_cpdb.results[0].est_annual_sales_amt",
{
"zoominfo_company_enrich.data[0].es_revenue": {
"$divide": 12
}
}
]
}
},
"config": {
"mode": "cache",
"return_raw_data": true,
"return_flattened_data": true,
"fixed_list_size": 3
}
}
That configuration can be saved in the Data API directly.
Your API is now configured.
3 - Retrieve harmonised attributes in the results
Saving that config will now allow you to run transactions and integrate the newly created Data API with your systems. When you run a request and start receiving data, you will see the refine section right at the top.
(Optional 4) - Play around with the config
You can add further elements from Demyst's proprietary DSL and play around with the config. Refer to the Top-level functions and Arithmetic Functions.
Additional Details
JSON Syntax
Arguments
A mapping from upstream → custom attributes.
Returns
The custom attribute, or the upstream attribute if no argument matched.
Notes
$map
needs to be called together with another function. If you only want to change a single attribute, you can use the $at
function.
Payload Example
# Replace upstream attribute values of living status and home owner status for standardisation.
{
"$firstOf": [
"domain_from_email.details.demographics.living_status",
"provider_5.home_owner_status"
],
"$map": {
"r": "renter",
"rent": "renter",
"o": "owner",
"homeowner": "owner",
"own": "owner"
}
}
# This update allows us to map the output of the individual attributes that are passed into $firstOf.
The example below shows that for the same code, C006 from Quantarium and Attom, different values
could be mapped.
{
"providers": {
"attomdata_attom_id": {
"version": "$latest"
},
"hosted_attom_residential_tax_assessor": {
"version": "$latest",
"inputs": {
"attom_id": "attomdata_attom_id.attom_id"
},
"when": {
"attomdata_attom_id": "$isMatch"
}
},
"hosted_quantarium_open_lien": {
"version": "$latest"
}
},
"inputs": {
"street": "221 Clinton Ave",
"city": "brooklyn",
"state": "ny",
"post_code": "11205",
"country": "us"
},
"refine": {
"route_demo": {
"$firstOf": [
{
"$at": "hosted_attom_residential_tax_assessor.results[0].contact_owner_mail_address_crrt",
"$map": {
"C006": "City"
}
},
{
"$at": "hosted_quantarium_open_lien.results[0].pa_carrier_route",
"$map": {
"C006": "County"
}
}
],
"$map": {
"C001": "Street",
"C002": "Road"
}
}
}
}
Example Scenario
Water-fall products for pre-fill: Using 2 or more base products (for increased coverage) to extract data attributes that will be passed to the downstream Connectors to get additional details about a subject. For example, getting a business address in order to get metrics for a property risk analysis.
Input to ML model: Harmonising the output attribute across base products which can be used as an input to downstream AI/ML models. This will improve efficiency and removes the need for ETL prior to loading the data into the model.
Updated over 1 year ago