Specify how to sync data
Your connector must include details on how Netlify should insert documents using data from an instance of your data source type and how to process updates whenever the source data changes.
Sync documents
When your connector first runs in a data layer or a site with visual editing enabled, Netlify calls the connector.sync()
API to perform an initial sync from your data source.
The API has access to a models
object. This object contains each document model you defined with define.document()
, where the keys are the document model names and the values are the insert
and delete
APIs for that model. For example, if you defined a Post
document model, you can use models.Post.insert()
and models.Post.delete()
.
As you configure the actions Netlify should take on initial sync, note the following:
- All documents must have a unique
id
that matches the ID defined in your data source. Make sure to pass anid
value for each document when you callinsert()
. The ID must match the ID defined in your CMS or data source. Even if the data source ID isn’t globally unique, Netlify makes it globally unique using a combination of your connector instance ID, the document model name, and the document’s ID from your data source. For example,[connector-id]-[model-name]-[document.id]
. - All relationship fields must contain the raw document ID. Similar to
id
values, all relationship fields should contain the raw document ID from your data source. Netlify will make the ID globally unique and use it to make the relationship to the correct document type you defined. Learn more about adding documents that have relationship fields. - The
insert
model action is an upsert. As a result, callinginsert
multiple times on objects that contain the sameid
will update the same stored document. You can use thecache
helper to work around this. - Connect to any data source in this API. Any data source will work, including JSON APIs, GraphQL APIs, and local files such as
.csv
or Excel files. - Consider storing cache-related metadata. The
connector.sync()
API has access to thecache
helper, which you can use to store sync-related metadata to help with caching on subsequent syncs. For example you can store a CMS sync token or a timestamp containing the moment your last sync finished.
For example:
Add documents that have relationship fields
To insert a document that contains a relationship field, use the raw document ID from your data source. As long as you provide the ID from your data source Netlify will figure out how to make the relationship between the document types you’ve defined.
For example:
If relationship fields are union types, they are required to have the ID and type of the relationship. For example:
In this example, since posts
can be either a News
or Post
document model, a __typename
field is required. Netlify will use this field to identify the type of document in the union field.
Update documents
After the initial sync, Netlify calls the connector.sync()
API again for all subsequent syncs.
We recommend that you support data caching by only updating documents that have changed since the last sync. But, this may not be possible for some data sources, such as file-based data sources.
The following sections outline how to cache data when data updates, how to use the cache
helper to manage sync-related metadata, and how to configure your connector if it does not cache data.
If you can cache data
To support data caching and only update documents that have changed, use the connector.sync()
API to only update data when the isInitialSync
argument is false
.
All previously existing documents inserted during connector.sync()
will continue to exist unless you modify them (by re-inserting them) or delete them during connector.sync()
. The previously existing documents that you don’t modify are cached between data syncs.
The API has access to a models
object. This object contains each document model you defined with define.document()
, where the keys are the document model names and the values are the insert and delete APIs for that model. For example, if you defined an Author
document model, you can use models.Author.insert()
and models.Author.delete()
.
Code example:
Store cache-related metadata
When you insert and update documents, you can use the cache
helper to store and access non-document data about your data sources. For example, you may want to reference a sync token or last updated
timestamp from your CMS.
The cache
helper is a key/value store that is available as an argument in each connector’s lifecycle, and provides two methods:
set
: pass in a key and value to store or updateget
: pass in a key to retrieve the stored value
For example:
If you can’t cache data
If your connector does not support caching, you must explicitly indicate this by setting supports.deltaSync
to false
in the call to addConnector()
.
For example:
When supports.deltaSync
is set to false
, isInitialSync
is false on every data sync and stale document deletion is enabled.
As a result, every time data syncs and connector.sync(fn)
runs, your connector needs to re-insert all relevant documents. Any documents that aren’t re-inserted will be automatically deleted.
Normalize model field data
Sometimes the data in your data source doesn’t match the exact data shape defined in your models. You can normalize the data before it’s stored on Netlify by implementing a visitor function for your document, object, enum, and union definitions as well as for any field definition.
In this example, every time an ExampleDocument
is inserted, the title
field will have some text appended to it. Similarly any time a field with the ExampleObject
type exists on a document that was inserted, the subtitle
field on that object will have a string appended to it.
This data will be stored in the database as follows:
If you implement visitor functions for your document, object, enum, and union definitions, you can avoid writing recursive normalization code when inserting data into Connect or the visual editor.
This is an important performance enhancement because the Netlify SDK also recursively normalizes your data. Using visitors will prevent the system from needing to recurse on the same CMS data multiple times.
Visitor context
If you need to pass some data down to each nested visitor in your models, you can use visitor context. Visitor context is a value which can be set in one visitor and then accessed in a child visitor.
A common use-case for visitor context is for passing the locale of a document down to be used in field values of that document.
In the following example, the locale of each document is added to the id so that documents can only link to other documents in the same locale.
Visitor context can be used to pass any data down from any object or document model to any nested field at any depth.
Concurrently fetch data
In the above examples, documents for each model type are fetched in series:
Fetching in series will work in a real-world connector but you’ll lose out on the benefits of JavaScript’s asynchronous concurrency. Instead, you can use the models.concurrent
method to fetch multiple data types from your CMS concurrently:
models.concurrent()
takes the number provided as the first argument and uses it to parallelize running the function passed as the second argument.
In the above example, assuming there are eight different model types defined, concurrent
calls the function on the first four model types all at the same time. It then waits for the returned promises to resolve before calling the function again with a new model type each time a concurrent
callback function resolves.
This can help you avoid hitting rate limits or overwhelming low powered servers, and it’s a simple way to fetch more than one model at a time.
Inspect model definitions while creating documents
You may need to check the types of model fields while fetching and inserting data. You can achieve this by checking the fields
property on each model object.
This is useful for dynamically building your schema and then dynamically determining how to fetch and insert data into each model. Refer to the TypeScript type for model.fields
in your IDE to review the available data:
Note that model.fields
returned here may include fields
that have additional fields
within them. You must be careful when writing recursive code. A self-referencing field will have its own definition available infinitely deep, for example model.fields.relatedPost.fields.relatedPost.fields.relatedPost.fields.relatedPost
.
Accept webhook bodies while syncing data
If your data source relies on sending information to your connector through a webhook body, you can access the body in the first argument passed to connector.sync(fn)
:
To simulate sending a webhook body in local development, send a POST request with a JSON object as the body to http://localhost:8000/__refresh
.
Handle scheduled updates
For use with Netlify Visual Editor only.
If your data source supports scheduled publishing, you can load scheduled actions data into a site with visual editing enabled using the models.ScheduledAction
interface that Netlify exposes to the connector.sync()
callback.
Learn more in the scheduled publishing doc.
Inserting cross-referenced fields
Data must be inserted in a specific format when using cross-reference fields.
When there is a single referenced model type, only the reference
field is required. In this example, we insert data into the bonusContent
cross-reference field with just the reference
ID.
When there are multiple referenced model types, additional fields are required. In this example, we insert data into the allTheBonusContent
cross-references field with the reference
, connectorName
, instanceId
, and modelName
.