Specify how to sync data

Your connector must include details on how Netlify should insert documents using data from an instance of your data source type and how to process updates whenever the source data changes.

Sync documents

When your connector first runs in a data layer, Netlify calls the connector.sync() API to perform an initial sync from your data source.

The API has access to a models object. This object contains each document model you defined with define.document(), where the keys are the document model names and the values are the insert and delete APIs for that model. For example, if you defined a Post document model, you can use models.Post.insert() and models.Post.delete().

As you configure the actions Netlify should take on initial sync, note the following:

All documents must have a unique id that matches the ID defined in your data source. Make sure to pass an id value for each document when you call insert(). The ID must match the ID defined in your CMS or data source. Even if the data source ID isn’t globally unique, Netlify makes it globally unique using a combination of your connector instance ID, the document model name, and the document’s ID from your data source. For example, [connector-id]-[model-name]-[document.id].
All relationship fields must contain the raw document ID. Similar to id values, all relationship fields should contain the raw document ID from your data source. Netlify will make the ID globally unique and use it to make the relationship to the correct document type you defined. Learn more about adding documents that have relationship fields.
The insert model action is an upsert. As a result, calling insert multiple times on objects that contain the same id will update the same stored document. You can use the cache helper to work around this.
Connect to any data source in this API. Any data source will work, including JSON APIs, GraphQL APIs, and local files such as .csv or Excel files.
Consider storing cache-related metadata. The connector.sync() API has access to the cache helper, which you can use to store sync-related metadata to help with caching on subsequent syncs. For example you can store a CMS sync token or a timestamp containing the moment your last sync finished.

For example:

const data = {
  Post: [
    {
      id: "Post-1",
      description: "Hello world!",
      authorId: "Author-1",
      updatedAt: "2020-01-01T00:00:00.000Z",
    },
    {
      id: "Post-2",
      description: "Second post!",
      authorId: "Author-2",
      updatedAt: "2020-01-01T00:00:00.000Z",
    },
    {
      id: "Post-3",
      description: "Third post!",
      authorId: "Author-2",
      updatedAt: "2020-01-01T00:00:00.000Z",
    },
  ],
  Author: [
    {
      id: "Author-1",
      name: "Jane",
      updatedAt: "2020-01-01T00:00:00.000Z",
    },
    {
      id: "Author-2",
      name: "Marta",
      updatedAt: "2020-01-01T00:00:00.000Z",
    },
  ],
};

connector.sync(async ({ models, isInitialSync, options }) => {
  if (!isInitialSync) return; // this example only shows initial syncing logic

  for (const model of models) {
    // for each model, insert documents from the array of data for that model type
    const cmsData = data[model.name]; // note: this would usually be an API call to a CMS
    model.insert(cmsData);
  }

  /* For example, the first data item would be inserted as follows. Note
  that Netlify will add extra characters to make the id globally
  unique on insertion:

    models.Post.insert({
      id: "Post-1", // internally will be converted to a uuid
      _objectId: "Post-1", // this will stay as the original ID.
      title: "Hello world!",
    });

  */
});

Add documents that have relationship fields

To insert a document that contains a relationship field, use the raw document ID from your data source. As long as you provide the ID from your data source Netlify will figure out how to make the relationship between the document types you’ve defined.

For example:

connector.model(async ({ define }) => {
  const UserModel = define.document({
    name: "User",
    fields: {
      posts: {
        // relationship field from a User document to a list of Post documents
        type: "Post",
        list: true,
      },
    },
  });

  define.document({
    name: "Post",
    fields: {
      author: {
        user: {
          type: UserModel,
        },
      },
    },
  });
});

connector.sync(async ({ models, isInitialSync }) => {
  if (!isInitialSync) return; // this example only shows an initial data sync

  models.User.insert({
    id: "1",
    posts: ["1"],
    // `posts` was defined as a list field, so an array is required.
    // Notice "1" is the "raw id" of a Post. Netlify will insert
    // a globally unique ID from this that matches the ID of the Post
    // inserted with the ID "1".
  });
  models.Post.insert({
    id: "1",
    author: "1",
    // This `author` relationship field isn’t required for User.posts to
    // work. For now, the only way to do back-references is to
    // explicitly set the ID on each connected document. Each relationship
    // field is a one-way relationship from one document to another.
  });
});

If relationship fields are union types, they are required to have the ID and type of the relationship. For example:

const Content = define.union({
  types: ["Post", "News"]
})

connector.model(async ({ define }) => {
  const UserModel = define.document({
    name: "User",
    fields: {
      posts: {
        type: Content,
        list: true
      },
      mostPopularPost: {
        type: Content
      }
    }
  })

  define.document({
    name: "News"
    fields: {
      title: {
        type: "String"
      }
    }
  })

  define.document({
    name: "Post",
    fields: {
      author: {
        user: {
          type: UserModel
        }
      }
    }
  })
})

connector.sync(async ({ models, isInitialSync }) => {
  if (!isInitialSync) return // this example only shows an initial data sync

  models.User.insert({
    id: "1",
    posts: [
      {
        __typename: "Post",
        id: "1"
      },
      {
        __typename: "News",
        id: "2"
      }
    ],
    mostPopularPost: {
      __typename: "News",
      id: "2"
    }
  });
  models.Post.insert({
    id: "1",
    author: "1"
  });
  models.News.insert({
    id: "2",
    title: "Hello world"
  });
});

In this example, since posts can be either a News or Post document model, a __typename field is required. Netlify will use this field to identify the type of document in the union field.

Update documents

After the initial sync, Netlify calls the connector.sync() API again for all subsequent syncs.

We recommend that you support data caching by only updating documents that have changed since the last sync. But, this may not be possible for some data sources, such as file-based data sources.

The following sections outline how to cache data when data updates, how to use the cache helper to manage sync-related metadata, and how to configure your connector if it does not cache data.

If you can cache data

To support data caching and only update documents that have changed, use the connector.sync() API to only update data when the isInitialSync argument is false.

All previously existing documents inserted during connector.sync() will continue to exist unless you modify them (by re-inserting them) or delete them during connector.sync(). The previously existing documents that you don’t modify are cached between data syncs.

The API has access to a models object. This object contains each document model you defined with define.document(), where the keys are the document model names and the values are the insert and delete APIs for that model. For example, if you defined an Author document model, you can use models.Author.insert() and models.Author.delete().

Code example:

const changedData = {
  Post: [
    {
      id: "Post-1",
      description: "Hello world again!",
      authorId: "Author-1",
      updatedAt: "2020-01-01T00:00:00.001Z",
    },
  ],
};

const deletedData = {
  User: ["1"],
};

connector.sync(async ({ models, isInitialSync, options }) => {
  if (isInitialSync) return; // this example only shows a data update, not an initial sync

  // handle updates
  for (const model of models) {
    model.insert(changedData[model.name]);
  }

  // and deletes
  for (const model of models) {
    model.delete(deletedData[model.name]);
  }
});

When you insert and update documents, you can use the cache helper to store and access non-document data about your data sources. For example, you may want to reference a sync token or last updated timestamp from your CMS.

The cache helper is a key/value store that is available as an argument in each connector’s lifecycle, and provides two methods:

set: pass in a key and value to store or update
get: pass in a key to retrieve the stored value

For example:

const fetchCMSData = ({ since }) => {
  /* ... */
};

const makeNodesFromData = ({ cmsData, models }) => {
  for (const model of models) {
    model.insert(cmsData[model.name]);
  }
};

connector.sync(async ({ models, cache, isInitialSync }) => {
  if (isInitialSync) {
    // On initial sync, pass in a lastSync value of null to get all data
    const cmsData = await fetchCMSData({ since: null });

    makeNodesFromData({
      models,
      cmsData,
    });
  } else if (!isInitialSync) {
    // On subsequent syncs, access the lastSync value we stored
    const lastSyncTime = await cache.get("lastSync");

    // Fetch data that changed since the last time we ran a sync
    const cmsData = await fetchCMSData({
      since: lastSyncTime,
    });

    makeNodesFromData({
      models,
      cmsData,
    });
  }

  // As a final step, we update the lastSync value to now
  await cache.set("lastSync", Date.now());
});

If you can’t cache data

If your connector does not support caching, you must explicitly indicate this by setting supports.deltaSync to false in the call to addConnector().

For example:

extension.addConnector({
  typePrefix: "Example",
  supports: {
    connect: true,
    deltaSync: false,
  },
});

When supports.deltaSync is set to false, isInitialSync is false on every data sync and stale document deletion is enabled.

As a result, every time data syncs and connector.sync(fn) runs, your connector needs to re-insert all relevant documents. Any documents that aren’t re-inserted will be automatically deleted.

Normalize model field data

Sometimes the data in your data source doesn’t match the exact data shape defined in your models. You can normalize the data before it’s stored on Netlify by implementing a visitor function for your document, object, enum, and union definitions as well as for any field definition.

connector.model(async ({ define }) => {
  define.document({
    name: `ExampleDocument`,
    visitor: (document, info) => {
      // if the hasTitle field was defined as a boolean
      if (info.fields.hasTitle?.typeName === `Boolean`) {
        // set the hasTitle field as a boolean
        document.hasTitle = !!document.title;
      }

      return document;
    },
    fields: {
      title: {
        type: `String`,
        visitor: (title, info) => {
          // info about the field type can be inspected using the second argument.
          // this is mostly useful when you’re dynamically building your schema and
          // visitor functions
          // check the TS types for `info` in your IDE to review available fields
          //
          // In this example we just add some text to the end of every title, for illustration.
          return (title += ` testing visitors`);

          // You could also use this to change the data structure,
          // for example by returning `title.value` if your title was an object
          // where the string value of the title was nested on a `.value` property.
          return title.value;
        },
      },
      exampleObjectField: {
        type: define.object({
          name: `ExampleObject`,
          visitor: (object) => {
            object.subtitle += ` testing nested visitor`;
            return object;
          },
          fields: {
            subtitle: {
              type: `String`,
            },
          },
        }),
      },
    },
  });
});

In this example, every time an ExampleDocument is inserted, the title field will have some text appended to it. Similarly any time a field with the ExampleObject type exists on a document that was inserted, the subtitle field on that object will have a string appended to it.

connector.sync(({ models }) => {
  models.ExampleDocument.insert({
    id: `1`,
    title: `A title: `,
    exampleObjectField: {
      subtitle: `A subtitle: `,
    },
  });
});

This data will be stored in the database as follows:

{
  "id": "1",
  "title": "A title: testing visitors",
  "exampleObjectField": {
    "subtitle": "A subtitle: testing nested visitor"
  }
}

If you implement visitor functions for your document, object, enum, and union definitions, you can avoid writing recursive normalization code when inserting data into Connect.

This is an important performance enhancement because the Netlify SDK also recursively normalizes your data. Using visitors will prevent the system from needing to recurse on the same CMS data multiple times.

Visitor context

If you need to pass some data down to each nested visitor in your models, you can use visitor context. Visitor context is a value which can be set in one visitor and then accessed in a child visitor.

A common use-case for visitor context is for passing the locale of a document down to be used in field values of that document.

In the following example, the locale of each document is added to the id so that documents can only link to other documents in the same locale.

define.document({
  name: `Page`,
  visitor: (document, info) => {
    info.setVisitorContext({
      locale: document.locale,
    });

    // here any Page document that’s inserted will have its locale prepended to its id.
    document.id = document.locale + document.id;

    return document;
  },
  fields: {
    locale: {
      type: `String`,
      required: true,
    },
    relatedPage: {
      type: `Page`,
      visitor: (relatedPageId, info) => {
        // here any "relatedPage" field id will have the locale from visitor context prepended to the relationship id
        return info.visitorContext.locale + relatedPageId;
      },
    },
  },
});

Visitor context can be used to pass any data down from any object or document model to any nested field at any depth.

Concurrently fetch data

In the above examples, documents for each model type are fetched in series:

for (const model of models) {
  const cmsNodes = await fetchCMSData(model.name);

  model.insert(cmsNodes);
}

Fetching in series will work in a real-world connector but you’ll lose out on the benefits of JavaScript’s asynchronous concurrency. Instead, you can use the models.concurrent method to fetch multiple data types from your CMS concurrently:

connector.sync(async ({ models }) => {
  await models.concurrent(4, async (model) => {
    const cmsNodes = await fetchCMSData(model.name);

    model.insert(cmsNodes);
  });
});

models.concurrent() takes the number provided as the first argument and uses it to parallelize running the function passed as the second argument.

In the above example, assuming there are eight different model types defined, concurrent calls the function on the first four model types all at the same time. It then waits for the returned promises to resolve before calling the function again with a new model type each time a concurrent callback function resolves.

This can help you avoid hitting rate limits or overwhelming low powered servers, and it’s a simple way to fetch more than one model at a time.

Inspect model definitions while creating documents

You may need to check the types of model fields while fetching and inserting data. You can achieve this by checking the fields property on each model object.

connector.sync(({ models }) => {
  for (const model of models) {
    model.insert({
      id: `1`,
      // this is a contrived example to illustrate the point that you can introspect your model
      title: model.fields.title.is.scalar ? `HI` : model.fields.title.is.document ? `2` : undefined,
    });
  }
});

This is useful for dynamically building your schema and then dynamically determining how to fetch and insert data into each model. Refer to the TypeScript type for model.fields in your IDE to review the available data:

type Fields = {
  [fieldName: string]: Field;
};

type Field = {
  name: string;
  typeName: string;
  fields?: Fields;
  required: boolean;
  list: boolean | `required`;
  is: {
    document: boolean;
    object: boolean;
    union: boolean;
    scalar: boolean;
  };
};

Note that model.fields returned here may include fields that have additional fields within them. You must be careful when writing recursive code. A self-referencing field will have its own definition available infinitely deep, for example model.fields.relatedPost.fields.relatedPost.fields.relatedPost.fields.relatedPost.

Accept webhook bodies while syncing data

If your data source relies on sending information to your connector through a webhook body, you can access the body in the first argument passed to connector.sync(fn):

connector.sync(async ({ webhookBody }) => {
  // webhook body is a JSON object here with the data from the POST request
});

To simulate sending a webhook body in local development, send a POST request with a JSON object as the body to http://localhost:8000/__refresh.

Got it!

Your feedback helps us improve our docs.

Specify how to sync data

Sync documents

Add documents that have relationship fields

Update documents

If you can cache data

Store cache-related metadata

If you can’t cache data

Normalize model field data

Visitor context

Concurrently fetch data

Inspect model definitions while creating documents

Accept webhook bodies while syncing data

Did you find this doc useful?

Got it!