Implementing unidirectional file synchronization in an integration
The unidirectional file synchronization interface allows you to implement 1-way sync in your integration to import files from an external service to Botpress.
Filesystem abstraction
The file synchronization interface provides a filesystem-like abstraction that works with any kind of data source. The external service doesn’t need to provide an actual filesystem - your integration just needs to represent the external data as files and folders.
For example:
- If you are building a website crawler, individual pages could be folders and HTML contents and assets like images or stylesheets could be files.
- For a notetaking platform, notebooks could be folders with individual notes being files.
- For an email provider, mailboxes or labels could be folders and individual emails could be files.
This abstraction allows the interface to work consistently regardless of what type of data is being synchronized from your external service.
Terminology
Throughout this document, we will use the following terms:
- Integration
The code that connects Botpress to an external service.
- External service
The service from which you want to import files. This could be a cloud storage service, a file server, or any other type of external service that stores files.
- File synchronization interface
The interface that defines the contract for implementing unidirectional file synchronization in your integration. This interface specifies the actions and events that your integration must implement to support file synchronization.
- File synchronizer plugin
The Botpress plugin that orchestrates file synchronization. This plugin is responsible for managing the synchronization process, including scheduling, error handling, and reporting.
- File
A file is a single unit of data that can be synchronized from the external service to Botpress. Files can contain any type of data, such as text, images, or binary data. Files cannot contain other files or folders.
- Folder
A folder is a container for files. Folders can contain other folders and files, allowing for a hierarchical organization of data.
- Real-time synchronization
A synchronization mode where changes in the external service are immediately reflected in Botpress. This is typically achieved through webhooks or other push mechanisms. Integrations are not required to support this mode, but it is recommended for better user experience.
External service requirements
The external service providing file synchronization functionality must support the following:
- An API that allows listing all files and folders in a folder.
- Must support pagination. This means that the API should return a limited number of items at a time, along with a token that can be used to retrieve the next set of items.
- An API that allows downloading files.
The external service may also support the following in order to provide real-time synchronization:
- Webhooks that can notify your integration of the following events:
Updating your package.json
file
Finding the current interface version
The current version of the files-readonly
interface is:
You will need this version number for the next steps.
Adding the interface as a dependency
Once you have the file synchronization interface version, you can add it as a dependency to your integration:
Open the package.json file
Open your integration‘s package.json
file.
Add the bpDependencies section
If there is no bpDependencies
section in your integration‘s package.json
file, create one:
Add the interface as a dependency
In the bpDependencies
section, add the file synchronization interface as a dependency. For example, for version 0.2.0
, you would add the following:
It is very important to follow this syntax:
"<interface-name>": "interface:<interface-name>@<version>"
.
Save the package.json file
Save the package.json
file.
Install the interface
Now that you have added the file synchronization interface as a dependency, you can run the bp add
command to install it. This command will:
- Download the interface from Botpress.
- Install it in a directory named
bp_modules
in your integration‘s root directory.
Adding a helper build script
To keep your integration up to date, we recommend adding a helper build script to your package.json
file:
Open the package.json file
Open your integration‘s package.json
file.
Add the build script
In the scripts
section, add the following script:
If the build
script already exists in your package.json
file, please replace it.
Save the package.json file
Save the package.json
file.
Now, whenever you run npm run build
, it will automatically install the file synchronization interface and build your integration.
Editing your integration definition file
Adding the interface to your integration definition file
Now that the file synchronization interface is installed, you must add it your integration definition file in order to implement it.
Open the integration.definition.ts file
Open your integration‘s integration.definition.ts
file.
Import the interface
At the top of the file, import the file synchronization interface:
Extend your definition
Use the .extend()
function at the end of your new IntegrationDefinition()
statement:
The exact syntax of .extend()
will be explained in the next section.
Configuring the interface
The .extend()
function takes two arguments:
- The first argument is a reference to the interface you want to implement. In this case, it is
filesReadonly
. - The second argument is a configuration object. Using this object, you can override interface defaults with custom names, titles, and descriptions.
Whilst renaming actions, events and channels is optional, it is highly recommended to rename these to match the terminology of the external service. This will help you avoid confusion and make your integration easier to understand.
Renaming actions
The file synchronization interface defines two actions that are used to interact with the external service:
listItemsInFolder
- Used by the file synchronizer plugin to request a list of all files and folders in a folder.transferFileToBotpress
- Used by the file synchronizer plugin to request that a file be downloaded from the external service and uploaded to Botpress.
If you want to rename these actions, you can do so in the configuration object. For example, if you want to rename listItemsInFolder
to crawlFolder
, you can do it like this:
For example, if you’re using a notetaking platform such as Microsoft OneNote, you might rename listItemsInFolder
to listNotebooksAndPages
and transferFileToBotpress
to downloadPage
. This way, the action names reflect the specific context of the notetaking platform, making your integration clearer and easier to understand.
Renaming events
The file synchronization interface interface defines these events to notify the plugin of changes in the external service:
fileCreated
- Emitted by your integration to notify the file synchronizer plugin that a new file has been created in the external service.fileUpdated
- Emitted by your integration to notify the file synchronizer plugin that a file has been updated in the external service.fileDeleted
- Emitted by your integration to notify the file synchronizer plugin that a file has been deleted in the external service.folderDeletedRecursive
- Emitted by your integration to notify the file synchronizer plugin that a folder and all of its contents have been deleted in the external service.
If the external service emits several filesystem changes at once, it is also possible for your integration to emit a aggregateFileChanges
event, which contains all the changes in a single event.
If you want to rename these events, you can do so in the configuration object. For example, if you want to rename fileCreated
to pageCreated
, you can do it like this:
Implementing the interface
Implementing the actions
Implementing listItemsInFolder
The listItemsInFolder
action is used by the file synchronizer plugin to request a list of all files and folders in a folder.
If you opted to rename the action to something else to listItemsInFolder
in the “Configuring the interface” section, please use the new name instead of listItemsInFolder
.
Please refer to the expected input and output schemas for the action: interface.definition.ts line 52.
This action should implement the following logic:
Get the folder ID
Get the folder identifier from input.folderId
. When this value is undefined
, it means the file synchronizer plugin is requesting a list of all items in the root directory of the external service. For root directory requests, please refer to the documentation of the external service to determine the correct root identifier - this is typically an empty string, a slash character (/
), or a special value defined by the service.
Get the list of items
Use the external service‘s API to get the list of items in the folder. If the external service supports filtering by item type (file or folder), by maximum file size, or by modification date, please use these filters to limit the number of items returned. This will help reduce the amount of data transferred and improve performance.
If a pagination token is provided (input.nextToken
), use it to get the next page of items. The external service should return a new pagination token in the response, which you should return with the action’s response.
Do not list items recursively. The file synchronizer plugin is responsible for handling recursion. Your integration should only return the items in the specified folder.
Map each items to the expected schema
Map each item to the expected schema. The file synchronizer plugin expects the following schemas:
Yield control back to the plugin
Yield control back to the file synchronizer plugin by returning the list of items. The file synchronizer plugin will then handle the rest of the synchronization process.
If the external service indicates it has more items, return the pagination token in the nextToken
field. The file synchronizer plugin will use this token to request the next page of items. Otherwise, return undefined
.
As reference, here’s how this logic is implemented in the Dropbox integration:
Implementing transferFileToBotpress
The transferFileToBotpress
action is used by the file synchronizer plugin to request that a file be downloaded from the external service and uploaded to Botpress.
If you opted to rename the action to something else to transferFileToBotpress
in the “Configuring the interface” section, please use the new name instead of transferFileToBotpress
.
Please refer to the expected input and output schemas for the action: interface.definition.ts line 88.
This action should implement the following logic:
Get the file ID
Get the file identifier from input.file.id
. This is the identifier of the file to be downloaded from the external service.
Download the file from the external service
Use the external service‘s API to download the file’s content.
Upload the file to Botpress
Upload the file to Botpress using the client.uploadFile
method. This method expects both the file’s content and a file key, which is provided by the file synchronizer plugin as input.fileKey
.
Yield control back to the plugin
Yield control back to the file synchronizer plugin by returning the the ID of the file that was uploaded to Botpress.
As reference, here’s how this logic is implemented in the Dropbox integration:
Implementing real-time sync
The file synchronizer plugin can be configured to use real-time synchronization. This means that changes in the external service are immediately reflected in Botpress. To enable this functionality, the external service must support webhooks that can notify your integration of changes in the filesystem.
Implementing fileCreated
Add a webhook handler
In your integration, add a webhook handler that can receive file change notifications from the external service.
Map the file to the expected schema
In your handler, map the file to the expected schema. The file synchronizer plugin expects the following schema:
Emit the event
Emit the fileCreated
event with the mapped file as the payload. The file synchronizer plugin will then handle the rest of the synchronization process.
Implementing fileUpdated
The logic is identical to the fileCreated
event, but you should emit the fileUpdated
event instead.
Implementing fileDeleted
The logic is identical to the fileCreated
event, but you should emit the fileDeleted
event instead.
Implementing folderDeletedRecursive
Add a webhook handler
In your integration, add a webhook handler that can receive file change notifications from the external service.
Map the folder to the expected schema
In your handler, map the folder to the expected schema. The file synchronizer plugin expects the following schema:
Emit the event
Emit the folderDeletedRecursive
event with the mapped folder as the payload. The file synchronizer plugin will then handle the rest of the synchronization process.
Implementing aggregateFileChanges
The logic is identical to the fileCreated
, fileUpdated
, fileDeleted
, or folderDeletedRecursive
events, but you should emit the aggregateFileChanges
event instead:
If your integration needs to emit more than one filesystem change event, you should combine them into a single aggregateFileChanges
event. This is more efficient and faster to process for the file synchronizer plugin.