You can streamline data retrieval with the GRC GraphQL API by using bulk operations, which allow you to fetch data in large quantities asynchronously. This API is specifically designed to simplify handling pagination for extensive datasets. It enables you to perform bulk queries on any connection field defined in the GRC GraphQL API schema.
Rather than manually managing pagination and client-side rate limiting, you can execute a bulk query operation. Drova's infrastructure handles the heavy lifting of executing your query and provides a URL to download all the requested data conveniently.
Flow
The volume of data the service could return can be big, and the time to process can take minutes. That is why all bulk query operations are executed asynchronously. It means that there are a few steps to perform it:
initiate the operation
wait for the completion of the operation
download the results upon successful completion
The below sequence diagram shows the three steps.
Initiate bulk query
To start a bulk operation, the consumer sends the declaration for the GraphQL query to the API that needs to be run asynchronously. The following example demonstrates how start the async process for quering all events that exist in the given GRC site
mutation ExecuteBulkQuery {
executeBulkQuery(
input: {
query: "query GrcWebsite { grcWebsite(tenantId: \"DROVA\") { region tenantName tenantId events(first: 20) { totalCount edges { node { dateClosed elapsedTime eventDate id isBreach lossAmount permissions recordNumber remedialCost reportedDate riskRating statusDescription title } } } } }"
}
) {
userErrors {
... on ApiForServiceAccountOnlyError {
authKind
message
}
... on GrcUserNotExistError {
message
tenantId
}
... on GrcWebsiteNotFoundError {
message
tenantId
}
... on InvalidQuerySyntaxError {
message
query
}
... on QueryNotMatchSchemaError {
message
query
}
}
bulkOperation {
jobId
status
url
}
}
}
Note that events
property of the website does not have any pagination parameters such as first
, after
, etc.
The server returns a unique identifier of the operation and the current status.
Wait for the completion of the operation
After the successful operation initiation, the consumer can get the current operation status by querying it by the unique operation id.
query GrcWebsite {
grcWebsite(tenantId: "DROVA") {
currentBulkQuery(jobId: "1123") {
jobId
status
url
}
}
}
When status
of the operation indicates success, the query returns the URL of the file where the operation results are stored.
Downloading results
The file with the results can be downloaded by sending HTTP GET request for given URL.
The query result is written in JSONL format, where each line in the file is a valid JSON object.
Each line in the file is a record. If a record has a nested record collection, each child record is extracted into its object on the next line. For example, the result of the above bulk query of grcWebsite
and all events
will be
{"id":"DROVA","region":"AU","tenantName":"DROVA"}
{"id":"YXJpOmdyYzpBVTpERU1PMzMyOkNvbXBsaWFuY2U6Ng","permissions":[],"recordNumber":"COM0001","statusDescription":"Active","title":"Systems are maintained to ensure the business operates as per its internal and external requirements.","__typename":"GrcCompliance","recordType":"GrcCompliance","__relationName":"compliances","__relationType":"array","__parentId":"DROVA"}
{"id":"YXJpOmdyYzpBVTpERU1PMzMyOkNvbXBsaWFuY2U6Nw","permissions":[],"recordNumber":"COM0002","statusDescription":"Active","title":"Monitor levels of liquid assets test","__typename":"GrcCompliance","recordType":"GrcCompliance","__relationName":"compliances","__relationType":"array","__parentId":"DROVA"}
{"id":"YXJpOmdyYzpBVTpERU1PMzMyOkNvbXBsaWFuY2U6OA","permissions":[],"recordNumber":"COM0003","statusDescription":"Active","title":"Insurance policies in place to protect the business.","__typename":"GrcCompliance","recordType":"GrcCompliance","__relationName":"compliances","__relationType":"array","__parentId":"DROVA"}
{"id":"YXJpOmdyYzpBVTpERU1PMzMyOkNvbXBsaWFuY2U6OQ","permissions":[],"recordNumber":"COM0004","statusDescription":"Active","title":"Protection of staff and customers.","__typename":"GrcCompliance","recordType":"GrcCompliance","__relationName":"compliances","__relationType":"array","__parentId":"DROVA"}
{"id":"YXJpOmdyYzpBVTpERU1PMzMyOkNvbXBsaWFuY2U6MTA","permissions":[],"recordNumber":"COM0005","statusDescription":"Active","title":"Exposure to fraud.","__typename":"GrcCompliance","recordType":"GrcCompliance","__relationName":"compliances","__relationType":"array","__parentId":"DROVA"}
{"id":"YXJpOmdyYzpBVTpERU1PMzMyOkNvbXBsaWFuY2U6MTE","permissions":[],"recordNumber":"COM0006","statusDescription":"Active","title":"General Business Insurance Annual Renewal","__typename":"GrcCompliance","recordType":"GrcCompliance","__relationName":"compliances","__relationType":"array","__parentId":"DROVA"}
where each parent object appears before its children. The __parentId
field provides the reference to the object’s parent.
Operation restrictions
A bulk operation query needs to include a connection. If your query doesn’t use a connection, then it should be executed as a normal synchronous GraphQL query.
Bulk operations have some additional restrictions:
Maximum of 5 bulk query requests per service account can be run concurrently.
The top-level
nodes
field can't be used. Bulk operation only supportsedges
in the connection.
{
grcWebsite(tenantId: "DROVA"){
events {
edges {
node {
id
title
formFields {
edges {
node {
id
fieldName
}
}
}
}
}
}
}
}