Skip to main content

Working with Datasets

Decouple your test logic from your test data to keep your drift.yaml clean and maintainable.


1. Local and Remote Datasets

Datasets can be stored locally or fetched from a remote URI.

# drift.yaml
sources:
- name: product-ds # Source identifier (used in drift.yaml only)
path: product.dataset.yaml # Local path

- name: remote-data
uri: https://data.internal.dev/product.dataset.yaml # Remote URI
auth:
username: admin
secret: ${env:DATA_SECRET}
Best Practice

Name your source the same as the dataset name inside the file to avoid confusion. For example, if your dataset is named product-data in the dataset file, name the source product-data as well.


2. Dataset Structure

Define your data objects within the data block of a drift-dataset-file.

# product.dataset.yaml
drift-dataset-file: V1
datasets:
- name: product-data # ← This is the ID you reference in operations
data:
products:
existingProduct:
id: 10
type: "beverage"
price: 10.99
name: "cola"
version: "1.0.0"
newProduct:
id: 25
type: "food"
price: 5.49
name: "chips"
version: "2.0.0"

Multiple Datasets in One File

A single dataset file can contain multiple named datasets:

drift-dataset-file: V1
datasets:
- name: product-data
data:
products:
product10:
id: 10
name: "cola"

- name: user-data
data:
users:
admin:
id: 1
email: "admin@example.com"

3. Understanding Dataset References

Important: When referencing data in operations, use the dataset name from inside the dataset file, not the source name from drift.yaml.

# drift.yaml
sources:
- name: my-source # ← Source identifier (NOT used in data references)
path: data.dataset.yaml

operations:
createProduct:
dataset: product-data # ← Must match the 'name' inside data.dataset.yaml
parameters:
request:
body: ${product-data:products.existingProduct} # ← Uses dataset name

If the dataset name doesn't match, you'll see an error like:

     ╭─[ drift.yaml:102:14 ]

102 │ dataset: product-data
│ ──────┬─────
│ ╰─────── There is no dataset with name 'product-data'
─────╯

4. Injecting Data into Operations

Reference your data using the ${datasetName:path.to.key} syntax. Ensure you declare the dataset in the operation.

Single Object Reference

createProduct_Success:
target: source-oas:createProduct
dataset: product-data
parameters:
request:
body: ${product-data:products.existingProduct}
expected:
response:
statusCode: 201

Nested Field Reference

getProductByID:
target: source-oas:getProductByID
dataset: product-data
parameters:
path:
id: ${product-data:products.existingProduct.id} # ← Access nested fields
expected:
response:
statusCode: 200

Array References with Glob Syntax

Use the * glob to reference all items in a collection:

# Reference all product IDs
getProductByID_DoesNotExist:
target: source-oas:getProductByID
dataset: product-data
parameters:
path:
id: ${product-data:notIn(products.*.id)} # ← Glob matches all product IDs
expected:
response:
statusCode: 404

The * glob expands to all keys at that level:

  • ${product-data:products.*} → all product objects
  • ${product-data:products.*.id} → all product IDs
  • ${product-data:products.*.name} → all product names

5. Common Patterns

Testing with Multiple Items

datasets:
- name: product-data
data:
valid_products:
- id: 10
name: "cola"
- id: 25
name: "chips"

Reference using glob:

createProducts:
dataset: product-data
parameters:
request:
body: ${product-data:valid_products.*} # ← All items in array

Troubleshooting

Dataset Not Found

Error:

There is no dataset with name 'product-data'

Solution: Ensure the dataset field in your operation matches the name field inside your dataset file, not the source name in drift.yaml.

Path Not Found

Error:

${product-data:products.nonExistent}: There is no entry with key 'nonExistent'

Solution: Verify the path exists in your dataset file using dot notation (e.g., products.existingProduct.id).