Create Datasets
As interactions with the Large Language Model (LLM) grow, it's essential to keep data organized and easily accessible. Hyperstack Gen AI Platform provides datasets to help users group their data into collections that can be used for future training purposes, making it easier to manage and fine-tune models.
Using the API, you can create datasets with the following endpoint:
curl -X POST https://api.genai.hyperstack.cloud/tailor/v1/datasets \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "your-dataset-name",
"description": "A brief description of your dataset",
"tags": ["tag1", "tag2"],
"models": ["model1", "model2"],
"from_date": "2024-01-01T00:00:00Z",
"to_date": "2024-12-31T23:59:59Z"
}'
Request Parameters
name
(required): The name of the dataset. Must be unique for the user.description
(optional): A brief description of the dataset.tags
(optional): Array of tags to filter the dataset by.models
(optional): Array of model names to filter the dataset by.from_date
(optional): Start date for filtering data (ISO 8601 format).to_date
(optional): End date for filtering data (ISO 8601 format).
Response
On success, the API returns a 201 status code with the following response:
{
"status": "success",
"message": "Successfully created dataset {name}"
}
Error Responses
- 400 Bad Request: If the
name
parameter is missing - 409 Conflict: If a dataset with the same name already exists for the user
When creating a dataset from the platform users can filter their data by selecting specific models, tags used and data ranges.
Create datasets through UI
Data can also be uploaded directly from the platform. Users can drag and drop their JSON/JSONL files into the upload area. The system will validate the format and allow the user to upload the logs.
The upload data page allows you to manage your datasets and view logs.