Uploading Data to Hyperstack Gen AI Platform
To have specialised and well performing Large Language Model (LLM) deployments within an enterprise, it is crucial to collect and upload data on interactions between end-users and LLMs. This page provides guidance on how to upload data to Hyperstack Gen AI Platform through the platform's drag-and-drop interface and API endpoints.
Upload through API
The file upload process through the API involves three steps:
- Get a signed URL for file upload
- Upload a JSONL file to the signed URL
- Associate the uploaded file with your account
To read more about the correct JSONL file format, go to the JSONL File Format section.
Step 1: Get a Signed URL
response=$(curl -X GET https://api.genai.hyperstack.cloud/tailor/v1/generate_signed_url \
-H "X-API-KEY: $API_KEY")
signedUrl=$(echo $response | jq -r '.signedUrl')
filename=$(echo $response | jq -r '.filename')
The response will contain a signedUrl
and filename
that you'll need for the next steps.
Step 2: Upload File to Signed URL
PATH_TO_FILE="data.jsonl"
curl -X PUT $signedUrl \
-H "Content-Type: application/octet-stream" \
--data-binary @$PATH_TO_FILE
Step 3: Associate File with Account
curl -X POST https://api.genai.hyperstack.cloud/tailor/v1/custom_log_upload \
-H "X-API-KEY: $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"custom_logs_filename": "'$filename'",
"save_logs_with_tags": ["tag1", "tag2"]
}'
The following parameters are required in the final step: -
custom_logs_filename
: The filename received from the signed URL generation -
save_logs_with_tags
: An array of string tags to associate with the logs
Logging Data API Endpoints
Save Data
To save data from API interactions, use the data endpoint:
curl -X POST https://api.genai.hyperstack.cloud/tailor/v1/data \
-H "X-API-KEY: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "You are a friendly chatbot."},
{"role": "user", "content": "Hello!"}
],
"model": "your-model-name",
"kwargs": {},
"tags": ["tag1", "tag2"],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 20,
"total_tokens": 30
}
}'
Required parameters: - messages
: Array of message objects containing role
and content - model
: Name of the model used Optional parameters: - kwargs
:
Additional parameters passed to the model - tags
: Array of string tags to
associate with the logs - usage
: Object containing token usage information
Upload through UI
Data can also be uploaded directly from the platform. Users can drag and drop their JSON/JSONL files into the upload area. The system will validate the format and allow the user to upload the logs.
The upload data page allows you to manage your datasets and view logs.
JSONL File Format
The JSONL file should contain one JSON object per line, where each object represents a conversation or interaction. Here's an example of the expected format:
{"messages": [{"role": "user", "content": "What's the capital of Australia?"}, {"role": "assistant", "content": "The capital of Australia is Canberra."}]}
{"messages": [{"role": "system", "content": "You are a travel advisor."}, {"role": "user", "content": "Where should I go in Europe for a summer vacation?"}, {"role": "assistant", "content": "Consider Italy, Spain, or Greece—they offer great weather, food, and culture in the summer!"}]}
{"messages": [{"role": "user", "content": "What's the capital of Australia?"}, {"role": "assistant", "content": "The capital of Australia is Canberra."}]}
Each line in the JSONL file must be a valid JSON object containing:
messages
: An array of message objects- Each message object must have:
role
: Either "system", "user", or "assistant"content
: The text content of the message
Make sure your JSONL file:
- Has one complete JSON object per line
- Uses proper JSON formatting
- Contains the required fields for each message
- Has no trailing commas
- Uses UTF-8 encoding