Data Collection SDK
The collect_data function empowers users to capture frames from a designated source at specified intervals. Users have the flexibility to configure the frequency of frame collection, determine the number of frames to collect, and optionally upload the gathered frames to a cloud storage service such as AWS S3. The collected frames are subsequently stored in the Label Studio project, where they can be utilized for training or retraining models.
| Parameter | Type | Default | Required | Description | When to Use | Accepted Values |
|---|---|---|---|---|---|---|
| source_id | Integer | None | Yes | Unique identifier for the video/camera source. | Always required. | Any valid source ID |
| time_interval | Integer | None | No | Time interval (in seconds) for frame collection. | Use when you want to collect frames based on time, not count. | Any positive integer |
| number_of_frames_requested | Integer | None | No | Total number of frames to collect. | Use when you want to collect a specific number of frames. | Any positive integer |
| skip_frame_count | Integer | 5 | No | Number of frames to skip between captures. | Use to reduce processing and avoid redundant data. | Any positive integer |
| s3_bucket_name | String | None | No | Name of the AWS S3 bucket. | Use if you want to upload collected frames to AWS. | Valid AWS S3 bucket name |
| s3_access_key | String | None | No | AWS access key for the specified S3 bucket. | Required only when using custom AWS credentials. | Valid AWS credentials |
| s3_secret_key | String | None | No | AWS secret key for the specified S3 bucket. | Required only when using custom AWS credentials. | Valid AWS credentials |
| s3_cloud_path | String | None | No | S3 path to upload the collected frames. | Use if uploading frames to a specific location in the S3 bucket. | Valid S3 path (e.g., s3://bucket-name/path/) |
Collect Data from a Source:
Description:
The collect_data method enables users to extract frames from a specified source ID over a defined time interval. Users can also specify the number of frames to capture, skip frames for optimization, and optionally store the collected frames in an AWS S3 bucket.
Example 1: Collecting data based on the time interval
data_collect = sdk.collect_data(
source_id=174,
time_interval=10, # 10 seconds
skip_frame_count=5, # Optional
)
print(data_collect)
Example 2: Collecting data based on the number of frames
data_collect = sdk.collect_data(
source_id=174,
number_of_frames_requested=200,
skip_frame_count=5, # Optional
)
print(data_collect)
Once the data has been collected, users will receive a Label Studio URL where they can annotate the images and export the dataset in a zip format upon completing the annotations. This zip file will be utilized for training or retraining processes.