Skip to main content

Data Collection SDK

The collect_data function empowers users to capture frames from a designated source at specified intervals. Users have the flexibility to configure the frequency of frame collection, determine the number of frames to collect, and optionally upload the gathered frames to a cloud storage service such as AWS S3. The collected frames are subsequently stored in the Label Studio project, where they can be utilized for training or retraining models.

ParameterTypeDefaultRequiredDescriptionWhen to UseAccepted Values
source_idIntegerNoneYesUnique identifier for the video/camera source.Always required.Any valid source ID
time_intervalIntegerNoneNoTime interval (in seconds) for frame collection.Use when you want to collect frames based on time, not count.Any positive integer
number_of_frames_requestedIntegerNoneNoTotal number of frames to collect.Use when you want to collect a specific number of frames.Any positive integer
skip_frame_countInteger5NoNumber of frames to skip between captures.Use to reduce processing and avoid redundant data.Any positive integer
s3_bucket_nameStringNoneNoName of the AWS S3 bucket.Use if you want to upload collected frames to AWS.Valid AWS S3 bucket name
s3_access_keyStringNoneNoAWS access key for the specified S3 bucket.Required only when using custom AWS credentials.Valid AWS credentials
s3_secret_keyStringNoneNoAWS secret key for the specified S3 bucket.Required only when using custom AWS credentials.Valid AWS credentials
s3_cloud_pathStringNoneNoS3 path to upload the collected frames.Use if uploading frames to a specific location in the S3 bucket.Valid S3 path (e.g., s3://bucket-name/path/)

Collect Data from a Source:

Description:

The collect_data method enables users to extract frames from a specified source ID over a defined time interval. Users can also specify the number of frames to capture, skip frames for optimization, and optionally store the collected frames in an AWS S3 bucket.

Example 1: Collecting data based on the time interval

data_collect = sdk.collect_data(

source_id=174,

time_interval=10, # 10 seconds

skip_frame_count=5, # Optional

)

print(data_collect)

Example 2: Collecting data based on the number of frames

data_collect = sdk.collect_data(

source_id=174,

number_of_frames_requested=200,

skip_frame_count=5, # Optional

)

print(data_collect)

Once the data has been collected, users will receive a Label Studio URL where they can annotate the images and export the dataset in a zip format upon completing the annotations. This zip file will be utilized for training or retraining processes.